Most Popular

1500 questions
6
votes
1 answer

Is this proof of $\epsilon$-greedy policy improvement correct?

The following paragraph about $\epsilon$-greedy policies can be found at the end of page 100, under section 5.4, of the book "Reinforcement Learning: An Introduction" by Richard Sutton and Andrew Barto (second edition, 2018). but with probability…
Jarvis1997
  • 147
  • 6
6
votes
1 answer

What are the conditions of convergence of temporal-difference learning?

In reinforcement learning, temporal difference seem to update the value function in each new iteration of experience absorbed from the environment. What would be the conditions for temporal-difference learning to converge in the end? How is it…
MJeremy
  • 163
  • 3
6
votes
1 answer

What's the difference between LSTM and GRU?

I have been reading about LSTMs and GRUs, which are recurrent neural networks (RNNs). The difference between the two is the number and specific type of gates that they have. The GRU has an update gate, which has a similar role to the role of the…
Pluviophile
  • 1,263
  • 6
  • 19
  • 39
6
votes
1 answer

How should I handle invalid actions in a grid world?

I'm building a really simple experiment, where I let an agent move from the bottom-left corner to the upper-right corner of a $3 \times 3$ grid world. I plan to use DQN to do this. I'm having trouble handling the starting point: what if the Q…
o_yeah
  • 197
  • 1
  • 7
6
votes
2 answers

What is the difference between a Bayesian Network and a Markov Chain?

I am trying to understand the difference between a Bayesian Network and a Markov Chain. When I search for this one the web, the unanimous solution seems to be that a Bayesian Network is directional (i.e. it's a DAG) and a Markov Chain is not…
Newskooler
  • 163
  • 6
6
votes
4 answers

What are some datasets to train an MLP on simple tasks?

I have implemented an MLP. Now, I want to train it to solve simple tasks. Are there any data sets to train the MLP on simple tasks, that is, tasks with a small number of inputs and outputs? I would like to train it to solve problems which are…
6
votes
3 answers

What is the difference between artificial intelligence and swarm intelligence?

Artificial intelligence (AI) refers to the simulation of human intelligence in machines that are programmed to think like humans and mimic their actions. The term may also be applied to any machine that exhibits traits associated with a human mind…
Pluviophile
  • 1,263
  • 6
  • 19
  • 39
6
votes
1 answer

How does the region proposal method work in Fast R-CNN?

I read so many articles and the Fast R-CNN paper, but I'm still confused about how the region proposal method works in Fast R-CNN. As you can see in the image below, they say they used a proposal method, but it is not specified how it works. What…
ozoubia
  • 61
  • 2
6
votes
2 answers

How to write a C decompiler using AI?

I would like to learn more about whether it is possible and how to write a program that decompiles executable binary (an object file) to the C source. I'm not asking exactly 'how', but rather how this can be achieved. Given the following hello.c…
kenorb
  • 10,483
  • 3
  • 44
  • 94
6
votes
1 answer

Is Expected SARSA an off-policy or on-policy algorithm?

I understand that SARSA is an On-policy algorithm, and Q-learning an off-policy one. Sutton and Barto's textbook describes Expected Sarsa thusly: In these cliff walking results Expected Sarsa was used on-policy, but in general it might use a…
Y. Xu
  • 63
  • 1
  • 4
6
votes
0 answers

Why Pixel RNN (Row LSTM) can capture triangular contexts?

I'm reading the paper Pixel Recurrent Neural Network. I have a question about Row LSTM. Why Row LSTM can capture triangular contexts? In this paper, the kernel of the one-dimensional convolution has size $k \times 1$ where $k \geq 3$; the larger…
musako
  • 181
  • 2
6
votes
4 answers

Would it be ethical to allow an AI to make life-or-death medical decisions?

Would it be ethical to allow an AI to make life-or-death medical decisions? For instance, where there an insufficient number of ventilators during a respiratory pandemic, not every patient can have one. It seems like a straight forward question,…
DukeZhou
  • 6,227
  • 5
  • 25
  • 53
6
votes
1 answer

In NEAT, is it a good idea to give the same ID to node genes created from the same connection gene?

Do I have to prevent nodes created from the same connection gene to have different IDs/innovation number? In this example, the node 6 is created from the connection going from node 3 to node 4: In the case where that specific node was already…
Dara Kong
  • 115
  • 1
  • 7
6
votes
1 answer

Which neural networks can be used only for storing and retrieving information?

Is there a neural network(NN) system or architecture which can be used for only storing and retrieving information. For example; to store whole Avatar movie in HD format inside a neural network and retrieve(without loss) it from the neural network…
Eka
  • 1,066
  • 8
  • 24
6
votes
1 answer

Is there any programming language designed by deep learning?

I know that AI can be used to design printed circuit boards (PCBs), so it can be used to solve complex tasks. Is there any programming language designed by deep learning (or any other AI technique)?
sailfish009
  • 161
  • 1
  • 4