Highest Voted Questions - Artificial Intelligence Stack Exchange

7

votes

2 answers

How are the reward functions $R(s)$, $R(s, a)$ and $R(s, a, s')$ equivalent?

In this video, the lecturer states that $R(s)$, $R(s, a)$ and $R(s, a, s')$ are equivalent representations of the reward function. Intuitively, this is the case, according to the same lecturer, because $s$ can be made to represent the state and the…

asked Feb 07 '19 at 15:38

nbro

40,472
12
105
192

7

votes

2 answers

What's the role of bounding boxes in object detection?

I'm quite new to the field of computer vision and was wondering what are the purposes of having the boundary boxes in object detection. Obviously, it shows where the detected object is, and using a classifier can only classify one object per image,…

asked Jan 25 '19 at 06:39

Cody Chung

173
5

6

votes

2 answers

Can neural networks learn to ignore an input datum?

Disclaimer: I'm not a student in computer science and most of my knowledge about ML/NN comes from YouTube, so please bear with me! Let's say we have a classification neural network, that takes some input data $w, x, y, z$, and has some number of…

asked Jan 10 '19 at 14:42

czz1850

61
1

6

votes

2 answers

How can the importance sampling ratio be different than zero when the target policy is deterministic?

In the book Reinforcement Learning: An Introduction (2nd edition) Sutton and Barto define at page 104 (p. 126 of the pdf), equation (5.3), the importance sampling ratio, $\rho _{t:T-1}$, as follows: $$\rho…

asked Jan 09 '19 at 17:28

F.M.F.

311
3
7

6

votes

1 answer

Would machine learning be suitable for finding the seed of a random number generator?

I'm new to machine learning, and AI in general (but with 20+ years for programming). I'm wondering if machine learning is a good general approach to find the seed of a random number generator. Suppose I have a list of 2000 numbers. Is there a…

asked Jan 09 '19 at 08:59

Eden

163
4

6

votes

3 answers

How to train a logical XOR with reinforcement learning?

After reading an excellent BLOG post Deep Reinforcement Learning: Pong from Pixels and playing with the code a little, I've tried to do something simple: use the same code to train a logical XOR gate. But no matter how I've tuned hyperparameters,…

reinforcement-learning

asked Jan 04 '19 at 23:45

Dimagog

119
4

6

votes

3 answers

What is a high dimensional state in reinforcement learning?

In the DQN paper, it is written that the state-space is high dimensional. I am a little bit confused about this terminology. Suppose my state is a high dimensional vector of length $N$, where $N$ is a huge number. Let's say I solve this task using…

asked Jan 03 '19 at 16:42

Siddhant Tandon

163
1
5

6

votes

1 answer

How should we choose the dimensions of the encoding layer in auto-encoders?

asked Dec 27 '18 at 17:26

Neha soni

101
3

6

votes

2 answers

What is the difference between imitation learning and classification done by experts?

In short, imitation learning means learning from the experts. Suppose I have a dataset with labels based on the actions of experts. I use a simple binary classifier algorithm to assess whether it is good expert action or bad expert action. How is…

asked Dec 19 '18 at 02:38

user781486

201
2
5

6

votes

1 answer

What does it mean to do multi-dimensional processing with tensors in tensor cores?

In some tweets about NeurIPS 2018, this video from NVIDIA appeared. At around 0.37, she says: If you think about the current computations in our deep learning systems, they are all based on Linear Algebra. Can we come up with better paradigms to do…

asked Dec 09 '18 at 15:13

wrong_path

161
6

6

votes

5 answers

Why can't the XOR linear inseparability problem be solved with one perceptron like this?

Consider a perceptron where $w_0=1$ and $w_1=1$: Now, suppose that we use the following activation function \begin{align} f(x)= \begin{cases} 1, \text{ if }x =1\\ 0, \text{ otherwise} \end{cases} \end{align} The output is then summarised…

asked Dec 08 '18 at 21:09

rahs

163
4

6

votes

4 answers

Can Machine Learning be applied to decipher the script of lost ancient languages?

Can Machine Learning be applied to decipher the script of lost ancient languages (namely, languages that were being used many years ago, but currently are not used in human societies and have been forgotten, e.g. Avestan language)? If yes, is there…

asked Dec 06 '18 at 18:34

Questioner

293
1
10

6

votes

3 answers

How to deal with episode termination in Advantage Actor-Critic algorithm?

Advantage Actor-Critic algorithm may use the following expression to get 1-step estimate of the advantage: $ A(s_t,a_t) = r(s_t, a_t) + \gamma V(s_{t+1}) (1 - done_{t+1}) - V(s_t) $ where $done_{t+1}=1$ if $s_{t+1}$ is a terminal state (end of the…

reinforcement-learning

asked Nov 27 '18 at 04:07

Aleksei Petrenko

163
5

6

votes

1 answer

How is iterative deepening A* better than A*?

The iterative deepening A* search is an algorithm that can find the shortest path between a designated start node and any member of a set of goals. The A* algorithm evaluates nodes by combining the cost to reach the node and the cost to get from…

asked Nov 07 '18 at 13:22

Huma Qaseem

189
1
3
12

6

votes

2 answers

What are the differences in scope between statistical AI and classical AI?

What are the differences in scope between statistical AI and classical AI? Real-world examples would be appreciated.

asked Nov 05 '18 at 15:49

dua fatima

323
1
3
10

Most Popular