Highest Voted Questions - Artificial Intelligence Stack Exchange

8

votes

2 answers

How to deal with large (or NaN) neural network's weights?

My weights go from being between 0 and 1 at initialization to exploding into the tens of thousands in the next iteration. In the 3rd iteration, they become so large that only arrays of nan values are displayed. How can I go about fixing this? Is it…

asked Sep 23 '19 at 20:39

FeedMeInformation

327
2
7

8

votes

5 answers

Could curiosity improve artificial intelligence?

While thinking about AI, this question came into my mind. Could curiosity help in developing a true AI? According to this website (for testing creativity): Curiosity in this context refers to persistent desire to learn and discover new things and…

asked Aug 11 '16 at 14:00

Eka

1,066
8
24

8

votes

6 answers

What is the difference between artificial intelligence and robots?

asked Aug 08 '16 at 17:53

Vishnu JK

1,072
1
9
21

8

votes

2 answers

What does the symbol $\mathbb E$ mean in these equations?

I came across some papers that use $\mathbb E$ in equations, in particular, this paper: https://arxiv.org/pdf/1511.06581.pdf. Here is some equations from the paper that uses it: $Q^\pi \left(s,a \right) = \mathbb E \left[R_t|s_t = s, a_t = a, \pi…

asked Aug 27 '19 at 14:46

theonekeyg

91
1
4

8

votes

1 answer

An intuitive explanation of Adagrad, its purpose and its formula

It (Adagrad) adapts the learning rate to the parameters, performing smaller updates (i.e. low learning rates) for parameters associated with frequently occurring features, and larger updates (i.e. high learning rates) for parameters associated…

asked Aug 16 '19 at 16:51

DaddyMike

123
7

8

votes

3 answers

Does this argument refuting the existence of superintelligence work?

A superintelligence is a machine that can surpass all intellectual activities by any human, and such a machine is often portrayed in science fiction as a machine that brings mankind to an end. Any machine is executed using an algorithm. By the…

asked Aug 02 '16 at 20:37

wythagoras

1,521
12
28

8

votes

3 answers

Is there any research on the application of AI for drug design?

Is there any research on the application of AI for drug design? For example, you could train a deep learning model about current compounds, substances, structures, and their products and chemical reactions from the existing dataset (basically what…

asked Aug 05 '16 at 20:47

kenorb

10,483
3
44
94

8

votes

2 answers

What is the appropriate way to deal with multiple paths to same state in MCTS?

Many games have multiple paths to the same states. What is the appropriate way to deal with this in MCTS? If the state appears once in the tree, but with multiple parents, then it seems to be difficult to define back propagation: do we only…

monte-carlo-tree-search

asked Jul 30 '19 at 10:14

Jay McCarthy

225
1
4

8

votes

1 answer

Does AlphaZero use Q-Learning?

I was reading the AlphaZero paper Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm, and it seems they don't mention Q-Learning anywhere. So does AZ use Q-Learning on the results of self-play or just a Supervised…

asked Jul 01 '19 at 17:02

Avetik

115
6

8

votes

1 answer

What is the difference between a stationary and a non-stationary policy?

In reinforcement learning, there are deterministic and non-deterministic (or stochastic) policies, but there are also stationary and non-stationary policies. What is the difference between a stationary and a non-stationary policy? How do you…

asked Jun 27 '19 at 15:14

nbro

40,472
12
105
192

8

votes

1 answer

Can a single neural network handle recognizing two types of objects, or should it be split into two smaller networks?

In particular, an embedded computer (with limited resources) analyzes live video stream from a traffic camera, trying to pick good frames that contain license plate numbers of passing cars. Once a plate is located, the frame is handed over to an OCR…

asked Aug 02 '16 at 15:52

SF.

464
3
13

8

votes

1 answer

What is the purpose of the actor in actor-critic algorithms?

For discrete action spaces, what is the purpose of the actor in actor-critic algorithms? My current understanding is that the critic estimates the future reward given an action, so why not just take the action that maximizes the estimated…

asked May 18 '19 at 23:07

David Rein

93
6

8

votes

3 answers

What is the difference between a stochastic and a deterministic policy?

In reinforcement learning, there are the concepts of stochastic (or probabilistic) and deterministic policies. What is the difference between them?

asked May 12 '19 at 18:50

nbro

40,472
12
105
192

8

votes

2 answers

How do we define the reward function for an environment?

How do you actually decide what reward value to give for each action in a given state for an environment? Is this purely experimental and down to the programmer of the environment? So, is it a heuristic approach of simply trying different reward…

asked May 12 '19 at 00:15

Hazzaldo

279
2
9

8

votes

1 answer

Which machine learning models are universal function approximators?

The universal approximation theorem states that a feed-forward neural network with a single hidden layer containing a finite number of neurons can approximate any continuous function (provided some assumptions on the activation function are…

asked May 10 '19 at 12:15

nbro

40,472
12
105
192

Most Popular