Highest Voted Questions - Artificial Intelligence Stack Exchange

14

votes

1 answer

What are the state-of-the-art results on the generalization ability of deep learning methods?

I've read a few classic papers on different architectures of deep CNNs used to solve varied image-related problems. I'm aware there's some paradox in how deep networks generalize well despite seemingly overfitting training data. A lot of people in…

asked Nov 15 '19 at 09:22

Shirish Kulhari

393
1
11

14

votes

2 answers

How should I encode the structure of a neural network into a genome?

For a deterministic problem space, I need to find a neural network with the optimal node and link structure. I want to use a genetic algorithm to simulate many neural networks to find the best network structure for the problem domain. I've never…

asked Aug 14 '16 at 19:16

Mithical

2,905
5
27
39

14

votes

3 answers

Is there a way to understand neural networks without using the concept of brain?

Is there a way to understand, for instance, a multi-layered perceptron without hand-waving about them being similar to brains, etc? For example, it is obvious that what a perceptron does is approximating a function; there might be many other ways,…

asked Oct 19 '19 at 18:23

Evgeniy

249
1
3

14

votes

1 answer

What are the implications of the "No Free Lunch" theorem for machine learning?

The No Free Lunch (NFL) theorem states (see the paper Coevolutionary Free Lunches by David H. Wolpert and William G. Macready) any two algorithms are equivalent when their performance is averaged across all possible problems Is the "No Free Lunch"…

asked Sep 27 '19 at 13:52

user9947

14

votes

3 answers

What is the relationship between the size of the hidden layer and the size of the cell state layer in an LSTM?

I was following some examples to get familiar with TensorFlow's LSTM API, but noticed that all LSTM initialization functions require only the num_units parameter, which denotes the number of hidden units in a cell. According to what I have learned…

asked Sep 25 '19 at 13:59

kuixiong

241
2
4

14

votes

4 answers

What is the relevance of AIXI on current artificial intelligence research?

From Wikipedia: AIXI ['ai̯k͡siː] is a theoretical mathematical formalism for artificial general intelligence. It combines Solomonoff induction with sequential decision theory. AIXI was first proposed by Marcus Hutter in 2000[1] and the results…

asked Aug 02 '16 at 21:10

rcpinto

2,119
1
16
31

14

votes

3 answers

MCTS for non-deterministic games with very high branching factor for chance nodes

I'm trying to use a Monte Carlo Tree Search for a non-deterministic game. Apparently, one of the standard approaches is to model non-determinism using chance nodes. The problem for this game is that it has a very high min-entropy for the random…

asked Aug 09 '19 at 06:43

Mark

241
2
5

14

votes

3 answers

What sort of mathematical problems are there in AI that people are working on?

I recently got a 18-month postdoc position in a math department. It's a position with relative light teaching duty and a lot of freedom about what type of research that I want to do. Previously I was mostly doing some research in probability and…

asked Jun 21 '19 at 09:37

NonalcoholicBeer

261
2
8

14

votes

2 answers

How large should the replay buffer be?

I'm learning DDPG algorithm by following the following link: Open AI Spinning Up document on DDPG, where it is written In order for the algorithm to have stable behavior, the replay buffer should be large enough to contain a wide range of…

asked Apr 04 '19 at 14:40

ycenycute

341
1
2
6

14

votes

1 answer

How can the convolution operation be implemented as a matrix multiplication?

How can the convolution operation used by CNNs be implemented as a matrix-vector multiplication? We often think of the convolution operation in CNNs as a kernel that slides across the input. However, rather than sliding this kernel (e.g. using…

asked Mar 12 '19 at 13:57

nbro

40,472
12
105
192

14

votes

1 answer

What is the relation between online (or offline) learning and on-policy (or off-policy) algorithms?

In the context of RL, there is the notion of on-policy and off-policy algorithms. I understand the difference between on-policy and off-policy algorithms. Moreover, in RL, there's also the notion of online and offline learning. What is the relation…

asked Feb 09 '19 at 14:48

nbro

40,472
12
105
192

14

votes

3 answers

Why does is make sense to normalize rewards per episode in reinforcement learning?

In Open AI's actor-critic and in Open AI's REINFORCE, the rewards are being normalized like so rewards = (rewards - rewards.mean()) / (rewards.std() + eps) on every episode individually. This is probably the baseline reduction, but I'm not entirely…

asked Jan 24 '19 at 13:56

Gulzar

759
1
9
24

13

votes

4 answers

Is the singularity something to be taken seriously?

The term Singularity is often used in mainstream media for describing visionary technology. It was introduced by Ray Kurzweil in a popular book The Singularity Is Near: When Humans Transcend Biology (2005). In his book, Kurzweil gives an outlook to…

asked Sep 07 '18 at 13:31

Bucky Rogerson

131
4

13

votes

2 answers

Which layer in a CNN consumes more training time: convolution layers or fully connected layers?

In a convolutional neural network, which layer consumes more training time: convolution layers or fully connected layers? We can take AlexNet architecture to understand this. I want to see the time breakup of the training process. I want a relative…

asked Sep 06 '18 at 23:27

Ruchit Dalwadi

325
3
11

13

votes

2 answers

How important is consciousness for making advanced artificial intelligence?

How important is consciousness and self-consciousness for making advanced AIs? How far away are we from making such? When making e.g. a neural network there's (very probably) no consciousness within it, but just mathematics behind, but do we need…

asked Aug 14 '18 at 22:17

Mr. Eivind

578
4
27

Most Popular