Highest Voted Questions - Artificial Intelligence Stack Exchange

6

votes

2 answers

What is a good way to create an artificial self-recognition?

Self-Recognition seems to be an item that designers are trying to integrate into artificial intelligence. Is there a generally recognized method of doing this in a machine, and how would one test the capacity - as in a Turing-Test?

intelligence-testing

asked Sep 09 '16 at 13:50

D. Wade

541
2
7

6

votes

5 answers

Emulating human brain - with analogous NN chips

Considering the answers of this question, emulating a human brain with the current computing capacity is currently impossible, but we aren't very far from it. Note, 1 or 2 decades ago, similar calculations had similar results. The clock frequency of…

asked Sep 07 '16 at 20:13

peterh

203
4
16

6

votes

1 answer

How should I deal with variable-length inputs for neural networks?

I am a very beginner in the field of AI. I am basically a Pharma Professional without much coding experience. I use GUI-based tools for the neural network. I am trying to develop an ANN that receives as input a protein sequence and produces as…

asked Mar 28 '20 at 07:59

Swayamprakash Patel

91
4

6

votes

2 answers

Can neurons in MLP and filters in CNN be compared?

I know they are not the same in working, but an input layer sends the input to $n$ neurons with a set of weights, based on these weights and the activation layer, it produces an output that can be fed to the next layer. Aren't the filters the same,…

asked Mar 24 '20 at 23:28

Tibo Geysen

193
5

6

votes

1 answer

What are pros and cons of Bi-LSTM as compared to LSTM?

What are the pros and cons of LSTM vs Bi-LSTM in language modelling? What was the need to introduce Bi-LSTM?

asked Mar 23 '20 at 07:41

DRV

1,673
3
13
19

6

votes

1 answer

If vanishing gradients are NOT the problem that ResNets solve, then what is the explanation behind ResNet success?

I often see blog posts or questions on here starting with the premise that ResNets solve the vanishing gradient problem. The original 2015 paper contains the following passage in section 4.1: We argue that this optimization difficulty is unlikely…

asked Mar 18 '20 at 07:52

Alexander Soare

1,339
2
11
27

6

votes

0 answers

How to correctly implement self-play with DQN?

I have an environment where an agent faces an equal opponent, and while I've achieved OK performance implementing DQN and treating the opponent as a part of the environment, I think performance would improve if the agent trains against itself…

asked Mar 17 '20 at 12:49

Pell000

61
1

6

votes

2 answers

How can the policy iteration algorithm be model-free if it uses the transition probabilities?

I'm actually trying to understand the policy iteration in the context of RL. I read an article presenting it and, at some point, a pseudo-code of the algorithm is given : What I can't understand is this line : From what I understand, policy…

asked Mar 11 '20 at 16:11

Samuel Beaussant

193
3

6

votes

0 answers

How exactly does self-play work, and how does it relate to MCTS?

I am working towards using RL to create an AI for a two-player, hidden-information, a turn-based board game. I have just finished David Silver's RL course and Denny Britz's coding exercises, and so am relatively familiar with MC control, SARSA,…

asked Feb 09 '20 at 13:05

Alienator

61
2

6

votes

2 answers

What are the state-of-the-art meta-reinforcement learning methods?

This question can seem a little bit too broad, but I am wondering what are the current state-of-the-art works on meta reinforcement learning. Can you provide me with the current state-of-the-art in this field?

asked Feb 08 '20 at 11:23

Sara El

63
4

6

votes

1 answer

Why is the evidence equal to the KL divergence plus the loss?

Why is the equation $$\log p_{\theta}(x^1,...,x^N)=D_{KL}(q_{\theta}(z|x^i)||p_{\phi}(z|x^i))+\mathbb{L}(\phi,\theta;x^i)$$ true, where $x^i$ are data points and $z$ are latent variables? I was reading the original variation autoencoder paper and I…

asked Feb 07 '20 at 07:49

user8714896

797
1
6
24

6

votes

2 answers

In deep learning, is it possible to use discontinuous activation functions?

In deep learning, is it possible to use discontinuous activation functions (e.g. one with jump discontinuity)? (My guess: for example, ReLU is non-differentiable at a single point, but it still has a well-defined derivative. If an activation…

asked Jan 22 '20 at 04:40

Gyeonghoon Ko

61
2

6

votes

3 answers

Why are traditional ML models still used over deep neural networks?

I'm still on my first steps in the Data Science field. I played with some DL frameworks, like TensorFlow (pure) and Keras (on top) before, and know a little bit of some "classic machine learning" algorithms like decision trees, k-nearest neighbors,…

asked Jan 14 '20 at 13:18

Douglas Ferreira

845
1
8
13

6

votes

1 answer

What is the mathematical definition of an activation function?

What is the mathematical definition of an activation function to be used in a neural network? So far I did not find a precise one, summarizing which criterions (e.g. monotonicity, differentiability, etc.) are required. Any recommendations for…

asked Jan 10 '20 at 10:25

user32649

69
2

6

votes

0 answers

Has anyone attempted to take a bunch of similar neural networks to extract general formulae about the focus area?

When a neural network learns something from a data set, we are left with a bunch of weights which represent some approximation of knowledge about the world. Although different data sets or even different runs of the same NN might yield completely…

asked Jan 09 '20 at 22:29

Lawnmower Man

300
1
7

Most Popular