Most Popular

1500 questions
14
votes
4 answers

What is the "dropout" technique?

What purpose does the "dropout" method serve and how does it improve the overall performance of the neural network?
kenorb
  • 10,483
  • 3
  • 44
  • 94
14
votes
6 answers

What would motivate a machine?

Currently, within the AI development field, the main focus seems to be on pattern recognition and machine learning. Learning is about adjusting internal variables based on a feedback loop. Maslow's hierarchy of needs is a theory in psychology…
Aleksei Maide
  • 251
  • 2
  • 14
14
votes
2 answers

Is the mean-squared error always convex in the context of neural networks?

Multiple resources I referred to mention that MSE is great because it's convex. But I don't get how, especially in the context of neural networks. Let's say we have the following: $X$: training dataset $Y$: targets $\Theta$: the set of parameters…
user74211
  • 141
  • 1
  • 3
14
votes
3 answers

Why are embeddings added, not concatenated?

Let's consider the following example from BERT I cannot understand why "the input embeddings are the sum of the token embeddings, the segmentation embeddings, and the position embeddings". The thing is, these embeddings carry different types of…
nalzok
  • 301
  • 3
  • 9
14
votes
3 answers

How to train a neural network for a round based board game?

I'm wondering how to train a neural network for a round based board game like, tic-tac-toe, chess, risk or any other round based game. Getting the next move by inference seems to be pretty straight forward, by feeding the game state as input and…
soriak
  • 249
  • 1
  • 2
  • 3
14
votes
3 answers

What are some implications of Gödel's theorems on AI research?

Note: My experience with Gödel's theorem is quite limited: I have read Gödel Escher Bach; skimmed the 1st half of Introduction to Godel's Theorem (by Peter Smith); and some random stuff here and there on the internet. That is, I only have a vague…
k.c. sayz 'k.c sayz'
  • 2,091
  • 10
  • 26
14
votes
6 answers

Is there actually a lack of fundamental theory on deep learning?

I heard several times that one of the fundamental/open problems of deep learning is the lack of "general theory" on it, because, actually, we don't know why deep learning works so well. Even the Wikipedia page on deep learning has similar comments.…
heleone
  • 151
  • 1
  • 3
14
votes
5 answers

Is a genetic algorithm an example of artificial intelligence?

Since human intelligence presumably is a function of a natural genetic algorithm in nature, is using a genetic algorithm in a computer an example of artificial intelligence? If not, how do they differ? Or perhaps some are and some are not expressing…
WilliamKF
  • 2,513
  • 1
  • 25
  • 31
14
votes
1 answer

What are the consequences of layer norm vs batch norm?

I'll start with my understanding of the literal difference between these two. First, let's say we have an input tensor to a layer, and that tensor has dimensionality $B \times D$, where $B$ is the size of the batch and $D$ is the dimensionality of…
Alexander Soare
  • 1,339
  • 2
  • 11
  • 27
14
votes
1 answer

Which approaches could I use to create a simple chatbot using a neural network?

I wanted to start experimenting with neural networks, so I decided to make a chatbot (like Cleverbot, which is not that clever anyway) using them. I looked around for some documentation and I found many tutorials on general tasks, but few on this…
Totem
  • 381
  • 2
  • 6
14
votes
2 answers

How does one prove comprehension in machines?

Say we have a machine and we give it a task to do (vision task, language task, game, etc.), how can one prove that a machine actually know's what's going on/happening in that specific task? To narrow it down, some examples: Conversation - How would…
Landon G
  • 500
  • 2
  • 10
14
votes
3 answers

How does noise affect generalization?

Does increasing the noise in data help to improve the learning ability of a network? Does it make any difference or does it depend on the problem being solved? How is it affect the generalization process overall?
kenorb
  • 10,483
  • 3
  • 44
  • 94
14
votes
2 answers

Should deep residual networks be viewed as an ensemble of networks?

The question is about the architecture of Deep Residual Networks (ResNets). The model that won the 1-st places at "Large Scale Visual Recognition Challenge 2015" (ILSVRC2015) in all five main tracks: ImageNet Classification: “Ultra-deep” (quote…
Erba Aitbayev
  • 357
  • 1
  • 10
14
votes
3 answers

Has anyone thought about making a neural network ask questions, instead of only answering them?

Most of the people is trying to answer question with a neural network. However, has anyone came up with some thoughts about how to make neural network ask questions, instead of answer questions? For example, if a CNN can decide which category an…
cha
  • 141
  • 5
14
votes
7 answers

Is consciousness necessary for any AI task?

Consciousness is challenging to define, but for this question let's define it as "actually experiencing sensory input as opposed to just putting a bunch of data through an inanimate machine." Humans, of course, have minds; for normal computers, all…
Ben N
  • 2,599
  • 2
  • 20
  • 35