Most Popular

1500 questions
13
votes
1 answer

Why is A* optimal if the heuristic function is admissible?

A heuristic is admissible if it never overestimates the true cost to reach the goal node from $n$. If a heuristic is consistent, then the heuristic value of $n$ is never greater than the cost of its successor, $n'$, plus the successor's heuristic…
Wizard
  • 303
  • 1
  • 2
  • 6
13
votes
1 answer

How exactly can ReLUs approximate non-linear and curved functions?

Currently, the most commonly used activation functions are ReLUs. So I answered this question What is the purpose of an activation function in neural networks? and, while writing the answer, it struck me, how exactly can ReLUs approximate a…
user9947
13
votes
5 answers

What is the fundamental difference between CNN and RNN?

What is the fundamental difference between convolutional neural networks and recurrent neural networks? Where are they applied?
Pradeep BV
  • 151
  • 1
  • 7
13
votes
3 answers

Is it possible to train a neural network to estimate a vehicle's length?

I have a large dataset (over 100k samples) of vehicles with the ground truth of their lengths. Is it possible to train a deep network to measure/estimate vehicle length? I haven't seen any papers related to estimating object size using a deep neural…
Naji
  • 139
  • 1
  • 1
  • 3
13
votes
4 answers

Why LLMs and RNNs learn so fast during inference but, ironically, are so slow during training?

Why LLMs learn so fast during inference, but, ironically, are so slow during training? That is, if you teach an AI a new concept in a prompt, it will learn and use the concept perfectly and flawless, through the whole prompt, after just one shot.…
MaiaVictor
  • 365
  • 3
  • 10
13
votes
5 answers

Is there a rigorous proof that AGI is possible, at least, in theory?

It is often implicitly assumed in computer science that the human mind, or at least some mechanical calculations that humans perform (see the Church-Turing thesis), can be replicated with a Turing machine, therefore Artificial General Intelligence…
yters
  • 387
  • 3
  • 11
13
votes
2 answers

Input/output encoding for a neural network to learn a grid-based game

I am writing a simple toy game with the intent of training a deep neural network on top of it. The games rules are roughly the following: The game has a board made up of hexagonal cells. Both players have the same collection of pieces that they can…
Totem
  • 381
  • 2
  • 6
13
votes
2 answers

Is there a fundamental difference between an environment being stochastic and being partially observable?

In AI literature, deterministic vs stochastic and being fully-observable vs partially observable are usually considered two distinct properties of the environment. I'm confused about this because what appears random can be described by hidden…
martinkunev
  • 255
  • 1
  • 7
13
votes
6 answers

What are good alternatives to the expression "Artificial Intelligence"?

I read a really interesting article titled "Stop Calling it Artificial Intelligence" that made a compelling critique of the name "Artificial Intelligence". The word intelligence is so broad that it's hard to say whether "Artificial Intelligence" is…
user6698
13
votes
2 answers

Are the shortcomings of neural networks diminishing?

Having worked with neural networks for about half a year, I have experienced first-hand what are often claimed as their main disadvantages, i.e. overfitting and getting stuck in local minima. However, through hyperparameter optimization and some…
user4747
13
votes
3 answers

Why teaching only search algorithms in a short introductory AI course?

I understood that the concept of search is important in AI. There's a question on this website regarding this topic, but one could also intuitively understand why. I've had an introductory course on AI, which lasted half of a semester, so of course…
nbro
  • 40,472
  • 12
  • 105
  • 192
13
votes
2 answers

What are other examples of theoretical machine learning books?

I am looking for a book about machine learning that would suit my physics background. I am more or less familiar with classical and complex analysis, theory of probability, сcalculus of variations, matrix algebra, etc. However, I have not studied…
Ilya
  • 133
  • 1
  • 5
13
votes
3 answers

Why is the reward in reinforcement learning always a scalar?

I'm reading Reinforcement Learning by Sutton & Barto, and in section 3.2 they state that the reward in a Markov decision process is always a scalar real number. At the same time, I've heard about the problem of assigning credit to an action for a…
Sid Mani
  • 233
  • 1
  • 4
13
votes
2 answers

How are generative adversarial networks trained?

I am reading about generative adversarial networks (GANs) and I have some doubts regarding it. So far, I understand that in a GAN there are two different types of neural networks: one is generative ($G$) and the other discriminative ($D$). The…
Eka
  • 1,066
  • 8
  • 24
13
votes
1 answer

How would DeepMind's new differentiable neural computer scale?

DeepMind just published a paper about a differentiable neural computer, which basically combines a neural network with a memory. The idea is to teach the neural network to create and recall useful explicit memories for a certain task. This…
BlindKungFuMaster
  • 4,255
  • 12
  • 23