Most Popular

1500 questions
7
votes
1 answer

Who manufactures Google's Tensor Processing Units?

Does google manufacture TPUs? I know that google engineers are the ones responsible for the design, and that google is the one using them, but which company is responsible for the actual manufacturing of the chip?
Alecto Irene Perez
  • 599
  • 1
  • 6
  • 10
7
votes
1 answer

Are more than 8 high performance Nvidia GPUs practical for deep learning applications?

I was prompted towards this question while trying to find server racks and motherboards which are specialized towards artificial intelligence. Naturally I went to the SuperMicro website. There the chassis+motherboard which supported the maximum GPUs…
Rushat Rai
  • 139
  • 8
7
votes
2 answers

How do we choose the kernel size depending on the problem?

Obviously, finding suitable hyper-parameters for a neural network is a complex task and problem or domain-specific. However, there should be at least some "rules" that hold most times for the size of the filter (or kernel)! In most cases, intuition…
daniel451
  • 266
  • 1
  • 4
  • 9
7
votes
2 answers

How can a neural network distinguish a rotated 6 and 9 digits?

Rotated MNIST is a popular dataset for benchmarking models equivariant to rotations on $\mathbb{R}^2$, described by $SO(2)$ group or its discrete subgroups like $\mathbb{Z}^{n}$: Group equivariant convolutional networks Harmonic networks It…
7
votes
1 answer

Is LSTM a subcategory of RNN?

Is the LSTM-Architecture a subcategory of RNNs? Or are they totally different? Literature doesn't seem to be unitary on this. This figure appears to explain the models to be alternatives, but I thought of them otherwise (LSTM to be a subcategory of…
MrPlanck
  • 113
  • 5
7
votes
1 answer

What are the most recent and influential breakthroughs in NLP?

I'm looking at the history of NLP, starting in the 1950s, with the Georgetown–IBM experiment. What are examples of the most recent (e.g. in the last 5-10 years) and influential breakthroughs in natural language processing?
csheroe
  • 71
  • 1
7
votes
1 answer

Is there a proper initialization technique for the weight matrices in multi-head attention?

Self-attention layers have 4 learnable tensors (in the vanilla formulation): Query matrix $W_Q$ Key matrix $W_K$ Value matrix $W_V$ Output matrix $W_O$ Nice illustration from https://jalammar.github.io/illustrated-transformer/ However, I do not…
7
votes
2 answers

What is the difference between the US and global edition of the AIMA book by Russell and Norvig?

The book Artificial Intelligence: A Modern Approach by Russell and Norvig has two editions: global and the US. It looks like these two are generally the same, but have some differences in the order of the chapters and in the context, is this…
Emad
  • 215
  • 2
  • 7
7
votes
1 answer

How could AI solve planet's major problems?

I had been reading that AI could solve planet's major problems. How could it be done? For example, how exactly could AI be applied to address climate change? What are examples of applications of AI to solve these problems?
Shashank
  • 73
  • 3
7
votes
3 answers

What are some information processing models besides MLPs?

Feedforward or multilayered neural networks, like the one in the image above, are usually characterized by the fact that all weighted connections can be represented as a continuous real number. Furthermore, each node in a layer is connected to…
user289661
  • 419
  • 3
  • 11
7
votes
2 answers

Can neural networks efficiently solve the traveling salesmen problem?

Can neural networks efficiently solve the traveling salesmen problem? Are there any research papers that show that neural networks can solve the TSP efficiently? The TSP is an NP-hard problem, so I suspect that there are only approximate solutions…
7
votes
3 answers

Is there a central focus on the communication methods between AI and humans?

AI is developing at a rapid pace and is becoming very sophisticated. One aspect will include the methods of interaction between AI and humans. Currently the interaction is an elementary interaction of voice and visual text or images. Is there…
7
votes
1 answer

Which parsing algorithm can I use for NLP question answering system?

I am currently working on my last project before graduating. For this project, I have to develop a Natural Language Question Answering System. Now, I have read quite some research papers regarding this topic and have figured out everything except…
lilienfa
  • 319
  • 1
  • 9
7
votes
2 answers

Are policy gradient methods good for large discrete action spaces?

I have seen this question asked primarily in the context of continuous action spaces. I have a large action space (~2-4k discrete actions) for my custom environment that I cannot reduce down further: I am currently trying DQN approaches but was…
user9317212
  • 161
  • 2
  • 11
7
votes
2 answers

Origins of the name of convolutional neural networks

Convolutional neural networks (CNNs) contain convolutional layers. In modern deep learning libraries such as Tensorflow and PyTorch, convolutional layers are implemented by using the cross-correlation operator instead of the convolution operator.…
mikkola
  • 579
  • 2
  • 10