Most Popular

1500 questions
11
votes
0 answers

Extending FaceNet’s triplet loss to object recognition

FaceNet uses a novel loss metric (triplet loss) to train a model to output embeddings (128-D from the paper), such that any two faces of the same identity will have a small Euclidean distance, and such that any two faces of different identities will…
rossignol
  • 111
  • 4
11
votes
1 answer

What is the time complexity of the forward pass algorithm of a feedforward neural network?

How do I determine the time complexity of the forward pass algorithm of a feedforward neural network? How many multiplications are done to generate the output?
Artificial
  • 113
  • 1
  • 6
11
votes
2 answers

What are the common myths associated with Artificial Intelligence?

What are some interesting myths of Artificial Intelligence and what are the facts behind them?
fabien
  • 287
  • 1
  • 6
11
votes
3 answers

What algorithms are considered reinforcement learning algorithms?

What are the areas/algorithms that belong to reinforcement learning? TD(0), Q-Learning and SARSA are all temporal-difference algorithms, which belong to the reinforcement learning area, but is there more to it? Are the dynamic programming…
Miguel Saraiva
  • 797
  • 1
  • 5
  • 15
11
votes
4 answers

What are the domains where SVMs are still state-of-the-art?

It seems that deep neural networks and other neural network based models are dominating many current areas like computer vision, object classification, reinforcement learning, etc. Are there domains where SVMs (or other models) are still producing…
Steven Davis
  • 177
  • 5
11
votes
3 answers

Why is dot product attention faster than additive attention?

In section 3.2.1 of Attention Is All You Need the claim is made that: Dot-product attention is identical to our algorithm, except for the scaling factor of $\frac{1}{\sqrt{d_k}}$. Additive attention computes the compatibility function using a…
user3180
  • 608
  • 4
  • 14
11
votes
1 answer

What is the meaning of $V(D,G)$ in the GAN objective function?

Here is the GAN objective function. $$\min _{G} \max _{D} V(D, G)=\mathbb{E}_{\boldsymbol{x} \sim p_{\text {data }}(\boldsymbol{x})}[\log D(\boldsymbol{x})]+\mathbb{E}_{\boldsymbol{z} \sim p_{\boldsymbol{z}}(\boldsymbol{z})}[\log…
i_rezic
  • 245
  • 1
  • 6
11
votes
1 answer

Are there neural networks with very few nodes that decently solve non-trivial problems?

I'm interested in knowing whether there exist any neural network, that solves (with >=80% accuracy) any nontrivial problem, that uses very few nodes (where 20 nodes is not a hard limit). I want to develop an intuition on sizes of neural networks.
Guillermo Mosse
  • 327
  • 1
  • 10
11
votes
2 answers

What are the segment embeddings and position embeddings in BERT?

They only reference in the paper that the position embeddings are learned, which is different from what was done in ELMo. ELMo paper - https://arxiv.org/pdf/1802.05365.pdf BERT paper - https://arxiv.org/pdf/1810.04805.pdf
Skinish
  • 163
  • 1
  • 1
  • 9
10
votes
2 answers

Can machine learning algorithms be used to differentiate between small differences in details between images?

I was wondering if machine learning algorithms (CNNs?) can be used/trained to differentiate between small differences in details between images (such as slight differences in shades of red or other colours, or the presence of small objects between…
The Pointer
  • 569
  • 3
  • 19
10
votes
1 answer

Is AlphaZero an example of an AGI?

From DeepMind's research paper on arxiv.org: In this paper, we apply a similar but fully generic algorithm, which we call AlphaZero, to the games of chess and shogi as well as Go, without any additional domain knowledge except the rules of the…
Siddhartha
  • 413
  • 2
  • 11
10
votes
2 answers

What are the limitations of the hill climbing algorithm and how to overcome them?

What are the limitations of the hill climbing algorithm? How can we overcome these limitations?
Abbas Ali
  • 566
  • 3
  • 10
  • 17
10
votes
4 answers

Has the spontaneous emergence of replicators been modeled in Artificial Life?

One of the cornerstones of The Selfish Gene (Dawkins) is the spontaneous emergence of replicators, i.e. molecules capable of replicating themselves. Has this been modeled in silico in open-ended evolutionary/artificial life simulations? Systems like…
sihubumi
  • 101
  • 3
10
votes
2 answers

How to handle rectangular images in convolutional neural networks?

Almost all the convolutional neural network architecture I have come across have a square input size of an image, like $32 \times 32$, $64 \times 64$ or $128 \times 128$. Ideally, we might not have a square image for all kinds of scenarios. For…
10
votes
1 answer

Loss jumps abruptly when I decay the learning rate with Adam optimizer in PyTorch

I'm training an auto-encoder network with Adam optimizer (with amsgrad=True) and MSE loss for Single channel Audio Source Separation task. Whenever I decay the learning rate by a factor, the network loss jumps abruptly and then decreases until the…
imflash217
  • 499
  • 5
  • 14