Most Popular
1500 questions
11
votes
0 answers
Extending FaceNet’s triplet loss to object recognition
FaceNet uses a novel loss metric (triplet loss) to train a model to output embeddings (128-D from the paper), such that any two faces of the same identity will have a small Euclidean distance, and such that any two faces of different identities will…
rossignol
- 111
- 4
11
votes
1 answer
What is the time complexity of the forward pass algorithm of a feedforward neural network?
How do I determine the time complexity of the forward pass algorithm of a feedforward neural network? How many multiplications are done to generate the output?
Artificial
- 113
- 1
- 6
11
votes
2 answers
What are the common myths associated with Artificial Intelligence?
What are some interesting myths of Artificial Intelligence and what are the facts behind them?
fabien
- 287
- 1
- 6
11
votes
3 answers
What algorithms are considered reinforcement learning algorithms?
What are the areas/algorithms that belong to reinforcement learning?
TD(0), Q-Learning and SARSA are all temporal-difference algorithms, which belong to the reinforcement learning area, but is there more to it?
Are the dynamic programming…
Miguel Saraiva
- 797
- 1
- 5
- 15
11
votes
4 answers
What are the domains where SVMs are still state-of-the-art?
It seems that deep neural networks and other neural network based models are dominating many current areas like computer vision, object classification, reinforcement learning, etc.
Are there domains where SVMs (or other models) are still producing…
Steven Davis
- 177
- 5
11
votes
3 answers
Why is dot product attention faster than additive attention?
In section 3.2.1 of Attention Is All You Need the claim is made that:
Dot-product attention is identical to our algorithm, except for the scaling factor of $\frac{1}{\sqrt{d_k}}$. Additive attention computes the compatibility function using a…
user3180
- 608
- 4
- 14
11
votes
1 answer
What is the meaning of $V(D,G)$ in the GAN objective function?
Here is the GAN objective function.
$$\min _{G} \max _{D} V(D, G)=\mathbb{E}_{\boldsymbol{x} \sim p_{\text {data }}(\boldsymbol{x})}[\log D(\boldsymbol{x})]+\mathbb{E}_{\boldsymbol{z} \sim p_{\boldsymbol{z}}(\boldsymbol{z})}[\log…
i_rezic
- 245
- 1
- 6
11
votes
1 answer
Are there neural networks with very few nodes that decently solve non-trivial problems?
I'm interested in knowing whether there exist any neural network, that solves (with >=80% accuracy) any nontrivial problem, that uses very few nodes (where 20 nodes is not a hard limit). I want to develop an intuition on sizes of neural networks.
Guillermo Mosse
- 327
- 1
- 10
11
votes
2 answers
What are the segment embeddings and position embeddings in BERT?
They only reference in the paper that the position embeddings are learned, which is different from what was done in ELMo.
ELMo paper - https://arxiv.org/pdf/1802.05365.pdf
BERT paper - https://arxiv.org/pdf/1810.04805.pdf
Skinish
- 163
- 1
- 1
- 9
10
votes
2 answers
Can machine learning algorithms be used to differentiate between small differences in details between images?
I was wondering if machine learning algorithms (CNNs?) can be used/trained to differentiate between small differences in details between images (such as slight differences in shades of red or other colours, or the presence of small objects between…
The Pointer
- 569
- 3
- 19
10
votes
1 answer
Is AlphaZero an example of an AGI?
From DeepMind's research paper on arxiv.org:
In this paper, we apply a similar but fully generic algorithm, which
we call AlphaZero, to the games of chess and shogi as well as Go,
without any additional domain knowledge except the rules of the…
Siddhartha
- 413
- 2
- 11
10
votes
2 answers
What are the limitations of the hill climbing algorithm and how to overcome them?
What are the limitations of the hill climbing algorithm? How can we overcome these limitations?
Abbas Ali
- 566
- 3
- 10
- 17
10
votes
4 answers
Has the spontaneous emergence of replicators been modeled in Artificial Life?
One of the cornerstones of The Selfish Gene (Dawkins) is the spontaneous emergence of replicators, i.e. molecules capable of replicating themselves.
Has this been modeled in silico in open-ended evolutionary/artificial life simulations?
Systems like…
sihubumi
- 101
- 3
10
votes
2 answers
How to handle rectangular images in convolutional neural networks?
Almost all the convolutional neural network architecture I have come across have a square input size of an image, like $32 \times 32$, $64 \times 64$ or $128 \times 128$. Ideally, we might not have a square image for all kinds of scenarios. For…
Santhosh Dhaipule Chandrakanth
- 257
- 1
- 2
- 7
10
votes
1 answer
Loss jumps abruptly when I decay the learning rate with Adam optimizer in PyTorch
I'm training an auto-encoder network with Adam optimizer (with amsgrad=True) and MSE loss for Single channel Audio Source Separation task. Whenever I decay the learning rate by a factor, the network loss jumps abruptly and then decreases until the…
imflash217
- 499
- 5
- 14