Highest Voted Questions - Artificial Intelligence Stack Exchange

5

votes

2 answers

Is there any way to draw a neural network's connections in a nice way?

I've been working with neural networks and artificial intelligence for a while. What I'm trying to do right now is, from a genotype I have (a sum of sensors, neurons and actuators) draw how the neural network is (with recurrent/recursive connections…

asked Jun 12 '17 at 20:48

MiGu3X

53
3

5

votes

1 answer

What is the difference between simple reflex and model-based agents?

What is the difference between simple reflex and model-based agents? What is the role of the internal state in the case of model-based agents?

asked May 29 '17 at 11:17

Pierre P.

161
1
2

5

votes

1 answer

How can one find / collect data for, and come up with ideas for, using Deep Learning / AI to improve one's everyday life?

I can see a lot of tutorials and examples about using TensorFlow and other free, open-source AI/ML/DL frameworks on enterprise level where enough data was collected for such AI solutions. How can one can collect enough data in normal everyday life…

asked May 26 '17 at 15:31

user6933

5

votes

2 answers

Feasibility of an AI assistant to expedite game development?

Basically, an AI that can create, rig, and texture 3d models and game environments (by extrapolating from collections of reference models, according to user input), and that can set up physics and mechanics (assuming that the AI has access to a 3d…

ai-design

asked May 23 '17 at 21:15

Sebastian Hahn

119
2

5

votes

1 answer

Precise localization and characterization of rudimentary shapes with neural networks

I understand that there are flavors of (convolutional) neural networks that are useful for object localization and detection tasks of reasonable difficulty. In all of the examples I have seen so far, localization is formulated as finding the corners…

asked May 10 '17 at 20:07

mbaytas

151
3

5

votes

1 answer

What is the difference between an on-policy distribution and state visitation frequency?

On-policy distribution is defined as follows in Sutton and Barto: On the other hand, state visitation frequency is defined as follows in Trust Region Policy Optimization: $$\rho_{\pi}(s) = \sum_{t=0}^{T} \gamma^t P(s_t=s|\pi)$$ Question: What is…

asked Dec 08 '21 at 10:36

user529295

369
2
10

5

votes

2 answers

Why do Transformers have a sequence limit at inference time?

As far as I understand, Transformer's time complexity increases quadratically with respect to the sequence length. As a result, during training to make training feasible, a maximum sequence limit is set, and to allow batching, all sequences smaller…

asked Nov 26 '21 at 15:32

chessprogrammer

2,890
2
15
26

5

votes

1 answer

Do we use validation and test sets for training a reinforcement learning agent?

I am pretty new to reinforcement learning and was working with some code for the PPO and DQN algorithms. After looking at the code, I noticed that the authors did not include any code to setup a validation or testing dataloader. In most other…

asked Nov 22 '21 at 15:24

krishnab

207
2
8

5

votes

1 answer

How can I estimate how many photos I need to train ResNet-50 for image classification?

I am working on a project where I have to classify around 1000 unique objects. I'm trying to plan how much training data I will need to collect. I was planning on using ResNet-50. Is there anyway I can estimate the amount of photos I should plan to…

asked Nov 16 '21 at 14:56

Tyler Hilbert

145
5

5

votes

2 answers

Why does the activation function for a hidden layer in a MLP have to be non-polynomial?

Across multiple pieces of literature describing MLPs or while describing the universal approximation theorem, the statement is very specific on the activation function being non-polynomial. Is there a reason why it cannot be a higher-order…

asked Nov 15 '21 at 11:53

niil87

53
3

5

votes

1 answer

In TD(0) with linear function approximation, why is the gradient of $\hat v(S^{\prime}, \mathbf w)$ wrt parameters $\mathbf w$ not considered?

I am reading these slides. On page 38, the update for the parameters for the linear function approximation of TD(0) is given. I have a doubt regarding this. The cost function (RMSE) is given on page 37. My doubt is: why is the gradient of $\hat…

asked Nov 14 '21 at 10:17

A Yoghes

53
4

5

votes

3 answers

Use of machine learning for analyzing companies enlisted in stock market

Can current trends and tools, in the field of machine learning, replicate the complexity of financial market? If yes, then what are the tools available in this domain. Q. I am trying to build a model to infer results from stock market using the…

asked Apr 28 '17 at 11:42

parth

161
5

5

votes

1 answer

Why isn't a target network used for the critic in on-policy actor-critic methods?

Based on my research, I've seen so many on-policy AC approaches that utilise a critic network to estimate the value function $V$. The Bellman equation for the value function is as bellow: $$ V_\pi(s_t) = \sum_a \pi(a|s_t)\sum_{r,…

asked Nov 07 '21 at 16:12

Green Falcon

170
2
10

5

votes

3 answers

What is the Intermediate (dense) layer in between attention-output and encoder-output dense layers within transformer block in PyTorch implementation?

In PyTorch, transformer (BERT) models have an intermediate dense layer in between attention and output layers whereas the BERT and Transformer papers just mention the attention connected directly to output fully connected layer for the encoder just…

asked Oct 25 '21 at 20:05

mohammad ali Humayun

51
3

5

votes

2 answers

Do we need automatic hyper-parameter tuning when we have a large enough dataset?

Hyperparameter tuning is the process of selecting the optimal hyperparameters for an ANN. Now, my guess is that, if we have sufficient data (say, 1.4 million for, say, 6 features), the model can be optimally trained and we don't need a…

asked Oct 17 '21 at 18:10

user366312

311
1
12

Most Popular