Most Popular

1500 questions
6
votes
1 answer

Why most imperfect information games usually use non machine learning AI?

To provide a bit of context, I'm a software engineer & game enthusiast (card games, especially). The thing is I've always been interested in AI oriented to games. In college, I programmed my own Gomoku AI, so I'm a bit familiar with the basic…
6
votes
1 answer

What are good learning strategies for Deep Q-Network with opponents?

I am trying to find out what are some good learning strategies for Deep Q-Network with opponents. Let's consider the well-known game Tic-Tac-Toe as an example: How should an opponent be implemented to get good and fast improvements? Is it better to…
murthy10
  • 81
  • 5
6
votes
1 answer

Does training happen during NEAT?

When one uses NEAT to evolve the best fitting network for a task, does training take place in each epoch as well? If I understand correctly, training is the adjustment of the weights of the neural network via backpropagation and gradient descent.…
Alexus
  • 236
  • 1
  • 8
6
votes
1 answer

How to deal with different actions for different states of the environment?

I'm new to this AI/Machine Learning and was playing around with OpenAI Gym a bit. When looking through the environments, I came across the Blackjack-v0 environment, which is a basic implementation of the game where the state is the hand count of the…
6
votes
1 answer

Understanding multi-iteration updates of the model in the Proximal Policy Optimization algorithm

I have a general question about the updating of the network/model in the PPO algorithm. If I understand it correctly, there are multiple iterations of weight updates done on the model with data that is created from the environment (with the model…
6
votes
1 answer

Which neural networks are suitable for visual place recognition?

I am doing a project on visual place recognition in changing environments. The CNN used here is mostly AlexNet, and a feature vector is constructed from layer 3. Does anyone know of similar work using other CNNs, for example, VGGnet (which I am…
6
votes
2 answers

Should the actor or actor-target model be used to make predictions after training is complete (DDPG)?

The situation I am referring to the paper T. P. Lillicrap et al, "Continuous control with deep reinforcement learning" where they discuss deep learning in the context of continuous action spaces ("Deep Deterministic Policy Gradient"). Based on the…
a_guest
  • 161
  • 3
6
votes
2 answers

What is the current state of AGI development?

Could you please provide some insight into the current stage of developments in AGI area? Are there any projects that had breakthroughs recently? Maybe some news source to follow on this topic?
Alex
  • 337
  • 2
  • 11
6
votes
2 answers

How can I formulate the map colouring problem as a hill climbing search problem?

I have a map. I need to colour it with $k$ colours, such that two adjacent regions do not share a colour. How can I formulate the map colouring problem as a hill climbing search problem?
jrk
  • 105
  • 1
  • 5
6
votes
1 answer

How to implement exploration function and learning rate in Q Learning

I'm trying to implement Q-learning (state-based representation and no neural / deep stuff) but I'm having a hard time getting it to learn anything. I believe my issue is with the exploration function and/or learning rate. Thing is, I see different…
6
votes
4 answers

What else can boost iterative deepening with alpha-beta pruning?

I read about minimax, then alpha-beta pruning, and then about iterative deepening. Iterative deepening coupled with alpha-beta pruning proves to quite efficient as compared to alpha-beta alone. I have implemented a game agent that uses iterative…
Suhail Gupta
  • 161
  • 1
  • 5
6
votes
1 answer

How are the kernels initialized in a convolutional neural network?

I am currently learning about CNNs. I am confused about how filters (aka kernels) are initialized. Suppose that we have a $3 \times 3$ kernel. How are the values of this filter initialized before training? Do you just use predefined image kernels?…
Inkplay_
  • 421
  • 4
  • 8
6
votes
1 answer

What is a weighted average in a non-stationary k-armed bandit problem?

In the book Reinforcement Learning: An Introduction (page 25), by Richard S. Sutton and Andrew G. Barto, there is a discussion of the k-armed bandit problem, where the expected reward from the bandits changes slightly over time (that is, the problem…
chessprogrammer
  • 2,890
  • 2
  • 15
  • 26
6
votes
2 answers

Neural network for data visualization

At my work, we're currently doing some research into data visualisation for highly interconnected data, basically graphs. We've been implementing all sorts of different layouts and trying to see which fits best, but, due to the nature of the problem…
tiansivive
  • 171
  • 1
6
votes
1 answer

Is one big network faster than several small ones?

The basis of my question is that a CNN that does great on MNIST is far smaller than a CNN that does great on ImageNet. Clearly, as the number of potential target classes increases, along with image complexity (background, illumination, etc.), the…
pshlady
  • 484
  • 2
  • 7