Questions tagged [monte-carlo-tree-search]

For questions related to Monte Carlo Tree Search (MCTS), which is a best-first, rollout-based tree search algorithm. MCTS gradually improves its evaluations of nodes in the trees using (semi-)random rollouts through those nodes, focusing a larger proportion of rollouts on the parts of the tree that are the most promising.

Monte-Carlo Tree Search (MCTS) is a best-first tree search algorithm. "Best-first" means that it focuses the majority of the search effort on the most promising parts of a search tree. This is a popular algorithm for automated game-playing in Artificial Intelligence (AI), but also has applications outside of AI.

Older algorithms, such as minimax and alpha-beta-pruning, evaluate a node by systematically searching all the nodes below it, which can be very computationally expensive. The basic idea of MCTS is to rapidly approximate such systematic evaluations of nodes by using (semi-)random rollouts and averaging the evaluations at the end of those rollouts. Over time, the algorithm will focus a larger amount of search effort on parts of the search tree that appear promising based on the earlier rollouts. This enables the algorithm to gradually improve the approximations of the evaluations in those parts of the tree. Additionally, the algorithm gradually grows a tree by storing a number (typically 1) of nodes in memory for every rollout. The parts of the tree that are stored in memory are not traversed using a (semi-)random strategy, but using a different ("Selection") strategy. This guarantees that, given an infinite amount of computation time, the algorithm converges to optimal solutions.

An extensive survey, describing many enhancements and applications of the algorithm, can be found in:

  • Browne, C., Powley, E., Whitehouse, D., Lucas, S., Cowling, P. I., Rohlfshagen, P., Tavener, S., Perez, D., Samothrakis, S., and Colton, S. (2012). A Survey of Monte Carlo Tree Search Methods. Computation Intelligence and AI in Games, IEEE Transactions on, Vol. 4, No. 1, pp. 1-43.

The algorithm was originally proposed in:

  • Kocsis, L. and Szepesvári, C. (2006). Bandit Based Monte-Carlo Planning. Proceedings of the 17th European Conference on Machine Learning (ECML 2006) (eds. J. Fürnkranz, T. Scheffer, and M. Spiliopoulou), Vol. 4212 of Lecture Notes in Computer Science, pp. 282-293, Springer Berlin Heidelberg.
  • Coulom, R. (2007). Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search. Computers and Games (eds. H. J. van den Herik, P. Ciancarini, and H. H. L. M. Donkers), Vol. 4630 of Lecture Notes in Computer Science, pp. 72-83, Springer Berlin Heidelberg.
127 questions
8
votes
2 answers

What is the appropriate way to deal with multiple paths to same state in MCTS?

Many games have multiple paths to the same states. What is the appropriate way to deal with this in MCTS? If the state appears once in the tree, but with multiple parents, then it seems to be difficult to define back propagation: do we only…
Jay McCarthy
  • 225
  • 1
  • 4
6
votes
1 answer

When to expand and when to simulate in Monte Carlo Tree Search?

In Monte Carlo Tree Search (MCTS), we start at root node $R$. Then we select some leaf node $L$. And we expand $L$ by one or more child nodes and simulate from the child to the end of the game. When should we expand and when should we simulate in…
Soroush
  • 98
  • 1
  • 6
6
votes
1 answer

What should we do when the selection step selects a terminal state?

In Monte Carlo tree search, what should we do when the selection step selects a terminal state (i.e. a won or lost state), which is, by definition, a leaf node? Expansion and simulation is not in order, as it's game over, but does the tree…
degski
  • 163
  • 6
5
votes
2 answers

What is a simple game for validation of MCTS?

What is a simple turn-based game, that can be used to validate a Monte-Carlo Tree Search code and it's parameters? Before applying it to problems where I do not have a possiblity to validate its moves for correctness, I would like to implement a…
allo
  • 310
  • 1
  • 9
4
votes
1 answer

How to run a Monte Carlo Tree Search MCTS for stochastic environment?

For MCTS there is an expansion phase where we make a move and list down all the next states. But this is complicated by the fact that for some games, after making the move, there is a stochastic change to the environment. Consider the game 2048,…
xiaodai
  • 141
  • 3
4
votes
1 answer

Is the playout started from a leaf or child of leaf in Monte Carlo Tree Search?

On Wikipedia, the MCTS algorithm is described Selection: start from root $R$ and select successive child nodes until a leaf node $L$ is reached. A leaf is any node from which no simulation (playout) has yet been initiated. Expansion: create one (or…
Charlie
  • 43
  • 2
3
votes
2 answers

How can we efficiently and unbiasedly decide which children to generate in the expansion phase of MCTS?

When executing MCTS' expansion phase, where you create a number of child nodes, select one of the numbers, and simulate from that child, how can you efficiently and unbiasedly decide which child(ren) to generate? One strategy is to always generate…
Jay McCarthy
  • 225
  • 1
  • 4
2
votes
1 answer

Does Monte Carlo Tree Search not work on games without the same initial state?

I'm curious how you would apply Monte Carlo Tree Search to a game that has a random initial state. You generate a tree where the root node is the initial state, then you expand if the options from that state are not explored yet. I'm also wondering…
user8714896
  • 797
  • 1
  • 6
  • 24
2
votes
0 answers

Formulating MCTS with random outcomes of actions?

I am working on implementing MCTS for a scheduling problem where MCTS is formulated each time there are multiple jobs that need to be scheduled. When a job is executed, the resulting state of the system is random. The challenge I'm having is that…
hoffee
  • 131
  • 3
1
vote
1 answer

In MCTS, does a simulation create new nodes

I am trying to implement a Monte-Carlo-Tree-Search algorithm. My question is, during the simulation/playout/rollout phase, are new nodes added as children to the node C from which the simulation occurs, or is the only impact (on the tree) the…
JohnT
  • 11
  • 1
1
vote
1 answer

Adding a Transposition Table to Monte Carlo Tree Search

I think I'm having a bit of trouble wrapping my head around how a transposition table functions: As I understand it you can store a value (simulation result?) for a given game state in this (hash) table and use it instead of a simulation when that…
NG.
  • 139
  • 6
1
vote
0 answers

What method to use for Monte-Carlo Tree Search to prefer depth search

The basic Monte-Carlo Tree Search algorithm uses the tree policy: while v is nonterminal: if v is not fully expanded: expand v else: v = v.best_child By always expanding a node as long as it is expandable a lot of time is consumed to…
allo
  • 310
  • 1
  • 9