Formulating MCTS with random outcomes of actions?

Asked Nov 20 '19 at 21:05

Active Nov 20 '19 at 21:05

Viewed 364 times

I am working on implementing MCTS for a scheduling problem where MCTS is formulated each time there are multiple jobs that need to be scheduled. When a job is executed, the resulting state of the system is random. The challenge I'm having is that the implementation I'm currently using relies on the ability to determine if a node is fully expanded. However, there are so many children of the root node that it's not feasible to expect all of them will ever be visited. Is there a suggested method of conducting MCTS in cases where nodes will not likely ever be fully expanded?

asked Nov 20 '19 at 21:05

hoffee

Welcome to AI.SE @hoffee. I think you're asking how to perform MCTS when actions have non-deterministic outcomes. Is that the essence of your question, or are you trying to ask something else? – John Doucette Nov 20 '19 at 23:22
Yes, that is what I'm asking. – hoffee Nov 21 '19 at 14:01
2

There is a great paper "A Survey of Monte Carlo Tree Search Methods". Read the chapter on stochastic MCTS. Short: introduce "random nodes" - fake player which do random actions. – mirror2image Nov 22 '19 at 09:00
I am aware of that paper but I will revisit that chapter, thanks for the suggestion. – hoffee Nov 22 '19 at 16:25

Formulating MCTS with random outcomes of actions?

0 Answers0