[−][src]Crate recon_mcts
A recombining and concurrent implementation of Monte Carlo tree search (MCTS).
Features
-
Recombining: the tree efficiently handles the fact that multiple different sequences of actions can lead to the same state. Since nodes can have multiple parents, the underlying data representation is a directed acyclic graph. The evaluation of a new leaf node will be propagated backwards to all parent nodes, rather than simply to nodes on the path along which the leaf was reached. Nodes that are no longer reachable are pruned from the tree when an action is taken.
-
Concurrent: multiple worker threads can grow the tree at the same time. Threads that are waiting on a new leaf to be created will steal work from the thread responsible for creating the new leaf in order to prevent log jams along the hot path of the Monte Carlo tree. Additional threads become more beneficial with increases in the amount of time required to run
GameDynamics::score_leaf
and greater exploration entropy for a given state inGameDynamics::select_node
. -
Topologically aware: the sequence of nodes backpropagating their scores is based on the node's position in the tree. In the process of backpropagating a score from a leaf to the root of the tree, a node does not backpropagate its score to its parents until it has received all updates from its children. This is desirable because in a recombining tree, some paths from a leaf to a (grand)*parent may pass through more nodes than other paths. A parent node waits on information to arrive via all possible paths before pushing score updates to its own parents. Nevertheless, backpropagation leads to O(n^2) complexity even with topological awareness (i.e. the nth node could have n-1 parents, all of which must be updated by the MCTS algorithm). It is possible to increase the performance of deep trees by only backpropagating a score if it represents a meaningful update to the node's prior score. See
GameDynamics::backprop_scores
for further details. -
Simple and flexible: the interface has a high degree of flexibility and is easy to implement for various discrete move games. Games can have one or many players; different players can have different evaluation functions (e.g. a player could store its score as a string or bytes array while another player simply uses a float).
-
Configurable: Memory usage for game states that are memory intensive can be reduced by choosing to recompute them or to store them as a hash only (at the cost of a possible hash collision that results in a suboptimal decision). See
state_memory
. -
Safe and self-contained: the library is written entirely in safe Rust without external dependencies (other than the Rust Standard Library).
Usage
Using this library mainly requires implementing the GameDynamics
trait for a game.
Interacting with the Tree
is accomplished via the SearchTree
trait.
Use of the library via a trait object is made possible via DynGD
since GameDynamics
is
automatically implemented for types that dereference to types that implement DynGD
.
See the provided implementation of Nim and the high-level code below for a demonstration of the library interface.
use recon_mcts::prelude::*; struct MyGame; // `Clone` is generally optional, but required for some methods such as // `SearchTree::get_root_info` #[derive(Clone, Debug, Hash, PartialEq)] enum Player { P1, P2, } impl GameDynamics for MyGame { type Player = Player; type State = usize; type Action = usize; type Score = f64; type ActionIter = Vec<(Self::Player, Self::Action)>; fn available_actions( &self, player: &Self::Player, state: &Self::State, ) -> Option<Self::ActionIter> { todo!() } fn apply_action( &self, state: Self::State, action: &Self::Action, ) -> Option<Self::State> { todo!() } fn select_node<II, Q, A>( &self, parent_score: Option<&Self::Score>, parent_player: &Self::Player, parent_node_state: &Self::State, purpose: SelectNodeState, scores_and_actions: II, ) -> Self::Action where II: IntoIterator<Item = (Q, A)>, Q: Deref<Target = Option<Self::Score>>, A: Deref<Target = Self::Action>, { todo!() } fn backprop_scores<II, Q>( &self, player: &Self::Player, score_current: Option<&Self::Score>, child_scores: II, ) -> Option<Self::Score> where II: IntoIterator<Item = Q>, Q: Deref<Target = Self::Score>, { todo!() } fn score_leaf( &self, parent_score: Option<&Self::Score>, parent_player: &Self::Player, state: &Self::State, ) -> Option<Self::Score> { todo!() } } let tree = Tree::new(MyGame, GetState, Player::P1, 0); loop { for _ in 0..9 { tree.step(); } match tree.apply_best_action() { Status::Terminal => break, _ => { dbg!(tree.get_root_info()); } } }
Contribute
You can browse the repo and make contributions on GitHub.
Modules
state_memory | Provides mixins to statically configure how each node's state is stored in memory and compared for equality. |
Structs
ArcWrap | Newtype to allow for a custom |
GetState | Slower performance but better memory efficiency for large states. |
HashOnly | There's a chance of hash collision, which would mean that a parent node connects to an incorrect child node. |
Node | The fundamental type composing a |
NodeInfo | Contains information about a specific |
RegistryInfo | Contains information about a |
StoreState | Memory usage is state dependent (could use lots of storage if states are large). |
Tree | An acyclic collection of connected |
WeakWrap | Newtype wrapping a |
Enums
SelectNodeState | A flag indicating whether an action is being evaluated for exploration or exploitation. |
Status | Provides information about the result of applying a |
Traits
BaseGD | A trait that can be used to implemented |
DynGD | A supertrait of |
GameDynamics | Requires implementation by the user in order to provide the rules of the game and desired specifics of the algorithm. |
OnDrop | A trait used to remove nodes from the transposition table that are no longer reachable from the root. Generally for internal use. |
SearchTree | An interface to reduce the number of bounds required to use a |
StateMemory | A trait used to modify how states are stored in the transposition table. Generally for internal use. |
Type Definitions
ArcNode | Convenience type alias. |
NodeAlias | Convenience type alias. |
TreeAlias | Convenience type alias. |
WeakNode | Convenience type alias. |