9.2 Rollouts in GNU Backgammon
In GNU Backgammon the Rollout function
implements the procedure described above, with the following
improvements:
- Truncation: instead of rolling out all the way to the end of the
game, it can stop and pretend its evaluation after a few plies is
perfect. This may obviously introduce some amount of systematic
error, but in practice this may not matter because:
- it makes rollouts much faster, which means you can do more of
them (and thus trade sampling error for systematic error);
- different positions will be reached in different trials, so the
correlation between errors in each trial weakens and the errors
cancel out to some extent;
- if you are rolling out the positions after making different
plays, then any remaining systematic error between the two rollouts
is likely to be somewhat correlated and so the error in the
comparison between the plays is hopefully small. This implies that
truncated rollouts are better for estimating
relative equity (which is the better
move here, 13/10*/9 or 13/10* 6/5*?) than
absolute equity (at this match
score I need 29% wins to accept a dead cube; can I take in this
position?).
- Race database truncation: when the game enters its 2-sided
bearoff database, GNU Backgammon can
estimate the probability of winning from that position with no error
at all (it can play and evaluate endgame positions perfectly), which
saves time and avoids introducing the errors that can result from
large equity variances at the end of the game.
- Variance reduction: when using lookahead evaluations, it can
reduce errors by making use of the equity difference from one ply to
the next. (This can be interpreted as either canceling out the
estimated luck (i.e. the difference in equity
evaluations before and after rolling) or using subsequent
evaluations to estimate the error in prior ones; the two views are
equivalent). GNU Backgammon automatically
performs variance reduction when looking ahead at least one ply.
- Stratified sampling: uses quasi-random number generation instead
of pseudo-random number generation (this is a standard technique in
Monte Carlo simulations where having a near-perfect uniform
distribution in your sample is more important than
unpredictability). GNU Backgammon only
stratifies the first 2 plies of a rollout, though it would be easy
enough to extend it to the remainder.