1 / 24

Monte Carlo Go Has a Way to Go

Monte Carlo Go Has a Way to Go. Adapted from the slides presented at AAAI 2006. Haruhiro Yoshimoto (*1) Kazuki Yoshizoe (*1) Tomoyuki Kaneko (*1) Akihiro Kishimoto (*2) Kenjiro Taura (*1). (*1)University of Tokyo (*2) Future University Hakodate. Games in AI.

Mia_John
Download Presentation

Monte Carlo Go Has a Way to Go

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Monte Carlo Go Has a Way to Go Adapted from the slides presented at AAAI 2006 Haruhiro Yoshimoto (*1) Kazuki Yoshizoe (*1) Tomoyuki Kaneko (*1) Akihiro Kishimoto (*2) Kenjiro Taura (*1) (*1)University of Tokyo (*2)Future University Hakodate

  2. Games in AI • Ideal test bed for AI research • Clear results • Clear motivation • Good challenge • Success in search-based approach • chess (1997, Deep Blue) • and others • Not successful in the game of Go • Go is to Chess as Poetry is to Double-entry accounting • It goes to the core of artificial intelligence, which involves the study of learning and decision-making, strategic thinking, knowledge representation, pattern recognition and, perhaps most intriguingly, intuition

  3. The game of Go • An 4,000 years old board game from China • Standard size 19×19 • Two players, Black and White, place the stones in turns • Stones can not be moved, but can be captured and taken off • Larger territory wins

  4. Terminology of Go

  5. Playing Strength $1.2M was set for beating a professional with no handicap (expired!!!) Handtalk in 1997 claimed $7,700 for winning an 11-stone handicap match against a 8-9 years old master

  6. Difficulties in Computer Go • Large search space • the game becomes progressively more complex, at least for the first 100 ply

  7. Difficulties in Computer Go • Lack of good evaluation function • a material advantage does not mean a simple way to victory, and may just mean that short-term gain has been given priority • legal moves around 150–250, usually <50 acceptable (even <10), but computers have a hard time distinguishing them. • Very high degree of pattern recognition involved in human capacity to play well.

  8. Why Monte Carlo Go? Replace evaluation function by random sampling Brugmann:1993, Bouzy:2003 • Success in other domains Bridge [Ginsberg:1999], Poker [Billings et al.:2002] • Reasonable position evaluation based on sampling search space from O(bd) to O(Nbd) • Easy to parallelize • Can win against search-based approach • Crazy Stone won the 11th Computer Olympiad in 9x9 Go • MoGo 19th, 20th KGS 9x9 winner, rated highest on CGOS

  9. Basic idea of Monte Carlo Go • Generate next moves by 1-ply search • Play a number of random games and compute the expected score • Choose the move with the maximal score • The only domain-dependent information is eye.

  10. Terminal Position of Go Larger territory wins Territory = surrounded area + stones ▲ Black’s territory is 36 points × White’s territory is 45 points White wins by 9 points

  11. Play many sample games Each player plays randomly Compute average points for each move Select the move that has the highest average Example Play rest of the game randomly 5 points win for black 9 points win for black move A: (5 + 9) / 2 = 7 points

  12. Monte Carlo Go and Sample Size Monte Carlo with 1000 sample games • Can reduce statistical errors with additional samples • Relationships between sample size and strength are not yet investigated • Sampling error~ • N: # of random games Diminishing returns must appear Monte Carlo with 100 sample games Stronger than

  13. Our Monte Carlo Go Implementation • basic Monte Carlo Go • atari-50 enhancement: Utilization of simple go knowledge in move selection • progressive pruning [Bouzy 2003]: statistical move pruning in simulations

  14. Atari-50 Enhancement • Basic Monte Carlo: assign uniform probability for each move in sample game (no eye filling) • Atari-50: higher probability for capture moves • Capture is “mostly” a good move • 50% Move A captures black stones

  15. Progressive Pruning [Bouzy2003] • Try sampling with smaller sample size • Prune statistically inferior moves score move Can assign more sample games to promising moves

  16. Experimental Design • Machine • Intel Xeon Dual CPU at 2.40 GHz with 2 GB memory • Use 64 PCs (128 processors) connected by 1GB/s network • Three versions of programs • BASIC: Basic Monte Carlo Go • ATARI: BASIC + Atari-50 enhancement • ATARIPP: ATARI + Progressive Pruning • Experiments • 200 self-play games • Analysis of decision quality from 58 professional games

  17. Diminishing Returns4*N samples vs N samplesfor each move

  18. Additional enhancements and Winning Percentage

  19. Decision Quality of Each Move a b c 1 20 17 10 2b -> 9 times 2c -> 1 times 15 2 25 30 3 12 21 7 Selected move for 100 sample game Monte Carlo Go Evaluation score of “Oracle” (64 million sample games) Average error of one move is ((30 – 30) * 9 + (30 - 15 ) * 1) / 10 = 1.5 points

  20. Decision Quality of Each Move(Basic)

  21. Decision Quality of Each Move (with Atari50 Enhancement)

  22. Summary of Experimental Results • Additional enhancements improve strength of Monte Carlo Go • Diminish returns eventually • Additional enhancements get quicker diminishing returns • Need to collect more samples in the early stage game of 9x9 Go

  23. Conclusions and Future Work • Conclusions • Additional samples achieve only small improvements • Not like search algorithm, e.g. chess • Good at strategy, not tactics • blunder due to lack of domain knowledge • Easy to evaluate • Easy to parallelize • The way for Monte Carlo Go to go Small sample games with many enhancements will be promising • Future Work • Adjust probability with pattern matching • Learning • Search + Monte Carlo Go • MoGo (exploration-exploitation in the search tree using UCT) • Scale to 19×19

  24. Questions ? Reference: • Go wiki http://en.wikipedia.org/wiki/Go_(board_game) • Gnu Go http://www.gnu.org/software/gnugo/ • KGS Go Server http://www.gokgs.com • CGOS 9x9 Computer Go Server http://cgos.boardspace.net

More Related