Файл:Giraﬀe Using Deep Reinforcement Learning to Play Chess1509.01549v2.pdf
Matthew Lai Imperial College London Department of Computing
This report presents Giraﬀe, a chess engine that uses self-play to discover all its domain-speciﬁc knowledge, with minimal hand-crafted knowledge given by the programmer. Unlike previous attempts using machine learning only to perform parametertuning on hand-crafted evaluation functions, Giraﬀe’s learning system also performs automatic feature extraction and pattern recognition. The trained evaluation function performs comparably to the evaluation functions of state-of-the-art chess engines - all of which containing thousands of lines of carefully hand-crafted pattern recognizers, tuned over many years by both computer chess experts and human chess masters. Giraﬀe is the most successful attempt thus far at using end-to-end machine learning to play chess.
We also investigated the possibility of using probability thresholds instead of depth to shape search trees. Depth-based searches form the backbone of virtually all chess engines in existence today, and is an algorithm that has become well-established over the past half century. Preliminary comparisons between a basic implementation of probability-based search and a basic implementation of depth-based search showed that our new probability-based approach performs moderately better than the established approach. There are also evidences suggesting that many successful ad-hoc add-ons to depth-based searches are generalized by switching to a probability-based search. We believe the probability-based search to be a more fundamentally correct way to perform minimax.
Finally, we designed another machine learning system to shape search trees within the probability-based search framework. Given any position, this system estimates the probability of each of the moves being the best move without looking ahead. The system is highly eﬀective - the actual best move is within the top 3 ranked moves 70% of the time, out of an average of approximately 35 legal moves from each position. This also resulted in a signiﬁcant increase in playing strength. With the move evaluator guiding a probability-based search using the learned evaluator, Giraﬀe plays at approximately the level of an FIDE International Master (top 2.2% of tournament chess players with an oﬃcial rating).
Нажмите на дату/время, чтобы просмотреть, как тогда выглядел файл.
|текущий||12:13, 22 декабря 2016||0 × 0 (712 КБ)||Slikos|
- Вы не можете перезаписать этот файл.