This algorithm (MMTO), or a variant of it, is now used by all the top Shogi programs. For an objective function, it uses agreement between low-depth search results and game moves actually made by strong players.
There is some brief discussion of its application to chess and an experiment using Crafty in this paper:
https://www.jair.org/media/4217/live-4217-7792-jair.pdf
(see p. 555).
--Jon
MMTO for evaluation learning
Moderator: Ras
-
- Posts: 4396
- Joined: Fri Mar 10, 2006 5:23 am
- Location: http://www.arasanchess.org
-
- Posts: 4396
- Joined: Fri Mar 10, 2006 5:23 am
- Location: http://www.arasanchess.org
Re: MMTO for evaluation learning
Here is a less technical intro PPT, explains the error function better:
http://www.logos.ic.i.u-tokyo.ac.jp/~mi ... ummer.pptx
http://www.logos.ic.i.u-tokyo.ac.jp/~mi ... ummer.pptx
-
- Posts: 4845
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: MMTO for evaluation learning
Thanks Jon, that is interesting. Something that it combines minimizing the error in score and at the same time guiding the score to choose the best move considered in a position.
-
- Posts: 838
- Joined: Thu Jul 05, 2007 5:03 pm
- Location: British Columbia, Canada
Re: MMTO for evaluation learning
Weird that they mention KnightCap and the temporal-difference learning experiments that were done (Jonathan Baxter and Andrew Tridgell) but give a date of 2000 for some reason... I think they had published papers about it as early as 1998 (I was in university then and remember reading them).
This one for example: http://citeseerx.ist.psu.edu/viewdoc/su ... .1.36.7885
Anyways, it looks interesting, thanks for the links!
This one for example: http://citeseerx.ist.psu.edu/viewdoc/su ... .1.36.7885
Anyways, it looks interesting, thanks for the links!