This algorithm (MMTO), or a variant of it, is now used by all the top Shogi programs. For an objective function, it uses agreement between low-depth search results and game moves actually made by strong players.
There is some brief discussion of its application to chess and an experiment using Crafty in this paper:
https://www.jair.org/media/4217/live-4217-7792-jair.pdf
(see p. 555).
--Jon
MMTO for evaluation learning
Moderators: hgm, Dann Corbit, Harvey Williamson
Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
-
- Posts: 4116
- Joined: Fri Mar 10, 2006 4:23 am
- Location: http://www.arasanchess.org
-
- Posts: 4116
- Joined: Fri Mar 10, 2006 4:23 am
- Location: http://www.arasanchess.org
Re: MMTO for evaluation learning
Here is a less technical intro PPT, explains the error function better:
http://www.logos.ic.i.u-tokyo.ac.jp/~mi ... ummer.pptx
http://www.logos.ic.i.u-tokyo.ac.jp/~mi ... ummer.pptx
Re: MMTO for evaluation learning
Thanks Jon, that is interesting. Something that it combines minimizing the error in score and at the same time guiding the score to choose the best move considered in a position.
Re: MMTO for evaluation learning
Weird that they mention KnightCap and the temporal-difference learning experiments that were done (Jonathan Baxter and Andrew Tridgell) but give a date of 2000 for some reason... I think they had published papers about it as early as 1998 (I was in university then and remember reading them).
This one for example: http://citeseerx.ist.psu.edu/viewdoc/su ... .1.36.7885
Anyways, it looks interesting, thanks for the links!
This one for example: http://citeseerx.ist.psu.edu/viewdoc/su ... .1.36.7885
Anyways, it looks interesting, thanks for the links!