Page 1 of 1

MMTO for evaluation learning

Posted: Sun Jan 25, 2015 4:11 pm
by jdart
This algorithm (MMTO), or a variant of it, is now used by all the top Shogi programs. For an objective function, it uses agreement between low-depth search results and game moves actually made by strong players.

There is some brief discussion of its application to chess and an experiment using Crafty in this paper:

https://www.jair.org/media/4217/live-4217-7792-jair.pdf

(see p. 555).

--Jon

Re: MMTO for evaluation learning

Posted: Sun Jan 25, 2015 5:22 pm
by jdart
Here is a less technical intro PPT, explains the error function better:

http://www.logos.ic.i.u-tokyo.ac.jp/~mi ... ummer.pptx

Re: MMTO for evaluation learning

Posted: Sun Jan 25, 2015 10:34 pm
by Ferdy
Thanks Jon, that is interesting. Something that it combines minimizing the error in score and at the same time guiding the score to choose the best move considered in a position.

Re: MMTO for evaluation learning

Posted: Mon Jan 26, 2015 3:35 am
by wgarvin
Weird that they mention KnightCap and the temporal-difference learning experiments that were done (Jonathan Baxter and Andrew Tridgell) but give a date of 2000 for some reason... I think they had published papers about it as early as 1998 (I was in university then and remember reading them).

This one for example: http://citeseerx.ist.psu.edu/viewdoc/su ... .1.36.7885

Anyways, it looks interesting, thanks for the links!