A paper about parameter tuning

Rémi Coulom · Post by **Rémi Coulom** » Tue Jan 12, 2010 10:46 pm

http://books.nips.cc/papers/files/nips2 ... 9_0508.pdf

I did not read carefully. Some of you might be interested.

Rémi

beachknight · Post by **beachknight** » Wed Jan 13, 2010 8:23 am

Quite interesting resource.

http://books.nips.cc/

A query with "chess" resulted with 32 items.

Best,

MattieShoes · Post by **MattieShoes** » Wed Jan 13, 2010 8:57 am

I read it carefully but I still couldn't make sense of the equations. In a general sense, they were doing something similar to what knightcap did, but on more nodes of the search tree than knightcap, resulting in faster learning and better results using self-play...

Bleh, I should back up and learn more about TD learning first, then re-read this.

zamar · Post by **zamar** » Wed Jan 13, 2010 9:23 am

Rémi Coulom wrote:http://books.nips.cc/papers/files/nips2 ... 9_0508.pdf

There are two things I don't like in these kind of papers:

1) Simple things are expressed in complicated math symbols. Writer uses great effort to write his ideas in very abstract form and reader uses great effort to extract these ideas from that abstract form. What's the point? I could get ideas much more easily from well documented C-like pseudo-code.

2) What's the point of testing these new kind of learning algorithms using completely random values as starting point? Any bad learning algorithm can improve from there! Pick up some strong existing chess program (Crafty, Fruit, Glaurung) and try to improve from there. Then I believe that this learning algorithm has some potential!

Aaron Becker · Post by **Aaron Becker** » Wed Jan 13, 2010 10:07 am

This looks like a smarter approach than KnightCap, based on the description of KnightCap that they provide. The biggest differences that I see are that 1) all of the parameter updates occur inside a single timestep, so it eliminates a lot of the dependence on opponent strength, and 2) updates aren't limited to root and pv-leaf nodes, so you do a lot more updates per search, and you hit positions that will never occur on the board.

One downside is that since all the updates depend on the fidelity of your own search routine, you can screw up your own learning by doing nullmove search, forward pruning, and possibly to a lesser extend extensions and reductions. The authors get away from this by turning all that stuff off while they do their learning. It's not clear to me how much of a problem this is for the actual learning.

Another downside is that it depends on your evaluation being expressed as a linear combination of weighted binary features. This is fine as a mathematical abstraction, but it would be a real pain to actually express your evaluation that way, which you would need to do in order to do the updates correctly. You're also extremely likely to get stuck in local maxima using a plain gradient descent algorithm like this in a complicated search space.

Michel · Post by **Michel** » Wed Jan 13, 2010 10:14 am

1) Simple things are expressed in complicated math symbols. Writer uses great effort to write his ideas in very abstract form and reader uses great effort to extract these ideas from that abstract form. What's the point? I could get ideas much more easily from well documented C-like pseudo-code.
It depends on your background. Some people find formulas clearer than text or pseudocode.

mcostalba · Post by **mcostalba** » Wed Jan 13, 2010 10:44 am

Michel wrote:
1) Simple things are expressed in complicated math symbols. Writer uses great effort to write his ideas in very abstract form and reader uses great effort to extract these ideas from that abstract form. What's the point? I could get ideas much more easily from well documented C-like pseudo-code.
It depends on your background. Some people find formulas clearer than text or pseudocode.
In a graph is drawn that after 100 games of training ELO increase of 500 points !

This is total crap, of corse, becasue you cannot increase by training with less games it takes to measure the ELO gain itself, and in 100 games you measure nothing, so this is just another piece of crap

Just to be more clear, if the engine increased by 500 ELO points in 100 games it is _theoretically_ only for pure luck.

michiguel · Post by **michiguel** » Wed Jan 13, 2010 11:37 am

zamar wrote:
Rémi Coulom wrote:http://books.nips.cc/papers/files/nips2 ... 9_0508.pdf
There are two things I don't like in these kind of papers:

1) Simple things are expressed in complicated math symbols. Writer uses great effort to write his ideas in very abstract form and reader uses great effort to extract these ideas from that abstract form. What's the point? I could get ideas much more easily from well documented C-like pseudo-code.

2) What's the point of testing these new kind of learning algorithms using completely random values as starting point? Any bad learning algorithm can improve from there! Pick up some strong existing chess program (Crafty, Fruit, Glaurung) and try to improve from there. Then I believe that this learning algorithm has some potential!

A scientific paper should try to tackle a problem of artificial intelligence, not how to make a competitive chess engine. Of course, you should start from scratch if that is your goal.

Miguel

Gian-Carlo Pascutto · Wed Jan 13, 2010 11:37 am

This is total crap, of corse, becasue you cannot increase by training with less games it takes to measure the ELO gain itself, and in 100 games you measure nothing, so this is just another piece of crap.

Just to be more clear, if the engine increased by 500 ELO points in 100 games it is _theoretically_ only for pure luck.

While there are some things in the paper that can be criticized, your criticism itself is just as flawed.

ELO gains cannot be measured exactly without an infinite number of games, so "measuring ELO gain" can realistically only be done with some error margin.

Measuring an improvement of 500 ELO in 100 games is going to be statistically significant. The error margin might be 300 ELO, but this means worst case there is still an improvement of 200 ELO. An ELO improvement of 500 ELO over 100 games means a score of something like 5 - 95!

michiguel · Post by **michiguel** » Wed Jan 13, 2010 11:41 am

Michel wrote:
1) Simple things are expressed in complicated math symbols. Writer uses great effort to write his ideas in very abstract form and reader uses great effort to extract these ideas from that abstract form. What's the point? I could get ideas much more easily from well documented C-like pseudo-code.
It depends on your background. Some people find formulas clearer than text or pseudocode.
Which narrows the audience. I understand that the formalism is most of the time important, but many times it is completely unnecessary.

Science tries to fulfill several goals, among them, accuracy and dissemination. Balance between both is important.

Miguel

A paper about parameter tuning

A paper about parameter tuning

Re: A paper about parameter tuning

Re: A paper about parameter tuning

Re: A paper about parameter tuning

Re: A paper about parameter tuning

Re: A paper about parameter tuning

Re: A paper about parameter tuning

Re: A paper about parameter tuning

Re: A paper about parameter tuning

Re: A paper about parameter tuning