A paper about parameter tuning

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Gian-Carlo Pascutto
Posts: 1243
Joined: Sat Dec 13, 2008 7:00 pm

Re: A paper about parameter tuning

Post by Gian-Carlo Pascutto »

diep wrote: You can say whatever you want, but someone is paying big cash to a bunch of programmers past few years to get things improved and also manages to show up with real big LOW LATENCY SHARED MEMORY hardware, so called 'clusters', though none of all those programs can run on any cluster.

Toga, Sjeng, Rybka, Jonny to just name 4 of them.
What are you saying here, exactly?
nkg114mc
Posts: 74
Joined: Sat Dec 18, 2010 5:19 pm
Location: Tianjin, China
Full name: Chao M.

Re: A paper about parameter tuning

Post by nkg114mc »

Hi, guys:

I am quite interested in the method mentioned in this paper.
Does some one knows where can I download the engine Meep in that paper? Is it an open source one?

Thanks! :) ~
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: A paper about parameter tuning

Post by diep »

joelv wrote:
Aaron Becker wrote: I can't speak for the authors, but I think they're trying to prove that their technique is superior to the earlier TD techniques that they refer to as TD-Leaf and TD-Root, and I think they make a reasonable case. You say that you could use any learning technique to improve on random data, but those earlier TD techniques don't work well without a reasonable starting vector. Obviously improving a real top engine would be a better result, but rewriting the evaluation of an engine like Stockfish to be expressed as a weighted feature vector would be a lot of work on its own, and improving values that have been painstakingly hand-tuned is an awfully high bar to clear for a new technique.
That's pretty much spot on.

Yes, I am also curious to know whether this technique (or an adaptation of) could help a strong engine. It is something I would personally be interested in trying in the future...that said, I think it would be crazy to invest such a huge effort in a new technique, without first validating it with some self play / random weight experiments on a simple program.

Some more general comments (directed at the whole thread):

a) This paper looks at learning from self play, starting from random weights....and nothing more. So please try and understand it in this context, and don't get angry when it doesn't answer _your_ question.

b) I agree that the paper could have been improved by reporting the results of multiple learning runs. If I get a chance in the future, I will address this. For what it is worth, in my private testing (with different, and progressively more sophisticated sets of features), the TreeStrap method consistently outperformed TD-Leaf by similar margins.

c) Apologies if the paper is difficult to read. The mathematics is really not that much. If you can understand TD-Leaf, or an ANN, then you can easily understand this work. I might post a chess programmer friendly version on my website if there is sufficient interest, in the meantime, please email me if you have any questions.

d) Automatic evaluation improvement of strong Chess engines is an entire research topic in itself. It's different and more open ended than learning from scratch with self-play. And I dare say more difficult. e.g. You can try to use the information provided by strong engines in a number of different ways (build a score based training set, try to match moves, etc)... Or, as mentioned somewhere in this thread, you could look at methods that only tune a small number of parameters in isolation... Or you can use a cluster and some computation heavy technique like simulated annealing (with ELO as the objective function)... Take your pick, be creative, and enjoy the age of auto-tuning.
Hi Joel,

Nice to see a paper of you. What i wonder about is what is new in what you are doing when you compare it to the Deep Blue parameter tuningspaper.

Seems you just reinvented what they did do.

From what i understand TD Learning didn't start with random values but with everything initialized at 0.

So we can prove (the mathematical type proving) that even moving from 0 to random values already would give an eloincrease for TD Learning, that and another 1000 factors make TD learning of course not so interesting.

One example is that the improvement graph Knightcap gave to show that TD learning worked was flawed.

Their comparision wasn't objective.

Basically they compared an engnine with parameters of its piece square table set to 0 that played a bunch of games nonstop, with the TD learning version which if it would win then it would play another game, and when it lost or even if it drew, then it would get offline for 15 minutes. this effectively is a good strategy to win elorating, as someone who knows how to beat you, you avoid - it follows the strategy of: "never attack someone at the moment that he's strong - do it if he's playing weak now and is losing". This strategy is well known to work for thousands of years already. Yet it's not what i call evidence that TD learning would work.

having played with Diep back then hundreds of games against Knightcap i can assure you that this is a flawed way to do the experiment.

there is more problems with TD learning, but i just showed a few.

Now what you do is compare yourself with the utmost crap ever invented in history called TD learning. Even a trivial algorithm is better than that. The only good thing about TD learning is that they got *something* to work. The rest is all deliberate flawed research - and they knew it.

On the other hand you just reinvent what the Deep Blue team had already invented in 90s to tune their evaluation function.

I miss in your paper any defence claiming where you did do things different from them. In fact you claim the exact same thing here.

So that voids your paper.

p.s. that's what you typically see happen if a few persons go study a subject without speaking with any other expert in that field. A big number of yanks checking out the forum here would have remembered the Deep Blue research.

If bob digs at his computer maybe he'll find the postscript of it still or maybe some textual writing.

In those years we had bunches of selftuning papers there.
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: A paper about parameter tuning

Post by Adam Hair »

nkg114mc wrote:Hi, guys:

I am quite interested in the method mentioned in this paper.
Does some one knows where can I download the engine Meep in that paper? Is it an open source one?

Thanks! :) ~
Hi Ma Chao,

Unfortunately, Meep has never been made publicly available. The best I can do is direct you to Joel Veness' contact page.

We members of the CCRL certainly hope you continue working on your chess engine.

Best of Luck,
Adam
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: A paper about parameter tuning

Post by diep »

nkg114mc wrote:Hi, guys:

I am quite interested in the method mentioned in this paper.
Does some one knows where can I download the engine Meep in that paper? Is it an open source one?

Thanks! :) ~
We can void the Meep experiment. They claim a new name for something that already existed, and i see no PROOF in their paper of the opposite. The Deep Blue paper should be there together with source code how they tuned, though i'm not sure whether the code was from this experiment or one before that. That was back then spreaded freely.

In fact over the years they wrote several different manners of how to tune, one of them is what is called here the new name 'bootstrapping'. It's not proper done research.

If you google some and ask around i'm sure many still have the Deep Blue research somewhere together with code how they tried to do it (that code might involve another tunings experiment they did do from 10 years before during times it was called Deep Thought).

I am not a big fan of researchers who hide themselves Down Under or in this case somewhere at the Northpole, with 0 contacts to the rest of the planet, ignoring the rest of the planet, then suddenly claim something to be new in a field where basically everything has been invented already. You run the risk then that your 'invention' already was publicly described. People who hide somewhere then suddenly out of the blue sky show up with a paper they have something to hide IMHO. At least they have to hide their personality. With so little communication to the community it's obvious odds for mistakes is bigger.

That's what obviously happened here.

comparing a static evaluation with a search is very interesting, yet already described by Deep Blue team back in 90s.

Give credits to those who deserve it, don't steal the honor of invention. Add to that that this team added unreadable math symbols to it. Now i can try to read those math symbols and probably after long study will conclude that it's crap math, or maybe not; but it doesn't make the paper more readable.

Deep Blue paper very readable. World of difference.

Back in 90s there was many tunings efforts undertaken. I also toyed some there - never published - others i spoke with at computer chess tournaments they all had toyed and tried. Most with neural networks.

Much government money was thrown in neural networks and that didn't change over the years. Just got more sneaky funded past few years. Back then was done openly.

I remember researcher who for years fought against version 1.0 of Diep with his neural network trying to have the net invent a new evaluation function itself. Proper done research, no succes reported back after some years though. Those neural nets were very expensive back then. Most started at $150k+ a net or more. All hardware nets. That's $500k in todays money.

It's obvious that material evaluation of rybka3 has been tuned by neural network, unlike the rybka's before.

Note that in most of these experiments the evaluation functions are so primitive of the programs that the word 'selflearning' in fact can be replaced by 'parameter tuning'; much unlike some of the real researchers who really tried to find new patterns by neural nets themselves (where i refer to), which all failed AFAIK.

Redoing such experiments with more statistical significance today would be interesting. Yet not sneaky. Sneaky research i dislike there. What's there to hide?
nkg114mc
Posts: 74
Joined: Sat Dec 18, 2010 5:19 pm
Location: Tianjin, China
Full name: Chao M.

Re: A paper about parameter tuning

Post by nkg114mc »

Hi Ma Chao,

Unfortunately, Meep has never been made publicly available. The best I can do is direct you to Joel Veness' contact page.

We members of the CCRL certainly hope you continue working on your chess engine.

Best of Luck,
Adam
Oh, Thanks a lot, Adam :)!
I have already sent a email to Joel and waiting for his reply.

My engine is too weak currently, a lot of work is still need to do. I will spend more time to fix the bugs and build a more complex evaluation for it.

Thanks for the work of CCRL! It really provides plenty of data for analysis, which is almost impossible to be done by individual chess programmer.

Best wishes,
Chao Ma.
nkg114mc
Posts: 74
Joined: Sat Dec 18, 2010 5:19 pm
Location: Tianjin, China
Full name: Chao M.

Re: A paper about parameter tuning

Post by nkg114mc »

Hi, Vincent!
Redoing such experiments with more statistical significance today would be interesting
Could you explain a little more about "experiments with more statistical significance" that you mentioned in your reply?

Thanks!

BTW: I still not quite believe that the method in this paper can get such a good performance, at least not for all arbitry initial weight. So that's why I want to know how their experiment was done in detail.
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: A paper about parameter tuning

Post by diep »

nkg114mc wrote:Hi, Vincent!
Redoing such experiments with more statistical significance today would be interesting
Could you explain a little more about "experiments with more statistical significance" that you mentioned in your reply?

Thanks!

BTW: I still not quite believe that the method in this paper can get such a good performance, at least not for all arbitry initial weight. So that's why I want to know how their experiment was done in detail.
With more statistical significance i mean that if they used 95% confidence level for experiments in 90s, that you now move to for example using 99.9% sureness of every fact finding instead of 95%.

So basically do things with better testing than happened in 90s.

Based upon 1 game there i usually would already modify evaluation or search :)

Against todays industrial engineered software this is not so effective.