Python script for TTM

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

mar
Posts: 2554
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: Python script for TTM

Post by mar »

lucasart wrote:I think speed is key here, because the optimization process must run lots of iterations.
I see two potential problems: IO, FEN parsing/validation.

I wouldn't be surprised if the universal tuner ran order of magnitude slower than integrated native one, obviously people hate to waste time
so why wait for 1 parameter when you could do 10 in the same amount of time?

What comes to mind is batching; since you pre-split the fens to threads (I assume), you can send all fens to the engine only once, then tune many parameters/iterations.
This would reduce IO from tuner to engine to zero and would also get rid of FEN processing.
The drawback is slightly increased complexity on engine part.

Another thing which might be interesting to try is to use plain eval instead of qs (what Miguel did in Gaviota IIRC), by filtering out non-quiet positions.
User avatar
lucasart
Posts: 3232
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: Python script for TTM

Post by lucasart »

mar wrote:
lucasart wrote:I think speed is key here, because the optimization process must run lots of iterations.
I see two potential problems: IO, FEN parsing/validation.

I wouldn't be surprised if the universal tuner ran order of magnitude slower than integrated native one, obviously people hate to waste time
so why wait for 1 parameter when you could do 10 in the same amount of time?

What comes to mind is batching; since you pre-split the fens to threads (I assume), you can send all fens to the engine only once, then tune many parameters/iterations.
This would reduce IO from tuner to engine to zero and would also get rid of FEN processing.
The drawback is slightly increased complexity on engine part.

Another thing which might be interesting to try is to use plain eval instead of qs (what Miguel did in Gaviota IIRC), by filtering out non-quiet positions.
Based on my results, and the ones from Ferdinand, it seems that we have 2 orders of magnitude speed difference. This huge! Of course there's IO, task switching, and FEN parsing, but that's not all, nor even most of the overhead.

Based on my experience coding a cutechess-cli in python, I'm pretty certain that 1 order of magnitude could be regained by doing the tuner in C++ instead of Python. We're just calculating qsearch, which runs at 1.5m/s in my case, so *everything* is overhead, and we need to kill overhead wherever possible.

On a side note, it's not so easy to call the qsearch like that, particularly with SMP. It's normally deeply nested, and many things must be setup before calling the QS. It was an interesting exercise, because it forced me to refactor my code better.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
User avatar
lucasart
Posts: 3232
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: Python script for TTM

Post by lucasart »

I'm now doing a proper fitting for lambda (using newton raphson method to minimize the quadratic error). That means the lambda before/after a patch is slightly different, always optimal: score = 1 / (1 + exp(-lambda * qsearch)).

I'm also guarding against over-fitting, by doing a 75/25 validation: optimize parameter on 75% sample, then validate with 25% sample.

Still no success. So far, every single tuning patch that improved the logistic fit was an elo loss.

I'll try with games from random starting positions. But I don't expect much from that.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
Pio
Posts: 334
Joined: Sat Feb 25, 2012 10:42 pm
Location: Stockholm

Re: Python script for TTM

Post by Pio »

Hi, I think you will get much better result if you try to minimize the absolute error.

/Pio
ZirconiumX
Posts: 1334
Joined: Sun Jul 17, 2011 11:14 am

Re: Python script for TTM

Post by ZirconiumX »

I'm sure I'm going to get yelled at by statisticians everywhere - I am not a mathematician.

But thinking about this, what if, instead of fitting to the single game result, which may or may not have blunders, you perform monte-carlo playouts to get an empirical value for the winning probability for this position, and then fit to that instead of fitting to the game result?

Of course, this method would be significantly more compute-intensive, but perhaps it would overfit less?
Some believe in the almighty dollar.

I believe in the almighty printf statement.
User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: Python script for TTM

Post by Evert »

ZirconiumX wrote:I'm sure I'm going to get yelled at by statisticians everywhere - I am not a mathematician.

But thinking about this, what if, instead of fitting to the single game result, which may or may not have blunders, you perform monte-carlo playouts to get an empirical value for the winning probability for this position, and then fit to that instead of fitting to the game result?

Of course, this method would be significantly more compute-intensive, but perhaps it would overfit less?
I don't think it does anything for overfitting, but the idea is interesting in that it can give you a "fuzzy" game result, which could be more useful than a hard "0, 1/2, 1". For that to work though the monte-carlo playout needs to be a reasonable predictor of the result under optimal play, and there's the rub: I don't think monte-carlo playouts have so far proven to be very effective with chess.

In general, picking positions for this type of tuning is hard, and I'm still not entirely sure what the best way to do it is (which is part of why I haven't tried this myself). What you need are positions that a) represent what the engine encounters during search, and b) have a good correlation to the outcome of the game. In other words, positions from that branch of the tree where white bungled a rook are useless if the actual line that was played resulted in white winning. Conversely, even a root position is useless if the engine was winning, but blundered a piece later (I suppose it needs to be at a level where it doesn't do that for this method to give good results anyway, but weird things happen ever so often).
ZirconiumX
Posts: 1334
Joined: Sun Jul 17, 2011 11:14 am

Re: Python script for TTM

Post by ZirconiumX »

Evert wrote:I don't think it does anything for overfitting, but the idea is interesting in that it can give you a "fuzzy" game result, which could be more useful than a hard "0, 1/2, 1".
An evaluation function is a heuristic, so perhaps heuristic values would be more useful than game theoretic values for positions the eval may never see in search.
For that to work though the monte-carlo playout needs to be a reasonable predictor of the result under optimal play, and there's the rub: I don't think monte-carlo playouts have so far proven to be very effective with chess.
This is the issue. I'm sure somebody could engineer a way around this, however.
In general, picking positions for this type of tuning is hard, and I'm still not entirely sure what the best way to do it is (which is part of why I haven't tried this myself). What you need are positions that a) represent what the engine encounters during search, and b) have a good correlation to the outcome of the game. In other words, positions from that branch of the tree where white bungled a rook are useless if the actual line that was played resulted in white winning. Conversely, even a root position is useless if the engine was winning, but blundered a piece later (I suppose it needs to be at a level where it doesn't do that for this method to give good results anyway, but weird things happen ever so often).
Yeah, my experience is that your engine already needs to be good to be improved, which is somewhat circular. I suppose it can't make up for knowledge the engine does not have, however.
Some believe in the almighty dollar.

I believe in the almighty printf statement.
User avatar
lucasart
Posts: 3232
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: Python script for TTM

Post by lucasart »

lucasart wrote:I'll try with games from random starting positions. But I don't expect much from that.
Also fails miserably. I'm just finding more and more ways to lose elo while reducing the quadratic error to a logistic curve.

I have 2 more things left to try:
- fit with absolute error instead of quadratic. I really don't think this is going to make a difference, but let's give it a chance.
- use depth=2 instead of depth=0 (qsearch).
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
AlvaroBegue
Posts: 931
Joined: Tue Mar 09, 2010 3:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

Re: Python script for TTM

Post by AlvaroBegue »

Would you like to try with the set of positions I am preparing? It's 1.34M positions taken from calls to the evaluation function of my program RuyDos, with results assigned by playing out from that position using SF6-vs-SF6 at 30 moves/second on a pretty fast computer.

I'll have it ready in a few hours (it's been computing for about a week), and I have some hope that it will be a good dataset to tune evaluation functions.

Are you interested?
User avatar
lucasart
Posts: 3232
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: Python script for TTM

Post by lucasart »

- use depth=2 instead of depth=0 (qsearch).
This made no difference, apart from being much slower.
- fit with absolute error instead of quadratic.
We could be on to something here. It is the only one that refuted the series of elo losing patch that improved the regression in all other test cases. And it suggested a patch that seems to be working (though test is still running, still early to say).
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.