Uri Blass wrote:tiger wrote:Uri Blass wrote:Kirill Kryukov wrote:I did a quick test of Colossus 2007a under CCRL 40/4 conditions. The rating is 2686 ELO points, after 224 games. This makes it #16 engine in
CCRL 40/4 Free Single-CPU list (my version which includes only stable public releases with default settings).
It's a bit early to make conclusion, ranking may change after more games (which are running right now). Still I hope this improvement can stand.
Those few games were enough to get 97.3% LOS (Likelihood of Superiority) over previous version - 2006f which is rated 2644 (42 points difference).
All results of Colossus 2007a to date.
Comparison of 3 Colossus versions we tested
What makes me more happy personally is that new version does not crash when accessing tablebases on my Vista machine like 2006f did.
Best,
Kirill
Note that movei personality 10 10 10 that is free and can be used by everyone is not in the list(only in the complete list) inspite of the fact that the tests suggest that it is better than the default(not enough games to know but other tests that I did also support it)
Movei 0.08.403(10 10 10) 2646 +27 −27 46.6% +22.0 25.2% 476
50.9%
Movei 0.08.403 2635 +20 −20 52.7% −22.6 30.1% 860
Uri
Uri, I seem to remember that this "10 10 10" stuff is somewhat related to "progress". Can you explain the concept? I have been playing with what I believe is a similar concept and I wanted to know about yours.
// Christophe
The concept is that movei evaluates the path and not only the leaf position(progress 0 0 0 means no path dependent evaluation).
Unfortunately it seems not to help much and my guess is that it gives me only 30 elo improvement even with best parameters.
I am sure it is possible to improve it by code changes(different path evaluation) and I did not do a lot of tests of different ideas of path dependent evaluations but
I think that it is better if I will try to use hash more efficiently (for pruning) because I probably can earn more from effective use of hash
even if I need to avoid progress.
I already had one try not to use progress in previous version 383 but unfortunately I failed to use hash for pruning after that version(I tried only using hash for pruning in qsearch to save qsearch nodes and it did not save nodes and instead of investigating the problem I prefered to try other ideas)
progress 10 10 10 means that the path evaluation can be 0.1 pawns or 0.1+0.1 pawns or 0.1+0.1+0.1 pawns different than the static evaluation that is dependent only on the leaf.
Uri
So how do you evaluate the path?
My idea was that if it is possible to reach a position by 2 different paths, it is best to follow the "safest" path. The idea is that you may discover later that the position you wanted to reach is not good because after reaching it something bad happens and the horizon effect has hidden it. In this case, you will have to change your mind. But you will have to find a new variation in the middle of one of the paths leading to the position you wanted to reach initially, because you realize the position is bad only after making a few moves, using one path or another.
My assumption is that it is easier to find a new safe position to reach from a path consisting of "safe" positions than from a path consisting of "unsafe" positions.
For example assume that in some position it is possible to exchange several pieces, the exchange leading to a better position for you, and there are two ways of doing it:
- first way ("path") is to sacrifice 3 pieces, then you get them back with a deep combination
- second path is to exchange them one after the other (you capture, your opponent is forced to recapture, and so on...)
In this position, a path independant evaluation will chose one path AT RANDOM!
A path dependant evaluation will chose on purpose the second way, because at no point during the variation the program is behind in material. So if it turns out after starting the exchanges that the sequence cannot go on because of some unseen threat (it was too deep to be seen before the exchanges started), then at least the program can look for an alternative from a position where it is not behind in material.
By extension, the same idea applies when it is possible to reach not the same position by two different path, but also when you can reach two positions that have the same evaluation by two different paths.
And it can go as far as including the path into the evaluation, so a program could choose to reach a position with a lower evaluation just because the path leading to it is safer.
In current chess programs, there is only one reason holding us from doing path dependant evaluation: it is the recognition of transpositions thru the use of the hash table. A path dependant evaluation gives a different definition of a transposition, and this definition is not compatible with how we use the hash table.
However it does not mean that a path dependant evaluation is fundamentally incompatible with transposition detection by the use of a hash table.
The most obvious way to avoid the problem is to stop using the hash table for transposition detection. Maybe the gains of transposition detection would be largely compensated by the gains of path dependant evaluation. As I understand, you do not use transposition detection at this time in Movei anyway.
But I think it is possible to design a transposition detection system that would work well together with path dependant evaluation. For example if the path dependant part of the evaluation is the same for two paths leading to the same position, then the transposition detection can be used as usual (the search simply returns the exact score stored in the hash table, if it has one, for the position).
Also, if the path dependant part of the evaluation is constrained between known bounds (for example [-0.50;+0.50]), then the scores stored in the hash table can also be used as bounds for beta cutoffs.
What I could do is run test matches using a version of Chess Tiger that would not use the hash table for transposition detection (it would just use it for move ordering) and see how much elo would be lost from not detecting transpositions. That would give a lower estimate of how much a path dependant evaluation should gain in order to overcome the loss of transposition detection.
// Christophe