I did see something at hyper-bullet, not sure it's relevant. 1,000 self-games of pretty equivalent versions at 60''+0.6'' would be very good. I will post soon some results on your hyper-bullet games.petero2 wrote: I don't have a lot of blitz games but I do have lots of hyper-bullet (1s+0.08s/move) games: Here are 37100 such games: http://dl.dropboxusercontent.com/u/8968 ... s105a32.xz. These games are played under the same conditions as I use when tuning the evaluation function.
If you want to I can play some games at longer time control. Just specify the time control and the number of games you want.
Expected performance and eval of Komodo 8 and SF 6
Moderator: Ras
-
Laskos
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Expected performance and eval of Komodo 8 and SF 6
-
lkaufman
- Posts: 6284
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Expected performance and eval of Komodo 8 and SF 6
I think your insight here is a very good one, and it gives me an idea how to solve this problem in Komodo. We shall see..Thank you for suggesting this line of attack.Pio wrote:Hi Larry!
I guess the problem you have is that when adjusting the scores to better reflect the winning probabilities you will not search as deep into the simplified positions that are much easier to resolve to either a win, draw or a loss since the search is guided by the evaluation.
I guess you could fix this problem if you modify your search a little bit. I had an idea http://www.talkchess.com/forum/viewtopi ... 76&t=42677 how to do this.
What I want to say is that I think you should search a lot deeper in the simplified parts of the tree since it is cheaper but that you should not play those moves going into the simplified parts if you are not pretty sure that they are good.
Good luck!
Komodo rules!
-
Laskos
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Expected performance and eval of Komodo 8 and SF 6
A reliable result at hyper-bullet:Laskos wrote:I did see something at hyper-bullet, not sure it's relevant. 1,000 self-games of pretty equivalent versions at 60''+0.6'' would be very good. I will post soon some results on your hyper-bullet games.petero2 wrote: I don't have a lot of blitz games but I do have lots of hyper-bullet (1s+0.08s/move) games: Here are 37100 such games: http://dl.dropboxusercontent.com/u/8968 ... s105a32.xz. These games are played under the same conditions as I use when tuning the evaluation function.
If you want to I can play some games at longer time control. Just specify the time control and the number of games you want.
Code: Select all
Eval Move 30 Move 70
Texel 1.05 Expe.score Expe.score
1.0 71% 60%
1.5 80% 70%
2.0 87% 78%-
Adam Hair
- Posts: 3226
- Joined: Wed May 06, 2009 10:31 pm
- Location: Fuquay-Varina, North Carolina
Re: Expected performance and eval of Komodo 8 and SF 6
Hi Kai,
If you are interested, I will see if I have games played by Gaviota that are appropriate for your study. And if I don't, I do not mind spending a day or two of computer time producing them.
If you are interested, I will see if I have games played by Gaviota that are appropriate for your study. And if I don't, I do not mind spending a day or two of computer time producing them.
-
Laskos
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Expected performance and eval of Komodo 8 and SF 6
Hi, sure, I need 1,000-5,000 games against pretty equal opponent, at blitz TC, in cutechess-cli.Adam Hair wrote:Hi Kai,
If you are interested, I will see if I have games played by Gaviota that are appropriate for your study. And if I don't, I do not mind spending a day or two of computer time producing them.
-
petero2
- Posts: 734
- Joined: Mon Apr 19, 2010 7:07 pm
- Location: Sweden
- Full name: Peter Osterlund
Re: Expected performance and eval of Komodo 8 and SF 6
Here are 2431 games played at time control 60+0.6: https://dl.dropboxusercontent.com/u/896 ... s105a35.xzLaskos wrote:I did see something at hyper-bullet, not sure it's relevant. 1,000 self-games of pretty equivalent versions at 60''+0.6'' would be very good. I will post soon some results on your hyper-bullet games.petero2 wrote: I don't have a lot of blitz games but I do have lots of hyper-bullet (1s+0.08s/move) games: Here are 37100 such games: http://dl.dropboxusercontent.com/u/8968 ... s105a32.xz. These games are played under the same conditions as I use when tuning the evaluation function.
If you want to I can play some games at longer time control. Just specify the time control and the number of games you want.
-
Laskos
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Expected performance and eval of Komodo 8 and SF 6
Great, I will produce plots for move 25 and move 70.petero2 wrote:Here are 2431 games played at time control 60+0.6: https://dl.dropboxusercontent.com/u/896 ... s105a35.xzLaskos wrote:I did see something at hyper-bullet, not sure it's relevant. 1,000 self-games of pretty equivalent versions at 60''+0.6'' would be very good. I will post soon some results on your hyper-bullet games.petero2 wrote: I don't have a lot of blitz games but I do have lots of hyper-bullet (1s+0.08s/move) games: Here are 37100 such games: http://dl.dropboxusercontent.com/u/8968 ... s105a32.xz. These games are played under the same conditions as I use when tuning the evaluation function.
If you want to I can play some games at longer time control. Just specify the time control and the number of games you want.
-
Laskos
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Expected performance and eval of Komodo 8 and SF 6
With this database, the Expected Score of Texel 1.05 on move 25 and on move 70. The difference is substantial.petero2 wrote: Here are 2431 games played at time control 60+0.6: https://dl.dropboxusercontent.com/u/896 ... s105a35.xz
Move 25:

Move 70:

Combined fits:

-
Laskos
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Expected performance and eval of Komodo 8 and SF 6
Ferdinand wrote the necessary script, I am a bit bogged analyzing data. I tried to accommodate logistic to fit data, it's pretty much hopeless. The excellent fit is a bit modified logistic:nimh wrote:It is obvious that the reason is that the reduced amount of material makes it harder to convert advantage into full point. Could you perform the analysis again for determining the relationship between material and expected scores?
You suggested I use a logistic function instead of centipawns for analyzing the quality of chess games. I think it would be useful to have a some sort of formula to determine expected scores based on material as well.
Expected Score = (tanh[eval^b/a] + 1) / 2
Move 25:

Here the blue line and dots are actual data, red line is (tanh[eval^b/a] + 1) / 2 fit, green line is the logistic fit. Logistic holds reasonably, but here:
Move 70:

The logistic is pretty broken.
So, I abandoned the logistic fit for (tanh[eval^b/a] + 1) / 2, with a and b empirical parameters different for each engine (and may vary with time control or hardware). It fitted VERY well all results until now, Komodo 8, SF6, Houdini 4, Texel 1.05. A little thing to observe is that both the data and the fits have an inflection point (zero of second derivative) at some eval>0, while logistic has such a point only at eval=0.
Now, including material. With usual counting 1,3,3,5,9 for pawn, knight, bishop, rook, queen, using Ferdinand's script for moves 15,25,35,50,70 of Komodo 8 games database, I got the following material:
material(move 15)= 67.63;
material(move 25)= 57.95;
material(move 35)= 49.34;
material(move 50)= 33.12;
material(move 70)= 18.22;
With Expected Score = (tanh[eval^b/a] + 1) / 2, fitting for dependency of material, I got that a is inversely proportional to material^0.6. b is inversely proportional to material^0.3. With this scaling, normalizing for actual data, I got the following generalization including material for the fits:
Expected Score = (tanh[eval^(b1/material^0.3)/( a1/material^0.6)]+1) / 2
With a1=14.3684 and b1=4.36497 in the case of Komodo 8 fits for these blitz games. Other engines or other conditions will have different a1, b1.
The fits for Komodo to moves 15,25,35,50,70 are shown here:

And they are very similar to the actual values I posted earlier:
http://www.talkchess.com/forum/viewtopi ... 10&start=5
For now I am a bit bogged improving on this to include material.
-
Laskos
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Expected performance and eval of Komodo 8 and SF 6
Hi Adam, thanks for the PGNs.Adam Hair wrote:Hi Kai,
If you are interested, I will see if I have games played by Gaviota that are appropriate for your study. And if I don't, I do not mind spending a day or two of computer time producing them.
Here is the first Gaviota result, TBs are to follow. The data seems a bit coarse, but the result is pretty clear. Interesting to note that Gaviota's eval is very close to logistic both to move 25 and to move 70 (the exponent b in the fit (1+tanh[eval^b/a])/2 is close to 1).


