Training data

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
Desperado
Posts: 879
Joined: Mon Dec 15, 2008 11:45 am

Re: Training data

Post by Desperado »

j.t. wrote: Sun Feb 20, 2022 3:06 pm An idea why the quiet-labeled set has a lower average phase is that because the positions are sampled from all the positions an engine evaluation function encountered during play, many positions will be from the end of a quiesce search, which has likely made a number of captures, and thus reduced the phase value.

The lower number of positions in the opening/midgame phase could maybe partly explain why the error for these phases is lower. The opening parameters have fewer positions that they need to predict, and thus can specialize better on these.
The latest tables show the MSE per game phase. (not the distribution of the game phase). Sorry if that was not clear.

To lower the average game phases, for adjusting the properties, i used the distribution pattern or the quiet-labeled.epd.
That lowered the average phase for the mentioned collection from 16.x to 10.x, so closer to 9.x.

Phase distribution example (after adjusting)

Code: Select all

3000_8_DP.epd
0.0130, 0.0109, 0.0692, 0.0342, 0.1192, 0.0585, 0.1057, 0.0363,
0.0707, 0.0285, 0.0480, 0.0193, 0.0344, 0.0149, 0.0292, 0.0126,
0.0235, 0.0109, 0.0223, 0.0125, 0.0440, 0.0114, 0.0758, 0.0181,
0.0770,

quiet-labeled.epd
0.0156, 0.0212, 0.0796, 0.0416, 0.1444, 0.0614, 0.1056, 0.0361,
0.0708, 0.0284, 0.0481, 0.0194, 0.0345, 0.0148, 0.0292, 0.0126,
0.0236, 0.0109, 0.0222, 0.0125, 0.0297, 0.0114, 0.0410, 0.0182,
0.0671,
So the phase distribution is already factored in. The conclusion and effect is pretty clear too (as already metioned).

The reason for the low MSE in the MG phases (in quiet-labeled.epd) needs to be explored further.

MSE per game phase

Code: Select all

3000_8_DP.epd
0.0936 0.0774 0.0753 0.0728 0.0744 0.0666 0.0947 0.0935
0.1053 0.1065 0.1245 0.1256 0.1399 0.1378 0.1499 0.1561
0.1549 0.1937 0.1734 0.2138 0.1884 0.2245 0.1934 0.2467
0.1957

quiet-labeled.epd
0.0994 0.0925 0.0492 0.0493 0.0486 0.0470 0.0559 0.0457
0.0604 0.0516 0.0707 0.0559 0.0829 0.0625 0.0912 0.0679
0.0890 0.0697 0.0941 0.0593 0.0993 0.0557 0.1220 0.0300
0.1247
Of course the total average error, and the difference between 0.04 and 0.08, is mainly due to the MSEs in the MG phases.

To have it in one post i would like to post the relevant tables again.

Adjusted properties

Code: Select all

3200_8_DP.epd      / CNT =  415968 / K = 0.683935950 / MSE = 0.0869939938770631 / DR = 0.275 / APHS = 11.06
3000_8_DP.epd      / CNT = 1321544 / K = 0.645552504 / MSE = 0.0876008663539680 / DR = 0.275 / APHS = 10.88
3000_8_P.epd       / CNT = 1856599 / K = 0.436934197 / MSE = 0.0744345800020777 / DR = 0.476 / APHS = 10.44
quiet-labeled.epd  / CNT =  750000 / K = 0.643410690 / MSE = 0.0465274694383763 / DR = 0.275 / APHS =  9.76
Original collections

Code: Select all

       3400_8.epd / CNT =   256400 / K = 0.31087369  / MSE = 0.0771616689121431 / DR = 0.647 / APHS = 18.23
       3300_8.epd / CNT =   520624 / K = 0.366286730 / MSE = 0.0898464623713863 / DR = 0.582 / APHS = 18.40
       3200_8.epd / CNT =  1149832 / K = 0.416795321 / MSE = 0.1022347681140613 / DR = 0.515 / APHS = 18.52
       3100_8.epd / CNT =  2102367 / K = 0.445591220 / MSE = 0.1104224481755210 / DR = 0.469 / APHS = 18.54
       3000_8.epd / CNT =  3229357 / K = 0.450422900 / MSE = 0.1154083821281762 / DR = 0.445 / APHS = 18.52
       2800_8.epd / CNT =  7872226 / K = 0.497679208 / MSE = 0.1025738023441199 / DR = 0.434 / APHS = 16.40
       2500_8.epd / CNT = 12746017 / K = 0.488025865 / MSE = 0.1109184451157332 / DR = 0.388 / APHS = 16.35
        64K_5.epd / CNT =   750000 / K = 0.376087857 / MSE = 0.1307579005698916 / DR = 0.232 / APHS = 15.98
  
       corr_8.epd / CNT = 16082330 / K = 0.462039370 / MSE = 0.1223174055758201 / DR = 0.404 / APHS = 20.21
       human8.epd / CNT = 30205714 / K = 0.307684900 / MSE = 0.2005787651347260 / DR = 0.076 / APHS = 19.46
       
       quiet-labeled.epd / CNT =   750000 / K = 0.643410690 / MSE = 0.0465274694383763 / DR = 0.275 / APHS =  9.76