Labeled positions for Texel tuning

Discussion of chess software programming and technical issues.

Moderator: Ras

Robert Pope
Posts: 563
Joined: Sat Mar 25, 2006 8:27 pm
Location: USA
Full name: Robert Pope

Labeled positions for Texel tuning

Post by Robert Pope »

I have a dataset of 725K quiet labeled positions that were generated by Zurichess, but I was wondering if anyone knew of other good datasets that are publicly available? I'm working on generating my own, but I don't think they'll be as good quality. I also tried to follow some of the links in chessprogramming.org, but didn't find much.
AndrewGrant
Posts: 1953
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: Labeled positions for Texel tuning

Post by AndrewGrant »

User avatar
lithander
Posts: 915
Joined: Sun Dec 27, 2020 2:40 am
Location: Bremen, Germany
Full name: Thomas Jahn

Re: Labeled positions for Texel tuning

Post by lithander »

In my devlog I have written a few posts on how I weaned Leorik off the Zurichess set and created my own data from scratch via selfplay, starting with a completely dumb evaluation based on just material values. (100, 300, 500 and 900)

It starts here if you're curious: https://talkchess.com/forum3/viewtopic. ... 40#p938897

Eventually I managed to exceed the performance I got out of the Zurichess set. Version 2.3 and later have been tuned on this growing repository of selfplay games (with randomization).

Be aware that these labels are far from perfect: just the outcome of the game, not involving stockfish in the labeling as it was done (afaik) for the Zurichess set. That it worked so well was a surprise to me, to be honest. And it would be interesting to know if it works well for other engines, too.

So, if you want I can upload a file with a few million labeled positions that I currently use for training version 2.4.X.
Minimal Chess (simple, open source, C#) - Youtube & Github
Leorik (competitive, in active development, C#) - Github & Lichess
Robert Pope
Posts: 563
Joined: Sat Mar 25, 2006 8:27 pm
Location: USA
Full name: Robert Pope

Re: Labeled positions for Texel tuning

Post by Robert Pope »

Thank you, both. I think Andrew's data has what I was looking for to start, so I'm going to try those while I work on generating my own data.
User avatar
lithander
Posts: 915
Joined: Sun Dec 27, 2020 2:40 am
Location: Bremen, Germany
Full name: Thomas Jahn

Re: Labeled positions for Texel tuning

Post by lithander »

Today I tuned a version of Leorik on the first 5M positions from Andrew's E12.52-1M-D12-Resolved.book (Leorik2.4.3d) to compare it with my current dev version tuned on 3.4M positions from Leorik selfplay games. I didn't change anything else except the dataset.

Code: Select all

   # PLAYER           :  RATING  POINTS  PLAYED   (%)
   1 Leorik-2.4.3c    :  2334.7  1442.0    2413    60
   2 Leorik-2.4.3d    :  2265.3   971.0    2413    40
The result is quite decisive in favor of Leorik tuned on Leorik's own selfplay games. Now I'm curious if my dataset would work equally well for other engines. If you want to try it you can download it here: DATA-L26-3443372.zip

The format is equal to the one used in the Zurichess set:

Code: Select all

8/8/8/8/2q4p/6k1/8/K7 b - - 1 1 c9 "0-1";
8/Q5pk/5p1p/1P3q2/8/8/3r4/K7 w - - 0 1 c9 "0-1";
4R3/p5p1/5rk1/3B3p/2P3bP/5pP1/PP3P2/K7 b - - 2 1 c9 "1-0";
8/8/1p4Q1/pq4pp/5p2/6k1/8/K7 b - - 5 1 c9 "0-1";
4b3/8/8/1k4P1/pp2p3/3pN3/8/K7 b - - 1 1 c9 "0-1";
5k2/4r3/5p1p/2Q4P/p3b3/P5P1/2P2P2/K7 b - - 2 1 c9 "1-0";
Minimal Chess (simple, open source, C#) - Youtube & Github
Leorik (competitive, in active development, C#) - Github & Lichess
Robert Pope
Posts: 563
Joined: Sat Mar 25, 2006 8:27 pm
Location: USA
Full name: Robert Pope

Re: Labeled positions for Texel tuning

Post by Robert Pope »

Interesting. I'll try yours and report back. I'm embarrassed to note that my FEN/EPD reader code is pretty brittle, and I'm having a little trouble with Andrew's data, so this will be an easy interim project.
User avatar
Ras
Posts: 2695
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: Labeled positions for Texel tuning

Post by Ras »

lithander wrote: Sat Jun 17, 2023 2:18 pmNow I'm curious if my dataset would work equally well for other engines. If you want to try it you can download it here: DATA-L26-3443372.zip
I tried it against the Zurichess training set in my upcoming engine version. With your training data, the score was 49.5% at 10k games.

An interesting detail I noticed is that with your training set, the bishop pair gets a large advantage when there are no pawns. I suspected it might have something to do with Leorik struggling with KBN:K, which is why it would overrate the bishop pair compared to B+N. Here two colour swapped games at 10s/game from the same KBN:K starting position, no tablebases used, and Leorik doesn't win that endgame (unlike KBB:K).

[pgn][White "Leorik 2.4"]
[Black "CT800 V1.45 64 bit"]
[Result "1/2-1/2"]
[Termination "50 moves rule"]
[FEN "8/8/3k4/8/8/8/8/K2N3B w - - 0 1"]
[PlyCount "100"]

1. Kb2 {1350/18} Ke5 {-738/11} 2. Kc3 {1350/19} Kf5 {-741/11} 3. Ne3+ {1350/19} Ke6 {-744/10}
4. Kd4 {1382/16} Kf6 {-751/11} 5. Ke4 {1423/17} Ke6 {-751/11} 6. Nf5 {1465/17} Kf6 {-750/11}
7. Kf4 {1465/18} Ke6 {-755/11} 8. Bc6 {1465/17} Kf6 {-760/11} 9. Bd5 {1472/17} Kg6 {-760/9}
10. Nh4+ {1492/17} Kh5 {-750/11} 11. Nf3 {1500/18} Kg6 {-760/11} 12. Ke5 {1536/17} Kh6 {-769/11}
13. Kf6 {1576/17} Kh7 {-769/11} 14. Nd4 {1536/17} Kh8 {-762/9} 15. Nf5 {1567/17} Kh7 {-759/2}
16. Kf7 {1536/18} Kh8 {-747/2} 17. Nd6 {1536/19} Kh7 {-759/2} 18. Kf6 {1536/18} Kh8 {-769/11}
19. Nf7+ {1536/18} Kg8 {-779/11} 20. Ne5+ {1536/18} Kh7 {-779/11} 21. Nf3 {1565/17} Kh8 {-768/9}
22. Bf7 {1536/18} Kh7 {-757/2} 23. Nd4 {1552/19} Kh8 {-769/11} 24. Bg6 {1565/20} Kg8 {-757/2}
25. Nc6 {1565/19} Kh8 {-774/11} 26. Ne5 {1565/20} Kg8 {-774/10} 27. Nf3 {1567/20} Kh8 {-768/10}
28. Ne1 {1565/17} Kg8 {-757/2} 29. Nd3 {1565/19} Kh8 {-769/10} 30. Bf7 {1565/19} Kh7 {-757/2}
31. Ne5 {1567/18} Kh8 {-771/10} 32. Ng4 {1567/19} Kh7 {-757/2} 33. Bh5 {1565/18} Kh8 {-769/11}
34. Nh6 {1565/19} Kh7 {-757/2} 35. Nf7 {1565/20} Kg8 {-759/2} 36. Ne5 {1565/20} Kh8 {-772/10}
37. Nc4 {1565/19} Kh7 {-771/10} 38. Bg6+ {1565/19} Kh8 {-768/9} 39. Nb6 {1565/20} Kg8 {-768/9}
40. Nd7 {1565/19} Kh8 {-745/2} 41. Bf7 {1565/20} Kh7 {-757/2} 42. Nb6 {1565/19} Kh8 {-769/11}
43. Nc4 {1565/20} Kh7 {-757/2} 44. Ke7 {1565/19} Kh8 {-764/9} 45. Bh5 {1565/18} Kg7 {-761/10}
46. Be8 {1565/19} Kh8 {0/10} 47. Kf7 {1536/22} Kh7 {-759/2} 48. Ne5 {0/59} Kh6 {0/42}
49. Ke6 {0/99} Kg5 {0/42} 50. Bf7 {0/99} Kf4 {0/42} 1/2-1/2[/pgn]

[pgn][White "CT800 V1.45 64 bit"]
[Black "Leorik 2.4"]
[Result "1-0"]
[Termination "checkmate"]
[FEN "8/8/3k4/8/8/8/8/K2N3B w - - 0 1"]
[PlyCount "27"]

1. Kb2 {733/11} Ke5 {-1350/16} 2. Kc3 {739/11} Kf4 {-1350/17} 3. Kd4 {743/11} Kg3 {-1350/18}
4. Ke5 {778/10} Kh3 {-1350/19} 5. Kf4 {799/10} Kh2 {-1404/18} 6. Bf3 {807/11} Kh3 {-M8/18}
7. Ne3 {808/10} Kh4 {-M7/18} 8. Be2 {809/10} Kh3 {-M6/19} 9. Bg4+ {M6/10} Kh2 {-M5/20}
10. Kf3 {M5/3} Kg1 {-M4/21} 11. Kg3 {M4/3} Kh1 {-M3/21} 12. Kf2 {M3/5} Kh2 {-M2/21}
13. Nf1+ {M2/3} Kh1 {-M1/23} 14. Bf3# {M1/3} 1-0[/pgn]
Rasmus Althoff
https://www.ct800.net
User avatar
lithander
Posts: 915
Joined: Sun Dec 27, 2020 2:40 am
Location: Bremen, Germany
Full name: Thomas Jahn

Re: Labeled positions for Texel tuning

Post by lithander »

Ras wrote: Sat Jun 17, 2023 10:06 pm An interesting detail I noticed is that with your training set, the bishop pair gets a large advantage when there are no pawns. I suspected it might have something to do with Leorik struggling with KBN:K, which is why it would overrate the bishop pair compared to B+N. Here two colour swapped games at 10s/game from the same KBN:K starting position, no tablebases used, and Leorik doesn't win that endgame (unlike KBB:K).
Assuming that every engine has it's unique strengths and weaknesses then tuning Leorik on Leorik selfplay games teaches it (besides general knowledge of chess) to avoid positions that it's weak and to play towards positions where it is stronger. Could explain why I get good tuning results on my own data and for you it was a slight regression compared to Zurichess.
Ras wrote: Sat Jun 17, 2023 10:06 pm Leorik struggling with KBN:K
I have no idea how to fix that without using tablebase. Do you have custom eval for certain endgames? In any case thanks for sharing that observation, I'll try to look into it!
Minimal Chess (simple, open source, C#) - Youtube & Github
Leorik (competitive, in active development, C#) - Github & Lichess
User avatar
Ras
Posts: 2695
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: Labeled positions for Texel tuning

Post by Ras »

lithander wrote: Sat Jun 17, 2023 11:28 pmAssuming that every engine has it's unique strengths and weaknesses then tuning Leorik on Leorik selfplay games teaches it (besides general knowledge of chess) to avoid positions that it's weak and to play towards positions where it is stronger. Could explain why I get good tuning results on my own data and for you it was a slight regression compared to Zurichess.
Makes sense, and the miracle of the Zurichess set is how generic it is - and that although the so-called "quiet" set isn't even quiet because there are some 20k positions where the side to move is in check. However, resolving that doesn't change the outcome anyway. I also noticed that with your training set, and Andrew's as well, there's a huge difference in pawn MG/EG value, like 70/130 or so. The Zurichess data don't lead to this. So I conclude that both Ethereal and Leorik are strong attackers and hence like to sac a pawn for the initiative early in the game.
I have no idea how to fix that without using tablebase. Do you have custom eval for certain endgames?
Yes, and it's actually very easy. Basically some special PSTs for that endgame, depending on what colour the bishop is, and override the standard eval. See my eval.c, which is really messy, but you will have no trouble to get the idea on that one. Other special endgames I have are e.g. KP:K with a 24k bitbase, and also "wrong bishop" plus rim pawn, and code for KQ:KR. Doesn't really give noticeable Elo, but was a good pretext for avoiding to address my king safety issues. :)
Rasmus Althoff
https://www.ct800.net
Robert Pope
Posts: 563
Joined: Sat Mar 25, 2006 8:27 pm
Location: USA
Full name: Robert Pope

Re: Labeled positions for Texel tuning

Post by Robert Pope »

I also noticed that it looks like the Leorik data is only decisive games - 1-0 or 0-1, no draws. I wonder if that affects the results, too.