Labeled positions for Texel tuning
Moderator: Ras
-
- Posts: 563
- Joined: Sat Mar 25, 2006 8:27 pm
- Location: USA
- Full name: Robert Pope
Labeled positions for Texel tuning
I have a dataset of 725K quiet labeled positions that were generated by Zurichess, but I was wondering if anyone knew of other good datasets that are publicly available? I'm working on generating my own, but I don't think they'll be as good quality. I also tried to follow some of the links in chessprogramming.org, but didn't find much.
-
- Posts: 1953
- Joined: Tue Apr 19, 2016 6:08 am
- Location: U.S.A
- Full name: Andrew Grant
-
- Posts: 915
- Joined: Sun Dec 27, 2020 2:40 am
- Location: Bremen, Germany
- Full name: Thomas Jahn
Re: Labeled positions for Texel tuning
In my devlog I have written a few posts on how I weaned Leorik off the Zurichess set and created my own data from scratch via selfplay, starting with a completely dumb evaluation based on just material values. (100, 300, 500 and 900)
It starts here if you're curious: https://talkchess.com/forum3/viewtopic. ... 40#p938897
Eventually I managed to exceed the performance I got out of the Zurichess set. Version 2.3 and later have been tuned on this growing repository of selfplay games (with randomization).
Be aware that these labels are far from perfect: just the outcome of the game, not involving stockfish in the labeling as it was done (afaik) for the Zurichess set. That it worked so well was a surprise to me, to be honest. And it would be interesting to know if it works well for other engines, too.
So, if you want I can upload a file with a few million labeled positions that I currently use for training version 2.4.X.
It starts here if you're curious: https://talkchess.com/forum3/viewtopic. ... 40#p938897
Eventually I managed to exceed the performance I got out of the Zurichess set. Version 2.3 and later have been tuned on this growing repository of selfplay games (with randomization).
Be aware that these labels are far from perfect: just the outcome of the game, not involving stockfish in the labeling as it was done (afaik) for the Zurichess set. That it worked so well was a surprise to me, to be honest. And it would be interesting to know if it works well for other engines, too.
So, if you want I can upload a file with a few million labeled positions that I currently use for training version 2.4.X.
-
- Posts: 563
- Joined: Sat Mar 25, 2006 8:27 pm
- Location: USA
- Full name: Robert Pope
Re: Labeled positions for Texel tuning
Thank you, both. I think Andrew's data has what I was looking for to start, so I'm going to try those while I work on generating my own data.
-
- Posts: 915
- Joined: Sun Dec 27, 2020 2:40 am
- Location: Bremen, Germany
- Full name: Thomas Jahn
Re: Labeled positions for Texel tuning
Today I tuned a version of Leorik on the first 5M positions from Andrew's E12.52-1M-D12-Resolved.book (Leorik2.4.3d) to compare it with my current dev version tuned on 3.4M positions from Leorik selfplay games. I didn't change anything else except the dataset.
The result is quite decisive in favor of Leorik tuned on Leorik's own selfplay games. Now I'm curious if my dataset would work equally well for other engines. If you want to try it you can download it here: DATA-L26-3443372.zip
The format is equal to the one used in the Zurichess set:
Code: Select all
# PLAYER : RATING POINTS PLAYED (%)
1 Leorik-2.4.3c : 2334.7 1442.0 2413 60
2 Leorik-2.4.3d : 2265.3 971.0 2413 40
The format is equal to the one used in the Zurichess set:
Code: Select all
8/8/8/8/2q4p/6k1/8/K7 b - - 1 1 c9 "0-1";
8/Q5pk/5p1p/1P3q2/8/8/3r4/K7 w - - 0 1 c9 "0-1";
4R3/p5p1/5rk1/3B3p/2P3bP/5pP1/PP3P2/K7 b - - 2 1 c9 "1-0";
8/8/1p4Q1/pq4pp/5p2/6k1/8/K7 b - - 5 1 c9 "0-1";
4b3/8/8/1k4P1/pp2p3/3pN3/8/K7 b - - 1 1 c9 "0-1";
5k2/4r3/5p1p/2Q4P/p3b3/P5P1/2P2P2/K7 b - - 2 1 c9 "1-0";
-
- Posts: 563
- Joined: Sat Mar 25, 2006 8:27 pm
- Location: USA
- Full name: Robert Pope
Re: Labeled positions for Texel tuning
Interesting. I'll try yours and report back. I'm embarrassed to note that my FEN/EPD reader code is pretty brittle, and I'm having a little trouble with Andrew's data, so this will be an easy interim project.
-
- Posts: 2695
- Joined: Tue Aug 30, 2016 8:19 pm
- Full name: Rasmus Althoff
Re: Labeled positions for Texel tuning
I tried it against the Zurichess training set in my upcoming engine version. With your training data, the score was 49.5% at 10k games.lithander wrote: ↑Sat Jun 17, 2023 2:18 pmNow I'm curious if my dataset would work equally well for other engines. If you want to try it you can download it here: DATA-L26-3443372.zip
An interesting detail I noticed is that with your training set, the bishop pair gets a large advantage when there are no pawns. I suspected it might have something to do with Leorik struggling with KBN:K, which is why it would overrate the bishop pair compared to B+N. Here two colour swapped games at 10s/game from the same KBN:K starting position, no tablebases used, and Leorik doesn't win that endgame (unlike KBB:K).
[pgn][White "Leorik 2.4"]
[Black "CT800 V1.45 64 bit"]
[Result "1/2-1/2"]
[Termination "50 moves rule"]
[FEN "8/8/3k4/8/8/8/8/K2N3B w - - 0 1"]
[PlyCount "100"]
1. Kb2 {1350/18} Ke5 {-738/11} 2. Kc3 {1350/19} Kf5 {-741/11} 3. Ne3+ {1350/19} Ke6 {-744/10}
4. Kd4 {1382/16} Kf6 {-751/11} 5. Ke4 {1423/17} Ke6 {-751/11} 6. Nf5 {1465/17} Kf6 {-750/11}
7. Kf4 {1465/18} Ke6 {-755/11} 8. Bc6 {1465/17} Kf6 {-760/11} 9. Bd5 {1472/17} Kg6 {-760/9}
10. Nh4+ {1492/17} Kh5 {-750/11} 11. Nf3 {1500/18} Kg6 {-760/11} 12. Ke5 {1536/17} Kh6 {-769/11}
13. Kf6 {1576/17} Kh7 {-769/11} 14. Nd4 {1536/17} Kh8 {-762/9} 15. Nf5 {1567/17} Kh7 {-759/2}
16. Kf7 {1536/18} Kh8 {-747/2} 17. Nd6 {1536/19} Kh7 {-759/2} 18. Kf6 {1536/18} Kh8 {-769/11}
19. Nf7+ {1536/18} Kg8 {-779/11} 20. Ne5+ {1536/18} Kh7 {-779/11} 21. Nf3 {1565/17} Kh8 {-768/9}
22. Bf7 {1536/18} Kh7 {-757/2} 23. Nd4 {1552/19} Kh8 {-769/11} 24. Bg6 {1565/20} Kg8 {-757/2}
25. Nc6 {1565/19} Kh8 {-774/11} 26. Ne5 {1565/20} Kg8 {-774/10} 27. Nf3 {1567/20} Kh8 {-768/10}
28. Ne1 {1565/17} Kg8 {-757/2} 29. Nd3 {1565/19} Kh8 {-769/10} 30. Bf7 {1565/19} Kh7 {-757/2}
31. Ne5 {1567/18} Kh8 {-771/10} 32. Ng4 {1567/19} Kh7 {-757/2} 33. Bh5 {1565/18} Kh8 {-769/11}
34. Nh6 {1565/19} Kh7 {-757/2} 35. Nf7 {1565/20} Kg8 {-759/2} 36. Ne5 {1565/20} Kh8 {-772/10}
37. Nc4 {1565/19} Kh7 {-771/10} 38. Bg6+ {1565/19} Kh8 {-768/9} 39. Nb6 {1565/20} Kg8 {-768/9}
40. Nd7 {1565/19} Kh8 {-745/2} 41. Bf7 {1565/20} Kh7 {-757/2} 42. Nb6 {1565/19} Kh8 {-769/11}
43. Nc4 {1565/20} Kh7 {-757/2} 44. Ke7 {1565/19} Kh8 {-764/9} 45. Bh5 {1565/18} Kg7 {-761/10}
46. Be8 {1565/19} Kh8 {0/10} 47. Kf7 {1536/22} Kh7 {-759/2} 48. Ne5 {0/59} Kh6 {0/42}
49. Ke6 {0/99} Kg5 {0/42} 50. Bf7 {0/99} Kf4 {0/42} 1/2-1/2[/pgn]
[pgn][White "CT800 V1.45 64 bit"]
[Black "Leorik 2.4"]
[Result "1-0"]
[Termination "checkmate"]
[FEN "8/8/3k4/8/8/8/8/K2N3B w - - 0 1"]
[PlyCount "27"]
1. Kb2 {733/11} Ke5 {-1350/16} 2. Kc3 {739/11} Kf4 {-1350/17} 3. Kd4 {743/11} Kg3 {-1350/18}
4. Ke5 {778/10} Kh3 {-1350/19} 5. Kf4 {799/10} Kh2 {-1404/18} 6. Bf3 {807/11} Kh3 {-M8/18}
7. Ne3 {808/10} Kh4 {-M7/18} 8. Be2 {809/10} Kh3 {-M6/19} 9. Bg4+ {M6/10} Kh2 {-M5/20}
10. Kf3 {M5/3} Kg1 {-M4/21} 11. Kg3 {M4/3} Kh1 {-M3/21} 12. Kf2 {M3/5} Kh2 {-M2/21}
13. Nf1+ {M2/3} Kh1 {-M1/23} 14. Bf3# {M1/3} 1-0[/pgn]
Rasmus Althoff
https://www.ct800.net
https://www.ct800.net
-
- Posts: 915
- Joined: Sun Dec 27, 2020 2:40 am
- Location: Bremen, Germany
- Full name: Thomas Jahn
Re: Labeled positions for Texel tuning
Assuming that every engine has it's unique strengths and weaknesses then tuning Leorik on Leorik selfplay games teaches it (besides general knowledge of chess) to avoid positions that it's weak and to play towards positions where it is stronger. Could explain why I get good tuning results on my own data and for you it was a slight regression compared to Zurichess.Ras wrote: ↑Sat Jun 17, 2023 10:06 pm An interesting detail I noticed is that with your training set, the bishop pair gets a large advantage when there are no pawns. I suspected it might have something to do with Leorik struggling with KBN:K, which is why it would overrate the bishop pair compared to B+N. Here two colour swapped games at 10s/game from the same KBN:K starting position, no tablebases used, and Leorik doesn't win that endgame (unlike KBB:K).
I have no idea how to fix that without using tablebase. Do you have custom eval for certain endgames? In any case thanks for sharing that observation, I'll try to look into it!
-
- Posts: 2695
- Joined: Tue Aug 30, 2016 8:19 pm
- Full name: Rasmus Althoff
Re: Labeled positions for Texel tuning
Makes sense, and the miracle of the Zurichess set is how generic it is - and that although the so-called "quiet" set isn't even quiet because there are some 20k positions where the side to move is in check. However, resolving that doesn't change the outcome anyway. I also noticed that with your training set, and Andrew's as well, there's a huge difference in pawn MG/EG value, like 70/130 or so. The Zurichess data don't lead to this. So I conclude that both Ethereal and Leorik are strong attackers and hence like to sac a pawn for the initiative early in the game.lithander wrote: ↑Sat Jun 17, 2023 11:28 pmAssuming that every engine has it's unique strengths and weaknesses then tuning Leorik on Leorik selfplay games teaches it (besides general knowledge of chess) to avoid positions that it's weak and to play towards positions where it is stronger. Could explain why I get good tuning results on my own data and for you it was a slight regression compared to Zurichess.
Yes, and it's actually very easy. Basically some special PSTs for that endgame, depending on what colour the bishop is, and override the standard eval. See my eval.c, which is really messy, but you will have no trouble to get the idea on that one. Other special endgames I have are e.g. KP:K with a 24k bitbase, and also "wrong bishop" plus rim pawn, and code for KQ:KR. Doesn't really give noticeable Elo, but was a good pretext for avoiding to address my king safety issues.I have no idea how to fix that without using tablebase. Do you have custom eval for certain endgames?

Rasmus Althoff
https://www.ct800.net
https://www.ct800.net
-
- Posts: 563
- Joined: Sat Mar 25, 2006 8:27 pm
- Location: USA
- Full name: Robert Pope
Re: Labeled positions for Texel tuning
I also noticed that it looks like the Leorik data is only decisive games - 1-0 or 0-1, no draws. I wonder if that affects the results, too.