Code: Select all
Bayesian Elo ratings — 17 PGN files combined
8500 games loaded, 10 players rated
Rank Name Elo ± Games Score Oppo Draws
-------------------------------------------------------------------
1 Leaf_v260410-1.6e6g +345 18 1000 55% +313 36%
2 Leaf_vclassic_eval +336 12 3500 72% +116 18%
3 Leaf_v260410-1.5e6g +289 15 1500 48% +308 25%
4 Leaf_v260410-8e5g +244 15 1500 48% +261 28%
5 Leaf_v260410-5e5g +158 15 1500 42% +222 25%
6 Leaf_v260410-2e5g +88 16 2000 57% -42 14%
7 Leaf_v260410-5e4g -50 18 2000 54% -112 10%
8 Leaf_v260410-1e4g -259 22 1500 38% -109 9%
9 Leaf_v260410-0g -539 25 500 62% -612 41%
10 Leaf_vmaterial_eval -612 19 2000 14% -190 14%
A few observations:
1) It will probably take order of magnitude(s) more games to make significantly more progress, regardless of the other changes I make. That is not necessarily a problem, other than time, but this is not a learning technique that can be used for obvious gains in a few hundred or few thousand games (except very early in the learning process).
2) Experience based learning like this is time consuming to optimize as one cannot use a bank of previously played games to test learning parameters. I need to think a bit about this and come up with a strategy for moving forward. Possibilities include
(a) Deeper games (depth = 8, 10, more, timed games - although there is an added noise signal there) will take time before I can tell if they are the right step
(b) Changing up opponents that will play different lines, exposing the TDleaf learning to a wider variety of positions
(c) Adjusting the learning parameters -- currently using AdamW for this, so not a lot to adjust, but there are still several choices of learning rates for various parts of the net, batching before initiating a learning set, merging learned values from multiple threads, etc.
No real question here except to see if anyone has experience with TDLeaf learning to adjust NNUE weights and biases through experiential play rather than off-line learning with a large databases of games. Any wisdom on areas to explore next is welcome.
- Dan