Re: best way to determine elos of a group
Posted: Thu Aug 08, 2019 6:56 pm
Things are going pretty well now - the piece values are learned long ago and it is actually starting to develop preference for well known openings now.
It hasn't learned to castle soon or about king safety that much yet -- training is still at <250k games so there is a lot to go.
I have one question: when to drop learning rate? It is still using an LR=0.15 but as you can see in the plot it is zigzagging a lot though it was like that from the beginning anyway. The zigzaging is most likely because I am training a net every 512 games ..
I think there is a thought that keeping a learning rate as high as possible and dropping in a stepwise fashion gives better generalization than using a gradually decaying learning rate such as exponential decay. These all sounds to me like a black art.
I have not done hyper-parameter study to determine the right learning rate but often the right one is one that gives a smooth reduction in the loss.
Mine is clearly zigzagging a lot, although from the very beginning, so i am not sure when to drop it and by how much ( maybe be by 10x like A0?).
Daniel
Code: Select all
# rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
# [st = 11114ms, mt = 29250ms , hply = 0 , moves_left 10]
63 8 111 2603 e2-e3 Nb8-c6 Nb1-c3 d7-d5 d2-d4 Ng8-f6 Ng1-f3
64 8 223 6024 c2-c4 Nb8-c6 Nb1-c3 e7-e5 e2-e4 Ng8-f6 Ng1-f3 Bf8-e7 Bf1-e2
65 6 334 10070 d2-d4 d7-d5 f2-f3 Nb8-c6 Nb1-c3 Ng8-f6 e2-e4 d5xe4 f3xe4 Qd8-d7
66 6 446 14134 d2-d4 Ng8-f6 Ng1-f3 d7-d5 Nb1-c3 Nb8-c6 h2-h4 h7-h5 Nf3-g5 Qd8-d7 f2-f4
67 4 558 18339 Nb1-c3 e7-e5 e2-e4 Nb8-c6 Ng1-f3 Ng8-f6 Bf1-e2 Bf8-e7 Rh1-f1 Rh8-f8 h2-h3
68 4 670 22748 Ng1-f3 d7-d5 c2-c3 Nb8-c6 d2-d4 Ng8-f6 b2-b4 Nf6-e4 b4-b5 Nc6-a5
69 4 782 27090 d2-d4 d7-d5 g2-g3 Ng8-f6 Ng1-f3 Nb8-c6 Nb1-c3 h7-h5 h2-h4 Nf6-g4
70 4 894 31872 Ng1-f3 c7-c5 Nb1-c3 Nb8-c6 e2-e4 Ng8-f6 e4-e5 Nf6-d5 Nc3xd5
71 5 1005 36133 Ng1-f3 c7-c5 Nb1-c3 Nb8-c6 e2-e4 Ng8-f6 e4-e5 Nf6-d5 Nc3xd5
72 5 1065 38332 d2-d4 d7-d5 g2-g3 Ng8-f6 Ng1-f3 Nb8-c6 Nb1-c3 h7-h5 h2-h4 Nf6-g4 Bf1-h3
# Move Value=(V,P,V+P) Policy Visits PV
#----------------------------------------------------------------------------------
# 1 (0.501,0.500,0.501) 17.15 7185 Nb1-c3 d7-d5 Ng1-f3 Nb8-c6 d2-d4 Ng8-f6 h2-h4 h7-h5 Nf3-g5 Qd8-d7 f2-f4
# 2 (0.507,0.500,0.504) 15.32 10871 d2-d4 d7-d5 g2-g3 Ng8-f6 Ng1-f3 Nb8-c6 Nb1-c3 h7-h5 h2-h4 Nf6-g4 Bf1-h3
# 3 (0.500,0.500,0.500) 11.57 9395 Ng1-f3 d7-d5 c2-c3 Nb8-c6 d2-d4 Ng8-f6 b2-b4 Nf6-e4 b4-b5 Nc6-a5
# 4 (0.497,0.500,0.497) 6.20 1600 e2-e4 Nb8-c6 Nb1-c3 e7-e5 Ng1-f3 Ng8-f6 Bf1-e2 Bf8-e7 Rh1-f1 Rh8-f8 h2-h3
# 5 (0.498,0.500,0.498) 3.64 754 Ng1-h3 Nb8-c6 Nb1-c3 d7-d5 d2-d4 Ng8-f6 f2-f4 Nf6-e4 Nc3xe4 d5xe4
# 6 (0.486,0.500,0.486) 3.60 428 Nb1-a3 Nb8-c6 Ng1-f3 d7-d5 d2-d4 Ng8-f6 Qd1-d2 Nf6-e4 Qd2-e3
# 7 (0.499,0.500,0.499) 3.37 732 b2-b4 d7-d5 d2-d4 Nb8-c6 b4-b5 Nc6-a5 Ng1-f3 Ng8-f6 Nf3-e5 Qd8-d6 Nb1-c3
# 8 (0.500,0.500,0.500) 3.29 740 c2-c4 Nb8-c6 Nb1-c3 e7-e5 e2-e4 Ng8-f6 Ng1-f3 Bf8-e7 Bf1-e2 Rh8-f8 Rh1-f1
# 9 (0.491,0.500,0.491) 3.21 450 d2-d3 d7-d5 Nb1-c3 Nb8-c6 Ng1-f3 Ng8-f6 h2-h4 d5-d4
# 10 (0.497,0.500,0.497) 3.21 803 c2-c3 e7-e5 e2-e4 Nb8-c6 Ng1-f3 Ng8-f6 Bf1-e2 Bf8-e7 d2-d4
# 11 (0.494,0.500,0.494) 3.16 512 f2-f4 Nb8-c6 Nb1-c3 d7-d5 d2-d4 Ng8-f6 Ng1-f3 Nf6-e4
# 12 (0.491,0.500,0.491) 3.16 500 e2-e3 Nb8-c6 Nb1-c3 d7-d5 d2-d4 Ng8-f6 Ng1-f3 Nf6-e4 Bf1-d3
# 13 (0.495,0.500,0.495) 3.08 536 g2-g3 Nb8-c6 Nb1-c3 Ng8-f6 d2-d4 d7-d5 Ng1-f3 h7-h5 h2-h4 Nf6-g4
# 14 (0.486,0.500,0.486) 3.03 361 a2-a4 Nb8-c6 Nb1-c3 Ng8-f6 Ng1-f3 d7-d5 d2-d4 h7-h5
# 15 (0.492,0.500,0.492) 2.98 681 h2-h3 Nb8-c6 Nb1-c3 Ng8-f6 d2-d4 d7-d5 Ng1-f3 h7-h5 Qd1-d2 Qd8-d7
# 16 (0.499,0.500,0.499) 2.94 638 f2-f3 Nb8-c6 e2-e4 e7-e5 Nb1-c3 Ng8-f6 Qd1-e2 Bf8-e7 Qe2-d3 Nc6-b4
# 17 (0.496,0.500,0.496) 2.92 523 g2-g4 Nb8-c6 Nb1-c3 e7-e5 e2-e4 Ng8-f6 g4-g5 Nf6-h5 Ng1-f3
# 18 (0.501,0.500,0.500) 2.88 695 b2-b3 Nb8-c6 Ng1-f3 d7-d5 d2-d4 Ng8-f6 Nb1-c3 h7-h5 Qd1-d2
# 19 (0.499,0.500,0.499) 2.70 569 a2-a3 Nb8-c6 Nb1-c3 d7-d5 d2-d4 Ng8-f6 Ng1-f3 h7-h5 h2-h4
# 20 (0.488,0.500,0.488) 2.61 358 h2-h4 Nb8-c6 Nb1-c3 Ng8-f6 d2-d4 d7-d5 Ng1-f3 h7-h5 Nf3-g5
# nodes = 257747 <0% qnodes> time = 10703ms nps = 24081 eps = 0 nneps = 3598
# Tree: nodes = 51647 depth = 12 pps = 3596 visits = 38332
# qsearch_calls = 0 search_calls = 0
move d2d4
Bye Bye
I have one question: when to drop learning rate? It is still using an LR=0.15 but as you can see in the plot it is zigzagging a lot though it was like that from the beginning anyway. The zigzaging is most likely because I am training a net every 512 games ..
I think there is a thought that keeping a learning rate as high as possible and dropping in a stepwise fashion gives better generalization than using a gradually decaying learning rate such as exponential decay. These all sounds to me like a black art.
I have not done hyper-parameter study to determine the right learning rate but often the right one is one that gives a smooth reduction in the loss.
Mine is clearly zigzagging a lot, although from the very beginning, so i am not sure when to drop it and by how much ( maybe be by 10x like A0?).
Daniel