Thanks for the test! This Stoofvless seems too strong for scorpio, or scorpio doesn't perform as well on the GTX 1650.
I have always tested on a Volta so i had no idea of its strength other than on that.
Btw how much nps is leela getting on the GTX 1650? The 2 knps that scorpio is getting seems too low.
I may want to try INT8 which may double nps -- it seems from my tests with equal nps it is only -38 elo weaker
than half precision (FP16). So with nps doubling it may perform better than half precision.
Edit: One more thing I forgot in my setup. Yo might want to try delay 1 and compare its speedup with delay 0.
This is most probably needed on a system with few CPU cores (1 or 2 cores) and when launching 128 threads like we do here.
Please try this first before the INT8 i think it may double nps
This is probably a very important factor. I get about 100x more speedup difference with delay 1 on my system if I restrict
it to using 1 core, unbelivable i know but it is how scorpio is designed lanunching many threads unlike lc0.
I always tested using all 32 cores, so i never had to use delay=1. I use linux taskset to restrict to 1 core.
I don't expect the nps to increase by 100x on your systems but maybe it will double though ...
Here is a run with delay=0, nps is 161 nodes/s
Code: Select all
[07:13 dabdi@hsw221 bin] > taskset -c 0 ./scorpio mt 128 delay 0 go quit
feature done=0
Number of cores 1 of 32
ht 4194304 X 16 = 64.0 MB
eht 524288 X 8 = 8.0 MB
pht 32768 X 24 = 0.8 MB
treeht 1342169600 X 10 = 12799.9 MB
processors [64]
processors [128]
EgbbProbe 4.3 by Daniel Shawul
egbb_cache 4084 X 8216 = 32.0 MB
0 egbbs loaded !
Loading neural network : ../nets-maddex/net-maddex.uff
nn_cache 131072 X 1552 = 194.0 MB
Loading graph on /gpu:0
0. main_input 7168 = 112 8 8
1. policy_head 1858 = 1858 1 1
2. value_head 1 = 1 1 1
Neural network loaded !
loading_time = 8s
# rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
# [st = 11114ms, mt = 29250ms , hply = 0 , moves_left 10]
63 3 366 182 d2-d4 Ng8-f6 c2-c4 e7-e6
64 10 521 559 d2-d4 Ng8-f6 c2-c4 e7-e6 Ng1-f3 d7-d5 Nb1-c3
65 34 690 912 d2-d4 Ng8-f6 c2-c4 e7-e6 Ng1-f3 d7-d5 Nb1-c3 Bf8-e7
66 35 861 1161 e2-e4 c7-c5 Ng1-f3 d7-d6 d2-d4 c5xd4 Nf3xd4 Ng8-f6 Nb1-c3 a7-a6
67 35 1116 1693 e2-e4 c7-c5 Ng1-f3 d7-d6 d2-d4 c5xd4 Nf3xd4 Ng8-f6 Nb1-c3 a7-a6 Bc1-e3
68 35 1286 2079 e2-e4 c7-c5 Ng1-f3 d7-d6 d2-d4 c5xd4 Nf3xd4 Ng8-f6 Nb1-c3 a7-a6 Bc1-e3
69 34 1457 2498 e2-e4 c7-c5 Ng1-f3 d7-d6 d2-d4 c5xd4 Nf3xd4 Ng8-f6 Nb1-c3 a7-a6 Bc1-e3 e7-e5 Nd4-b3
70 34 1626 2626 e2-e4 c7-c5 Ng1-f3 d7-d6 d2-d4 c5xd4 Nf3xd4 Ng8-f6 Nb1-c3 a7-a6 Bc1-e3 e7-e5 Nd4-b3
# Move Value=(V,P,V+P) Policy Visits PV
#----------------------------------------------------------------------------------
# 1 (0.547,0.527,0.537) 22.82 804 d2-d4 d7-d5 c2-c4 e7-e6 Nb1-c3 Ng8-f6 Ng1-f3 Bf8-e7 g2-g3 Ke8-g8
# 2 (0.549,0.527,0.538) 22.76 657 e2-e4 c7-c5 Ng1-f3 d7-d6 d2-d4 c5xd4 Nf3xd4 Ng8-f6 Nb1-c3 a7-a6 Bc1-e3 e7-e5 Nd4-b3
# 3 (0.539,0.527,0.533) 9.56 292 c2-c4 e7-e5 Nb1-c3 Ng8-f6 Ng1-f3 Nb8-c6 g2-g3 Bf8-b4 Bf1-g2
# 4 (0.533,0.527,0.530) 9.16 279 Ng1-f3 d7-d5 d2-d4 Ng8-f6 c2-c4 e7-e6 Nb1-c3 Bf8-e7 g2-g3 Ke8-g8
# 5 (0.520,0.527,0.520) 4.41 114 g2-g3 d7-d5 Ng1-f3 c7-c5 Bf1-g2 Ng8-f6 Ke1-g1 Nb8-c6 d2-d4
# 6 (0.524,0.527,0.524) 4.03 107 e2-e3 Ng8-f6 d2-d4 d7-d5 c2-c4 e7-e6 Ng1-f3
# 7 (0.497,0.527,0.497) 3.03 45 b2-b3 e7-e5 Bc1-b2 Nb8-c6 e2-e3 Ng8-f6 Ng1-f3 e5-e4
# 8 (0.510,0.527,0.510) 2.83 59 c2-c3 d7-d5 d2-d4 Ng8-f6 Ng1-f3 e7-e6 Bc1-f4
# 9 (0.507,0.527,0.507) 2.69 50 Nb1-c3 d7-d5 d2-d4 Ng8-f6 Bc1-f4 a7-a6 e2-e3 e7-e6 Ng1-f3
# 10 (0.496,0.527,0.496) 2.68 38 d2-d3 d7-d5 Ng1-f3 Ng8-f6 g2-g3 c7-c5 Bf1-g2 Nb8-c6
# 11 (0.458,0.527,0.458) 2.46 24 f2-f4 d7-d5 Ng1-f3 Ng8-f6 e2-e3 c7-c5
# 12 (0.473,0.527,0.473) 2.07 25 b2-b4 e7-e5 Bc1-b2 Bf8xb4 Bb2xe5 Ng8-f6 c2-c3 Bb4-e7 e2-e3
# 13 (0.492,0.527,0.492) 1.93 32 h2-h3 d7-d5 d2-d4 c7-c5 e2-e3 Ng8-f6 Ng1-f3 Nb8-c6
# 14 (0.498,0.527,0.498) 1.89 36 a2-a3 d7-d5 d2-d4 Ng8-f6 Ng1-f3 g7-g6 c2-c4 Bf8-g7
# 15 (0.434,0.527,0.434) 1.60 12 Ng1-h3 d7-d5 g2-g3 e7-e5 d2-d4 e5xd4
# 16 (0.473,0.527,0.473) 1.40 17 a2-a4 e7-e5 e2-e4 Ng8-f6 Nb1-c3 Bf8-b4 Ng1-f3
# 17 (0.453,0.527,0.453) 1.26 12 h2-h4 d7-d5 d2-d4 c7-c5 d4xc5 Ng8-f6
# 18 (0.388,0.527,0.388) 1.24 6 g2-g4 d7-d5 h2-h3 e7-e5 Bf1-g2
# 19 (0.418,0.527,0.418) 1.20 8 f2-f3 e7-e5 Nb1-c3 Ng8-f6 d2-d4 e5xd4
# 20 (0.437,0.527,0.437) 0.99 8 Nb1-a3 e7-e5 e2-e3 d7-d5 d2-d4 e5-e4
# nodes = 23219 <0% qnodes> time = 16270ms nps = 1427 eps = 0 nneps = 178
# Tree: nodes = 3580 depth = 12 pps = 161 visits = 2626
# qsearch_calls = 0 search_calls = 0
move e2e4
Bye Bye
Code: Select all
[07:14 dabdi@hsw221 bin] > taskset -c 0 ./scorpio mt 128 delay 1 go quit
feature done=0
Number of cores 1 of 32
ht 4194304 X 16 = 64.0 MB
eht 524288 X 8 = 8.0 MB
pht 32768 X 24 = 0.8 MB
treeht 1342169600 X 10 = 12799.9 MB
processors [64]
processors [128]
EgbbProbe 4.3 by Daniel Shawul
egbb_cache 4084 X 8216 = 32.0 MB
0 egbbs loaded !
Loading neural network : ../nets-maddex/net-maddex.uff
nn_cache 131072 X 1552 = 194.0 MB
Loading graph on /gpu:0
0. main_input 7168 = 112 8 8
1. policy_head 1858 = 1858 1 1
2. value_head 1 = 1 1 1
Neural network loaded !
loading_time = 8s
# rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
# [st = 11114ms, mt = 29250ms , hply = 0 , moves_left 10]
63 36 111 17518 e2-e4 Nb8-c6 d2-d4 d7-d5 e4-e5 Bc8-f5 c2-c3 e7-e6 Nb1-d2 f7-f6 f2-f4 f6xe5 f4xe5
64 36 223 35094 e2-e4 e7-e5 Ng1-f3 Nb8-c6 d2-d4 e5xd4 Nf3xd4 Ng8-f6 Nd4xc6 b7xc6 Bf1-d3 d7-d5 e4xd5 c6xd5 Ke1-g1 Bf8-e7 c2-c4
65 29 334 51851 d2-d4 Ng8-f6 c2-c4 c7-c6 Nb1-c3 d7-d5 Ng1-f3 e7-e6 Bc1-g5 h7-h6 Bg5xf6 Qd8xf6
66 27 446 71390 d2-d4 Ng8-f6 c2-c4 e7-e6 e2-e3 Bf8-e7 Ng1-f3 Ke8-g8 Nb1-c3 d7-d5 b2-b3 c7-c5 d4xc5
67 36 557 90129 e2-e4 e7-e6 d2-d4 d7-d5 Nb1-c3 Ng8-f6 Bc1-g5 d5xe4 Nc3xe4 Bf8-e7 Bg5xf6 Be7xf6 Ng1-f3 Ke8-g8 Qd1-d2 Nb8-d7 Ke1-c1 b7-b6
68 25 669 106481 e2-e4 e7-e6 d2-d4 d7-d5 Nb1-c3 Ng8-f6 Bf1-b5 c7-c6 Bb5-d3 c6-c5 d4xc5 Bf8xc5 Ng1-f3 d5xe4 Nc3xe4 Nf6xe4 Bd3xe4
69 24 781 125734 e2-e4 e7-e6 d2-d4 d7-d5 Nb1-c3 Ng8-f6 Bf1-b5 c7-c6 Bb5-d3 c6-c5 d4xc5 Bf8xc5 Ng1-f3 d5xe4 Nc3xe4 Nf6xe4 Bd3xe4 Qd8xd1 Ke1xd1
70 28 893 143024 d2-d4 Ng8-f6 Ng1-f3 e7-e6 g2-g3 c7-c5 Bf1-g2 c5xd4 Nf3xd4 d7-d5 Ke1-g1 e6-e5 Nd4-b3 Bc8-e6 Nb1-c3
71 28 1004 165996 d2-d4 d7-d5 c2-c4 e7-e6 g2-g3 d5xc4 Bf1-g2 c7-c5 Ng1-f3 Nb8-c6 Qd1-a4 c5xd4 Nf3xd4
72 29 1033 170871 d2-d4 d7-d5 c2-c4 e7-e6 Nb1-c3 a7-a6 c4xd5 e6xd5 Ng1-f3 Ng8-f6 Bc1-g5 Bc8-e6 e2-e3 Nb8-d7 Bf1-d3 Bf8-d6 Ke1-g1
# Move Value=(V,P,V+P) Policy Visits PV
#----------------------------------------------------------------------------------
# 1 (0.543,0.527,0.535) 22.82 73528 d2-d4 d7-d5 c2-c4 e7-e6 Nb1-c3 a7-a6 c4xd5 e6xd5 Ng1-f3 Ng8-f6 Bc1-g5 Bc8-e6 e2-e3 Nb8-d7 Bf1-d3 Bf8-d6 Ke1-g1
# 2 (0.537,0.527,0.532) 22.76 60350 e2-e4 e7-e6 d2-d4 d7-d5 Nb1-c3 Ng8-f6 e4-e5 Nf6-d7 f2-f4 a7-a6 Ng1-f3 c7-c5 Bc1-e3 Nb8-c6 Qd1-d2 Bf8-e7 d4xc5 Nd7xc5 Ke1-c1
# 3 (0.520,0.527,0.520) 9.56 13451 c2-c4 e7-e5 g2-g3 Ng8-f6 Bf1-g2 Bf8-c5 Nb1-c3 Ke8-g8 Ng1-f3 Nb8-c6 Ke1-g1 d7-d6 e2-e3 Rf8-e8
# 4 (0.530,0.527,0.529) 9.16 14483 Ng1-f3 d7-d5 d2-d4 Ng8-f6 e2-e3 c7-c5 c2-c4 c5xd4 e3xd4 g7-g6 Nb1-c3 Bf8-g7
# 5 (0.502,0.527,0.502) 4.41 1386 g2-g3 d7-d5 Ng1-f3 c7-c5 Bf1-g2 Ng8-f6 Ke1-g1 Nb8-c6 d2-d4 c5xd4 Nf3xd4 e7-e5
# 6 (0.512,0.527,0.512) 4.03 1680 e2-e3 e7-e6 c2-c4 d7-d5 Ng1-f3 Ng8-f6 d2-d4 Bf8-e7 Bf1-e2
# 7 (0.475,0.527,0.475) 3.03 802 b2-b3 e7-e5 Bc1-b2 Nb8-c6 e2-e3 Ng8-f6 Ng1-f3 e5-e4 Nf3-d4 Bf8-c5 Nd4xc6 d7xc6 d2-d4
# 8 (0.488,0.527,0.488) 2.83 804 c2-c3 d7-d5 d2-d4 Ng8-f6 Ng1-f3 e7-e6 Bc1-f4 Bf8-d6 e2-e3 Bd6xf4
# 9 (0.495,0.527,0.495) 2.69 803 Nb1-c3 d7-d5 d2-d4 Ng8-f6 Bc1-f4 a7-a6 e2-e3 e7-e6 Ng1-f3 c7-c5 Bf1-e2 Nb8-c6 Ke1-g1
# 10 (0.497,0.527,0.497) 2.68 753 d2-d3 d7-d5 Ng1-f3 Ng8-f6 g2-g3 c7-c5 Bf1-g2 Nb8-c6 Ke1-g1 e7-e5 e2-e4
# 11 (0.468,0.527,0.468) 2.46 413 f2-f4 d7-d5 Ng1-f3 Ng8-f6 e2-e3 c7-c5 b2-b3 g7-g6 Bf1-b5
# 12 (0.475,0.527,0.475) 2.07 383 b2-b4 e7-e5 Bc1-b2 Bf8xb4 Bb2xe5 Ng8-f6 c2-c3 Bb4-e7 e2-e3 Ke8-g8 d2-d4
# 13 (0.495,0.527,0.495) 1.93 516 h2-h3 d7-d5 d2-d4 c7-c5 e2-e3 Ng8-f6 Ng1-f3 Nb8-c6 c2-c4 e7-e6
# 14 (0.504,0.527,0.504) 1.89 627 a2-a3 d7-d5 d2-d4 Ng8-f6 Ng1-f3 g7-g6 c2-c4 Bf8-g7 c4xd5
# 15 (0.431,0.527,0.431) 1.60 177 Ng1-h3 d7-d5 g2-g3 e7-e5 d2-d4 e5xd4 Qd1xd4 Nb8-c6
# 16 (0.475,0.527,0.475) 1.40 258 a2-a4 e7-e5 e2-e4 Ng8-f6 Nb1-c3 Bf8-b4 Ng1-f3 Ke8-g8 Nf3xe5
# 17 (0.449,0.527,0.449) 1.26 168 h2-h4 d7-d5 d2-d4 c7-c5 d4xc5 Ng8-f6 Ng1-f3 e7-e6 Bc1-e3
# 18 (0.375,0.527,0.375) 1.24 91 g2-g4 d7-d5 h2-h3 e7-e5 Bf1-g2 Nb8-c6 Nb1-c3 Ng8-e7
# 19 (0.418,0.527,0.418) 1.20 118 f2-f3 e7-e5 Nb1-c3 Ng8-f6 d2-d4 e5xd4 Qd1xd4 Nb8-c6
# 20 (0.444,0.527,0.444) 0.99 124 Nb1-a3 e7-e5 e2-e3 d7-d5 d2-d4 e5-e4 c2-c4 c7-c6
# nodes = 1608289 <0% qnodes> time = 10347ms nps = 155435 eps = 0 nneps = 16556
# Tree: nodes = 236447 depth = 21 pps = 16532 visits = 170916
# qsearch_calls = 0 search_calls = 0
move d2d4
Bye Bye
regards,
Daniel