Leela Chess Zero 42565 vs Stockfish 140619

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Leela Chess Zero 42565 vs Stockfish 140619

Post by mwyoung »

Nordlandia wrote: Sun Jun 16, 2019 11:32 pm Then again the question that arise is if ponder is good trade-off for more time :?:
Exactactly.

Is 6% CPU cost to stockfish worth pondering. You tell me...

Current move prediction rate of this current ponder match is 57.9%.

And stockfish likes it very much.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Leela Chess Zero 42565 vs Stockfish 140619

Post by mwyoung »

hgm wrote: Sun Jun 16, 2019 11:51 pm Pondering makes sense when playing against Leela on a many-core machine, as Leela wouldn't use many CPU threads while thinking, and all other threads would then go to waste. Likewise the GPUs would be idle during Stockfish' turn when Leela is not pondering.

If Leela need two unshared cores, you can just set an affinity for it for hyper-threads 0-3. You can then set affinity for Stockfish to the remaining HT. That way they won't compete for cores. They might still compete for memory bandwidth, though; not sure how important that is for Leela.

I don't see why Leela couldn't run its two threads on the same physical core. That could of course be a disadvantage compared to running 2 threads on 2 physical cores, but that also holds for Stockfish' threads sharing physical cores. Yet some people claim that hyper-threading is beneficial compared to running 1 thread per physical core, and testing both under conditions with 2 active HT per core does not seem particularly unfair. It is just like all HT are somewhat slower physical cores. This doesn't involve any scheduling, so it should not be noisy.

Of course if you don't also reserve some cores for the OS, that would cause noise.
Ding ding ding a man with a brain. Happy fathers day. And there are even more effect ways to configure that also take care of the operating system.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Leela Chess Zero 42565 vs Stockfish 140619

Post by Laskos »

hgm wrote: Sun Jun 16, 2019 11:51 pm Pondering makes sense when playing against Leela on a many-core machine, as Leela wouldn't use many CPU threads while thinking, and all other threads would then go to waste. Likewise the GPUs would be idle during Stockfish' turn when Leela is not pondering.

If Leela need two unshared cores, you can just set an affinity for it for hyper-threads 0-3. You can then set affinity for Stockfish to the remaining HT. That way they won't compete for cores. They might still compete for memory bandwidth, though; not sure how important that is for Leela.

I don't see why Leela couldn't run its two threads on the same physical core. That could of course be a disadvantage compared to running 2 threads on 2 physical cores, but that also holds for Stockfish' threads sharing physical cores. Yet some people claim that hyper-threading is beneficial compared to running 1 thread per physical core, and testing both under conditions with 2 active HT per core does not seem particularly unfair. It is just like all HT are somewhat slower physical cores. This doesn't involve any scheduling, so it should not be noisy.

Of course if you don't also reserve some cores for the OS, that would cause noise.
This creature is running 34 threads on a 16 core machine, and it IS affecting heavily the engines, much more Leela than SF. I don't know, I have to perform plain experiment to show black on paper what is obvious:

4 cores i7 CPU, 8 logical cores.
Leela on RTX 2070 GPU using 2 threads.

1/ Leela to 1 million nodes on 2 threads (using one of the latest nets) without the interference from SF:

info depth 16 seldepth 52 time 35832 nodes 1021332 score cp 27 hashfull 481 nps 28503
info depth 16 seldepth 51 time 35866 nodes 1017115 score cp 27 hashfull 481 nps 28358
info depth 16 seldepth 52 time 36044 nodes 1027829 score cp 27 hashfull 483 nps 28515

Very high stability, within 0.5% deviation speeds.


2/ SF on 8 threads AND Leela to 1 million nodes on 2 threads:

info depth 16 seldepth 53 time 43960 nodes 1058510 score cp 27 hashfull 498 nps 24078
info depth 16 seldepth 52 time 48555 nodes 1017708 score cp 27 hashfull 480 nps 20959
info depth 16 seldepth 52 time 43277 nodes 999935 score cp 27 hashfull 474 nps 23105
info depth 16 seldepth 52 time 58460 nodes 1063476 score cp 27 hashfull 498 nps 18191

The speeds on average are some 30% lower and they are very erratic, some 20-30% deviation on average from one run to another.
The issue would be even graver with his 2080ti, as it uses a bit more of CPU resources than mine 2070. And more severe in shorter runs, as there are bursts of slowdowns. All in all, this creature's tests and posts here are plain garbage.
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Leela Chess Zero 42565 vs Stockfish 140619

Post by hgm »

Well, it doesn't seem to serve any purpose to use more threads in total than the number of HT provided by the hardware. So with 16 cores, 32 HT I would put Leela on 1 core, (HT 0 and 1) stockfish on 14, (HT 2-29) and reserve 1 core (HT 30-31) as a 'noise buffer' (as it seem to me it would make very little difference whether SF runs on 14 or 15 cores, and it is better to be safe than sorrow). With only 8 cores a similar approach would just leave 2 cores / 4 HT for SF, and as adding a third core would probably make a significant difference I would be tempted to run without noise buffer. As I said, contention for mamory bandwidth could still be a problem; I have really no idea whether CPU power or memory bandwidth would be the actual bottleneck, with so many cores.

Of course there are other ways to provide noise immunity, like running by node count rather than by wall-clock time.

This gives me an interesting idea. In principle pondering is still inefficient use of the CPU/GPU time that would be idle in a non-ponder game. It would be better to only use it for thinking. This is not possible in a single game, as there is no way to avoid that you have to wait for the opponent's move. But it would be possible when you played multiple (ponder-off) games in parallel. then you could run the CPU and GPU in 'batch mode', having the engines that are on move just queue up for the resource they need (i.e. one Leela queue for the GPU + 2 or 4 HT, one Stockfish queue for 28 or 26 HT, and only let their clock count down when they are actually thinking. By running some 8 games in parallel it should be quite rare that one engine thinks so much longer than average that the other queue is completely serviced, and its opponent would have to wait. The more games you run, the smaller that probability gets, and there really is no reason not to run all games of the match in parallel. (Well, perhaps hash memory.)
User avatar
Nordlandia
Posts: 2821
Joined: Fri Sep 25, 2015 9:38 pm
Location: Sortland, Norway

Re: Leela Chess Zero 42565 vs Stockfish 140619

Post by Nordlandia »

Graham Banks mentioned that it's good idea to place EGTB on two different drives to avoid throttling in case of ponder matches. It's better than nothing.

In my case i have i7-5960X 4.5GHz 8-core (HT off) and GTX 1070 Ti. One guy told me leela need only 1 thread for my card.

I allocated SF 6-core and Lc0 1-core and 1-core for "noise buffer" and 5-men syzygy on seperate SSDs.

Time control: 30m+30s + ponder.

[pgn][Event "?"] [Site "https://lichess.org/gRSAxQ6T"] [Date "2019.06.16"] [Round "?"] [White "lc0"] [Black "stockfish_19061408_x64_bmi2"] [Result "0-1"] [WhiteElo "?"] [BlackElo "?"] [Variant "From Position"] [TimeControl "1800+30"] [ECO "?"] [Opening "?"] [Termination "Unknown"] [FEN "rnbqkb1r/pp3ppp/3ppn2/8/3NP3/2N5/PPP2PPP/R1BQKBR1 b Qkq - 0 6"] [SetUp "1"] [Annotator "lichess.org"] 6... e5 7. Bb5+ Nbd7 8. Nf5 a6 9. Bxd7+ Qxd7 10. g4 Qc6 11. Qf3 b5 12. g5 Nd7 13. Ne3 Nb6 14. Bd2 Be7 15. O-O-O a5 16. Ncd5 Nxd5 17. Nxd5 b4 18. Kb1 Be6 19. h4 Rc8 20. c3 bxc3 21. Bxc3 a4 22. a3 Qc4 23. Ka1 h6 24. gxh6 Rxh6 25. h5 f5 26. Rxg7 Bf8 27. Rg6 Rxg6 28. hxg6 Bg7 29. Nb4 fxe4 30. Qh1 d5 31. Re1 Qc5 32. Qh2 d4 33. Bd2 e3 34. fxe3 dxe3 35. Bc3 e4 36. Qc2 Bxc3 37. Qxc3 Qxc3 38. bxc3 Rxc3 39. Kb2 Rb3+ 40. Kc1 Bc4 41. Nc2 e2 42. Kd2 Kf8 43. Ne3 Bb5 44. Rg1 Rxa3 45. g7+ Kg8 46. Nf5 Rd3+ 47. Kc2 Bc4 48. Ne7+ Kh7 49. g8=Q+ Bxg8 50. Nxg8 Rf3 51. Re1 Kxg8 { Black wins. } 52. Rxe2 0-1[/pgn]

[pgn][Event "?"] [Site "https://lichess.org/faaHfMb4"] [Date "2019.06.16"] [Round "?"] [White "stockfish_19061408_x64_bmi2"] [Black "lc0"] [Result "1-0"] [WhiteElo "?"] [BlackElo "?"] [Variant "From Position"] [TimeControl "1800+30"] [ECO "?"] [Opening "?"] [Termination "Unknown"] [FEN "rnbqkb1r/pp3ppp/3ppn2/8/3NP3/2N5/PPP2PPP/R1BQKBR1 b Qkq - 0 6"] [SetUp "1"] [Annotator "lichess.org"] 6... Nc6 7. g4 d5 8. exd5 Nxd5 9. Nxd5 exd5 10. Be3 Bd6 11. Qd2 Bxh2 12. Rg2 Bc7 13. O-O-O O-O 14. Bd3 g6 15. Rh1 Ne5 16. f4 Nxg4 17. f5 Nxe3 18. Qxe3 Qf6 19. Kb1 Be5 20. Nf3 Bxb2 21. fxg6 fxg6 22. Bxg6 hxg6 23. Qh6 Kf7 24. Rxg6 Qxg6 25. Ng5+ Qxg5 26. Qxg5 Bf6 27. Qxd5+ Ke8 28. Re1+ Be7 29. Qb5+ Kd8 30. Rd1+ Kc7 31. Qe5+ Kb6 32. Qxe7 Rf5 33. Rd6+ Ka5 34. Qe1+ Kb5 35. Qe4 Rb8 36. Qd4 Ra8 37. c4+ Ka4 38. Qc3 { White wins. } Rb5+ 1-0[/pgn]
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Leela Chess Zero 42565 vs Stockfish 140619

Post by mwyoung »

hgm wrote: Mon Jun 17, 2019 7:50 am Well, it doesn't seem to serve any purpose to use more threads in total than the number of HT provided by the hardware. So with 16 cores, 32 HT I would put Leela on 1 core, (HT 0 and 1) stockfish on 14, (HT 2-29) and reserve 1 core (HT 30-31) as a 'noise buffer' (as it seem to me it would make very little difference whether SF runs on 14 or 15 cores, and it is better to be safe than sorrow). With only 8 cores a similar approach would just leave 2 cores / 4 HT for SF, and as adding a third core would probably make a significant difference I would be tempted to run without noise buffer. As I said, contention for mamory bandwidth could still be a problem; I have really no idea whether CPU power or memory bandwidth would be the actual bottleneck, with so many cores.

Of course there are other ways to provide noise immunity, like running by node count rather than by wall-clock time.

This gives me an interesting idea. In principle pondering is still inefficient use of the CPU/GPU time that would be idle in a non-ponder game. It would be better to only use it for thinking. This is not possible in a single game, as there is no way to avoid that you have to wait for the opponent's move. But it would be possible when you played multiple (ponder-off) games in parallel. then you could run the CPU and GPU in 'batch mode', having the engines that are on move just queue up for the resource they need (i.e. one Leela queue for the GPU + 2 or 4 HT, one Stockfish queue for 28 or 26 HT, and only let their clock count down when they are actually thinking. By running some 8 games in parallel it should be quite rare that one engine thinks so much longer than average that the other queue is completely serviced, and its opponent would have to wait. The more games you run, the smaller that probability gets, and there really is no reason not to run all games of the match in parallel. (Well, perhaps hash memory.)
There is a purpose to using a 32+2 thread configuration. Remember threads are not the same as utilization. And lc0 is not a ab engine. And there is more then one way to manage threads. And it drives the trolls crazy.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
User avatar
Nordlandia
Posts: 2821
Joined: Fri Sep 25, 2015 9:38 pm
Location: Sortland, Norway

Re: Leela Chess Zero 42565 vs Stockfish 140619

Post by Nordlandia »

Please explain why you choose to overload your system with 32+2 threads?
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Leela Chess Zero 42565 vs Stockfish 140619

Post by mwyoung »

Nordlandia wrote: Mon Jun 17, 2019 3:46 pm Please explain why you choose to overload your system with 32+2 threads?
First I am not overloading my system. As you can see from the live stream. Let see if you can figure this out on your own.

I will give you a hint. How many threads is your compuer running right now.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
User avatar
Nordlandia
Posts: 2821
Joined: Fri Sep 25, 2015 9:38 pm
Location: Sortland, Norway

Re: Leela Chess Zero 42565 vs Stockfish 140619

Post by Nordlandia »

mwyoung wrote: Mon Jun 17, 2019 3:52 pm
Nordlandia wrote: Mon Jun 17, 2019 3:46 pm Please explain why you choose to overload your system with 32+2 threads?
First I am not overloading my system. As you can see from the live stream. Let see if you can figure this out on your own.

I will give you a hint. How many threads is your compuer running right now.
8-core at 4.5GHz.
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Leela Chess Zero 42565 vs Stockfish 140619

Post by mwyoung »

Nordlandia wrote: Mon Jun 17, 2019 4:16 pm
mwyoung wrote: Mon Jun 17, 2019 3:52 pm
Nordlandia wrote: Mon Jun 17, 2019 3:46 pm Please explain why you choose to overload your system with 32+2 threads?
First I am not overloading my system. As you can see from the live stream. Let see if you can figure this out on your own.

I will give you a hint. How many threads is your compuer running right now.
8-core at 4.5GHz.
That was not the question. Again how many threads are you running right now. Find the answer....
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.