why lczero needs more to beat sf

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

stavros
Posts: 165
Joined: Tue Dec 02, 2014 1:29 am

why lczero needs more to beat sf

Post by stavros »

1.according to google aplhazero paper, 44m games needed for the training stage
2.but the hardest part is that google monster used 4 TPU(4x45 Tflops=180 Tflops-wikipedia)
to produce 80,000 nodes per sec to PLAY the games.
even if the training part is perfect you need at least 2 nvidia titan v100(2x9000$)
to produce 240 Tflops ???nodes/sec
and all these to beat sf8 by 100 points elo (the latest asmbrainfish is +120 elo(vs st8 )
http://www.sp-cc.de/files/archive-rating.dat
i wonder why ppl are so impatient and so fanatic that lczero will crush sf soon
remember again not enough 8000 n/s nor 16000 etc but 80 000 nodes/sec with a big net!
unless lczero project is better than aplhazero...
pls correct me if iam wrong.
Dann Corbit
Posts: 12538
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: why lczero needs more to beat sf

Post by Dann Corbit »

stavros wrote:1.according to google aplhazero paper, 44m games needed for the training stage
2.but the hardest part is that google monster used 4 TPU(4x45 Tflops=180 Tflops-wikipedia)
to produce 80,000 nodes per sec to PLAY the games.
even if the training part is perfect you need at least 2 nvidia titan v100(2x9000$)
to produce 240 Tflops ???nodes/sec
and all these to beat sf8 by 100 points elo (the latest asmbrainfish is +120 elo(vs st8 )
http://www.sp-cc.de/files/archive-rating.dat
i wonder why ppl are so impatient and so fanatic that lczero will crush sf soon
remember again not enough 8000 n/s nor 16000 etc but 80 000 nodes/sec with a big net!
unless lczero project is better than aplhazero...
pls correct me if iam wrong.
There are people who spend $100K on a chess machine with standard processors. So for a machine that can demolish that for (say) $20,000 they will be very happy.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
stavros
Posts: 165
Joined: Tue Dec 02, 2014 1:29 am

Re: why lczero needs more to beat sf

Post by stavros »

Dann Corbit wrote:
stavros wrote:1.according to google aplhazero paper, 44m games needed for the training stage
2.but the hardest part is that google monster used 4 TPU(4x45 Tflops=180 Tflops-wikipedia)
to produce 80,000 nodes per sec to PLAY the games.
even if the training part is perfect you need at least 2 nvidia titan v100(2x9000$)
to produce 240 Tflops ???nodes/sec
and all these to beat sf8 by 100 points elo (the latest asmbrainfish is +120 elo(vs st8 )
http://www.sp-cc.de/files/archive-rating.dat
i wonder why ppl are so impatient and so fanatic that lczero will crush sf soon
remember again not enough 8000 n/s nor 16000 etc but 80 000 nodes/sec with a big net!
unless lczero project is better than aplhazero...
pls correct me if iam wrong.
There are people who spend $100K on a chess machine with standard processors. So for a machine that can demolish that for (say) $20,000 they will be very happy.

yes i agree i imagine that :
https://www.anandtech.com/show/12587/nv ... -only-400k

i hope i dont want too much
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: why lczero needs more to beat sf

Post by Milos »

stavros wrote:1.according to google aplhazero paper, 44m games needed for the training stage
2.but the hardest part is that google monster used 4 TPU(4x45 Tflops=180 Tflops-wikipedia)
to produce 80,000 nodes per sec to PLAY the games.
even if the training part is perfect you need at least 2 nvidia titan v100(2x9000$)
to produce 240 Tflops ???nodes/sec
and all these to beat sf8 by 100 points elo (the latest asmbrainfish is +120 elo(vs st8 )
http://www.sp-cc.de/files/archive-rating.dat
i wonder why ppl are so impatient and so fanatic that lczero will crush sf soon
remember again not enough 8000 n/s nor 16000 etc but 80 000 nodes/sec with a big net!
unless lczero project is better than aplhazero...
pls correct me if iam wrong.
They actually used Gen1 TPUs to play matches against SF which are 92TOPS each, i.e. 370TOPS in total for 4 TPU.
For training they used Gen2 TPUs where each TPU consists of 4 chips, i.e. what they consider TPU has actually 180TFLOPS.
And NVIDIA V100 is max 40-50 TFLOPS for 3x3 kernals (meaning using tensor cores at their maximum capacity which they are not capable atm) so you need 2 of V100 for a single Gen1 TPU.
stavros
Posts: 165
Joined: Tue Dec 02, 2014 1:29 am

Re: why lczero needs more to beat sf

Post by stavros »

Milos wrote:
stavros wrote:1.according to google aplhazero paper, 44m games needed for the training stage
2.but the hardest part is that google monster used 4 TPU(4x45 Tflops=180 Tflops-wikipedia)
to produce 80,000 nodes per sec to PLAY the games.
even if the training part is perfect you need at least 2 nvidia titan v100(2x9000$)
to produce 240 Tflops ???nodes/sec
and all these to beat sf8 by 100 points elo (the latest asmbrainfish is +120 elo(vs st8 )
http://www.sp-cc.de/files/archive-rating.dat
i wonder why ppl are so impatient and so fanatic that lczero will crush sf soon
remember again not enough 8000 n/s nor 16000 etc but 80 000 nodes/sec with a big net!
unless lczero project is better than aplhazero...
pls correct me if iam wrong.
They actually used Gen1 TPUs to play matches against SF which are 92TOPS each, i.e. 370TOPS in total for 4 TPU.
For training they used Gen2 TPUs where each TPU consists of 4 chips, i.e. what they consider TPU has actually 180TFLOPS.
And NVIDIA V100 is max 40-50 TFLOPS for 3x3 kernals (meaning using tensor cores at their maximum capacity which they are not capable atm) so you need 2 of V100 for a single Gen1 TPU.
so theoretically we need after 44m training games 8x nvidia v100=72000$ to beat sf9?

EDIT: you are right: google 4xTPU gen1 (92 Tflops)=368 Tflops to produce 80,000n/s
for a match play
JJJ
Posts: 1346
Joined: Sat Apr 19, 2014 1:47 pm

Re: why lczero needs more to beat sf

Post by JJJ »

stavros wrote:
Milos wrote:
stavros wrote:1.according to google aplhazero paper, 44m games needed for the training stage
2.but the hardest part is that google monster used 4 TPU(4x45 Tflops=180 Tflops-wikipedia)
to produce 80,000 nodes per sec to PLAY the games.
even if the training part is perfect you need at least 2 nvidia titan v100(2x9000$)
to produce 240 Tflops ???nodes/sec
and all these to beat sf8 by 100 points elo (the latest asmbrainfish is +120 elo(vs st8 )
http://www.sp-cc.de/files/archive-rating.dat
i wonder why ppl are so impatient and so fanatic that lczero will crush sf soon
remember again not enough 8000 n/s nor 16000 etc but 80 000 nodes/sec with a big net!
unless lczero project is better than aplhazero...
pls correct me if iam wrong.
They actually used Gen1 TPUs to play matches against SF which are 92TOPS each, i.e. 370TOPS in total for 4 TPU.
For training they used Gen2 TPUs where each TPU consists of 4 chips, i.e. what they consider TPU has actually 180TFLOPS.
And NVIDIA V100 is max 40-50 TFLOPS for 3x3 kernals (meaning using tensor cores at their maximum capacity which they are not capable atm) so you need 2 of V100 for a single Gen1 TPU.
so theoretically we need after 44m training games 8x nvidia v100=72000$ to beat sf9?

EDIT: you are right: google 4xTPU gen1 (92 Tflops)=368 Tflops to produce 80,000n/s
for a match play
But maybe Leela will reach more elo with more games so we won't need so powerfull hardware to beat Stockfish.
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: why lczero needs more to beat sf

Post by Milos »

JJJ wrote:
stavros wrote:
Milos wrote:
stavros wrote:1.according to google aplhazero paper, 44m games needed for the training stage
2.but the hardest part is that google monster used 4 TPU(4x45 Tflops=180 Tflops-wikipedia)
to produce 80,000 nodes per sec to PLAY the games.
even if the training part is perfect you need at least 2 nvidia titan v100(2x9000$)
to produce 240 Tflops ???nodes/sec
and all these to beat sf8 by 100 points elo (the latest asmbrainfish is +120 elo(vs st8 )
http://www.sp-cc.de/files/archive-rating.dat
i wonder why ppl are so impatient and so fanatic that lczero will crush sf soon
remember again not enough 8000 n/s nor 16000 etc but 80 000 nodes/sec with a big net!
unless lczero project is better than aplhazero...
pls correct me if iam wrong.
They actually used Gen1 TPUs to play matches against SF which are 92TOPS each, i.e. 370TOPS in total for 4 TPU.
For training they used Gen2 TPUs where each TPU consists of 4 chips, i.e. what they consider TPU has actually 180TFLOPS.
And NVIDIA V100 is max 40-50 TFLOPS for 3x3 kernals (meaning using tensor cores at their maximum capacity which they are not capable atm) so you need 2 of V100 for a single Gen1 TPU.
so theoretically we need after 44m training games 8x nvidia v100=72000$ to beat sf9?

EDIT: you are right: google 4xTPU gen1 (92 Tflops)=368 Tflops to produce 80,000n/s
for a match play
But maybe Leela will reach more elo with more games so we won't need so powerfull hardware to beat Stockfish.
That is highly improbable. Google is not stupid, they must have tried different net configurations. MCTS they didn't optimized much - it is more less the same as in AGZ, but with their net they most probably did a lot of optimizations, and a lot of trial and error before they reached current configuration.
It is therefore highly probable, especially considering Figure 1 that the net they used was indeed optimal one for strength if one doesn't use domain specific knowledge. That net was obviously saturated and couldn't yield more Elo.
Choosing larger net would probably offer some Elo gain per fixed number of playouts but would lose more Elo in terms of lower nps so they didn't increase network size even further. For them experimenting with size would be extremely easy considering their resources. So it is more than probable that they tried larger nets, especially considering that A0 net size is nowhere near to saturate the capacity of TPU memory.
Once they reached the optimal net, they simply used as many TPUs as it was required to beat SF8 in that handicapped match - in their case they used 4 TPU. If final net was weaker they would have simply used more TPUs, but would never published a result in which A0 didn't beat SF.

So considering reasonable assumptions they A0 net was optimal and that Google's training process was far better than of LC0, it is highly improbable that LC0 will not only fail to surpass A0 on equal hardware, but even reach a level close to A0.
JJJ
Posts: 1346
Joined: Sat Apr 19, 2014 1:47 pm

Re: why lczero needs more to beat sf

Post by JJJ »

Milos wrote:
JJJ wrote:
stavros wrote:
Milos wrote:
stavros wrote:1.according to google aplhazero paper, 44m games needed for the training stage
2.but the hardest part is that google monster used 4 TPU(4x45 Tflops=180 Tflops-wikipedia)
to produce 80,000 nodes per sec to PLAY the games.
even if the training part is perfect you need at least 2 nvidia titan v100(2x9000$)
to produce 240 Tflops ???nodes/sec
and all these to beat sf8 by 100 points elo (the latest asmbrainfish is +120 elo(vs st8 )
http://www.sp-cc.de/files/archive-rating.dat
i wonder why ppl are so impatient and so fanatic that lczero will crush sf soon
remember again not enough 8000 n/s nor 16000 etc but 80 000 nodes/sec with a big net!
unless lczero project is better than aplhazero...
pls correct me if iam wrong.
They actually used Gen1 TPUs to play matches against SF which are 92TOPS each, i.e. 370TOPS in total for 4 TPU.
For training they used Gen2 TPUs where each TPU consists of 4 chips, i.e. what they consider TPU has actually 180TFLOPS.
And NVIDIA V100 is max 40-50 TFLOPS for 3x3 kernals (meaning using tensor cores at their maximum capacity which they are not capable atm) so you need 2 of V100 for a single Gen1 TPU.
so theoretically we need after 44m training games 8x nvidia v100=72000$ to beat sf9?

EDIT: you are right: google 4xTPU gen1 (92 Tflops)=368 Tflops to produce 80,000n/s
for a match play
But maybe Leela will reach more elo with more games so we won't need so powerfull hardware to beat Stockfish.
That is highly improbable. Google is not stupid, they must have tried different net configurations. MCTS they didn't optimized much - it is more less the same as in AGZ, but with their net they most probably did a lot of optimizations, and a lot of trial and error before they reached current configuration.
It is therefore highly probable, especially considering Figure 1 that the net they used was indeed optimal one for strength if one doesn't use domain specific knowledge. That net was obviously saturated and couldn't yield more Elo.
Choosing larger net would probably offer some Elo gain per fixed number of playouts but would lose more Elo in terms of lower nps so they didn't increase network size even further. For them experimenting with size would be extremely easy considering their resources. So it is more than probable that they tried larger nets, especially considering that A0 net size is nowhere near to saturate the capacity of TPU memory.
Once they reached the optimal net, they simply used as many TPUs as it was required to beat SF8 in that handicapped match - in their case they used 4 TPU. If final net was weaker they would have simply used more TPUs, but would never published a result in which A0 didn't beat SF.

So considering reasonable assumptions they A0 net was optimal and that Google's training process was far better than of LC0, it is highly improbable that LC0 will not only fail to surpass A0 on equal hardware, but even reach a level close to A0.
Well, we ll have to wait and see fore sure.
whereagles
Posts: 565
Joined: Thu Nov 13, 2014 12:03 pm

Re: why lczero needs more to beat sf

Post by whereagles »

stavros wrote: yes i agree i imagine that :
https://www.anandtech.com/show/12587/nv ... -only-400k
10 kW power drain? that's gonna interfere with electric car chraging. gonna need a wiring upgrade :lol:
corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

Re: why lczero needs more to beat sf

Post by corres »

As DeepMind has a lot of source of power for computing so it should not make to optimize the usage. Because of this maybe an optimized LCO would be run on a weaker hardware. But it is obvious to defeat Stockfish you need a hardware more powerfull than a Geforce GTX 1060 plus a PC with four cores...
Mention repeatedly of GTX 1060 is a kind of advertisement only.
An encouragement for the contributors.