why lczero needs more to beat sf

stavros · Post by **stavros** » Wed May 02, 2018 12:02 am

1.according to google aplhazero paper, 44m games needed for the training stage
2.but the hardest part is that google monster used 4 TPU(4x45 Tflops=180 Tflops-wikipedia)
to produce 80,000 nodes per sec to PLAY the games.
even if the training part is perfect you need at least 2 nvidia titan v100(2x9000$)
to produce 240 Tflops ???nodes/sec
and all these to beat sf8 by 100 points elo (the latest asmbrainfish is +120 elo(vs st8 )
http://www.sp-cc.de/files/archive-rating.dat
i wonder why ppl are so impatient and so fanatic that lczero will crush sf soon
remember again not enough 8000 n/s nor 16000 etc but 80 000 nodes/sec with a big net!
unless lczero project is better than aplhazero...
pls correct me if iam wrong.

Dann Corbit · Post by **Dann Corbit** » Wed May 02, 2018 12:16 am

stavros wrote:1.according to google aplhazero paper, 44m games needed for the training stage
2.but the hardest part is that google monster used 4 TPU(4x45 Tflops=180 Tflops-wikipedia)
to produce 80,000 nodes per sec to PLAY the games.
even if the training part is perfect you need at least 2 nvidia titan v100(2x9000$)
to produce 240 Tflops ???nodes/sec
and all these to beat sf8 by 100 points elo (the latest asmbrainfish is +120 elo(vs st8 )
http://www.sp-cc.de/files/archive-rating.dat
i wonder why ppl are so impatient and so fanatic that lczero will crush sf soon
remember again not enough 8000 n/s nor 16000 etc but 80 000 nodes/sec with a big net!
unless lczero project is better than aplhazero...
pls correct me if iam wrong.

There are people who spend $100K on a chess machine with standard processors. So for a machine that can demolish that for (say) $20,000 they will be very happy.

stavros · Post by **stavros** » Wed May 02, 2018 12:34 am

Dann Corbit wrote:
stavros wrote:1.according to google aplhazero paper, 44m games needed for the training stage
2.but the hardest part is that google monster used 4 TPU(4x45 Tflops=180 Tflops-wikipedia)
to produce 80,000 nodes per sec to PLAY the games.
even if the training part is perfect you need at least 2 nvidia titan v100(2x9000$)
to produce 240 Tflops ???nodes/sec
and all these to beat sf8 by 100 points elo (the latest asmbrainfish is +120 elo(vs st8 )
http://www.sp-cc.de/files/archive-rating.dat
i wonder why ppl are so impatient and so fanatic that lczero will crush sf soon
remember again not enough 8000 n/s nor 16000 etc but 80 000 nodes/sec with a big net!
unless lczero project is better than aplhazero...
pls correct me if iam wrong.
There are people who spend $100K on a chess machine with standard processors. So for a machine that can demolish that for (say) $20,000 they will be very happy.

yes i agree i imagine that :
https://www.anandtech.com/show/12587/nv ... -only-400k

i hope i dont want too much

Milos · Post by **Milos** » Wed May 02, 2018 12:52 am

stavros wrote:1.according to google aplhazero paper, 44m games needed for the training stage
2.but the hardest part is that google monster used 4 TPU(4x45 Tflops=180 Tflops-wikipedia)
to produce 80,000 nodes per sec to PLAY the games.
even if the training part is perfect you need at least 2 nvidia titan v100(2x9000$)
to produce 240 Tflops ???nodes/sec
and all these to beat sf8 by 100 points elo (the latest asmbrainfish is +120 elo(vs st8 )
http://www.sp-cc.de/files/archive-rating.dat
i wonder why ppl are so impatient and so fanatic that lczero will crush sf soon
remember again not enough 8000 n/s nor 16000 etc but 80 000 nodes/sec with a big net!
unless lczero project is better than aplhazero...
pls correct me if iam wrong.

They actually used Gen1 TPUs to play matches against SF which are 92TOPS each, i.e. 370TOPS in total for 4 TPU.
For training they used Gen2 TPUs where each TPU consists of 4 chips, i.e. what they consider TPU has actually 180TFLOPS.
And NVIDIA V100 is max 40-50 TFLOPS for 3x3 kernals (meaning using tensor cores at their maximum capacity which they are not capable atm) so you need 2 of V100 for a single Gen1 TPU.

stavros · Post by **stavros** » Wed May 02, 2018 2:00 am

Milos wrote:
stavros wrote:1.according to google aplhazero paper, 44m games needed for the training stage
2.but the hardest part is that google monster used 4 TPU(4x45 Tflops=180 Tflops-wikipedia)
to produce 80,000 nodes per sec to PLAY the games.
even if the training part is perfect you need at least 2 nvidia titan v100(2x9000$)
to produce 240 Tflops ???nodes/sec
and all these to beat sf8 by 100 points elo (the latest asmbrainfish is +120 elo(vs st8 )
http://www.sp-cc.de/files/archive-rating.dat
i wonder why ppl are so impatient and so fanatic that lczero will crush sf soon
remember again not enough 8000 n/s nor 16000 etc but 80 000 nodes/sec with a big net!
unless lczero project is better than aplhazero...
pls correct me if iam wrong.
They actually used Gen1 TPUs to play matches against SF which are 92TOPS each, i.e. 370TOPS in total for 4 TPU.
For training they used Gen2 TPUs where each TPU consists of 4 chips, i.e. what they consider TPU has actually 180TFLOPS.
And NVIDIA V100 is max 40-50 TFLOPS for 3x3 kernals (meaning using tensor cores at their maximum capacity which they are not capable atm) so you need 2 of V100 for a single Gen1 TPU.

so theoretically we need after 44m training games 8x nvidia v100=72000$ to beat sf9?

EDIT: you are right: google 4xTPU gen1 (92 Tflops)=368 Tflops to produce 80,000n/s
for a match play

JJJ · Post by **JJJ** » Wed May 02, 2018 3:00 am

stavros wrote:
Milos wrote:
stavros wrote:1.according to google aplhazero paper, 44m games needed for the training stage
2.but the hardest part is that google monster used 4 TPU(4x45 Tflops=180 Tflops-wikipedia)
to produce 80,000 nodes per sec to PLAY the games.
even if the training part is perfect you need at least 2 nvidia titan v100(2x9000$)
to produce 240 Tflops ???nodes/sec
and all these to beat sf8 by 100 points elo (the latest asmbrainfish is +120 elo(vs st8 )
http://www.sp-cc.de/files/archive-rating.dat
i wonder why ppl are so impatient and so fanatic that lczero will crush sf soon
remember again not enough 8000 n/s nor 16000 etc but 80 000 nodes/sec with a big net!
unless lczero project is better than aplhazero...
pls correct me if iam wrong.
They actually used Gen1 TPUs to play matches against SF which are 92TOPS each, i.e. 370TOPS in total for 4 TPU.
For training they used Gen2 TPUs where each TPU consists of 4 chips, i.e. what they consider TPU has actually 180TFLOPS.
And NVIDIA V100 is max 40-50 TFLOPS for 3x3 kernals (meaning using tensor cores at their maximum capacity which they are not capable atm) so you need 2 of V100 for a single Gen1 TPU.
so theoretically we need after 44m training games 8x nvidia v100=72000$ to beat sf9?

EDIT: you are right: google 4xTPU gen1 (92 Tflops)=368 Tflops to produce 80,000n/s
for a match play

But maybe Leela will reach more elo with more games so we won't need so powerfull hardware to beat Stockfish.

Milos · Post by **Milos** » Wed May 02, 2018 3:41 am

JJJ wrote:
stavros wrote:
Milos wrote:
stavros wrote:1.according to google aplhazero paper, 44m games needed for the training stage
2.but the hardest part is that google monster used 4 TPU(4x45 Tflops=180 Tflops-wikipedia)
to produce 80,000 nodes per sec to PLAY the games.
even if the training part is perfect you need at least 2 nvidia titan v100(2x9000$)
to produce 240 Tflops ???nodes/sec
and all these to beat sf8 by 100 points elo (the latest asmbrainfish is +120 elo(vs st8 )
http://www.sp-cc.de/files/archive-rating.dat
i wonder why ppl are so impatient and so fanatic that lczero will crush sf soon
remember again not enough 8000 n/s nor 16000 etc but 80 000 nodes/sec with a big net!
unless lczero project is better than aplhazero...
pls correct me if iam wrong.
They actually used Gen1 TPUs to play matches against SF which are 92TOPS each, i.e. 370TOPS in total for 4 TPU.
For training they used Gen2 TPUs where each TPU consists of 4 chips, i.e. what they consider TPU has actually 180TFLOPS.
And NVIDIA V100 is max 40-50 TFLOPS for 3x3 kernals (meaning using tensor cores at their maximum capacity which they are not capable atm) so you need 2 of V100 for a single Gen1 TPU.
so theoretically we need after 44m training games 8x nvidia v100=72000$ to beat sf9?

EDIT: you are right: google 4xTPU gen1 (92 Tflops)=368 Tflops to produce 80,000n/s
for a match play
But maybe Leela will reach more elo with more games so we won't need so powerfull hardware to beat Stockfish.

That is highly improbable. Google is not stupid, they must have tried different net configurations. MCTS they didn't optimized much - it is more less the same as in AGZ, but with their net they most probably did a lot of optimizations, and a lot of trial and error before they reached current configuration.
It is therefore highly probable, especially considering Figure 1 that the net they used was indeed optimal one for strength if one doesn't use domain specific knowledge. That net was obviously saturated and couldn't yield more Elo.
Choosing larger net would probably offer some Elo gain per fixed number of playouts but would lose more Elo in terms of lower nps so they didn't increase network size even further. For them experimenting with size would be extremely easy considering their resources. So it is more than probable that they tried larger nets, especially considering that A0 net size is nowhere near to saturate the capacity of TPU memory.
Once they reached the optimal net, they simply used as many TPUs as it was required to beat SF8 in that handicapped match - in their case they used 4 TPU. If final net was weaker they would have simply used more TPUs, but would never published a result in which A0 didn't beat SF.

So considering reasonable assumptions they A0 net was optimal and that Google's training process was far better than of LC0, it is highly improbable that LC0 will not only fail to surpass A0 on equal hardware, but even reach a level close to A0.

JJJ · Post by **JJJ** » Wed May 02, 2018 4:41 am

Milos wrote:
JJJ wrote:
stavros wrote:
Milos wrote:
stavros wrote:1.according to google aplhazero paper, 44m games needed for the training stage
2.but the hardest part is that google monster used 4 TPU(4x45 Tflops=180 Tflops-wikipedia)
to produce 80,000 nodes per sec to PLAY the games.
even if the training part is perfect you need at least 2 nvidia titan v100(2x9000$)
to produce 240 Tflops ???nodes/sec
and all these to beat sf8 by 100 points elo (the latest asmbrainfish is +120 elo(vs st8 )
http://www.sp-cc.de/files/archive-rating.dat
i wonder why ppl are so impatient and so fanatic that lczero will crush sf soon
remember again not enough 8000 n/s nor 16000 etc but 80 000 nodes/sec with a big net!
unless lczero project is better than aplhazero...
pls correct me if iam wrong.
They actually used Gen1 TPUs to play matches against SF which are 92TOPS each, i.e. 370TOPS in total for 4 TPU.
For training they used Gen2 TPUs where each TPU consists of 4 chips, i.e. what they consider TPU has actually 180TFLOPS.
And NVIDIA V100 is max 40-50 TFLOPS for 3x3 kernals (meaning using tensor cores at their maximum capacity which they are not capable atm) so you need 2 of V100 for a single Gen1 TPU.
so theoretically we need after 44m training games 8x nvidia v100=72000$ to beat sf9?

EDIT: you are right: google 4xTPU gen1 (92 Tflops)=368 Tflops to produce 80,000n/s
for a match play
But maybe Leela will reach more elo with more games so we won't need so powerfull hardware to beat Stockfish.
That is highly improbable. Google is not stupid, they must have tried different net configurations. MCTS they didn't optimized much - it is more less the same as in AGZ, but with their net they most probably did a lot of optimizations, and a lot of trial and error before they reached current configuration.
It is therefore highly probable, especially considering Figure 1 that the net they used was indeed optimal one for strength if one doesn't use domain specific knowledge. That net was obviously saturated and couldn't yield more Elo.
Choosing larger net would probably offer some Elo gain per fixed number of playouts but would lose more Elo in terms of lower nps so they didn't increase network size even further. For them experimenting with size would be extremely easy considering their resources. So it is more than probable that they tried larger nets, especially considering that A0 net size is nowhere near to saturate the capacity of TPU memory.
Once they reached the optimal net, they simply used as many TPUs as it was required to beat SF8 in that handicapped match - in their case they used 4 TPU. If final net was weaker they would have simply used more TPUs, but would never published a result in which A0 didn't beat SF.

So considering reasonable assumptions they A0 net was optimal and that Google's training process was far better than of LC0, it is highly improbable that LC0 will not only fail to surpass A0 on equal hardware, but even reach a level close to A0.

Well, we ll have to wait and see fore sure.

whereagles · Post by **whereagles** » Wed May 02, 2018 9:24 am

stavros wrote: yes i agree i imagine that :
https://www.anandtech.com/show/12587/nv ... -only-400k

10 kW power drain? that's gonna interfere with electric car chraging. gonna need a wiring upgrade

corres · Post by **corres** » Wed May 02, 2018 10:57 am

As DeepMind has a lot of source of power for computing so it should not make to optimize the usage. Because of this maybe an optimized LCO would be run on a weaker hardware. But it is obvious to defeat Stockfish you need a hardware more powerfull than a Geforce GTX 1060 plus a PC with four cores...
Mention repeatedly of GTX 1060 is a kind of advertisement only.
An encouragement for the contributors.

why lczero needs more to beat sf

why lczero needs more to beat sf

Re: why lczero needs more to beat sf

Re: why lczero needs more to beat sf

Re: why lczero needs more to beat sf

Re: why lczero needs more to beat sf

Re: why lczero needs more to beat sf

Re: why lczero needs more to beat sf

Re: why lczero needs more to beat sf

Re: why lczero needs more to beat sf

Re: why lczero needs more to beat sf