lczero faq

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

duncan
Posts: 12038
Joined: Mon Jul 07, 2008 10:50 pm

lczero faq

Post by duncan »

Do not understand much of what is going on and suspect there are many others in a similar position. Anybody interested in creating or linking to a faq?
User avatar
AdminX
Posts: 6339
Joined: Mon Mar 13, 2006 2:34 pm
Location: Acworth, GA

Re: lczero faq

Post by AdminX »

"Good decisions come from experience, and experience comes from bad decisions."
__________________________________________________________________
Ted Summers
duncan
Posts: 12038
Joined: Mon Jul 07, 2008 10:50 pm

Re: lczero faq

Post by duncan »

thanks for the links, although does not help me to know what these networks are and how they work and improve each version . or what buffers are etc.
Robert Pope
Posts: 558
Joined: Sat Mar 25, 2006 8:27 pm

Re: lczero faq

Post by Robert Pope »

Basically, the networks are the weights of the evaluation function. As LCZero learns, the weights in the networks get better tuned and it plays stronger.
duncan
Posts: 12038
Joined: Mon Jul 07, 2008 10:50 pm

Re: lczero faq

Post by duncan »

Robert Pope wrote:Basically, the networks are the weights of the evaluation function. As LCZero learns, the weights in the networks get better tuned and it plays stronger.
thanks. why does it take longer to make moves as the network gets better and why is a cpu not so suitable for play ?
MonteCarlo
Posts: 188
Joined: Sun Dec 25, 2016 4:59 pm

Re: lczero faq

Post by MonteCarlo »

It doesn't take longer just because the network gets better.

It's slower now because the network size was recently increased substantially.

A larger network means evaluating the network is more computationally expensive (essentially, the NN is just a giant set of math operations, and now there are a lot more of them); however, with a fixed network size (which this should be now for a while), the speed will stay the same whether the network improves or regresses (that's not increasing the number of the weights/operations, just changing their value to values that work better/worse).

The operations performed by the NN basically require doing the same operation on a lot of different data independently; this makes them quite amenable to running on GPUs, which are designed for just such purposes. At a very abstract level, GPUs basically have thousands of units for doing math, so as long as you have thousands of math operations that can be done independently, they'll be well-suited to the task.

They're much less well-suited for things that involve a lot of branching or task switching, or where you have to figure out the result of calculation N before you can perform calculation N+1, exactly the sorts of things that CPUs are optimized for.

Evaluating an NN is just the sort of problem that lies in the GPU's sweet spot, and isn't in the CPU's. The problem is exacerbated by larger network sizes. The previous network was small enough that the gap between CPU and GPU was not too difficult to overcome by using a handful of CPU cores.

After the recent increase in network size, though, GPU users were only mildly affected, because the previous net was so small the decent GPUs weren't even getting 100% utilized.

For CPU users, however, NPS dropped by 2-3x (although the strength improvement from the new larger network basically made this a wash for gameplay).
duncan
Posts: 12038
Joined: Mon Jul 07, 2008 10:50 pm

Re: lczero faq

Post by duncan »

MonteCarlo wrote:It doesn't take longer just because the network gets better.

It's slower now because the network size was recently increased substantially.

A larger network means evaluating the network is more computationally expensive (essentially, the NN is just a giant set of math operations, and now there are a lot more of them); however, with a fixed network size (which this should be now for a while), the speed will stay the same whether the network improves or regresses (that's not increasing the number of the weights/operations, just changing their value to values that work better/worse).

The operations performed by the NN basically require doing the same operation on a lot of different data independently; this makes them quite amenable to running on GPUs, which are designed for just such purposes. At a very abstract level, GPUs basically have thousands of units for doing math, so as long as you have thousands of math operations that can be done independently, they'll be well-suited to the task.

They're much less well-suited for things that involve a lot of branching or task switching, or where you have to figure out the result of calculation N before you can perform calculation N+1, exactly the sorts of things that CPUs are optimized for.

Evaluating an NN is just the sort of problem that lies in the GPU's sweet spot, and isn't in the CPU's. The problem is exacerbated by larger network sizes. The previous network was small enough that the gap between CPU and GPU was not too difficult to overcome by using a handful of CPU cores.

After the recent increase in network size, though, GPU users were only mildly affected, because the previous net was so small the decent GPUs weren't even getting 100% utilized.

For CPU users, however, NPS dropped by 2-3x (although the strength improvement from the new larger network basically made this a wash for gameplay).
thanks for your reply.


I read the network changed from 6*64 to 10*128. what do these figures mean.

and how is it decided when to change to a larger network?