lczero faq

Discussion of anything and everything relating to chess playing software and machines.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
duncan
Posts: 10365
Joined: Mon Jul 07, 2008 8:50 pm

lczero faq

Post by duncan » Fri Apr 13, 2018 10:05 am

Do not understand much of what is going on and suspect there are many others in a similar position. Anybody interested in creating or linking to a faq?

User avatar
AdminX
Posts: 5157
Joined: Mon Mar 13, 2006 1:34 pm
Location: Acworth, GA
Contact:

Re: lczero faq

Post by AdminX » Fri Apr 13, 2018 10:06 am

"Good decisions come from experience, and experience comes from bad decisions."
__________________________________________________________________
Ted Summers

duncan
Posts: 10365
Joined: Mon Jul 07, 2008 8:50 pm

Re: lczero faq

Post by duncan » Fri Apr 13, 2018 5:27 pm

thanks for the links, although does not help me to know what these networks are and how they work and improve each version . or what buffers are etc.

Robert Pope
Posts: 510
Joined: Sat Mar 25, 2006 7:27 pm

Re: lczero faq

Post by Robert Pope » Fri Apr 13, 2018 5:36 pm

Basically, the networks are the weights of the evaluation function. As LCZero learns, the weights in the networks get better tuned and it plays stronger.

duncan
Posts: 10365
Joined: Mon Jul 07, 2008 8:50 pm

Re: lczero faq

Post by duncan » Fri Apr 13, 2018 5:58 pm

Robert Pope wrote:Basically, the networks are the weights of the evaluation function. As LCZero learns, the weights in the networks get better tuned and it plays stronger.
thanks. why does it take longer to make moves as the network gets better and why is a cpu not so suitable for play ?

MonteCarlo
Posts: 62
Joined: Sun Dec 25, 2016 3:59 pm

Re: lczero faq

Post by MonteCarlo » Fri Apr 13, 2018 6:23 pm

It doesn't take longer just because the network gets better.

It's slower now because the network size was recently increased substantially.

A larger network means evaluating the network is more computationally expensive (essentially, the NN is just a giant set of math operations, and now there are a lot more of them); however, with a fixed network size (which this should be now for a while), the speed will stay the same whether the network improves or regresses (that's not increasing the number of the weights/operations, just changing their value to values that work better/worse).

The operations performed by the NN basically require doing the same operation on a lot of different data independently; this makes them quite amenable to running on GPUs, which are designed for just such purposes. At a very abstract level, GPUs basically have thousands of units for doing math, so as long as you have thousands of math operations that can be done independently, they'll be well-suited to the task.

They're much less well-suited for things that involve a lot of branching or task switching, or where you have to figure out the result of calculation N before you can perform calculation N+1, exactly the sorts of things that CPUs are optimized for.

Evaluating an NN is just the sort of problem that lies in the GPU's sweet spot, and isn't in the CPU's. The problem is exacerbated by larger network sizes. The previous network was small enough that the gap between CPU and GPU was not too difficult to overcome by using a handful of CPU cores.

After the recent increase in network size, though, GPU users were only mildly affected, because the previous net was so small the decent GPUs weren't even getting 100% utilized.

For CPU users, however, NPS dropped by 2-3x (although the strength improvement from the new larger network basically made this a wash for gameplay).

duncan
Posts: 10365
Joined: Mon Jul 07, 2008 8:50 pm

Re: lczero faq

Post by duncan » Sun Apr 15, 2018 12:06 am

MonteCarlo wrote:It doesn't take longer just because the network gets better.

It's slower now because the network size was recently increased substantially.

A larger network means evaluating the network is more computationally expensive (essentially, the NN is just a giant set of math operations, and now there are a lot more of them); however, with a fixed network size (which this should be now for a while), the speed will stay the same whether the network improves or regresses (that's not increasing the number of the weights/operations, just changing their value to values that work better/worse).

The operations performed by the NN basically require doing the same operation on a lot of different data independently; this makes them quite amenable to running on GPUs, which are designed for just such purposes. At a very abstract level, GPUs basically have thousands of units for doing math, so as long as you have thousands of math operations that can be done independently, they'll be well-suited to the task.

They're much less well-suited for things that involve a lot of branching or task switching, or where you have to figure out the result of calculation N before you can perform calculation N+1, exactly the sorts of things that CPUs are optimized for.

Evaluating an NN is just the sort of problem that lies in the GPU's sweet spot, and isn't in the CPU's. The problem is exacerbated by larger network sizes. The previous network was small enough that the gap between CPU and GPU was not too difficult to overcome by using a handful of CPU cores.

After the recent increase in network size, though, GPU users were only mildly affected, because the previous net was so small the decent GPUs weren't even getting 100% utilized.

For CPU users, however, NPS dropped by 2-3x (although the strength improvement from the new larger network basically made this a wash for gameplay).
thanks for your reply.


I read the network changed from 6*64 to 10*128. what do these figures mean.

and how is it decided when to change to a larger network?

Post Reply