LCzero sacs a knight for nothing

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

carldaman
Posts: 2283
Joined: Sat Jun 02, 2012 2:13 am

Re: LCzero sacs a knight for nothing

Post by carldaman »

OneTrickPony wrote:
Not at all - I think you've raised some very interesting points! MCTS averaging does seem fundamentally mismatched to Chess. That's why I was so amazed A0 actually worked.
It seems it's not that good in go either as shown by recent Leela games it blunders tactics in go on regular basis. Go is a bit different as it's possible to kill humans there without much tactical awareness but winning against humans in a board game isn't exactly a high bar these days. In engine vs engine matches it's clear that MCTS in pure form isn't working.
There is some kind of component missing, for example now you can get to 100k playouts, discover that the line is a total disaster (losing by force) and it will take a longer while for it to prefer a different move.

I personally believe policy guided search and policy being trained on many games will work but the move selection itself will be more in line with alpha/beta in the end. Overall I am excited (and wish I had time away from programming engines for card games to participate). If anything Leela plays like a naive optimistic human 2200-2300ELO and that's really cool to have :)
Hi Piotr,

Slightly off-topic, but I'm curious, what could you recommend as a good program/AI for hold'em (or poker in general)? :)

Thanks,
CL
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: LCzero sacs a knight for nothing

Post by Laskos »

Michel wrote:
And yes, the self-play match games between networks on the main site is terrible and misleading.
I wonder why that is. Now that the matches are no longer used for gating and there is much more opening variety, the graph should in principle be correct on average.

So it seems that elo is not additive in this case.

One possible explanation might be that buggy engines do not satisfy the elo model. This was an observation by HGM in a slightly different context. Of course it is bit unclear how to define a buggy engine...
Weren't they during the bug (underpromoting) self-tested against another underpromoting almost identical engine? Even I would have seen a progress, with a book and fixed time. Then they AFAIK only slowly re-trained the net (shouldn't they just start all again from ID124 or even from "smallnet" ID122?). And then tested non-buggy engine against non-buggy engine, which slowly started to increase Queen promotion, thus progress again assured (on average)? They could or should have one drop, but due to slow changes , it is barely visible, they have many lower and higher all the way.

Probably one could imagine certain bugs (say rate of time losses) and a gedanken experiment, showing that three engines don't satisfy the additivity underl Elo model.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: LCzero sacs a knight for nothing

Post by Laskos »

CMCanavessi wrote:My tests arrive to very similar numbers to yours, Kai (though I've tested 150 as the strongest, and have not tested any newer network... maybe 156 will be the next one). And yes, the self-play match games between networks on the main site is terrible and misleading.


My gauntlet numbers:

Image


The calculated Elo:

Image
It seems there is large jump outside error margins with ID156, maybe you can tell the devs. It is now outside error margins the best net.

Code: Select all

Games Completed = 200 of 200 (Avg game length = 98.619 sec) 
Settings = Gauntlet/64MB/1000ms per move/M 2500cp for 3 moves, D 150 moves/EPD:C:\LittleBlitzer\3moves_GM_04.epd(817) 
Time = 5478 sec elapsed, 0 sec remaining 
 1.  LCZero CPU ID124            65.0/200   44-114-42     (L: m=114 t=0 i=0 a=0)   (D: r=31 i=4 f=4 s=2 a=1)   (tpm=947.8 d=12.50 nps=202) 
 2.  Jabba 1.0                   135.0/200   114-44-42     (L: m=44 t=0 i=0 a=0)   (D: r=31 i=4 f=4 s=2 a=1)   (tpm=802.9 d=8.98 nps=0) 


Games Completed = 200 of 200 (Avg game length = 88.613 sec) 
Settings = Gauntlet/64MB/1000ms per move/M 9000cp for 30 moves, D 150 moves/EPD:C:\LittleBlitzer\3moves_GM_04.epd(817) 
Time = 5056 sec elapsed, 0 sec remaining 
 1.  LCZero CPU ID131            41.5/200   27-144-29     (L: m=144 t=0 i=0 a=0)   (D: r=27 i=0 f=1 s=0 a=1)   (tpm=947.0 d=12.50 nps=126) 
 2.  Jabba 1.0                   158.5/200   144-27-29     (L: m=27 t=0 i=0 a=0)   (D: r=27 i=0 f=1 s=0 a=1)   (tpm=803.9 d=8.82 nps=0) 


Games Completed = 200 of 200 (Avg game length = 92.903 sec) 
Settings = Gauntlet/64MB/1000ms per move/M 5500cp for 30 moves, D 150 moves/EPD:C:\LittleBlitzer\3moves_GM_04.epd(817) 
Time = 5264 sec elapsed, 0 sec remaining 
 1.  LCZero CPU ID139            39.5/200   22-143-35     (L: m=143 t=0 i=0 a=0)   (D: r=25 i=6 f=3 s=1 a=0)   (tpm=948.6 d=12.52 nps=175) 
 2.  Jabba 1.0                   160.5/200   143-22-35     (L: m=22 t=0 i=0 a=0)   (D: r=25 i=6 f=3 s=1 a=0)   (tpm=803.7 d=8.73 nps=0) 
  
  
Games Completed = 200 of 200 (Avg game length = 91.517 sec) 
Settings = Gauntlet/64MB/1000ms per move/M 9000cp for 30 moves, D 150 moves/EPD:C:\LittleBlitzer\3moves_GM_04.epd(817) 
Time = 4855 sec elapsed, 0 sec remaining 
 1.  LCZero CPU ID147            45.0/200   35-145-20     (L: m=145 t=0 i=0 a=0)   (D: r=14 i=4 f=2 s=0 a=0)   (tpm=945.0 d=12.48 nps=178) 
 2.  Jabba 1.0                   155.0/200   145-35-20     (L: m=35 t=0 i=0 a=0)   (D: r=14 i=4 f=2 s=0 a=0)   (tpm=804.0 d=8.87 nps=0) 
  
  
Games Completed = 200 of 200 (Avg game length = 97.840 sec) 
Settings = Gauntlet/64MB/1000ms per move/M 9000cp for 30 moves, D 150 moves/EPD:C:\LittleBlitzer\3moves_GM_04.epd(817) 
Time = 5223 sec elapsed, 0 sec remaining 
 1.  LCZero CPU ID152            58.0/200   42-126-32     (L: m=126 t=0 i=0 a=0)   (D: r=24 i=3 f=3 s=1 a=1)   (tpm=948.7 d=12.53 nps=242) 
 2.  Jabba 1.0                   142.0/200   126-42-32     (L: m=42 t=0 i=0 a=0)   (D: r=24 i=3 f=3 s=1 a=1)   (tpm=803.2 d=9.03 nps=0) 


Games Completed = 200 of 200 (Avg game length = 103.383 sec) 
Settings = Gauntlet/64MB/1000ms per move/M 9000cp for 30 moves, D 150 moves/EPD:C:\LittleBlitzer\3moves_GM_04.epd(817) 
Time = 5473 sec elapsed, 0 sec remaining 
 1.  LCZero CPU ID154            63.0/200   40-114-46     (L: m=114 t=0 i=0 a=0)   (D: r=34 i=4 f=6 s=0 a=2)   (tpm=952.3 d=12.49 nps=282) 
 2.  Jabba 1.0                   137.0/200   114-40-46     (L: m=40 t=0 i=0 a=0)   (D: r=34 i=4 f=6 s=0 a=2)   (tpm=804.0 d=9.15 nps=0)




Games Completed = 200 of 200 (Avg game length = 97.010 sec)
Settings = Gauntlet/64MB/1000ms per move/M 9000cp for 30 moves, D 150 moves/EPD:C:\LittleBlitzer\3moves_GM_04.epd(817)
Time = 5105 sec elapsed, 0 sec remaining
 1.  LCZero CPU ID156         	79.0/200	62-104-34  	(L: m=104 t=0 i=0 a=0)	(D: r=28 i=3 f=2 s=0 a=1)	(tpm=951.2 d=12.52 nps=171)
 2.  Jabba 1.0                	121.0/200	104-62-34  	(L: m=62 t=0 i=0 a=0)	(D: r=28 i=3 f=2 s=0 a=1)	(tpm=803.3 d=9.01 nps=0)

I now want to see on test suites (tactical and positional) what happened.
User avatar
CMCanavessi
Posts: 1142
Joined: Thu Dec 28, 2017 4:06 pm
Location: Argentina

Re: LCzero sacs a knight for nothing

Post by CMCanavessi »

I'm running the gauntlet right now with 156, and it's proving to be the best network so far, but not by much (early estimates of around 40 elo). Only 30% of the games played, so it has some way to go.
Follow my tournament and some Leela gauntlets live at http://twitch.tv/ccls
Jhoravi
Posts: 291
Joined: Wed May 08, 2013 6:49 am

Re: LCzero sacs a knight for nothing

Post by Jhoravi »

Is Leela really regaining the lost knowledge since the fix of the promotion bug?

I suspect that the newer training just skips those positions leading to the promotion instead of retraining them because the buggy network already tells the search that it's loosing.

As a result, the succeeding networks are seemingly getting better and better over the previous buggy one because they don't have those positions to deal with each other.

But when faced against a network before the bug like ID125 it's not much better.

Just my humble theory.
User avatar
CMCanavessi
Posts: 1142
Joined: Thu Dec 28, 2017 4:06 pm
Location: Argentina

Re: LCzero sacs a knight for nothing

Post by CMCanavessi »

Yes it is, it's pretty obvious by watching the matches live.
Follow my tournament and some Leela gauntlets live at http://twitch.tv/ccls
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: LCzero sacs a knight for nothing

Post by Laskos »

CMCanavessi wrote:I'm running the gauntlet right now with 156, and it's proving to be the best network so far, but not by much (early estimates of around 40 elo). Only 30% of the games played, so it has some way to go.
Our error margins are large with only 200 games, but I can confirm that ID159 comes close to that high result of ID156, so it was not a 2.5 standard deviations fluke. By now, we both can confirm that the new nets are the best ones, and will probably get better an better.

OTOH, I couldn't see a significant jump on both opening positional and middlegame tactical suites, just a small improvement over say ID147. I don't know why, maybe some other aspects of the gameplay improved, say endgames.
Werewolf
Posts: 1796
Joined: Thu Sep 18, 2008 10:24 pm

Re: LCzero sacs a knight for nothing

Post by Werewolf »

Laskos wrote:
CMCanavessi wrote:I'm running the gauntlet right now with 156, and it's proving to be the best network so far, but not by much (early estimates of around 40 elo). Only 30% of the games played, so it has some way to go.
Our error margins are large with only 200 games, but I can confirm that ID159 comes close to that high result of ID156, so it was not a 2.5 standard deviations fluke. By now, we both can confirm that the new nets are the best ones, and will probably get better an better.

OTOH, I couldn't see a significant jump on both opening positional and middlegame tactical suites, just a small improvement over say ID147. I don't know why, maybe some other aspects of the gameplay improved, say endgames.
Do you have a list of the results of different versions of LCZero in tactics?
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: LCzero sacs a knight for nothing

Post by Laskos »

Werewolf wrote:
Laskos wrote:
CMCanavessi wrote:I'm running the gauntlet right now with 156, and it's proving to be the best network so far, but not by much (early estimates of around 40 elo). Only 30% of the games played, so it has some way to go.
Our error margins are large with only 200 games, but I can confirm that ID159 comes close to that high result of ID156, so it was not a 2.5 standard deviations fluke. By now, we both can confirm that the new nets are the best ones, and will probably get better an better.

OTOH, I couldn't see a significant jump on both opening positional and middlegame tactical suites, just a small improvement over say ID147. I don't know why, maybe some other aspects of the gameplay improved, say endgames.
Do you have a list of the results of different versions of LCZero in tactics?
Yes, some sort of list. For ECM200.epd middlegame tactical suite (200 positions), analyzed for 20s/position. At this time control and my hardware, LC0 performs overall (Elo-wise) comparably to GreKo 6.5 2330 Elo CCRL standard A/B engine, which fares much better tactically (but much worse positionally). And it seems on this tactical middlegame suite ID124 is still the best of the nets.

Here is a short list:

Code: Select all

ID124:
ECM200
score=75/200 [averages on correct positions: depth=13.4 time=2.92 nodes=930]

ID143:
ECM200
score=63/200 [averages on correct positions: depth=12.8 time=2.56 nodes=791]

ID148:
ECM200
score=67/200 [averages on correct positions: depth=11.9 time=1.84 nodes=567]

ID156:
ECM200
score=68/200 [averages on correct positions: depth=12.6 time=2.44 nodes=944]

==============================================

Compare with a similar in strength standard A/B engine:


GreKo 6.5 (2330 CCRL):
ECM200
score=143/200 [averages on correct positions: depth=7.3 time=1.91 nodes=4718200]
syzygy
Posts: 5566
Joined: Tue Feb 28, 2012 11:56 pm

Re: LCzero sacs a knight for nothing

Post by syzygy »

gladius wrote:But the entire process is designed to have it solve tactics. The policies are trained to match the output of an 800 node search, so it's being trained to take the tactics into account. Even modern chess evaluation features do this (with eg. huge penalties for queen under threat, and restricting queen mobility to "safe" squares).

Don't you think that the network can learn to predict tactics?
What I don't quite get is what the move probabilities are supposed to stand for.

If the move probabilities are supposed to single out "good" moves, then a move that simply looks bad but happens to have a deep (or even shallow) tactic behind it would score bad and would not guide the search to discover the tactic.

If the move probabilties are supposed to single out "unclear" moves, then things could work. But I don't really see how the whole updating process would work towards identifying "unclear" moves.