LCzero sacs a knight for nothing

mirek · Post by **mirek** » Thu Apr 19, 2018 10:47 pm

Daniel Shawul wrote:
Here LC0 moved Ne5 on TCEC's 43-core hardware! Note that this blunder is probably not due to a bug as most other engines would have it, but that the algorithm is working as intended and can produce such tactical blunders even on this massive hardware.

Are you telling me that this is not a problem for L0 or A0, and that it can be solved with bigger net and more training !?

Indeed that is exactly what we are telling you ! Actually even bigger blunder is, that after Ne5 is played on the board ID125 doesn't see that fxe5 is winning and doesn't want to play it even after > 500k playouts.

The same position (with Ne5 on the board) but ID149 sees fxe5 as best (and with advantageous score) in just 8k playouts and as completely winning in about 20k

And that's the difference in just week of training, 4 days of which have been affected by serious under-promotion bug. So I guess if you wanted to show how shallow tactical blunders will never go away you should have picked a better example

Daniel Shawul wrote: That is a one ply tactic right there!

Have you even looked at the board? You make it sound like LC0 had overlooked that Ne5 leaves the knight "en prise", while in reality it obviously overlooked the Re1+ move (so more like 4 plies tactics) and without Re1, Ne5 would be trapping the rook with f3, so it would be actually the best move in the position. If anything it's very interesting to see how human-like blunders of LC0 are, even a GM in serious blitz time pressure could blunder like that.

Daniel Shawul · Post by **Daniel Shawul** » Thu Apr 19, 2018 11:08 pm

mirek wrote:
Daniel Shawul wrote:
Here LC0 moved Ne5 on TCEC's 43-core hardware! Note that this blunder is probably not due to a bug as most other engines would have it, but that the algorithm is working as intended and can produce such tactical blunders even on this massive hardware.

Are you telling me that this is not a problem for L0 or A0, and that it can be solved with bigger net and more training !?
Indeed that is exactly what we are telling you ! Actually even bigger blunder is, that after Ne5 is played on the board ID125 doesn't see that fxe5 is winning and doesn't want to play it even after > 500k playouts.

The same position (with Ne5 on the board) but ID149 sees fxe5 as best (and with advantageous score) in just 8k playouts and as completely winning in about 20k

That must be due to luck then that ID149 solved it, ID150 could go back to f3e5 move and what will you then say it learned? You can't learn static details of tactics with a NN and expect it to work in all positions. Say you tune the policy network weights to favour Re1+ (checks) more even though it is leaves the rook hanging, this rule is certainly going to hurt you in other positions don't you agree? It will give more simulations to useless checks. I would be amazed if you can even learn to SEE (static exchange evaluation) as accurately as alphabeta engines do it. So many exceptions to cater for ...

And that's the difference in just week of training, 4 days of which have been affected by serious under-promotion bug. So I guess if you wanted to show how shallow tactical blunders will never go away you should have picked a better example

Keep on tuning to solve the tactics ... but then don't run it on massive hardware and say it improved tactics, we know that will help. It was a huge surprize for the developers too when TCEC tested L0 and showed it to be strong.

Have you even looked at the board? You make it sound like LC0 had overlooked that Ne5 leaves the knight "en prise", while in reality it obviously overlooked the Re1+ move (so more like 4 plies tactics) and without Re1, Ne5 would be trapping the rook with f3, so it would be actually the best move in the position. If anything it's very interesting to see how human-like blunders of LC0 are, even a GM in serious blitz time pressure could blunder like that.

Ok it is a 4 ply tactic.

nabildanial · Post by **nabildanial** » Thu Apr 19, 2018 11:35 pm

Your pessimism is really something else. I hope you will never touch LC0 codes in the future.

Laskos · Post by **Laskos** » Fri Apr 20, 2018 12:10 am

mirek wrote:
Daniel Shawul wrote:
Here LC0 moved Ne5 on TCEC's 43-core hardware! Note that this blunder is probably not due to a bug as most other engines would have it, but that the algorithm is working as intended and can produce such tactical blunders even on this massive hardware.

Are you telling me that this is not a problem for L0 or A0, and that it can be solved with bigger net and more training !?
Indeed that is exactly what we are telling you ! Actually even bigger blunder is, that after Ne5 is played on the board ID125 doesn't see that fxe5 is winning and doesn't want to play it even after > 500k playouts.

The same position (with Ne5 on the board) but ID149 sees fxe5 as best (and with advantageous score) in just 8k playouts and as completely winning in about 20k

And that's the difference in just week of training, 4 days of which have been affected by serious under-promotion bug. So I guess if you wanted to show how shallow tactical blunders will never go away you should have picked a better example

Daniel Shawul wrote: That is a one ply tactic right there!
Have you even looked at the board? You make it sound like LC0 had overlooked that Ne5 leaves the knight "en prise", while in reality it obviously overlooked the Re1+ move (so more like 4 plies tactics) and without Re1, Ne5 would be trapping the rook with f3, so it would be actually the best move in the position. If anything it's very interesting to see how human-like blunders of LC0 are, even a GM in serious blitz time pressure could blunder like that.

Still, the tactical weakness is pretty obvious, and the improvement rate is low. I selected randomly 200 out of 879 ECM.epd tactical middlegame positions, which are solved overwhelmingly by top engines. I let LC0 on 4 cores for 20s/position (about 15,000 playouts per position). In these conditions, overall (strength-wise) LC0 is similar in strength to an AB standard engine GreKo 6.5 (about 2300 CCRL Elo) on one core. Performance on this 200 positions suite:

GreKo 6.5:
143/200

LC0 v0.7

LC0 ID122 (the last "smallnet"):
59/200

LC0 ID 124 (the second "bignet", one of the strongest Elo-wise):
75/200

since the bug and its elimination, it started to progress again from a lower value:

LC0 ID148 (the last "bignet" I tested):
67/200

But to a 2300 CCRL Elo level standard AB engine GreKo 6.5 it is still a long way. Let's see, it seems not that easy only by training, in games hardware and LTC (the scaling is very good) might be crucial.

CMCanavessi · Post by **CMCanavessi** » Fri Apr 20, 2018 1:11 am

Laskos wrote:
mirek wrote:
Daniel Shawul wrote:
Here LC0 moved Ne5 on TCEC's 43-core hardware! Note that this blunder is probably not due to a bug as most other engines would have it, but that the algorithm is working as intended and can produce such tactical blunders even on this massive hardware.

Are you telling me that this is not a problem for L0 or A0, and that it can be solved with bigger net and more training !?
Indeed that is exactly what we are telling you ! Actually even bigger blunder is, that after Ne5 is played on the board ID125 doesn't see that fxe5 is winning and doesn't want to play it even after > 500k playouts.

The same position (with Ne5 on the board) but ID149 sees fxe5 as best (and with advantageous score) in just 8k playouts and as completely winning in about 20k

And that's the difference in just week of training, 4 days of which have been affected by serious under-promotion bug. So I guess if you wanted to show how shallow tactical blunders will never go away you should have picked a better example

Daniel Shawul wrote: That is a one ply tactic right there!
Have you even looked at the board? You make it sound like LC0 had overlooked that Ne5 leaves the knight "en prise", while in reality it obviously overlooked the Re1+ move (so more like 4 plies tactics) and without Re1, Ne5 would be trapping the rook with f3, so it would be actually the best move in the position. If anything it's very interesting to see how human-like blunders of LC0 are, even a GM in serious blitz time pressure could blunder like that.
Still, the tactical weakness is pretty obvious, and the improvement rate is low. I selected randomly 200 out of 879 ECM.epd tactical middlegame positions, which are solved overwhelmingly by top engines. I let LC0 on 4 cores for 20s/position (about 15,000 playouts per position). In these conditions, overall (strength-wise) LC0 is similar in strength to an AB standard engine GreKo 6.5 (about 2300 CCRL Elo) on one core. Performance on this 200 positions suite:

GreKo 6.5:
143/200

LC0 v0.7

LC0 ID122 (the last "smallnet"):
59/200

LC0 ID 124 (the second "bignet", one of the strongest Elo-wise):
75/200

since the bug and its elimination, it started to progress again from a lower value:

LC0 ID148 (the last "bignet" I tested):
67/200

But to a 2300 CCRL Elo level standard AB engine GreKo 6.5 it is still a long way. Let's see, it seems not that easy only by training, in games hardware and LTC (the scaling is very good) might be crucial.

The problem with that methodology is that it doesn't work that well with Leela, because you need to play at least 8 moves to fill her history plane. If you just feed her a random pgn it won't evaluate it that good because of that fact.

Laskos · Post by **Laskos** » Fri Apr 20, 2018 1:24 am

CMCanavessi wrote:
Laskos wrote:
mirek wrote:
Daniel Shawul wrote:
Here LC0 moved Ne5 on TCEC's 43-core hardware! Note that this blunder is probably not due to a bug as most other engines would have it, but that the algorithm is working as intended and can produce such tactical blunders even on this massive hardware.

Are you telling me that this is not a problem for L0 or A0, and that it can be solved with bigger net and more training !?
Indeed that is exactly what we are telling you ! Actually even bigger blunder is, that after Ne5 is played on the board ID125 doesn't see that fxe5 is winning and doesn't want to play it even after > 500k playouts.

The same position (with Ne5 on the board) but ID149 sees fxe5 as best (and with advantageous score) in just 8k playouts and as completely winning in about 20k

And that's the difference in just week of training, 4 days of which have been affected by serious under-promotion bug. So I guess if you wanted to show how shallow tactical blunders will never go away you should have picked a better example

Daniel Shawul wrote: That is a one ply tactic right there!
Have you even looked at the board? You make it sound like LC0 had overlooked that Ne5 leaves the knight "en prise", while in reality it obviously overlooked the Re1+ move (so more like 4 plies tactics) and without Re1, Ne5 would be trapping the rook with f3, so it would be actually the best move in the position. If anything it's very interesting to see how human-like blunders of LC0 are, even a GM in serious blitz time pressure could blunder like that.
Still, the tactical weakness is pretty obvious, and the improvement rate is low. I selected randomly 200 out of 879 ECM.epd tactical middlegame positions, which are solved overwhelmingly by top engines. I let LC0 on 4 cores for 20s/position (about 15,000 playouts per position). In these conditions, overall (strength-wise) LC0 is similar in strength to an AB standard engine GreKo 6.5 (about 2300 CCRL Elo) on one core. Performance on this 200 positions suite:

GreKo 6.5:
143/200

LC0 v0.7

LC0 ID122 (the last "smallnet"):
59/200

LC0 ID 124 (the second "bignet", one of the strongest Elo-wise):
75/200

since the bug and its elimination, it started to progress again from a lower value:

LC0 ID148 (the last "bignet" I tested):
67/200

But to a 2300 CCRL Elo level standard AB engine GreKo 6.5 it is still a long way. Let's see, it seems not that easy only by training, in games hardware and LTC (the scaling is very good) might be crucial.
The problem with that methodology is that it doesn't work that well with Leela, because you need to play at least 8 moves to fill her history plane. If you just feed her a random pgn it won't evaluate it that good because of that fact.

You mean a random FEN? Even if this is the case, compared to _positional_ "random FEN" opening suite, the results are completely opposite:

200 positions:

Code: Select all

&#91;Search parameters&#58; MaxDepth=99   MaxTime=20.0   DepthDelta=2   MinDepth=7   MinTime=0.1&#93; 

Engine                         &#58; Correct  TotalPos  Corr%  AveT&#40;s&#41;  MaxT&#40;s&#41;  TestFile 
      
Komodo 10.2 64-bit             &#58;     145       200   72.5      2.0     20.0  openings200beta07.epd 
Houdini 5.01 Pro x64           &#58;     144       200   72.0      2.4     20.0  openings200beta07.epd    
Stockfish 8 64 BMI2            &#58;     141       200   70.5      2.0     20.0  openings200beta07.epd 
Houdini 5.01 Pro x64 Tactical  &#58;     139       200   69.5      2.3     20.0  openings200beta07.epd      
Deep Shredder 13 x64           &#58;     128       200   64.0      2.7     20.0  openings200beta07.epd     
Andscacs 0.88n                 &#58;     123       200   61.5      2.4     20.0  openings200beta07.epd 
Houdini 4 Pro    &#40;3335 CCRL&#41;   &#58;     120       200   60.0      1.6     20.0  openings200beta07.epd 
Nirvanachess 2.3 &#40;3216 CCRL&#41;   &#58;     119       200   59.5      1.8     20.0  openings200beta07.epd 

LCZero  *************  ID124   &#58;     118       200   59.0      2.7     20.0  openings200beta07.epd

Fire 5 x64       &#40;3341 CCRL&#41;   &#58;     110       200   55.0      3.0     20.0  openings200beta07.epd    
Texel 1.06       &#40;3162 CCRL&#41;   &#58;     110       200   55.0      1.6     20.0  openings200beta07.epd    

LCZero  *************  ID83    &#58;     109       200   54.5      1.1     20.0  openings200beta07.epd

Fritz 15         &#40;3227 CCRL&#41;   &#58;     102       200   51.0      1.9     20.0  openings200beta07.epd  

LCZero  *************  ID148   &#58;     101       200   50.5      3.0     20.0  openings200beta07.epd 
  
Fruit 2.1        &#40;2685 CCRL&#41;   &#58;      91       200   45.5      1.5     20.0  openings200beta07.epd  

LCZero  *************  ID59    &#58;      90       200   45.0      1.7     20.0  openings200beta07.epd

GreKo 6.5        &#40;2336 CCRL&#41;   &#58;      78       200   39.0      1.6     20.0  openings200beta07.epd
Sjaak II 1.3.1   &#40;2194 CCRL&#41;   &#58;      75       200   37.5      4.0     20.0  openings200beta07.epd    
BikJump v2.01    &#40;2098 CCRL&#41;   &#58;      74       200   37.0      1.6     20.0  openings200beta07.epd

Milos · Post by **Milos** » Fri Apr 20, 2018 1:25 am

CMCanavessi wrote:The problem with that methodology is that it doesn't work that well with Leela, because you need to play at least 8 moves to fill her history plane. If you just feed her a random pgn it won't evaluate it that good because of that fact.

So basically you are saying it is useless for any kind of analysis?

CMCanavessi · Post by **CMCanavessi** » Fri Apr 20, 2018 1:27 am

@Kai yes, i meant FEN position

Milos wrote:
CMCanavessi wrote:The problem with that methodology is that it doesn't work that well with Leela, because you need to play at least 8 moves to fill her history plane. If you just feed her a random pgn it won't evaluate it that good because of that fact.
So basically you are saying it is useless for any kind of analysis?

Well, pretty much, for now. I read that there are plans to remove all that history thing and make it much more compact.

mirek · Post by **mirek** » Fri Apr 20, 2018 2:06 am

Daniel Shawul wrote: That must be due to luck then that ID149 solved it, ID150 could go back to f3e5 move and what will you then say it learned?

Must be? "may be" would be more appropriate. And obviously I was trolling you a bit there, but OK, let's make collection of shallow tactical blunders of ID 125 from TCEC and see how it goes with ID149 and then net week later and another week later etc. Do you wanna bet?

Daniel Shawul wrote: You can't learn static details of tactics with a NN and expect it to work in all positions.

I don't expect it to work on all positions. I only expect it to have overall positive contribution to strength. I.e. even if you were occasionally overlooking tactics that alpha-beta engine would never have blundered, that should be more than compensated by reaching deeper depths (when done correctly)

Daniel Shawul wrote: Keep on tuning to solve the tactics ... but then don't run it on massive hardware and say it improved tactics, we know that will help.

Why this argument again? How many times have I already mentioned that A0 on 1080TI + 1min / move should be of comparable strength to SF8@64 cores 1min / move?

And you still go on and on how good tactics for A0 is only possible with 4xTPU, never addressing any of my points. So where's the problem?

Do you consider those graphs (Figure 2: Scalability of AlphaZero with thinking time) to be "fake"? Because otherwise I don't understand why you would be writing things you do. Me personally I don't see reasons to doubt DeepMind's paper, so until there is some new evidence that would contradict their findings I just base my conclusions on that scaling graph. At least that is some hard evidence compared to your arguments which are based almost just like on your "feeling" that xyz is not achieveable with NN (and you know it even before anyone else besides DeepMind has tried it).

Therefore my conclusions are that at 1 min / move A0@1080Ti is playing at level comparable to SF8@64 cores. So at least in practical play A0 can't be
tactically vulnerable, otherwise it would be crushed by SF8.

Maybe if you take specialized set of hard tactical puzzles, SF8 would score +300% compared to A0, who knows. But in the end the best engine in practical play is not the one who is best at tactics, but the one that is best at scoring points with given resources. And A0 is obviously very good at that

mirek · Post by **mirek** » Fri Apr 20, 2018 2:31 am

Laskos wrote: You mean a random FEN? Even if this is the case, compared to _positional_ "random FEN" opening suite, the results are completely opposite:

I think that's no coincidence that LC0 is positionally stronger and tactically weaker to engines of comparable strength. Just try to imagine you were a self learning chess AI, what do you think would help you score more points in the initial phases of learning - knowing general rules of the game (strategy, i.e. exchanging knight for a pawn is not good) or exceptions to the rules (in some positions sacrificing knight for a pawn is actually good)

Which one do you think is better to learn first? I would argue that you probably first need to understand the general rules before you start to learn exceptions to the rules.

(and that you probably also need more games to learn exceptions compared to learning general rules)

LCzero sacs a knight for nothing

Re: LCzero sacs a knight for nothing

Re: LCzero sacs a knight for nothing

Re: LCzero sacs a knight for nothing

Re: LCzero sacs a knight for nothing

Re: LCzero sacs a knight for nothing

Re: LCzero sacs a knight for nothing

Re: LCzero sacs a knight for nothing

Re: LCzero sacs a knight for nothing

Re: LCzero sacs a knight for nothing

Re: LCzero sacs a knight for nothing