Leela outplays SF Dev in a position that SF evaluates as draw

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Re: Leela outplays SF Dev in a position that SF evaluates as draw

Post by zullil »

MikeB wrote: Tue Jul 16, 2019 8:59 pm
Not only that , this SF derivative (with Lc0 scoring) , at the critical moment, readily sacs the pawn to get in the position and it sees the Ne4 Bxe4 dxe4 Rxe4 f5 maneuver almost instantly:

[d]r2q1rk1/pp3p2/1bn2n1p/3p1Bpb/8/1NP2NBP/PP3PP1/R2QR1K1 b - - 1 16

Code: Select all

dep	score	nodes	time	(not shown:  tbhits	knps	seldep)
 44	+51.15 	5.52G	3:59.17	Ne4 Bxe4 dxe4 Rxe4 f5 Re2 Qxd1+ Rxd1 Rfe8 Rxe8+ Rxe8 Bd6 Re2 Rd2 Bxf2+ Kf1 Rxd2 Nfxd2 Bb6 Nc5 Bxc5 Bxc5 b6 Bd6 f4 Kf2 Bg6 Nf3 Kf7 g3 fxg3+ Kxg3 Be4 Ne5+ Nxe5 Bxe5 b5 a3 Ke6 Bd4 a5 h4 gxh4+ Kxh4 Kf5 Bg1 Bd5 Kh5 Ke4 Kxh6 Kd3 Kg5 
"this SF derivative"

Good plan. If you give it a name, someone might be triggered. :roll:

Have you abandoned the name "Burnzie" for some reason? Who could object to that one?
jp
Posts: 1470
Joined: Mon Apr 23, 2018 7:54 am

Re: Leela outplays SF Dev in a position that SF evaluates as draw

Post by jp »

MikeB wrote: Tue Jul 16, 2019 8:59 pm Not only that , this SF derivative (with Lc0 scoring) , at the critical moment, readily sacs the pawn to get in the position and it sees the Ne4 Bxe4 dxe4 Rxe4 f5 maneuver almost instantly:

r2q1rk1/pp3p2/1bn2n1p/3p1Bpb/8/1NP2NBP/PP3PP1/R2QR1K1 b - - 1 16

Code: Select all

dep	score	nodes	time	(not shown:  tbhits	knps	seldep)
 44	+51.15 	5.52G	3:59.17	Ne4 Bxe4 dxe4 Rxe4 f5 Re2 Qxd1+ Rxd1 Rfe8 Rxe8+ Rxe8 Bd6 Re2 Rd2 Bxf2+ Kf1 Rxd2 Nfxd2 Bb6 Nc5 Bxc5 Bxc5 b6 Bd6 f4 Kf2 Bg6 Nf3 Kf7 g3 fxg3+ Kxg3 Be4 Ne5+ Nxe5 Bxe5 b5 a3 Ke6 Bd4 a5 h4 gxh4+ Kxh4 Kf5 Bg1 Bd5 Kh5 Ke4 Kxh6 Kd3 Kg5 
 
By "Lc0 scoring", do you mean it gives the side to move (Black) just a 51.15% chance of winning?

And which Lc0 scoring is this (old or new)?
User avatar
MikeB
Posts: 4889
Joined: Thu Mar 09, 2006 6:34 am
Location: Pen Argyl, Pennsylvania

Re: Leela outplays SF Dev in a position that SF evaluates as draw

Post by MikeB »

By "Lc0 scoring", do you mean it gives the side to move (Black) just a 51.15% chance of winning?
correct
And which Lc0 scoring is this (old or new)?
Neither actually , I should have said Leela "like" scoring , but same idea, Scoring %. The Lc0 to centipawn "formula" was reversed engineered and then tweaked so that for my engine , after 1.e4 , the scoring % was just under 55% for white after a 5 minute search ( approx 5 billion node search). A roughly two queen advantage is evaluated as 100% win, so two queen, ,3 queen , 4 queen advantage are all evaluated at 100% win which to me is more relevant than showing 2000, 3000, 4000 centipawn evaluation. Of course, , if you prefer centipawn, there is an option to display that as well

Code: Select all

centip		Scor-%
0		50.00%
10		51.31%
20		52.61%
30		53.91%
40		55.21%
50		56.50%
60		57.78%
70		59.04%
80		60.30%
90		61.55%
100		62.78%
110		63.99%
120		65.18%
130		66.36%
140		67.52%
150		68.65%
160		69.77%
170		70.86%
180		71.92%
190		72.97%
200		73.99%
210		74.98%
220		75.95%
230		76.89%
240		77.80%
250		78.69%
260		79.56%
270		80.39%
280		81.20%
290		81.99%
300		82.75%
310		83.48%
320		84.19%
330		84.87%
340		85.53%
350		86.17%
360		86.78%
370		87.36%
380		87.93%
390		88.47%
400		89.00%
410		89.50%
420		89.98%
430		90.44%
440		90.88%
450		91.31%
460		91.71%
470		92.10%
480		92.47%
490		92.83%
500		93.17%
510		93.49%
520		93.81%
530		94.10%
540		94.39%
550		94.66%
560		94.91%
570		95.16%
580		95.40%
590		95.62%
600		95.83%
610		96.04%
620		96.23%
630		96.42%
640		96.59%
650		96.76%
660		96.92%
670		97.07%
680		97.22%
690		97.36%
700		97.49%
710		97.61%
720		97.73%
730		97.84%
740		97.95%
750		98.05%
760		98.15%
770		98.24%
780		98.33%
790		98.41%
800		98.49%
810		98.57%
820		98.64%
830		98.71%
840		98.77%
850		98.84%
860		98.90%
870		98.95%
880		99.00%
890		99.05%
900		99.10%
910		99.15%
920		99.19%
930		99.23%
940		99.27%
950		99.31%
960		99.34%
970		99.38%
980		99.41%
990		99.44%
1000		99.47%
1100		99.68%
1200		99.81%
1300		99.89%
1400		99.93%
1500		99.96%
1600		99.98%
1700		99.99%
1800		99.99%
1900		100.00%

Image
User avatar
MikeB
Posts: 4889
Joined: Thu Mar 09, 2006 6:34 am
Location: Pen Argyl, Pennsylvania

Re: Leela outplays SF Dev in a position that SF evaluates as draw

Post by MikeB »

zullil wrote: Tue Jul 16, 2019 9:17 pm
MikeB wrote: Tue Jul 16, 2019 8:59 pm
Not only that , this SF derivative (with Lc0 scoring) , at the critical moment, readily sacs the pawn to get in the position and it sees the Ne4 Bxe4 dxe4 Rxe4 f5 maneuver almost instantly:

[d]r2q1rk1/pp3p2/1bn2n1p/3p1Bpb/8/1NP2NBP/PP3PP1/R2QR1K1 b - - 1 16

Code: Select all

dep	score	nodes	time	(not shown:  tbhits	knps	seldep)
 44	+51.15 	5.52G	3:59.17	Ne4 Bxe4 dxe4 Rxe4 f5 Re2 Qxd1+ Rxd1 Rfe8 Rxe8+ Rxe8 Bd6 Re2 Rd2 Bxf2+ Kf1 Rxd2 Nfxd2 Bb6 Nc5 Bxc5 Bxc5 b6 Bd6 f4 Kf2 Bg6 Nf3 Kf7 g3 fxg3+ Kxg3 Be4 Ne5+ Nxe5 Bxe5 b5 a3 Ke6 Bd4 a5 h4 gxh4+ Kxh4 Kf5 Bg1 Bd5 Kh5 Ke4 Kxh6 Kd3 Kg5 
"this SF derivative"

Good plan. If you give it a name, someone might be triggered. :roll:

Have you abandoned the name "Burnzie" for some reason? Who could object to that one?
haha - no , but I haven't released it yet - but I will at some point ...
Image
jp
Posts: 1470
Joined: Mon Apr 23, 2018 7:54 am

Re: Leela outplays SF Dev in a position that SF evaluates as draw

Post by jp »

MikeB wrote: Tue Jul 16, 2019 10:22 pm
And which Lc0 scoring is this (old or new)?
Neither actually , I should have said Leela "like" scoring , but same idea, Scoring %. The Lc0 to centipawn "formula" was reversed engineered and then tweaked so that for my engine , after 1.e4 , the scoring % was just under 55% for white after a 5 minute search ( approx 5 billion node search). A roughly two queen advantage is evaluated as 100% win, so two queen, ,3 queen , 4 queen advantage are all evaluated at 100% win which to me is more relevant than showing 2000, 3000, 4000 centipawn evaluation.
I see... The Lc0 formula is claimed to be this:
Lc0 wiki wrote: cp = 111.714640912 * tan(1.5620688421 * Q),

average expected score Q in the range [-1,1].
I don't know if that's the old or the new.
jp
Posts: 1470
Joined: Mon Apr 23, 2018 7:54 am

Re: Leela outplays SF Dev in a position that SF evaluates as draw

Post by jp »

It looks like it's the new Leela formula. It was updated in Lc0 v0.21.2 (2019-06-09).

The old one was
pull/841 wrote:290.680623072 * tan(1.548090806 * Q),

12800cp when Q=1 (100% winrate).
User avatar
MikeB
Posts: 4889
Joined: Thu Mar 09, 2006 6:34 am
Location: Pen Argyl, Pennsylvania

Re: Leela outplays SF Dev in a position that SF evaluates as draw

Post by MikeB »

jp wrote: Tue Jul 16, 2019 10:43 pm
MikeB wrote: Tue Jul 16, 2019 10:22 pm
And which Lc0 scoring is this (old or new)?
Neither actually , I should have said Leela "like" scoring , but same idea, Scoring %. The Lc0 to centipawn "formula" was reversed engineered and then tweaked so that for my engine , after 1.e4 , the scoring % was just under 55% for white after a 5 minute search ( approx 5 billion node search). A roughly two queen advantage is evaluated as 100% win, so two queen, ,3 queen , 4 queen advantage are all evaluated at 100% win which to me is more relevant than showing 2000, 3000, 4000 centipawn evaluation.
I see... The Lc0 formula is claimed to be this:
Lc0 wiki wrote: cp = 111.714640912 * tan(1.5620688421 * Q),

average expected score Q in the range [-1,1].
I don't know if that's the old or the new.
That's it in principle -so the output in scoring % will be "O to 1" , 0% means certain loss , 50% means drawish, 100% ( or 1) means certain win.

To simply this in Excel ,( it's not my precise formula in McCain ( see uci.cpp towards the bottom) - but it's the same idea. This formula here , IMHO, is over confident , so what I have in McCain shows less confidence of scoring based on % score. Because SF show optimistic CP scores ( my opinion) , I also lowered the confidence in scoring % even lower.
Image
jp
Posts: 1470
Joined: Mon Apr 23, 2018 7:54 am

Re: Leela outplays SF Dev in a position that SF evaluates as draw

Post by jp »

George Tsavdaris wrote: Tue Jul 16, 2019 6:28 pm
zullil wrote: Mon Jul 15, 2019 8:44 pm
jp wrote: Mon Jul 15, 2019 5:05 pm Well, what is Lc's evaluation for the position above? Maybe it's also about 0.5-1.0 towards Black.
128x10.T17.6-swa-20000 lc0's output:

info depth 23 seldepth 58 time 1557045 nodes 4665017 score cp 48 hashfull 1000 nps 2996 tbhits 0 pv a8d8 d1d2 g8h7 h3h4 h5f7 e6d6 g5g4 d6d8 c6d8 f3d4 f7c4 g1h1 d8e6 h2g1 e6d4 b3d4 c4a2 f2f3 g4g3 b2b3 b6d4 g1d4 a2b3 d4a7 b7b5 d2d7 h7g6 d7d6 g6h7 a7d4 f8a8 d6f6 a8a1 d4g1 h6h5 f6f4 h7g6 f4f8 a1c1 f8e8 c1c3 e8e5 c3c4 g1e3 b5b4 e5g5 g6f7 g5g3 c4h4 h1g1 b3e6 g3g5

Perhaps someone with a "real" lc0 system can provide better.
Lc0 v21.2 JH T8-swa-610000(the TCEC sufi15 winner net):

Code: Select all

 22/55	00:43	 1.457.324	33.764	-0,92	Ra8-d8 Rd1xd8 Rf8xd8 Nb3-d2 Bh5-f7 Re6-e2 Bf7xa2 g2-g3 Bb6-c7 g3xf4 g5xf4 Nf3-h4 Kg8-h7 Nh4-g2 Rd8-f8 b2-b3 f4-f3 Nd2xf3 Rf8xf3 Bh2xc7 Ba2xb3 Re2-d2 Rf3xc3 Rd2-d7+ Kh7-g8 Bc7-d6 b7-b5 Rd7-b7 b5-b4 Bd6xb4 Nc6xb4 Rb7xb4 a7-a5 Rb4-b6 a5-a4 Rb6-a6 Rc3-c1+ Kg1-h2 Rc1-a1 Ra6-b6 Ra1-b1 Rb6-a6 h6-h5 Ng2-e3 Rb1-a1 Ra6-b6 Bb3-f7 Rb6-b8+ Kg8-g7 Rb8-b7 a4-a3
Thanks, George. Here's the Lc0 output you got:
[pgn] [FEN "r4rk1/pp6/1bn1R2p/6pb/5p2/1NP2N1P/PP3PPB/3R2K1 b - - 0 7"] Ra8-d8 Rd1xd8 Rf8xd8 Nb3-d2 Bh5-f7 Re6-e2 Bf7xa2 g2-g3 Bb6-c7 g3xf4 g5xf4 Nf3-h4 Kg8-h7 Nh4-g2 Rd8-f8 b2-b3 f4-f3 Nd2xf3 Rf8xf3 Bh2xc7 Ba2xb3 Re2-d2 Rf3xc3 Rd2-d7+ Kh7-g8 Bc7-d6 b7-b5 Rd7-b7 b5-b4 Bd6xb4 Nc6xb4 Rb7xb4 a7-a5 Rb4-b6 a5-a4 Rb6-a6 Rc3-c1+ Kg1-h2 Rc1-a1 Ra6-b6 Ra1-b1 Rb6-a6 h6-h5 Ng2-e3 Rb1-a1 Ra6-b6 Bb3-f7 Rb6-b8+ Kg8-g7 Rb8-b7 a4-a3 [/pgn]

The SF & Leela outputs are 0.5-1.0 in Black's favor.
There's no big difference between what SF & Leela think.
They could both be wrong, but it looks like White is on the worse side of a draw.

In Leela's PV after 18. Rd2, SF's evaluation is -0.85, depth=34, and its PV leads to this tablebase draw:

[d]8/3k4/R3b3/8/p7/5r2/5P2/6K1 w - - 0 31

In Leela's PV after 25. Ra6, SF's evaluation is -1.25, depth=33, and its PV leads to this TB draw:

[d]8/8/4k3/R7/5r2/5N2/p2K4/1b6 w - - 0

At the end of Leela's PV after 32...a3, SF's PV at some depth leads to this TB draw,

[d]8/8/8/4k3/7p/5K1b/p4P2/2R5 w - - 0 42

and at depth 40 SF's evaluation is 0.00, and its PV leads to this TB draw:

[d]8/8/8/8/5K1p/1b5k/p4P2/R7 w - - 0 46
jp
Posts: 1470
Joined: Mon Apr 23, 2018 7:54 am

Re: Leela outplays SF Dev in a position that SF evaluates as draw

Post by jp »

Having said all that, I don't think SF is totally off the hook yet. It just sounded like SF was being criticised for not having an evaluation of e.g. -2, when -2 is probably totally wrong.

SF (W) & Leela (B) play to this position:

[d]3r4/pp4k1/1bn1Rr1p/6p1/5p1P/1NP2P2/PP1R1P1B/6K1 w - - 1 11

SF thinks this is not Black's best line. At some depths, SF shows 0.00 here, but at other depths a small negative value.

Can someone tell me what Leela's evaluation in the diagram position is?

Then the question is what the "correct" evaluation of this position is. Is it really a draw?
Hai
Posts: 598
Joined: Sun Aug 04, 2013 1:19 pm

Re: Leela outplays SF Dev in a position that SF evaluates as draw

Post by Hai »

jp wrote: Sat Jul 20, 2019 1:56 am Having said all that, I don't think SF is totally off the hook yet. It just sounded like SF was being criticised for not having an evaluation of e.g. -2, when -2 is probably totally wrong.

SF (W) & Leela (B) play to this position:

[d]3r4/pp4k1/1bn1Rr1p/6p1/5p1P/1NP2P2/PP1R1P1B/6K1 w - - 1 11

SF thinks this is not Black's best line. At some depths, SF shows 0.00 here, but at other depths a small negative value.

Can someone tell me what Leela's evaluation in the diagram position is?

Then the question is what the "correct" evaluation of this position is. Is it really a draw?


1.Re4 Rxd2 2.Nxd2 Rd6 3.Nc4 Rd1+ 4.Kg2 Bc7 5.Re8 Rd7 6.a4 Kf6 7.Rh8 Kg6 8.Rg8+ Kf5 9.Rf8+ Kg6 10.Rg8+ Kf6 11.Rh8 Kg7 12.Re8 Ne7 13.Kf1 Kf7 14.Rh8 Kg7 15.Re8 a6 16.Ke2 Ng6 17.hxg5 hxg5 18.Bg1 Kf7 19.Re4 Kf6 20.Nd2 Kf5 21.Re8 Bd6 22.Nb3
-1.17
depth: 28/73
214MN
tb=747
LC0 42783