Lco road to 8000 or 10000! self play elo.

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

mar
Posts: 2559
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: Lco road to 8000 or 10000! self play elo.

Post by mar »

Milos wrote: Fri Nov 23, 2018 11:46 pm Your self play TC was 1min per game or similar, his games were 30sec per move. Due to draw rate you might have higher error margins with 20 super fast self-play games than with 4 30sec/move games.
In his case if nets were really of equal strength draw probability could be easily 80%. So probability of 3 wins out of 4 games for one engine in case of engines of equal strength would be like 0.1%.
So 4 games could be indeed more than sufficient to prove with almost 100% certainty that engine A is stronger than engine B.
Ofc one would need to have some knowledge of statistics which doesn't seem to be your case...
I can show a sample of CCRL 40/40 games (easily 30s/move) between equal-rated engines (2900) with a sequence of 4 1-0 for either one.
Self-play may have even higher draw rate plus probably much higher for top engines, but that doesn't change the fact that 4 games is not enough whether you like it or not. As for your "100% certainty", no comment :)
Martin Sedlak
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Lco road to 8000 or 10000! self play elo.

Post by Milos »

mar wrote: Sat Nov 24, 2018 1:06 amI can show a sample of CCRL 40/40 games (easily 30s/move) between equal-rated engines (2900) with a sequence of 4 1-0 for either one.
Self-play may have even higher draw rate plus probably much higher for top engines, but that doesn't change the fact that 4 games is not enough whether you like it or not. As for your "100% certainty", no comment :)
CCRL 40/40 with 2900Elo engines is the level of SF selfplay from testing framework. You clearly don't understand the impact of draw rate on error margins.
Showing me an example of 4 games in a row between equal engines that have 70% draw rate would be enough, you can pick any TC and any engine strength you like.
But I'm pretty certain you can't simply because it implies at least 4000 LTC games of high-quality opponents ;).
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Lco road to 8000 or 10000! self play elo.

Post by Milos »

Nay Lin Tun wrote: Sat Nov 24, 2018 12:49 am
Milos wrote: Sat Nov 24, 2018 12:20 am
Nay Lin Tun wrote: Sat Nov 24, 2018 12:10 am
MikeB wrote: Fri Nov 23, 2018 8:38 pm
Nay Lin Tun wrote: Fri Nov 23, 2018 8:47 am
P.S, Testers say 30xx network is still -100 to -150 elo beyond 11248.
Fake news....

anyone can verify or disprove your claims after even just a few games, I believe it's actually worse...

Code: Select all

Rank Name             Rating   Δ     +    -     #     Σ    Σ%     W    L    D   W%    =%   OppR 
---------------------------------------------------------------------------------------------------------
   1 Lc0 v0.19.0 11261   3198   0.0   84   84    50   38.5  77.0   33    6   11  66.0  22.0  3002 
   2 Lc0 v0.19.0 31493   3002 195.7   84   84    50   11.5  23.0    6   33   11  12.0  22.0  3198
---------------------------------------------------------------------------------------------------------
custom openings that may exaggerate Elo differences ( due to the unbalance nature of the openings )..
Hmm, I read this forum post just before posting this. According to his test with decent hardware ( that would closely reflect performance in Tcec or CCCC) the estimate is -200 elo below latest SF, whereas best 11248 is known to be below -100 elo.(speed ratio 1:1000). And also, in your slow GPU or very short time control, you are testing mostly the strength of policy heads because the value net (MCTS dont have a good chance to correct the mistakes done by policy head)
https://ibb.co/eq90FV
20Mnpmove close to TCEC performance for SFdev???
Didn't know TCEC used TC of 10''+0.1''. :lol: :lol: :lol:

Among testers, his hardware setup is most similar to those TCEC/CCCC. I think the average speeds of Lco and SF in last CCCC was around 40knps vs 80 MNps(1:2000), that would add another -50 elo gap between Lco and SF.
What can I say, you clearly lack basic comprehension skills...
carldaman
Posts: 2283
Joined: Sat Jun 02, 2012 2:13 am

Re: Lco road to 8000 or 10000! self play elo.

Post by carldaman »

If self-play is totally broken and meaningless, why don't they test against engines of established known strength? :?:
chrisw
Posts: 4319
Joined: Tue Apr 03, 2012 4:28 pm

Re: Lco road to 8000 or 10000! self play elo.

Post by chrisw »

carldaman wrote: Sat Nov 24, 2018 1:46 am If self-play is totally broken and meaningless, why don't they test against engines of established known strength? :?:
the self play testing is useful because it shows, at a minimum, that something is happened with the training. Contrast to situation where self play got stuck.
alex67a
Posts: 50
Joined: Mon Sep 10, 2018 10:15 am
Location: Denmark
Full name: Alexander Spence

Re: Lco road to 8000 or 10000! self play elo.

Post by alex67a »

mar wrote: Sat Nov 24, 2018 1:06 amI can show a sample of CCRL 40/40 games (easily 30s/move) between equal-rated engines (2900) with a sequence of 4 1-0 for either one.
Self-play may have even higher draw rate plus probably much higher for top engines, but that doesn't change the fact that 4 games is not enough whether you like it or not. As for your "100% certainty", no comment :)
This is interesting...
But I tried the NN 11149 against several NNs with elo> 6000 and every time the 11149 wins
Is it possible that always getting the same result is due to the fact of using a few games?
I have no high knowledge of statistics, but I have doubts
Nay Lin Tun
Posts: 708
Joined: Mon Jan 16, 2012 6:34 am

Re: Lco road to 8000 or 10000! self play elo.

Post by Nay Lin Tun »

Celebrating self play elo 9000+.
images.jpeg
Btw, Kings Crusher said in twitch chat that, Stockfish newer version +50 elo is also practically self play elo cos all AB engines have more or less similar search.

And one tester said, although SF 9 vs Lco in his tests is +50 from SF 9, SF10 vs Lc0 is +70.

Practically SF +50 elo is also bogus cos SF is playing against another 100 AB engines. If there were 50 AB engines and 50 NN engines, I thought it would scale to +25 or +30 only.
yanquis1972
Posts: 1766
Joined: Wed Jun 03, 2009 12:14 am

Re: Lco road to 8000 or 10000! self play elo.

Post by yanquis1972 »

Thats an interesting idea, hope someone (SF team?) tests it thoroughly. Id be surprised if there isn’t a similar elo increase though.