http://rebel13.nl/2-moves-new.html (6.728 positions vs 2-moves-epd (404) positions. Not much different.
Remarkable - Lc0 still tops at 1 and 2 seconds only doing 30-35 NPS I also noticed that even at 100ms Lc0 would top.
Remarkable, and that's softly speaking.
Next 2 list, 4=5 and 6-moves,
OKE - Opening Knowledge Engines
Moderators: hgm, Rebel, chrisw
-
- Posts: 6997
- Joined: Thu Aug 18, 2011 12:04 pm
- Full name: Ed Schröder
Re: OKE - Opening Knowledge Engines
90% of coding is debugging, the other 10% is writing bugs.
-
- Posts: 12542
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: OKE - Opening Knowledge Engines
I am given too much credit. The bulk of the work was by Swaminathan. I just pointed engines at it and let them pound. Of course the hour per position (and there were far more positions rejected than accepted) can literally be reproduced in one second with modern hardware and software. My current estimate is that ten percent of the STS positions are wrong. Be that as it may, I am still proud of the achievement. I did a calculation recently. From start to end we averaged one position per day.Rebel wrote: ↑Tue Jun 11, 2019 9:38 pmYou can't compare STS with OKE. STS was created with the help of the top engines of that time, all the Rybka derivatives will top the STS list not only Houdini 1.5Laskos wrote: ↑Tue Jun 11, 2019 9:17 pm Well Ed, why bother? We have the excellent "Strategic Test Suite" which "consists of series of themed test suites designed to evaluate chess engine's long term understanding of strategical and positional concepts" (self-description). Dann and his Rybka were the main contributors to it. Here are some current standings on "strategical and positional concepts" (have no patience testing more engines):
I guess your suite might converge to STS as standings go. Good!Code: Select all
Stuck to the solution from 1s to 2s of thinking 4 i7 cores at 3.80GHz RTX 2070 GPU Houdini 1.5a score=1339/1500 [averages on correct positions: depth=8.4 time=0.15 nodes=1320734] Komodo 13.02 score=1319/1500 [averages on correct positions: depth=10.6 time=0.16 nodes=1064120] Stockfish dev score=1284/1500 [averages on correct positions: depth=10.8 time=0.18 nodes=1111414] Texel 1.07 score=1241/1500 [averages on correct positions: depth=8.9 time=0.22 nodes=1379498] Arasan 21.0 score=1195/1500 [averages on correct positions: depth=9.5 time=0.24 nodes=1017178] Lc0 v21.2 ID42524 score=1177/1500 [averages on correct positions: depth=3.8 time=0.14 nodes=1580] Fruit 2.1 score= 993/1500 [averages on correct positions: depth=5.2 time=0.23 nodes=505132]
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
-
- Posts: 12542
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: OKE - Opening Knowledge Engines
I should mention that I always used the three strongest engines (not just Rybka). By the end of the test set Rybka was long retired.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
-
- Posts: 6997
- Joined: Thu Aug 18, 2011 12:04 pm
- Full name: Ed Schröder
Re: OKE - Opening Knowledge Engines
Thanks for sharing. We are defenitely pioneering in different areas. My focus is strictly on the first 6 moves. In your 200 many positions finished already the development phase or are close to that. Nevertheless the results are very interesting. With ChessPartner -> Analyze EPD I ran your 200 positions with some engines.Laskos wrote: ↑Tue Jun 11, 2019 9:59 pm Well, after seeing your methodology of building the suite, I think I will stick to my opening suite, which I didn't appreciated too much, maybe I lack a sense of being an "expert" (I always feel a patzer in Chess).
The opening suite was built manually and pretty slowly almost three years ago, before any Lc0, and the boost of confidence came with Lc0 too:
Here is the suite:Code: Select all
4 i7 cores at 3.80GHz RTX 2070 GPU Openings1000.epd test-suite (stuck to the solution from 1s to 2s of thinking): Lc0 v21.2 ID42524 757/1000 Lc0 v21.2 ID32930 727/1000 Lc0 v21.2 ID11261 723/1000 Stockfish_dev 574/1000 Houdini 6.03 558/1000 Komodo 13.02 556/1000 Xiphos 0.5 513/1000 Booot 6.3.1 494/1000 Andscacs 0.95 484/1000 Laser 1.7 480/1000 Ethereal 11.25 467/1000 Texel 1.07 419/1000 Rodent III 376/1000 Fruit 2.1 348/1000 BikJump 2.01 276/1000 Predateur 2.2.1 265/1000
http://s000.tinyupload.com/?file_id=854 ... 6503996473
The number of positions is 1000, 5 times the same 200 positions, for less noise. My methodology hardly allows for more than 200 positions in some reasonable amount of time (several weekends spent).
I7 cores at 2.80GHz
one second per move.
Code: Select all
Engine: Lc0 v0.21.2-rc1 125/200
Engine: Stockfish 10 93/200
Engine: Ethereal 11.25 86/200
Engine: Senpai 1.0 78/200
Engine: Mephisto Gideon 76/200
Engine: Xiphos 0.5 72/200
Engine: Laser 1.6 71/200
Engine: Rebel Century 66/200
Engine: Sting SF 9.6 66/200
Engine: Texel 1.06a45 65/200
Engine: Rodent III 64/200
Engine: Rybka 4.1 61/200
90% of coding is debugging, the other 10% is writing bugs.
-
- Posts: 12542
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: OKE - Opening Knowledge Engines
The engines that ere interesting here are those that out kicked their coverage. Rodent and Gideon stand out like sore thumbs. How do the differ in opening eval?
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
-
- Posts: 6997
- Joined: Thu Aug 18, 2011 12:04 pm
- Full name: Ed Schröder
Re: OKE - Opening Knowledge Engines
You did the best you could and likely (or hopefully) many engines profited and starters still can. Whatver OKE will be surely it will be outdated 5-10 years after. My hope is on Lc0.Dann Corbit wrote: ↑Wed Jun 12, 2019 9:01 amI am given too much credit. The bulk of the work was by Swaminathan. I just pointed engines at it and let them pound. Of course the hour per position (and there were far more positions rejected than accepted) can literally be reproduced in one second with modern hardware and software. My current estimate is that ten percent of the STS positions are wrong. Be that as it may, I am still proud of the achievement. I did a calculation recently. From start to end we averaged one position per day.Rebel wrote: ↑Tue Jun 11, 2019 9:38 pmYou can't compare STS with OKE. STS was created with the help of the top engines of that time, all the Rybka derivatives will top the STS list not only Houdini 1.5Laskos wrote: ↑Tue Jun 11, 2019 9:17 pm Well Ed, why bother? We have the excellent "Strategic Test Suite" which "consists of series of themed test suites designed to evaluate chess engine's long term understanding of strategical and positional concepts" (self-description). Dann and his Rybka were the main contributors to it. Here are some current standings on "strategical and positional concepts" (have no patience testing more engines):
I guess your suite might converge to STS as standings go. Good!Code: Select all
Stuck to the solution from 1s to 2s of thinking 4 i7 cores at 3.80GHz RTX 2070 GPU Houdini 1.5a score=1339/1500 [averages on correct positions: depth=8.4 time=0.15 nodes=1320734] Komodo 13.02 score=1319/1500 [averages on correct positions: depth=10.6 time=0.16 nodes=1064120] Stockfish dev score=1284/1500 [averages on correct positions: depth=10.8 time=0.18 nodes=1111414] Texel 1.07 score=1241/1500 [averages on correct positions: depth=8.9 time=0.22 nodes=1379498] Arasan 21.0 score=1195/1500 [averages on correct positions: depth=9.5 time=0.24 nodes=1017178] Lc0 v21.2 ID42524 score=1177/1500 [averages on correct positions: depth=3.8 time=0.14 nodes=1580] Fruit 2.1 score= 993/1500 [averages on correct positions: depth=5.2 time=0.23 nodes=505132]
90% of coding is debugging, the other 10% is writing bugs.
-
- Posts: 6997
- Joined: Thu Aug 18, 2011 12:04 pm
- Full name: Ed Schröder
Re: OKE - Opening Knowledge Engines
For Gideon one reason would be, as already stated earlier, no reductions, no futily pruning etc. --> no loss in the quality of eval.Dann Corbit wrote: ↑Wed Jun 12, 2019 9:22 am The engines that ere interesting here are those that out kicked their coverage. Rodent and Gideon stand out like sore thumbs. How do the differ in opening eval?
90% of coding is debugging, the other 10% is writing bugs.
-
- Posts: 12542
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: OKE - Opening Knowledge Engines
That does not explain it for me.Rebel wrote: ↑Wed Jun 12, 2019 9:33 amFor Gideon one reason would be, as already stated earlier, no reductions, no futily pruning etc. --> no loss in the quality of eval.Dann Corbit wrote: ↑Wed Jun 12, 2019 9:22 am The engines that ere interesting here are those that out kicked their coverage. Rodent and Gideon stand out like sore thumbs. How do the differ in opening eval?
Quite the opposite.
Early on, their are not many striking tactical shots to filter out. And why does the enormous depth increase not overshadow those few tactical oversights that do occur.
I think there is something that does not meet the eye.
Tragically we cannot study what lc0 does since it is a black box that spits out brilliant chess moves. And his turk keeps mum.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
-
- Posts: 893
- Joined: Mon Jan 15, 2007 11:23 am
- Location: Warsza
Re: OKE - Opening Knowledge Engines
Rodent's score might well be an artifact caused by the fact that it likes fianchetto and finds/completes it more often than other engines.
Other opening-related bits and pieces are penalty for developing queen before minor pieces (but it is quite low and unlikely to influence the result), some code for pawn chains from King's Indian Defence/Attack (unlikely to kick in on move 4) and piece/square table asymmetry causing poor little mouse to dislike pawn on c2 (this one can have some influence).
Other opening-related bits and pieces are penalty for developing queen before minor pieces (but it is quite low and unlikely to influence the result), some code for pawn chains from King's Indian Defence/Attack (unlikely to kick in on move 4) and piece/square table asymmetry causing poor little mouse to dislike pawn on c2 (this one can have some influence).
Pawel Koziol
http://www.pkoziol.cal24.pl/rodent/rodent.htm
http://www.pkoziol.cal24.pl/rodent/rodent.htm
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: OKE - Opening Knowledge Engines
Yes, Mephisto Gideon seems to perform well here too, but is not ranked as high as in your test:Rebel wrote: ↑Wed Jun 12, 2019 9:16 amThanks for sharing. We are defenitely pioneering in different areas. My focus is strictly on the first 6 moves. In your 200 many positions finished already the development phase or are close to that. Nevertheless the results are very interesting. With ChessPartner -> Analyze EPD I ran your 200 positions with some engines.Laskos wrote: ↑Tue Jun 11, 2019 9:59 pm Well, after seeing your methodology of building the suite, I think I will stick to my opening suite, which I didn't appreciated too much, maybe I lack a sense of being an "expert" (I always feel a patzer in Chess).
The opening suite was built manually and pretty slowly almost three years ago, before any Lc0, and the boost of confidence came with Lc0 too:
Here is the suite:Code: Select all
4 i7 cores at 3.80GHz RTX 2070 GPU Openings1000.epd test-suite (stuck to the solution from 1s to 2s of thinking): Lc0 v21.2 ID42524 757/1000 Lc0 v21.2 ID32930 727/1000 Lc0 v21.2 ID11261 723/1000 Stockfish_dev 574/1000 Houdini 6.03 558/1000 Komodo 13.02 556/1000 Xiphos 0.5 513/1000 Booot 6.3.1 494/1000 Andscacs 0.95 484/1000 Laser 1.7 480/1000 Ethereal 11.25 467/1000 Texel 1.07 419/1000 Rodent III 376/1000 Fruit 2.1 348/1000 BikJump 2.01 276/1000 Predateur 2.2.1 265/1000
http://s000.tinyupload.com/?file_id=854 ... 6503996473
The number of positions is 1000, 5 times the same 200 positions, for less noise. My methodology hardly allows for more than 200 positions in some reasonable amount of time (several weekends spent).
I7 cores at 2.80GHz
one second per move.
I would say it's hard to believe a 2400-2500 rated engine (Gideon) can outperform many 3000+ elo rated engines without admitting something is missing in modern engines. The pattern with OKE remains except for Rodent III.Code: Select all
Engine: Lc0 v0.21.2-rc1 125/200 Engine: Stockfish 10 93/200 Engine: Ethereal 11.25 86/200 Engine: Senpai 1.0 78/200 Engine: Mephisto Gideon 76/200 Engine: Xiphos 0.5 72/200 Engine: Laser 1.6 71/200 Engine: Rebel Century 66/200 Engine: Sting SF 9.6 66/200 Engine: Texel 1.06a45 65/200 Engine: Rodent III 64/200 Engine: Rybka 4.1 61/200
Code: Select all
4 i7 cores at 3.80GHz
RTX 2070 GPU
Openings1000.epd test-suite (stuck to the solution from 1s to 2s of thinking):
Lc0 v21.2 ID42524 757/1000
Lc0 v21.2 ID32930 727/1000
Lc0 v21.2 ID11261 723/1000
Stockfish_dev 574/1000
Houdini 6.03 558/1000
Komodo 13.02 556/1000
Xiphos 0.5 513/1000
Booot 6.3.1 494/1000
Andscacs 0.95 484/1000
Laser 1.7 480/1000
Ethereal 11.25 467/1000
Texel 1.07 419/1000
Mephisto Gideon 387/1000
Rodent III 376/1000
Fruit 2.1 348/1000
BikJump 2.01 276/1000
Predateur 2.2.1 265/1000
Yes, my 200 positions are well into the openings, but only there I found in databases some clear-cut positional "shots" or "decisions" (I have a single "bm" entry in each EPD line, it's either a hit or a miss). In your case, focusing on at most 6 initial moves, and giving these moves "grades" of positional play is beyond my competency. Anyway, databases of strong human games are important in your case too, and engine's analysis is useful only to trim out unwanted tactics. Positionally, aside maybe Lc0 case by case, no engine can help in these early opening moves (no matter how long the engine "thinks").