Stockfish 10 x64 Level ELO's (WIP)

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Stockfish 10 x64 Level ELO's (WIP)

Post by mvanthoor »

Hi :) Thanks for having me in this forum.

After being away from chess for almost 20 years (with a short return between 2007 and 2009), I've now decided to drop some of my many hobby's permanently and start studying chess again. (Go being one of them... There's nothing like a wooden DGT board for Go.) 20 years ago, my over the board rating fluctuated between 1800 and 1835, with an all time high of 1865; with me never opening a book or reviewing a single game. Nowadays, I'll be surprised if my rating even reaches 1600 :p

Being a software engineer/programmer, and having written a (extremely basic) chess program my teens in Borland Pascal, I've also been an avid computer chess fan. I was curious how strong Stockfish would be on levels other than level 20. A test has been run by Pohl4711, on talkchess:

viewtopic.php?t=69731

While it is a good test to make an educated guess as to the place of the SF levels in the CCRL rating list, that user didn't use the CCRL timings, used ORDO for ratings instead of BayesELO, and calibrated to a different list.

I am running my own test, using the CCRL 40/4 testing methods, calculating with BayesELO, and I calibrate against CCRL:

Intel i7-6700K (crafty 19.17 bh: 17 seconds, or 40 moves in 85 seconds)
256 MB Hash per engine
Ponder off
Endgame Tablebase: 5 piece Syzygy
Book: Varied.bin, set to 8 moves deep
GUI: CuteChess, running 4 games at once (one per core, which is possible due to ponder being off.)

Having all levels play one another is useless. Level 20 will defeat even 19 about 99% percent of the time. Therefore I've decided to do it like this:

- Run a tournament between 20 to 16
- Run a tournament between 16 to 12

and so on.

Later, I'll sprinkle some other engines in between the larger ELO gaps and have them run a tournament against the two Stockfish levels above and below.

Below are partial results. ELO-ratings have been calculated from referenced PGN (CCRL forum: 720 kB is too large to attach here), and the list has been calibrated against Stockfish 10 x64, single-threaded, which is 3495 in CCRL 40/4. I will add the last few levels in the coming days, and then start to use some other engines to fill the gaps. I'll recalibrate the list against that "basket of engines" using the average ratings of those engines.

Code: Select all

Rank Name                    Elo    +    - games score oppo. draws 
   1 Stockfish 10 x64 1T20  3495  102  102   400   99%  2805    2% 
   2 Stockfish 10 x64 1T19  2855   32   32   400   45%  2965   26% 
   3 Stockfish 10 x64 1T18  2825   32   32   400   41%  2972   27% 
   4 Stockfish 10 x64 1T17  2793   32   32   400   36%  2980   27% 
   5 Stockfish 10 x64 1T16  2745   23   23   800   50%  2788   24% 
   6 Stockfish 10 x64 1T15  2677   30   30   400   60%  2602   26% 
   7 Stockfish 10 x64 1T14  2652   30   30   400   56%  2608   26% 
   8 Stockfish 10 x64 1T13  2552   31   31   400   39%  2633   21% 
   9 Stockfish 10 x64 1T12  2458   24   24   800   50%  2444   14% 
  10 Stockfish 10 x64 1T11  2371   33   33   400   65%  2253   14% 
  11 Stockfish 10 x64 1T10  2269   32   32   400   49%  2279   12% 
  12 Stockfish 10 x64 1T09  2181   33   33   400   36%  2301   13% 
  13 Stockfish 10 x64 1T08  2105   25   25   800   49%  2108    8% 
  14 Stockfish 10 x64 1T07  2050   34   34   400   66%  1910    7% 
  15 Stockfish 10 x64 1T06  1951   34   34   400   51%  1935    5% 
  16 Stockfish 10 x64 1T05  1863   34   34   400   39%  1957    4% 
  17 Stockfish 10 x64 1T04  1723   38   38   400   21%  1992    4% 
The ratings feel believable. I can consistently beat levels 0-3, and I'm able to beat level 4 regularly on longer timecontrols if I put some effort in. Against level 5, I only draw occasionally at this point.

If you wish, you could include "Stockfish 10 x64 Level 19" and lower in your list as "derivatives" of Stockfish 10 x64 in the single-core CPU list.

I hope this is of some use to someone. (edit: added the games up until now, levels 20 up to and including 08.)
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL
User avatar
Guenther
Posts: 4605
Joined: Wed Oct 01, 2008 6:33 am
Location: Regensburg, Germany
Full name: Guenther Simon

Re: Stockfish 10 x64 Level ELO's (WIP)

Post by Guenther »

mvanthoor wrote: Mon Jul 15, 2019 6:28 pm Hi :) Thanks for having me in this forum.

...
Marcel, welcome to the CCC forum. While I am not into SF level play and always have preferred to play one of the hundreds of available chess
programs at full strength (or plain Humans OTB and on chess servers), I guess some folks here will find your elaborated post quite useful.

BTW I guess, if you zip or 7z your pgn file it might be small enough to be attached as a file to the forum (all kind of archives are supported).
(720 KB might sound like nothing today, but imagine, if dozens of people add such files per day it will sum up like the famous rice grains on
that ancient chess board)

Guenther
https://rwbc-chess.de

trollwatch:
Chessqueen + chessica + AlexChess + Eduard + Sylwy
User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: Stockfish 10 x64 Level ELO's (WIP)

Post by mvanthoor »

Guenther wrote: Mon Jul 15, 2019 8:19 pm
mvanthoor wrote: Mon Jul 15, 2019 6:28 pm Hi :) Thanks for having me in this forum.
Marcel, welcome to the CCC forum. While I am not into SF level play and always have preferred to play one of the hundreds of available chess
programs at full strength (or plain Humans OTB and on chess servers), I guess some folks here will find your elaborated post quite useful.
Thanks for the welcome, Guenther.

I've completed the Stockfish-only part of the run. The rating list is below, for all levels starting from 20 down to 00, 1 thread. The list has been calculated with BayesElo and calibrated against Stockfish level 20, 1 thread, which is 3495 in CCRL 40/4.

PS: the openings book is not Varied.bin, but Performance.bin, found here:
http://rebel13.nl/download/books.html

Full games archive at the CCRL forum: http://kirill-kryukov.com/chess/discuss ... p?id=44807

Code: Select all

Rank Name                    Elo    +    - games score oppo. draws 
   1 Stockfish 10 x64 1T20  3495  102  102   400   99%  2804    2% 
   2 Stockfish 10 x64 1T19  2855   32   32   400   45%  2965   26% 
   3 Stockfish 10 x64 1T18  2825   32   32   400   41%  2972   27% 
   4 Stockfish 10 x64 1T17  2793   32   32   400   36%  2980   27% 
   5 Stockfish 10 x64 1T16  2745   23   23   800   50%  2788   24% 
   6 Stockfish 10 x64 1T15  2677   30   30   400   60%  2602   26% 
   7 Stockfish 10 x64 1T14  2652   30   30   400   56%  2608   26% 
   8 Stockfish 10 x64 1T13  2552   31   31   400   39%  2633   21% 
   9 Stockfish 10 x64 1T12  2458   24   24   800   50%  2444   14% 
  10 Stockfish 10 x64 1T11  2370   33   33   400   65%  2253   14% 
  11 Stockfish 10 x64 1T10  2269   32   32   400   49%  2279   12% 
  12 Stockfish 10 x64 1T09  2181   33   33   400   36%  2301   13% 
  13 Stockfish 10 x64 1T08  2105   25   25   800   49%  2108    8% 
  14 Stockfish 10 x64 1T07  2050   35   35   400   66%  1910    7% 
  15 Stockfish 10 x64 1T06  1950   34   34   400   51%  1935    5% 
  16 Stockfish 10 x64 1T05  1862   34   34   400   39%  1957    4% 
  17 Stockfish 10 x64 1T04  1722   29   29   800   52%  1683    3% 
  18 Stockfish 10 x64 1T03  1589   37   37   400   68%  1408    2% 
  19 Stockfish 10 x64 1T02  1427   36   36   400   48%  1449    2% 
  20 Stockfish 10 x64 1T01  1302   38   38   400   32%  1480    1% 
  21 Stockfish 10 x64 1T00  1181   42   42   400   19%  1510    0% 
Now I'll add some other engines into the list for some variety and to get a basket of engines to calibrate against.
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL