A test idea without Elo, I think I start middle of Jan.26!

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

Frank Quisinsky
Posts: 7185
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

A test idea without Elo, I think I start middle of Jan.26!

Post by Frank Quisinsky »

Hi there,

Engines must play vs. the same group of “Standard” engines 50 games …
26 Standard Engines x 50 games = 1300 games on a 16 Core Ryzen 5950 AMD system = 1.300 games.
Time control will be 7 + 2, 256Mb hash, 5-pieces if possible!

In the group of Standard engines different playing levels.
2775 Elo up to 3550 Elo.

Inside in the group of Standard engines are tactical strong engines and engines produced not a high move average without resign mode.
Newer engines, older engines ... a good mix.

Here the group of standard engines:

Code: Select all

01. Uralochka 3.42a JA             ~3550
02. Revenge 4.0                    ~3500
03. CSTal 2.0                      ~3475
04. Velvet 8.1.1 JA                ~3475
05. Igel 3.6.0 JA                  ~3475
06. Stockfish 200731 HCE           ~3425
07. SlowChess Blitz 2.9            ~3400
08. Texel 1.12                     ~3400
09. Wasp 7.00                      ~3350
10. Patricia 5.0 JA                ~3350
11. Leorik 3.1.3 JA                ~3275
12. Tcheran 9.0                    ~3275
13. Nemorino 6.11 JA               ~3250
14. Monty 251209 MCTS dev          ~3225
15. Booot 6.50 HCE                 ~3175
16. Xiphos 0.6.1 HCE JA            ~3175
17. Hiarcs 15.4 HCE                ~3150
18. Engine x                       ~3150
19. DanaSah 9.1 JA                 ~3075
20. Fizbo 2.0 JA                   ~3075
21. Petrel 3.1 JA                  ~3050
22. Vajolet2 2.8.0 HCE             ~2975
23. Critter 1.6a HCE               ~2975
24. Deep iCE 4.0.853 HCE           ~2950
25. Hakkapelitta TCEC v2 HCE       ~2875
26. Spark 1.0 HCE                  ~2775
Now I sort according to the result ...

Code: Select all

1300 - 1000 points = 01. ***** General of the Engines
 999 -  950 points = 02. ****  General
 949 -  900 points = 03. ***   Lieutenant General
 899 -  850 points = 04. **    Major General
 849 -  800 points = 05. *     Brigadier General
--
 799 -  750 points = 06. Colonel
 749 -  700 points = 07. Lieutenant Colonel
 699 -  650 points = 08. Major
 649 -  600 points = 09. Captain
 599 -  550 points = 10. First Lieutenant
 549 -  500 points = 11. Second Lieutenant
--
 499 -  450 points = 12. Sergeant Major
 449 -  400 points = 13. Master Sergeant
 399 -  350 points = 14. Sergeant
 349 -  300 points = 15. Corporal
 299 -  250 points = 16. Specialist
 249 -  200 points = 17. Private First Class
 199 -    0 points = 18. Private
After a test-run …
Example: Wasp 8.0 is out …
Wasp 8.0 have to play vs. the 26 Standard-Engines, 50 games = 1.300 games

1300 points = max. possible
Result for Wasp 8.0 = 772,5 points = Colonel

And in my engine-overview you can see an entry for Wasp 8.0 = 06. Colonel

So, I can give each engines a rang (without to working with Elo).
Furthermore, you can see the move-average, quantity of lost and won games below 50 moves, draw quote (not more on stats).

Example:
Wasp 8.0 … Date of Test: 260215, 772,5 points, Rang: 06. Colonel, 83,4 move-average, 36 won, 2 lost games below 50 moves, 47,5% draws

This can be interesting with fun!
And my engine-overview will get some information about the playing-level of engines.
https://www.amateurschach.de/main/_engines.htm

😊

Test-runs with live-mode will be available on a new test-site.
I have interest to test also the older engines, around the group of TOP-250 available engines.
Games will be available as *.pgn after each test-run.
I am using my own modifiend FEOBOS book and I am sure the draw quote will be relatively low.

Best
Frank
Frank Quisinsky
Posts: 7185
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: A test idea without Elo, I think I start middle of Jan.26!

Post by Frank Quisinsky »

Hi there,

hm, perhaps I need later more stars?

8 stars ******** First General Field Marshal ... strongest result
7 stars ******* Second General Field Marshal ... second strongest result
6 stars ****** Third General Field Marshal ...third strongest result

Best
Frank

With around 3000 Elo (strongest human Elo = 2875 Elo) a computer chess engine should be "Private First Class"
3000 ELO will be around 3125-3150 CCRL Elo.
Frank Quisinsky
Posts: 7185
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: A test idea without Elo, I think I start middle of Jan.26!

Post by Frank Quisinsky »

Hi there,

I will start tomorrow in the evening a test-run in LIVE mode with Stockfish 17.1.
Stockfish 17.1 should make not more as 88%, or I must make it again harder!

I made a correction:
To be a 4Star General it must be much harder I think!!

26 Standard Engines:

Code: Select all

01. Uralochka 3.42a JA             3550
02. Revenge 4.0                    3500
03. CSTal 2.0                      3475
04. Velvet 8.1.1                   3475
05. Igel 3.6.0 JA                  3475
06. Stockfish 200731 HCE           3400
07. SlowChess Blitz 2.9            3400
08. Texel 1.12                     3400
09. Wasp 7.00                      3350
10. Patricia 5.0 JA                3350
11. Leorik 3.1.3 JA                3275
12. Tcheran 9.0                    3275
13. Nemorino 6.11 JA               3250
14. Monty 251209 MCTS dev          3225
15. Booot 6.50 HCE                 3200
16. Xiphos 0.6.1 HCE JA            3175
17. Engine X                       3175
18. Hiarcs 15.4 HCE                3150
19. DanaSah 9.1 JA                 3075
20. Fizbo 2.0 JA                   3075
21. Petrel 3.1 JA                  3050
22. Vajolet2 2.8.0 HCE             2975
23. Critter 1.6a HCE               2975
24. Deep iCE 4.0.853 HCE           2950
25. Hakkapelitta TCEC v2 HCE       2875
26. Spark 1.0 HCE                  2775
                                  -----
                                  83850 : 26 = 3225 Elo

Code: Select all

1.300,0 (100,00%) - 1.144,0 ( 88,00%) points = 01. ***** General Field Marshal
1.143,5 ( 87,97%) - 1.105,0 ( 85,00%) points = 02. ****  General
1.104,5 ( 84,97%) - 1.066,0 ( 82,00%) points = 03. ***   Lieutenant General
1.065,5 ( 81,97%) - 1.027,0 ( 79,00%) points = 04. **    Major General
1.026,5 ( 78,97%) -   988,0 ( 76,00%) points = 05. *     Brigadier General
--
  987,5 ( 75,97%) -   910,0 ( 70,00%) points = 06. Colonel
  909,5 ( 69,97%) -   832,0 ( 64,00%) points = 07. Lieutenant Colonel
  831,5 ( 63,97%) -   754,0 ( 58,00%) points = 08. Major
  753,5 ( 57,97%) -   676,0 ( 52,00%) points = 09. Captain
  675,5 ( 51,97%) -   598,0 ( 46,00%) points = 10. First Lieutenant
  597,5 ( 45,97%) -   520,0 ( 40,00%) points = 11. Second Lieutenant
--
  519,5 ( 39,97%) -   481,0 ( 37,00%) points = 12. Sergeant Major
  480,5 ( 36,97%) -   442,0 ( 34,00%) points = 13. Master Sergeant
  441,5 ( 33,97%) -   403,0 ( 31,00%) points = 14. Sergeant
  402,5 ( 30,97%) -   364,0 ( 28,00%) points = 15. Corporal
  363,5 ( 27,97%) -   325,0 ( 25,00%) points = 16. Specialist
  324,5 ( 24,97%) -   273,0 ( 21,00%) points = 17. Private First Class
  272,5 ( 20,97%) -     0,0 (  0,00%) points = 18. Private
Best
Frank

3225 Elo is around 50%
3225 Elo = ~ 3350 CCRL Elo.
For the 26 Standard Engines, Elo is more exactly if 20 test-runs / round-robins are over.
Current average values from my own rating lists, I created.
Frank Quisinsky
Posts: 7185
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: A test idea without Elo, I think I start middle of Jan.26!

Post by Frank Quisinsky »

My new tourney idea was given the name "ETOC-G"
= Engine Test Operating Center-Gutweiler -

Test-01 is still running now!
Stockfish 17.1 JA vs. 26 Standard Engines.

Different changes to the system I will use later:

- Laser 1.7 will be play, the program I will use later is at the moment not available.
for this test only I am using Laser 1.7 HCE JA dev.

- Test-run on Intel i9-10900k with 4.3Ghz and 10 cores ...
later I will use an AMD Ryzen 5950x with 4.2Ghz and 16 Cores!

- I am using 6+2 for this test.
later I will use the time control 8+2 or 7+2.

Interesting is for the moment only to adjust the rank system.
Hope that all 26 standard engines works fine, I will check that during the test.
Stockfish 17.1 should make not more as 88% or I must adjust the rank-system.

Live mode:
shredder *.sto file (tournament configuration, game plan)
https://www.amateurschach.de/fling/etoc-g_test-01.sto

shredder *.html file (current results)
https://www.amateurschach.de/fling/etoc-g_test-01.html

shredder *.pgn file (the games)
https://www.amateurschach.de/fling/etoc-g_test-01.pgn

Updates each 2 minutes with FTP-Software Fling Plus 5.04.

Best
Frank
Frank Quisinsky
Posts: 7185
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: A test idea without Elo, I think I start middle of Jan.26!

Post by Frank Quisinsky »

Hi there,

working perfect ... the rank-system!
I hope of 83,5% before I start the test (after my calculation).

With other words ...
After 259 games = 82,82% =
1.104,5 ( 84,97%) - 1.066,0 ( 82,00%) points = 03. *** Lieutenant General
Room for improvement, maybe the next official Stockfish will be in rank system higher = 02. **** General

As next I will start Rybka (Fritz 16) in the rank-system-test-phase.
A number 1 engine in the past with a very bad king safety with many pieces on board.
Have to play vs. a lot of tactical strong engines inside the group of "Standard Engines" and will lost many games very fast.

Hope of rank 17 of 18
324,5 ( 24,97%) - 273,0 ( 21,00%) points = 17. Private First Class

Today a "Private First Class"
If so, I must make no changes for the rank-systems and the 26 "Standard Engines" are perfect for a test vs. the TOP-250 available engines.

Best
Frank

At the moment all runs perfect, no crashes other problems in the group of 26 Standard Engines.
Thats good ... move average seems to be also good (without resign mode).
Frank Quisinsky
Posts: 7185
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: A test idea without Elo, I think I start middle of Jan.26!

Post by Frank Quisinsky »

I got some mails with the question:
Why I am using the U.S. Army rank system and not the German Army rank system?

Easy ...
057 different flags in the group of TOP-250 engines!
051 out of 268 from 1. usa = 19,03%
032 out of 268 from 2. germany = 11,94%

Have a look on my TOP-250 overview ...
https://www.amateurschach.de/main/_engines.htm

In this case ...

America first

:-)

Best
Frank

OK, the Field Marshal is a bit German I believe, same as Gerneral of the Army in U.S.A ... 5stars General.
A long way for Stockfish, Reckless, PlentyChess to be a 5stars General of the Army ...
but perhaps, soon or later, Reckless is able not to play so many draws vs. clearly weaker engines.
Frank Quisinsky
Posts: 7185
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: A test idea without Elo, I think I start middle of Jan.26!

Post by Frank Quisinsky »

82,66% after 972 / 1300 games.
Very stable SF results during the complete test-run.
Sure, that later SF 18 will get the rank: 4Stars General.

I found a mistake in the test-run?
Igel dev is playing. I added the wrong Igel in the group of "Standard" Engines.
Perhaps the reason that my calculation was wrong, normaly SF 17.1 should make 83,5%.
Furthermore, Monty is stronger as I await.
Not important for the moment.

Igel 3.6.0 is right.
Later Igel 3.6.0 will be one of the "Standard Engines".

Senpai 3.0 will replace Laser in the Group of "Standard Engines".
The engine I can't speak about it, before Senpai 3.0 is released.

OK, tomorrow I will start the test with Fritz 16 (Rybka).
Now with the correct group of 26 Standard Engines.

Again, I will start my new idea around January 15th, 2026.
Before I haven't the time for it.

Later ...
Tomorrow ...

Best
Frank
Frank Quisinsky
Posts: 7185
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: A test idea without Elo, I think I start middle of Jan.26!

Post by Frank Quisinsky »

Shit, Laser do really a great job vs. Stockfish.
The Laser compiles by Jim are fantastic!!

Furthermore, the engine is absolutely stable with a good king safety for the playing strength the engine have.
I should not delete Laser from the group of "Standard" Engines.

Better is to replace Hiarcs with Senpai 3.0.
Better is not to have a commercial engine in the group of "Standard" Engines.

Possible that the test idea is strong and others like to help and don't have commercial engines ...

The reason I gave the the test-system the name Gutweiler on the end of name.
If an user from Tokio :-) like to help ...
ETOC-T

Best
Frank
Frank Quisinsky
Posts: 7185
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: A test idea without Elo, I think I start middle of Jan.26!

Post by Frank Quisinsky »

Hi there,

Stockfish 17.1 test is over.

Now I started the second test:
Fritz 16 HCE (Rybka) vs. the group of 26 "Standard Engines"

Same conditions as before!
But Senpai 3.0.1 HCE replaced Hiarcs 15.4 HCE in the group of "Standard Engines".

Live mode:
shredder *.sto file (tournament configuration, game plan)
https://www.amateurschach.de/fling/etoc-g_test-02.sto

shredder *.html file (current results)
https://www.amateurschach.de/fling/etoc-g_test-02.html

shredder *.pgn file (the games)
https://www.amateurschach.de/fling/etoc-g_test-02.pgn

Updates each 2 minutes with FTP-Software Fling Plus 5.04.

The goal is to determine whether Rybka will achieve the rank of “Private First Class.” If not, I will have to change the rank system.
= 21.00% - 24,97%

Code: Select all

  324,5 ( 24,97%) -   273,0 ( 21,00%) points = 17. Private First Class
Best
Frank
Frank Quisinsky
Posts: 7185
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: A test idea without Elo, I think I start middle of Jan.26!

Post by Frank Quisinsky »

Hi there,

my math is bad ... Bob Hyatt said that to me in TalkChes for around 30 years.
Maybe he was right!

3000 Elo should be ... Private First Class
3000 ELO is the playing level for Fritz 16 (Rybka)
First promotion when 3000 Elo ... should be the goal.
That can't be very complicated.

I am working with 3%-6%-3% steps and 18 ranks.
Must thinking ...
I need 3%-4%-3%´, an 3-4-3 system with 20 ranks and have to calculate all again.

Shit ...
Later ... I have to do!

All I need would be two more Sergeant ranks.
Yes, yes ... this will solve the problem for today.

01. ***** General Field Marshal
02. **** General
03. *** Lieutenant General
04. ** Major General
05. * Brigadier General
--
06. Colonel
07. Lieutenant Colonel
08. Major
09. Captain
10. First Lieutenant
11. Second Lieutenant
--
12. Sergeant Major
13. Master Sergeant
14. Sergeant First Class
15. Staff Sergeant
16. Sergeant
17. Corporal
18. Specialist
19. Private First Class
20. Private

Later, I have to calculate all again.

Best
Frank