Mayhem NNUE - New NN engine

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

JohnWoe
Posts: 491
Joined: Sat Mar 02, 2013 11:31 pm

Mayhem NNUE - New NN engine

Post by JohnWoe »

Yes I copy pasted the NNUE from here and there. And what a difference does a good evaluation make. It completely annihilates Sapeli 1.92 Standard/Chess960 games!

Mayhem is Sapeli written in C++14 + SF NNUE evaluation. Thanks to Maksim simplifying off DirtyPiece/etc crap!
Release: https://github.com/SamuraiDangyo/mayhem ... /tag/v0.42
Contains a fast Linux binary.

At least +350 Elo stronger than Sapeli 1.92. This proves that handcrafted evaluations are all crap.

Code: Select all

5s
Score of Mayhem 0.42 NNUE vs Sapeli 1.92: 64 - 5 - 9  [0.878] 78
Elo difference: 343.19 +/- 109.68
Finished match

Score of Mayhem NNUE 0.42 vs Sapeli 1.92: 75 - 13 - 12  [0.810] 100
Elo difference: 251.89 +/- 80.80
Finished match

20s
Score of Mayhem NNUE 0.42 vs Sapeli 1.92: 7 - 1 - 2  [0.800] 10
Elo difference: 240.82 +/- nan
Finished match
I played a few test games on lichess. Mayhem annihilates sachy2 totally! 7½-½. Despite massive lag: https://lichess.org/k6wbI1sF

Mayhem NNUE nice analysis. Look how consistend 1. d4 is!

Code: Select all

info depth 1 nodes 40 time 1 nps 20000 score cp 69 pv d2d4
info depth 2 nodes 269 time 4 nps 53800 score cp 24 pv d2d4
info depth 3 nodes 1552 time 22 nps 67478 score cp 45 pv d2d4
info depth 4 nodes 5933 time 46 nps 126234 score cp 34 pv d2d4
info depth 5 nodes 18196 time 78 nps 230329 score cp 60 pv d2d4
info depth 6 nodes 66327 time 157 nps 419791 score cp 39 pv d2d4
info depth 7 nodes 156246 time 288 nps 540643 score cp 46 pv d2d4
info depth 8 nodes 648186 time 836 nps 774415 score cp 32 pv d2d4
info depth 9 nodes 1803245 time 1892 nps 952585 score cp 51 pv d2d4
info depth 10 nodes 8442083 time 7493 nps 1126512 score cp 32 pv d2d4
info depth 11 nodes 27943335 time 26254 nps 1064305 score cp 46 pv d2d4
BrendanJNorman
Posts: 2526
Joined: Mon Feb 08, 2016 12:43 am
Full name: Brendan J Norman

Re: Mayhem NNUE - New NN engine

Post by BrendanJNorman »

JohnWoe wrote: Thu Oct 22, 2020 11:40 pm This proves that handcrafted evaluations are all crap.
This will cause enough heads here to shake that it'll cause an earthquake. :lol:

Maybe a better way to express yourself is "this proves that copy and pasting Stockfish's evaluation into other engines is superior to using your own".

Something we already knew haha.

But even doing this is only sufficient if the goal is strength.

As an end-user, I'm getting more and more bored of the wave of Stockfish NNUEs being plugged into other engines.

And even Frank Q has switched to testing for style rather than strength.

I feel like if the trend for programmers is to copy/pasting SF NNUE, most of the end-users will also move in another direction and test older or different engines.

Or perhaps I'm misunderstanding what exactly NNUE is...I dunno.
carldaman
Posts: 2283
Joined: Sat Jun 02, 2012 2:13 am

Re: Mayhem NNUE - New NN engine

Post by carldaman »

Yeah, in many ways it's a sad state of affairs when you just plug in SF-NNUE as a shortcut. The worst part is that it's a black box that doesn't even let you set contempt, let alone other settings. It takes the fun out for guys like Brendan and me, and also for some conscientious programmers like Andrew G of Ethereal fame, who take pride in fine tuning an engine.

My advice to developers that know they are not chasing after Stockfish is to program for style, and to create an enjoyable opponent or analysis partner. That would be something to be proud of, regardless of actual Elo rating. :)
JohnWoe
Posts: 491
Joined: Sat Mar 02, 2013 11:31 pm

Re: Mayhem NNUE - New NN engine

Post by JohnWoe »

carldaman wrote: Fri Oct 23, 2020 9:00 am Yeah, in many ways it's a sad state of affairs when you just plug in SF-NNUE as a shortcut. The worst part is that it's a black box that doesn't even let you set contempt, let alone other settings. It takes the fun out for guys like Brendan and me, and also for some conscientious programmers like Andrew G of Ethereal fame, who take pride in fine tuning an engine.

My advice to developers that know they are not chasing after Stockfish is to program for style, and to create an enjoyable opponent or analysis partner. That would be something to be proud of, regardless of actual Elo rating. :)
Mayhem is just an experimental engine.

Also pext doesn't give any speedup as CPU is having hard time with NNUE instructions. My testing NNUE only is that it shuffles too much in endgames. It wasn't trained on? You still need classical evaluation.

I see NNUE as evaluation library. Tho it needs to be cleaned/polished and header only. Without all those hacks. It could use GPU/CPU/whatever I don't care. As long it works.

New version: Mayhem 0.43: https://github.com/SamuraiDangyo/mayhem ... /tag/v0.43

Sample game. See how effortless the NNUE is.
[pgn][Event "Computer Chess Game"]
[Site "pc"]
[Date "2020.10.23"]
[Round "-"]
[White "Mayhem NNUE 0.43"]
[Black "Sapeli 1.92"]
[Result "1-0"]
[TimeControl "40/60"]
[Annotator "1. +0,51 1... -0,06"]

1. d4 {+0,51/11} d5 {-0,06/10 2,0} 2. c4 {+0,46/11 1,9} dxc4 {+0,22/11 1,9}
3. e3 {+0,66/10 1,9} Be6 {+0,12/10 1,9} 4. Nf3 {+0,93/9 1,8} b5
{+0,28/10 1,8} 5. a4 {+1,78/10 1,7} c6 {+0,27/11 1,7} 6. axb5
{+2,26/10 1,7} cxb5 {+0,29/10 1,7} 7. b3 {+2,30/10 1,6} Nc6 {+0,27/10 1,6}
8. bxc4 {+3,03/10 1,6} bxc4 {-0,09/10 1,6} 9. e4 {+3,58/10 1,5} Nf6
{-0,29/10 1,5} 10. d5 {+4,62/10 1,5} Nxe4 {+0,11/10 1,5} 11. Bxc4
{+5,15/9 1,4} Nb8 {-0,20/9 1,4} 12. dxe6 {+10,67/9 1,4} Qxd1+
{-2,00/12 1,4} 13. Kxd1 Nxf2+ {-2,06/12 1,4} 14. Ke2 {+14,49/11 1,5} Nxh1
{-2,34/12 1,4} 15. Bd5 {+15,70/10 1,5} fxe6 {-2,95/12 1,4} 16. Bxa8
{+16,34/10 1,5} Nd7 {-3,35/11 1,4} 17. Bc6 {+24,20/9 1,5} Kd8
{-2,66/10 1,4} 18. Bxd7 {+22,67/9 1,5} Kxd7 {-3,41/12 1,4} 19. Rxa7+
{+26,13/9 1,5} Kc6 {-2,66/10 1,4} 20. Bf4 {+2,74/10 1,5} Kb6 {-2,87/10 1,4}
21. Be3+ {+3,59/10 1,5} Kb5 {-5,23/11 1,4} 22. Ne5 {+4,92/10 1,5} Ng3+
{-104,85/9 0,2} 23. hxg3 {+104,85/8 0,1} g6 {-104,85/7 0,1} 24. Rb7+
{+104,85/6 0,1} Ka4 {-104,85/6 0,1} 25. Nc3+ {+104,85/5 0,1} Ka5
{-104,85/5 0,1} 26. Nc4+ {+104,85/4 0,1} Ka6 27. Ra7# {+104,85/3 0,1}
{Xboard adjudication: Checkmate} 1-0[/pgn]
User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: Mayhem NNUE - New NN engine

Post by mvanthoor »

JohnWoe wrote: Thu Oct 22, 2020 11:40 pm Mayhem is Sapeli written in C++14 + SF NNUE evaluation. Thanks to Maksim simplifying off DirtyPiece/etc crap!
My engine is quite fast with regard to search already. It can achieve depth 10 in under 15 seconds in many positions WITHOUT even having a TT or other search optimizations yet. The main weakness is that it only has material count + PSQT for an evaluation. I could implement some things such as LMR and other search tricks, a TT, and stick a NNUE to that search and call it a day. That engine would probably be at least 2500 Elo.

God I hate NNUE. All those engines (and the GPU-engines as well) should be transferred to their own rating list(s), but I digress. There are other topics to discuss this.
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL
JohnWoe
Posts: 491
Joined: Sat Mar 02, 2013 11:31 pm

Re: Mayhem NNUE - New NN engine

Post by JohnWoe »

mvanthoor wrote: Fri Oct 23, 2020 10:52 am
JohnWoe wrote: Thu Oct 22, 2020 11:40 pm Mayhem is Sapeli written in C++14 + SF NNUE evaluation. Thanks to Maksim simplifying off DirtyPiece/etc crap!
My engine is quite fast with regard to search already. It can achieve depth 10 in under 15 seconds in many positions WITHOUT even having a TT or other search optimizations yet. The main weakness is that it only has material count + PSQT for an evaluation. I could implement some things such as LMR and other search tricks, a TT, and stick a NNUE to that search and call it a day. That engine would probably be at least 2500 Elo.

God I hate NNUE. All those engines (and the GPU-engines as well) should be transferred to their own rating list(s), but I digress. There are other topics to discuss this.
Even if you plug in the NNUE it doesn't take anything away from Rustic! You could use that classical evaluation to deliver a quick checkmate/time trouble.
Evaluation is the hardest part to get right. To me the most boring job.
User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: Mayhem NNUE - New NN engine

Post by mvanthoor »

BrendanJNorman wrote: Fri Oct 23, 2020 5:26 am
...

But even doing this is only sufficient if the goal is strength.

As an end-user, I'm getting more and more bored of the wave of Stockfish NNUEs being plugged into other engines.

And even Frank Q has switched to testing for style rather than strength.

I feel like if the trend for programmers is to copy/pasting SF NNUE, most of the end-users will also move in another direction and test older or different engines.

Or perhaps I'm misunderstanding what exactly NNUE is...I dunno.
I've finally installed Arena and looked into how it works.

There are actually some nice engines packaged with it. There are some that I remember from way back, such as SOS 5.1 and Ruffian 1.0.5. Those are 15 and 17 years old. Maybe I should look into collecting more older free and open source engines for testing.

With Stockfish, I'm sticking with version 10. In 11, they have recalibrated Skill and UCI_Strength to the CCRL scale, and now the engine is way too strong on the lowest 1350 setting. That is not 1350 Elo; it feels more like 1600, maybe even 1700. In the past I have always had these engines installed:

- Whatever the Fritz GUI installs, and Frtiz 13 and 11. (I don't have 12, or anything before 11.)
- Stockfish (up to 10) and Texel 1.07 (which is my favorite engine because of its great Strength setting)
- Stockfish 11 and Ethereal for analyzing.

You're right; for playing against, it wouldn't hurt to look into older engines again especially if they have a strength/skill setting, and they'll make good sparring partners for my own engine.
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL
User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: Mayhem NNUE - New NN engine

Post by mvanthoor »

JohnWoe wrote: Fri Oct 23, 2020 11:03 am Even if you plug in the NNUE it doesn't take anything away from Rustic! You could use that classical evaluation to deliver a quick checkmate/time trouble.
Evaluation is the hardest part to get right. To me the most boring job.
To me, it is the REASON to write the chess engine in the first place.

The evaluation is the place where the chess engine plays chess. It makes it a "chess engine", instead of an "apparatus that makes legal moves on a internal board representation, on the basis of values ascribed to those moves by algorithms of which we don't know how they actually work."

The only thing I'll not be doing by hand (eventually) is tuning the precise values of the evaluation. And even then, I might change some values after tuning to tweak the engine's personality, even if it decreases strength somewhat.
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL
JohnWoe
Posts: 491
Joined: Sat Mar 02, 2013 11:31 pm

Re: Mayhem NNUE - New NN engine

Post by JohnWoe »

I release Mayhem NNUE 0.47
Release: https://github.com/SamuraiDangyo/mayhem ... /tag/v0.47
Source code: https://github.com/SamuraiDangyo/mayhem

- Classical evaluation removed. Tiny mating helper added to finish games faster.
- Added null move for non tactical moves.
- Against SF12 Level: 2300 Elo it scores "5 to 1".

The 22Mb eval file is needed too.

This NNUE black box contains perfect KPK endings. No need to have KPK code inside.
[pgn][Event "Computer Chess Game"]
[Site "pc"]
[Date "2020.10.29"]
[Round "-"]
[White "Mayhem NNUE 0.47"]
[Black "Stockfish 12"]
[Result "1-0"]
[TimeControl "40/60"]
[FEN "1k6/8/8/8/8/8/1P6/1K6 w - - 0 1"]
[SetUp "1"]

{--------------
. k . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. P . . . . . .
. K . . . . . .
white to play
--------------}
1. Ka2 {+0,99/16} Ka8 {-59,87/52 1,0} 2. Kb3 {+1,31/19 1,9} Ka7
{-59,87/50 0,6} 3. Kc4 {+1,31/17 1,9} Kb6 {-59,87/51 0,5} 4. Kb4
{+2,23/18 1,8} Kc7 {-59,87/50 0,7} 5. Ka5 {+2,14/17 1,7} Kc6
{-59,87/50 0,7} 6. b3 {+1,99/17 1,7} Kd5 {-59,87/51 2,9} 7. Kb6
{+4,81/18 1,6} Ke6 {-59,87/40 0,5} 8. Kc6 {+4,52/17 1,6} Ke7 {-60,16/41 7}
9. b4 {+4,52/15 1,5} Kd8 {-60,19/37 0,5} 10. Kb7 {+3,23/17 1,5} Kd7
{-60,21/38 0,5} 11. b5 {+5,67/16 1,4} Kd6 {-1000,15/48 1,8} 12. b6
{+20,65/14 1,4} Kd7 {-1000,13/45 0,6} 13. Ka7 {+3,23/13 1,4} Ke6
{-1000,12/45 0,7} 14. b7 {+22,56/11 1,4} Kf6 {-1000,11/45 1,4} 15. b8=Q
{+21,92/11 1,4} Kf5 {-1000,09/43 0,5} 16. Qb4 {+21,84/11 1,4} Kg5
{-1000,08/45 0,7} 17. Kb6 {+28,99/11 1,4} Kf5 {-1000,07/45 0,7} 18. Kc5
{+30,48/11 1,4} Kg5 {-1000,06/53 0,8} 19. Qe4 {+30,48/11 1,4} Kf6
{-1000,06/48 0,7} 20. Kd5 {+30,27/11 1,4} Kf7 {-1000,05/84 0,7} 21. Ke5
{+104,85/10 1,0} Kg7 {-1000,04/245 0,7} 22. Kf5 {+104,85/8 0,1} Kg8
{-1000,03/245 0,1} 23. Qe7 {+104,85/5 0,1} Kh8 {-56,29/2 0,1} 24. Kg6
{+104,85/4 0,1} Kg8 {-1000,01/43 0,1} 25. Qe8# {+104,85/3 0,1}
{Xboard adjudication: Checkmate} 1-0[/pgn]
jorose
Posts: 358
Joined: Thu Jan 22, 2015 3:21 pm
Location: Zurich, Switzerland
Full name: Jonathan Rosenthal

Re: Mayhem NNUE - New NN engine

Post by jorose »

BrendanJNorman wrote: Fri Oct 23, 2020 5:26 am
JohnWoe wrote: Thu Oct 22, 2020 11:40 pm This proves that handcrafted evaluations are all crap.
This will cause enough heads here to shake that it'll cause an earthquake. :lol:

Maybe a better way to express yourself is "this proves that copy and pasting Stockfish's evaluation into other engines is superior to using your own".

Something we already knew haha.

But even doing this is only sufficient if the goal is strength.

As an end-user, I'm getting more and more bored of the wave of Stockfish NNUEs being plugged into other engines.

And even Frank Q has switched to testing for style rather than strength.

I feel like if the trend for programmers is to copy/pasting SF NNUE, most of the end-users will also move in another direction and test older or different engines.

Or perhaps I'm misunderstanding what exactly NNUE is...I dunno.
I feel like the logical next step for programmers to consider is copy pasting Stockfish's search into their program as well. I am sure that is also worth a couple hundred Elo! Also why not the board representation, move generation, uci options etc, since those are all just boilerplate anyways?

Perhaps it makes most sense to start with copy pasting the author list.
-Jonathan