Strategies for weaker play levels

Evert · Post by **Evert** » Tue Jun 28, 2016 5:54 pm

I've been looking a bit at implementing strength levels in SjaakII. This is particularly pertinent because it can play any number of obscure variants that humans who just want to try them out may not play well simply because they are unfamiliar with them.

Also, I watched my son play an old hand-held chess computer the other day, and the type of mistakes make playing against a conventional alpha-beta searcher an exercise in frustration, even at low depths. I noticed, however, that the hand-held would play much more appropriately, at least at lower levels: it sometimes forgets to evacuate an attacked piece, or it captures a defended pawn, but in ways that make it not blatantly obvious that the computer is making a blunder. In short, it looks human-ish in the mistakes it makes, which is a great feature.

I've implemented a few options in SjaakII now, as follows:

Clueless - this plays a completely random legal move, so it's completely clueless. It doesn't look like it plays chess though.
Beal/random - Returns a completely random value from the evaluation, but uses a normal search. Looks like somewhat natural play, might be close to what I'm looking for (it drops the occasional piece), but my understanding is that using this to limit the playing strength does not scale well if you increase the search depth.
Static analysis - Very simple. Look at all legal moves in the root, score them according to whether they are SEE>0 captures, centralisation or checking moves, and play the one that gets the highest score. Again, this actually looks like chess, but it's still clueless. Recognising some more patterns (threats against own pieces comes to mind) may make this better.

This topic comes up every once in a while, but from the discussions it's not so obvious to me what most people consider to be a satisfactory algorithm. Is there anyone who has tried one or more of these ideas and is able to make some recommendations based on this?

Ferdy · Post by **Ferdy** » Tue Jun 28, 2016 7:35 pm

Have you tried weakening by multipv?

This comes as a natural weakening, generate pv's (multipv = 5 for example) sort them and the first one is the best and the last one is the worst at a given iteration depth.

pv's at iteration 1 is generally weaker than pv's at iteration 2 and so on. Not only you can vary the strength along the iteration depths but also can vary the strength by the ordering in multipv at a given iteration.

You can then introduce the top move percentage per skill level. The strongest skill level will most of the time pick the top move, the 2nd top move in the sorted multipv will be picked by certain percentage higher for higher skill level and lower for lower skill level.

Then you can introduce the blunder percentage. High skill level has less blunder rate and low skill level has high blunder rate.

This is very natural this is what most humans are behaving.

Then you can introduce the error margin per skill level. Strong players may blunder at times but their blunder move is not that worst. For example for high skill level, multipv 1 is +15cp, multipv 2 is +2cp, multipv 3 is -10 multipv 4 is -200cp. Being that skilled you may select the move in multipv 3 which has a score of -10. That multipv 4 which has a score of -200 cp can be assigned for lower skills.

kbhearn · Post by **kbhearn** » Tue Jun 28, 2016 8:07 pm

Ferdy wrote:Have you tried weakening by multipv?

This comes as a natural weakening, generate pv's (multipv = 5 for example) sort them and the first one is the best and the last one is the worst at a given iteration depth.

pv's at iteration 1 is generally weaker than pv's at iteration 2 and so on. Not only you can vary the strength along the iteration depths but also can vary the strength by the ordering in multipv at a given iteration.

You can then introduce the top move percentage per skill level. The strongest skill level will most of the time pick the top move, the 2nd top move in the sorted multipv will be picked by certain percentage higher for higher skill level and lower for lower skill level.

Then you can introduce the blunder percentage. High skill level has less blunder rate and low skill level has high blunder rate.

This is very natural this is what most humans are behaving.

Then you can introduce the error margin per skill level. Strong players may blunder at times but their blunder move is not that worst. For example for high skill level, multipv 1 is +15cp, multipv 2 is +2cp, multipv 3 is -10 multipv 4 is -200cp. Being that skilled you may select the move in multipv 3 which has a score of -10. That multipv 4 which has a score of -200 cp can be assigned for lower skills.

always bothered me trying to quantify the likelihood of a blunder by cp score. strong players make potentially large blunders plenty - it's not the size of the blunder, it's how obvious it is that decreases with strength.

i.e. a complete beginner might commonly forget to move a piece out of en prise - but will still be making a move that appears to do something. making the occasional -5 blunder will still look wrong if that -5 blunder is undeveloping a piece for no reason while another is under attack.

the next step up might count an exchange sequence wrong

the next step up might miss simple two step tactics like removing a defender

the next step up might miss attacks that begin with a sacrifice or other unnatural move

the next step up might struggle with combinations involving multiple areas of the board

and finally any strength of player may struggle with complex combinations mixing multiple sacrifices, unnatural moves, quiet moves to varying degrees.

etc

then there's coupling that with missing or incorrect knowledge to cover the more subtle blunders (liking knights too much, not understanding pawn structure at all, excessive fear of doubled pawns, etc) that would be best covered by exposing large amounts of broad eval weights as options and coupling it with a gui that has personalities setup to represent common failings. In order for these wrong weights to take hold though you'll also need to limit search tree to an extreme degree so that the consequences of the wrong knowledge are often outside of the search tree.

Natural blunders are NOT merely randomly intentionally losing some centipawns every couple moves.

Evert · Post by **Evert** » Tue Jun 28, 2016 9:27 pm

Ferdy wrote:Have you tried weakening by multipv?

Not yet. At the moment I don't actually store the moves and scores in multi-pv mode, I just print them (and it only works in analysis mode at the moment).

I'll give it a go, but I suspect it still leads to play that isn't very balanced for very weak players/beginners...

Evert · Post by **Evert** » Tue Jun 28, 2016 9:32 pm

kbhearn wrote: Natural blunders are NOT merely randomly intentionally losing some centipawns every couple moves.

Just this.

Doing a static move ordering (scored by captures/checking/centralisation) and picking the most likely move leads to play that looks reasonably natural, but it has some odd holes: it doesn't actually try to deliver mate, and without threat detection, it doesn't actually try to safe pieces that are en-prise. I think this is probably the most promising thing I have so far.

Ferdy · Post by **Ferdy** » Tue Jun 28, 2016 10:09 pm

Evert wrote:
Ferdy wrote:Have you tried weakening by multipv?
Not yet. At the moment I don't actually store the moves and scores in multi-pv mode, I just print them (and it only works in analysis mode at the moment).

I'll give it a go, but I suspect it still leads to play that isn't very balanced for very weak players/beginners...

I had been making attempts in the past to improve its play, but the truth is a human is sometimes stupid, it will also blunder like crazy

So it is allowable to let your engine play crazy moves at times.

Check the latest Deuterium play with it, it has engine limit strength and elo value options based on multipv method and other things. This kind of feature also needs tuning.

Some engines that you may try playing. It was fun to watch these engines playing against each other at a supposed elo rating.

Code: Select all

Rank Engine                          ELO    1    2    3    4    5    6    7    8    9   Score      Tie  White   ELO 
------------------------------------------------------------------------------------------------------------------- 
  1&#58; Horizon v4.4 elo1200           1200   4B+  3W+  7W+  2B=  8B+  6B+  9W+  5B+ 10W+    8.5     48.5      4   +80 
  2&#58; Deuterium v14.3.34.21 elo1200  1200   5B+  6W=  8W+  1W=  3B=  4B+  7B+ 10W+ 11B+    7.5     49.0      4   +60 
  3&#58; Amyan v1.72 elo1200            1200  10B+  1B-  5W+  6W+  2W=  8B+  4B+  7W+ 12B+    7.5     48.5      4   +55 
  4&#58; Ufim v8.02 elo1200             1200   1W- 15B+  6B= 14W+  7B+  2W-  3W-  9W+ 13B+    5.5     45.0      5   +15 
  5&#58; Rybka v2.3.2a elo1200          1200   2W-  9B+  3B- 12W+  6B- 14B+ 13W+  1W-  8W+    5.0     47.0      5    +5 
  6&#58; Tornado v4.4 elo1200           1200   9B+  2B=  4W=  3B-  5W+  1W- 11W- 16W+ 14B+    5.0     46.0      5   +15 
  7&#58; BlackMamba v1.2c elo1200       1200  16W+ 11B+  1B- 13W+  4W- 10B+  2W-  3B- 15W+    5.0     41.5      5   +20 
  8&#58; Cheng4 v0.36c elo1200          1200  15W+ 13B+  2B- 10W+  1W-  3W- 12B= 11B+  5B-    4.5     44.0      4     0 
  9&#58; Hiarcs v14 elo1200             1200   6W-  5W- 15B+ 11B+ 14W= 12W+  1B-  4B- 16W+    4.5     36.0      5     0 
 10&#58; Houdini v4 elo1200             1200   3W- 14B+ 11W+  8B- 13B+  7W- 16B+  2B-  1B-    4.0     43.5      3   -10 
 11&#58; Hamsters v0.7.1 elo1200        1200  12B+  7W- 10B-  9W- 16W+ 15B+  6B+  8W-  2W-    4.0     35.5      5   -15 
 12&#58; SlowChess 2.960e elo1200       1200  11W- 16B+ 13W-  5B- 15W+  9B-  8W= 14B+  3W-    3.5     33.0      5   -15 
 13&#58; Arasan v17.1 elo1200           1200  14W=  8W- 12B+  7B- 10W- 16B+  5B- 15W=  4W-    3.0     32.0      5   -30 
 14&#58; MadChess v1.4 elo1200          1200  13B= 10W- 16W+  4B-  9B=  5W- 15B+ 12W-  6W-    3.0     32.0      5   -30 
 15&#58; Rodent v1.3 elo1200            1200   8B-  4W-  9W- 16B= 12B- 11W- 14W- 13B=  7B-    1.0     33.5      4   -70 
 16&#58; DanaSahZ v0.4 elo1200          1200   7B- 12W- 14B- 15W= 11B- 13W- 10W-  6B-  9B-    0.5     33.0      4   -80

Evert · Post by **Evert** » Tue Jun 28, 2016 10:11 pm

Here is a game between "static move ordering" (white) and "random evaluation" (black). These things are quite fun; even more so because you can make a trivial change in move ordering and see a large boost in playing strength.

[pgn][Event "Computer Chess Game"]
[Site "vivaine.local"]
[Date "2016.06.28"]
[Round "-"]
[White "Sjaak II 621 (static)"]
[Black "Sjaak II 621 (random/Beal)"]
[Result "1-0"]
[TimeControl "40/10"]
[Annotator "1... -0.56"]

1. Nf3 a5 {-0.56/9 0.2} 2. c4 b6 {+0.42/8 0.1} 3. Nc3 e6 {-1.35/8 0.1} 4.
e4 Qh4 {+0.77/8 0.3} 5. Nxh4 Kd8 {-1.40/9 0.3} 6. Be2 g6 {+0.13/8 0.1} 7.
O-O Bd6 {-0.07/9 0.4} 8. a4 Nh6 {+0.00/8 0.2} 9. f4 Nf5 {+2.09/8 0.2} 10.
exf5 Na6 {-0.71/7 0.1} 11. Kf2 Nb4 {+1.75/8 0.4} 12. g4 Ra7 {+1.39/7 0.2}
13. Ke3 Bb7 {+4.94/7 0.2} 14. Ra3 gxf5 {+2.09/7 0.2} 15. gxf5 Bc5+
{+2.26/7 0.2} 16. d4 Nc2+ {-0.01/7 0.1} 17. Qxc2 Bxd4+ {+0.22/8 0.3} 18.
Kxd4 e5+ {+1.27/8 0.2} 19. fxe5 c5+ {+0.00/8 0.2} 20. Ke3 Re8 {+0.53/8 0.2}
21. Rf4 Rxe5+ {+2.31/8 0.3} 22. Ne4 d5 {-0.12/8 0.2} 23. Rd3 b5
{-0.52/8 0.2} 24. axb5 d4+ {+0.00/8 0.3} 25. Kd2 Bd5 {-1.66/8 0.2} 26. cxd5
Rc7 {-8.29/7 0.2} 27. Nxc5 Ke8 {+0.02/8 0.2} 28. Rdxd4 h5 {-5.92/8 0.3} 29.
Bxh5 Rb7 {-8.55/8 0.3} 30. Nxb7 Kf8 {-159.88/9 1.1} 31. Nxa5 Re3
{-14.30/8 0.4} 32. Kxe3 Kg7 {-159.94/20 0.1} 33. Rg4+ Kh7 {-159.94/5 0.1}
34. Bxf7 Kh6 {-159.94/4 0.1} 35. Qc6+ Kh7 {-11.27/2 0.1} 36. b3 Kh8
{-159.98/2 0.1} 37. Qh6#
{Xboard adjudication: Checkmate} 1-0
[/pgn]

konsolas · Post by **konsolas** » Tue Jun 28, 2016 10:53 pm

Would doing a fixed depth 2 search without quiescence be such a bad idea? The engine would play moves that make sense, but would be vulnerable to tactical shots.

Vinvin · Post by **Vinvin** » Wed Jun 29, 2016 12:27 am

Evert wrote:I've been looking a bit at implementing strength levels in SjaakII.

Set a bonus/malus random values for each move at the root.

How ?
At the beginning of the game, set the range (example from -2 to +2).
Before each search (each move) set a random number for each root move : e4-> random[-2, 2], d4->random[-2, 2], ...
The evaluation will add this value for the corresponding move.
The search could last 0.1 sec.
The advantage : to not blunder too much.
The disadvantage : it will never blunder a mate.

Ferdy · Post by **Ferdy** » Wed Jun 29, 2016 6:09 am

konsolas wrote:Would doing a fixed depth 2 search without quiescence be such a bad idea? The engine would play moves that make sense, but would be vulnerable to tactical shots.

This is true, weakening by reducing the depth searched is fine, but the issue is scaling, skill1, skill2, skill3 where higher skill number is stronger. It is not that easy to just assign skill1 = depth 1, skill2 = depth 2 etc.

Strategies for weaker play levels

Strategies for weaker play levels

Re: Strategies for weaker play levels

Re: Strategies for weaker play levels

Re: Strategies for weaker play levels

Re: Strategies for weaker play levels

Re: Strategies for weaker play levels

Re: Strategies for weaker play levels

Re: Strategies for weaker play levels

Re: Strategies for weaker play levels

Re: Strategies for weaker play levels