But isn't that the point here? The engine shouldn't be weakened down to completely random play, so don't make the random bonus / error term too high.bob wrote:I don't like just adding a fixed bonus for each root move. If you make the bonus too large, play becomes random...
Score Inaccuracy: An Engine Weakening Algorithm
Moderators: hgm, Rebel, chrisw
-
- Posts: 5647
- Joined: Tue Feb 28, 2012 11:56 pm
Re: Score Inaccuracy: An Engine Weakening Algorithm
-
- Posts: 7208
- Joined: Thu Aug 18, 2011 12:04 pm
- Full name: Ed Schröder
Re: Score Inaccuracy: An Engine Weakening Algorithm
For each move the "Club Player" option in mine creates a random value for each root move between 0.00 - 2.00. The effect is surprising, it still plays its normal style but so now and then (2-3 times in a game) it blunders. Just like the average chess player.syzygy wrote:But isn't that the point here? The engine shouldn't be weakened down to completely random play, so don't make the random bonus / error term too high.bob wrote:I don't like just adding a fixed bonus for each root move. If you make the bonus too large, play becomes random...
-
- Posts: 439
- Joined: Thu Apr 26, 2012 1:51 am
- Location: Oak Park, IL, USA
- Full name: Erik Madsen
Re: Score Inaccuracy: An Engine Weakening Algorithm
First of all I'd like to thank all of you for your responses. It's quite amazing that I can ask an obscure technical question on this forum and within a couple days get advice from experts in the field. Many of you have been at this hobby for years, so I appreciate you taking some time to help a newcomer like myself.
Bob, I remember reading a post of yours a while back that explained this. It makes sense to me. Piece mobility is desired by an engine. Mobility = greater number of moves. Greater number of moves = greater chance a good move is selected, even with a random error applied. That's why I rejected the technique of added error to evaluation scores. Instead I add error only to root moves.If all you do is add some sort of random bias to your scores, and don't change anything else, you will have a "floor" on your performance that is much higher than you might imagine. Don Beal tested a program with a purely random eval and found it did not play foolishly at all.
Bob and Jon both mentioned the importance of slowing down the search. I have a parameter in MadChess called Evaluation Delay that does that. It enforces a delay every time the evaluation function is called. I have not spent any time calibrating this parameter to ELO. I'm hoping the combination of Score Inaccuracy and Evaluation Delay can create fairly believable weak chess personalities.And I also limit speed as Bob does: because at normal search speed you will get very high depths and if your score inaccuracy is not very large then the program will still find deep tactical shots that win material.
Ed, this sounds very similar to what I'm doing in my engine. I'll check out the personalities in ProDeo, maybe play a few games between ProDeo personalities and MadChess personalities. Though I'd have to weaken your engine much more than mine! ProDeo is much faster and stronger than my amateur engine.For each move the "Club Player" option in mine creates a random value for each root move between 0.00 - 2.00. The effect is surprising, it still plays its normal style but so now and then (2-3 times in a game) it blunders. Just like the average chess player.
Erik Madsen | My C# chess engine: https://www.madchess.net
-
- Posts: 5647
- Joined: Tue Feb 28, 2012 11:56 pm
Re: Score Inaccuracy: An Engine Weakening Algorithm
Slowing down the search could also be done at the root: let the engine search for a fraction of the allocated time, then wait until the allocated time runs out.emadsen wrote:Bob and Jon both mentioned the importance of slowing down the search. I have a parameter in MadChess called Evaluation Delay that does that. It enforces a delay every time the evaluation function is called. I have not spent any time calibrating this parameter to ELO. I'm hoping the combination of Score Inaccuracy and Evaluation Delay can create fairly believable weak chess personalities.
-
- Posts: 895
- Joined: Mon Jan 15, 2007 11:23 am
- Location: Warsza
Re: Score Inaccuracy: An Engine Weakening Algorithm
I went a different path in Rodent:
there is a random factor in leaf eval, there is a slowdown at the root, but the following function constitutes the core of my weakening algorithm. Basically it tells the search whether a move should be omitted.
In short, stronger levels make less blunders and push them further away from the root. Also it is not ok to forget about specific moves. Not missing check evasions is extremely big contributor (levels that can miss it tend to play unsound sacrificial attacks).
I managed to tune this algorithm to perform like 2000 and 2200 Elo (CCRL scale, first data point requires internal Elo of about 1850). At first I thought that with enough data points some curve fit would be possible.
The real problem is that I am unable to tune it in such a way that strength becomes independent from evaluation style. In short, at weaker levels evaluation functions containing significant bias (i.e. geared towards King attack or King defense) clearly outperform their balanced counterparts.
there is a random factor in leaf eval, there is a slowdown at the root, but the following function constitutes the core of my weakening algorithm. Basically it tells the search whether a move should be omitted.
Code: Select all
int sSearcher::Blunder(sPosition *p, int ply, int depth, int flag, int move, int lastMove, int flagInCheck)
{
// we're playing the best game we can, so don't bother with weakening
// (speed optimization)
// MAX_ELO set at 2500
if (Data.elo == MAX_ELO || !Data.useWeakening ) return 0;
// Weaker levels progressively use more heuristics that disallow
// forgetting about certain moves.
// try all root moves
if (!ply) return 0;
// try to capture the last piece that moved
if (Tsq(move) == Tsq(lastMove) && Data.elo > 999) return 0;
// don't miss captures by a pawn
if ( p->pc[Tsq(move)] == P
&& File(Tsq(move)) != File(Fsq(move))
&& Data.elo > 1199) return 0;
// don't miss short captures
if ( Data.distance[Fsq(move)][Tsq(move)] > 9
&& flag > FLAG_KILLER_MOVE
&& flag != FLAG_HASH_MOVE
&& Data.elo > 1299 ) return 0;
// don't miss a queen promotion
if ( MoveType(move) == Q_PROM && Data.elo > 1399) return 0;
// don't miss check evasions
if (flagInCheck && Data.elo > 1499) return 0;
// Stronger levels progressively push blunders away from the root
// ( extra randomness is used to scale this property as smoothly as possible)
if ( (Data.elo > 1599 && ply < 3)
|| (Data.elo > 1899 && ply < 4)
|| (Data.elo > 2299 && ply < 5) ) return 0;
if (Data.elo > 1300 && Data.elo <= 1600 && ply < 3)
{
int saveRate = p->hashKey % 300;
if (saveRate > 1600 - Data.elo) return 0;
}
if (Data.elo > 1600 && Data.elo <= 1900 && ply < 4 )
{
int saveRate = p->hashKey % 300;
if (saveRate > 1900 - Data.elo) return 0;
}
if (Data.elo > 1900 && Data.elo <= 2300 && ply < 5 )
{
int saveRate = p->hashKey % 300;
if (saveRate > 2300 - Data.elo) return 0;
}
int blunderRate = p->hashKey % Data.elo;
if (flag == FLAG_KILLER_MOVE
|| flag == FLAG_HASH_MOVE)
{
blunderRate *= 2;
blunderRate *= Data.elo;
blunderRate /= MAX_ELO;
}
// calculate blunder probability (less blunders near the root in deeper search)
// depth is counted so that ONE_PLY == 4
if ( blunderRate + depth < MAX_ELO - Data.elo) return 1;
return 0;
}
I managed to tune this algorithm to perform like 2000 and 2200 Elo (CCRL scale, first data point requires internal Elo of about 1850). At first I thought that with enough data points some curve fit would be possible.
The real problem is that I am unable to tune it in such a way that strength becomes independent from evaluation style. In short, at weaker levels evaluation functions containing significant bias (i.e. geared towards King attack or King defense) clearly outperform their balanced counterparts.
Pawel Koziol
http://www.pkoziol.cal24.pl/rodent/rodent.htm
http://www.pkoziol.cal24.pl/rodent/rodent.htm
-
- Posts: 1235
- Joined: Thu May 10, 2007 2:49 pm
Re: Score Inaccuracy: An Engine Weakening Algorithm
I added the limit_ELO feature to some Engines. All I did is to reduce the nodes/second with this kind of formula:
npslimit = x * pow(2,(EloLimit-999)/65)
and
while (nps > npslimit) Sleep 10
This doubles the nps for each 65 ELO increase.
You have to tune the forumla a little bit for different engines, fast timecontrols and very low nps.
After much tuning I got the following nps values for SlowChess compared to ssdf ELO:
1000: 0,2nps
1200: 12nps
1500: 150nps
2000: 3200nps
2500: 313.000nps
And it seems to work pretty good. Here some NUNN matches vs. Dedicated machines with the same ELO as adjusted:
npslimit = x * pow(2,(EloLimit-999)/65)
and
while (nps > npslimit) Sleep 10
This doubles the nps for each 65 ELO increase.
You have to tune the forumla a little bit for different engines, fast timecontrols and very low nps.
After much tuning I got the following nps values for SlowChess compared to ssdf ELO:
1000: 0,2nps
1200: 12nps
1500: 150nps
2000: 3200nps
2500: 313.000nps
And it seems to work pretty good. Here some NUNN matches vs. Dedicated machines with the same ELO as adjusted:
Code: Select all
Rtng Score 12345678901234567890123456789012345678901234567890 Perf Chg
----------------------------------------------------------------------------------------------------------------
1: Mephisto III 1464 27,5 / 50 11011111011=10=11001=100=00=10011=1=0=10010001=011 1500 +25
2: Mysticum SlowChess ELO 1464 1464 22,5 / 50 00100000100=01=00110=011=11=01100=0=1=01101110=100 1428 -25
----------------------------------------------------------------------------------------------------------------
50 games: +24 =9 -17
Rtng Score 12345678901234567890123456789012345678901234567890 Perf Chg
----------------------------------------------------------------------------------------------------------------
1: Mephisto MM IV 1904 26,5 / 50 0010===10011=011011011=011101==01001001=1011=01==0 1925 +15
2: Mysticum SlowChess ELO 1904 1904 23,5 / 50 1101===01100=100100100=100010==10110110=0100=10==1 1883 -15
----------------------------------------------------------------------------------------------------------------
50 games: +22 =11 -17
Rtng Score 12345678901234567890123456789012345678901234567890 Perf Chg
----------------------------------------------------------------------------------------------------------------
1: Mysticum SlowChess ELO 1927 1927 26,5 / 50 0011100=01=11100001=0=11==1===1==11=001=1====01001 1948 +15
2: Mephisto Amsterdam 1927 23,5 / 50 1100011=10=00011110=1=00==0===0==00=110=0====10110 1906 -15
----------------------------------------------------------------------------------------------------------------
50 games: +22 =17 -11
Rtng Score 12345678901234567890123456789012345678901234567890 Perf Chg
----------------------------------------------------------------------------------------------------------------
1: Mephisto Dallas 16Bit 1971 25,0 / 50 ===11==110=1=1111=01==0=100100000011=0=011=0000110 1971 +0
2: Mysticum SlowChess ELO 1971 1971 25,0 / 50 ===00==001=0=0000=10==1=011011111100=1=100=1111001 1971 +0
----------------------------------------------------------------------------------------------------------------
50 games: +25 =14 -11
Rtng Score 1234567890123456789012345678901234567890123 Perf Chg
---------------------------------------------------------------------------------------------------------
1: Mephisto MM V 1974 22,0 / 43 1=1011101011=001010011=0110=0=000==11=00011 1981 +4
2: Mysticum SlowChess ELO 1974 1974 21,0 / 43 0=0100010100=110101100=1001=1=111==00=11100 1967 -4
---------------------------------------------------------------------------------------------------------
43 games: +18 =8 -17
Rtng Score 12345678901234567890123456789012345678901234567890 Perf Chg
----------------------------------------------------------------------------------------------------------------
1: Mysticum SlowChess ELO 2030 2030 29,0 / 50 ==10011=1011=0=01=1=101001=1111=1=1=1=0=11=0001010 2087 +40
2: Mephisto Roma 32Bit 2030 21,0 / 50 ==01100=0100=1=10=0=010110=0000=0=0=0=1=00=1110101 1973 -40
----------------------------------------------------------------------------------------------------------------
50 games: +16 =14 -20
Rtng Score 12345678901234567890123456789012345678901234567890 Perf Chg
----------------------------------------------------------------------------------------------------------------
1: Mysticum SlowChess ELO 2011 2011 25,5 / 50 0110==010===1=11010=010==1100101010100110=10=10110 2037 +5
2: Mephisto Roma 32Bit 2030 24,5 / 50 1001==101===0=00101=101==0011010101011001=01=01001 2023 -5
----------------------------------------------------------------------------------------------------------------
50 games: +22 =11 -17
Rtng Score 12345678901234567890123456789012345678901234567890 Perf Chg
-------------------------------------------------------------------------------------------------------------------
1: Mysticum SlowChess ELO 2445 2445 26,0 / 50 1001==01010=10=00110==0011=1=1=110==101=0=10=11==0 2459 +10
2: Resurrection Fruit 2.1 203 MHz 2445 24,0 / 50 0110==10101=01=11001==1100=0=0=001==010=1=01=00==1 2431 -10
-------------------------------------------------------------------------------------------------------------------
50 games: +22 =16 -12
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Score Inaccuracy: An Engine Weakening Algorithm
How do you define "too high"? Do you taper the random bonus so that moves ordered later are less likely to become the best move?syzygy wrote:But isn't that the point here? The engine shouldn't be weakened down to completely random play, so don't make the random bonus / error term too high.bob wrote:I don't like just adding a fixed bonus for each root move. If you make the bonus too large, play becomes random...
This is a REALLY coarse adjustment. I looked for an approach that provides a fairly scalable performance that doesn't feel out of balance. I don't like positional idiot / tactical genius. I don't like tactical idiot / positional genius. I wanted both "components" to match up somewhat reasonably, so that the thing just feels like a weaker program that still plays normal-looking chess moves.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Score Inaccuracy: An Engine Weakening Algorithm
Don't you see the effect where it plays several moves like a GM, then one like a patzer? That was what I worked so hard to avoid.Rebel wrote:For each move the "Club Player" option in mine creates a random value for each root move between 0.00 - 2.00. The effect is surprising, it still plays its normal style but so now and then (2-3 times in a game) it blunders. Just like the average chess player.syzygy wrote:But isn't that the point here? The engine shouldn't be weakened down to completely random play, so don't make the random bonus / error term too high.bob wrote:I don't like just adding a fixed bonus for each root move. If you make the bonus too large, play becomes random...
Humans don't really play like that.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Score Inaccuracy: An Engine Weakening Algorithm
Slowing things down makes your approach sound more reasonable, because you have to reduce tactical acuity, and the only sensible way is to simply reduce the search depth. I also tune back (or eventually completely turn off) the other tricks in the search (LMR, null-move, forward pruning, which is another effective way of slowing things down.)emadsen wrote:First of all I'd like to thank all of you for your responses. It's quite amazing that I can ask an obscure technical question on this forum and within a couple days get advice from experts in the field. Many of you have been at this hobby for years, so I appreciate you taking some time to help a newcomer like myself.
Bob, I remember reading a post of yours a while back that explained this. It makes sense to me. Piece mobility is desired by an engine. Mobility = greater number of moves. Greater number of moves = greater chance a good move is selected, even with a random error applied. That's why I rejected the technique of added error to evaluation scores. Instead I add error only to root moves.If all you do is add some sort of random bias to your scores, and don't change anything else, you will have a "floor" on your performance that is much higher than you might imagine. Don Beal tested a program with a purely random eval and found it did not play foolishly at all.
Bob and Jon both mentioned the importance of slowing down the search. I have a parameter in MadChess called Evaluation Delay that does that. It enforces a delay every time the evaluation function is called. I have not spent any time calibrating this parameter to ELO. I'm hoping the combination of Score Inaccuracy and Evaluation Delay can create fairly believable weak chess personalities.And I also limit speed as Bob does: because at normal search speed you will get very high depths and if your score inaccuracy is not very large then the program will still find deep tactical shots that win material.
Ed, this sounds very similar to what I'm doing in my engine. I'll check out the personalities in ProDeo, maybe play a few games between ProDeo personalities and MadChess personalities. Though I'd have to weaken your engine much more than mine! ProDeo is much faster and stronger than my amateur engine.For each move the "Club Player" option in mine creates a random value for each root move between 0.00 - 2.00. The effect is surprising, it still plays its normal style but so now and then (2-3 times in a game) it blunders. Just like the average chess player.
-
- Posts: 10486
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: Score Inaccuracy: An Engine Weakening Algorithm
I disagree that humans don't play like that when we talk about the first part(several moves like a GM).bob wrote:Don't you see the effect where it plays several moves like a GM, then one like a patzer? That was what I worked so hard to avoid.Rebel wrote:For each move the "Club Player" option in mine creates a random value for each root move between 0.00 - 2.00. The effect is surprising, it still plays its normal style but so now and then (2-3 times in a game) it blunders. Just like the average chess player.syzygy wrote:But isn't that the point here? The engine shouldn't be weakened down to completely random play, so don't make the random bonus / error term too high.bob wrote:I don't like just adding a fixed bonus for each root move. If you make the bonus too large, play becomes random...
Humans don't really play like that.
I think that majority of the moves that I play in games is at GM level.
The minority of the moves that I do not play at GM level is the reason that I have a rating near 2000 and not a rating near 2600.
Note that it is rare that I play in games even one move
like a stupid patzer.
I may do a tactical mistake that lose material or the game but usually it is not a mistake that the computer can see that it is a mistake based on one ply search.