How much ELO should I expect to gain from killer moves?

mvanthoor · Post by **mvanthoor** » Fri Jul 16, 2021 9:24 pm

lithander wrote: ↑Fri Jul 16, 2021 6:35 pm You can do that quite easily. The part where you compute the error of *each* of the thousands or millions of positions is easily parallelized.

I'm counting on Rust's Rayon library to do it for me, even

At what point in development do you plan adding more evaluation terms? Maybe you can use a simple tuner for now and upgrade it once you need the speed? (of course you can do it properly right from the start but then I won't be able to provide much help)

One term I'd like to do before anything else is a mobility term. Version 5.0.0 will have null move, which isn't a lot of work to implement, so possibly that term will be in there already. Next would be a pawn hash table with pawn structure evaluation (open and semi-open files, double, triple, isolated, hanging and backward pawns, passers... that will probably be version 6.0.0).

algerbrex · Post by **algerbrex** » Fri Jul 16, 2021 11:47 pm

Jakob Progsch wrote: ↑Fri Jul 16, 2021 8:57 pm Why is everyone writing their own tuner anyway? Just export the positions/features and scores and throw them at tensorflow or so. After all a PSQT is just a tiny single layer network and using a well optimized framework will have those converge within minutes instead of the hours people often quote for their DIY tuners?
Using these python based frameworks also makes for very convenient experimentation when adding more terms etc.

Interesting, I never thought of that. I've also been tempted to ditch PSQT and experiment around with a NN approach to creating an evaluation function. Of course there are complications and what not, but I'm getting curious as to how good of an engine I could making by using a machine learning approach and training it on data from grandmaster games and what not.

algerbrex · Post by **algerbrex** » Sat Jul 17, 2021 12:03 am

mvanthoor wrote: ↑Fri Jul 16, 2021 12:46 pm
Rustic gained about 56 Elo from the killer move feature, in self-play:

https://rustic-chess.org/progress/sprt_results.html

So since Lithander recommended I test more games, I'm running 2500+ games with a 1s + 0.08s time control, and so far it seems that killer moves aren't actually increasing the ELO. That or maybe I'm misunderstanding some aspect of testing. Here are the current results:

Code: Select all

Rank Name                          Elo     +/-   Games   Score    Draw 
   1 MinimalChess 2.0               52      11    1772   57.4%   53.4% 
   2 Blunder 1.0.0                 -24      11    1772   46.6%   56.5% 
   3 Blunder 1.1.0                 -28      11    1772   46.0%   57.6% 

2658 of 6000 games finished.

I'm wondering whether or not I implemented killer moves incorrectly, so I'll go back and check that, but from all of my previous testing, that didn't seem to be the case.

And I'm probably not going to finish running 6000, as I'd imagine ~2500+ games would be enough.

mvanthoor · Post by **mvanthoor** » Sat Jul 17, 2021 12:24 am

Is your code somewhere online?

abulmo2 · Post by **abulmo2** » Sat Jul 17, 2021 12:43 am

algerbrex wrote: ↑Fri Jul 16, 2021 3:14 am What kind of ELO gain should I be expecting with killer moves added? (I know every engine is different, I'm just more so asking generally). And depending on the minuteness of the ELO gain, how many games would I need to run to see a result?

When I remove a feature from Dumb, I got the following Elo differences :

Code: Select all

   # PLAYER                      : RATING  ERROR   POINTS  PLAYED    (%)
   1 dumb-1.9-dev                  :    0.0    4.8   6970.0   11000   63.4%
   2 dumb-1.9-dev-no_razoring      :   -1.5    4.7   6947.5   11000   63.2%
   3 dumb-1.9-dev-no_fnco          :   -7.7    5.0   6853.0   11000   62.3%
   4 dumb-1.9-dev-no_killer        :  -10.3    4.5   6814.5   11000   62.0%
   5 dumb-1.9-dev-no_lmp           :  -43.9    4.7   6295.0   11000   57.2%
   6 dumb-1.9-dev-no_see           :  -51.3    4.6   6179.5   11000   56.2%
   7 dumb-1.9-dev-no_aspiration    :  -69.9    4.9   5885.5   11000   53.5%
   8 dumb-1.9-dev-no_nullmove      :  -84.5    4.6   5655.5   11000   51.4%
   9 dumb-1.9-dev-no_history       : -164.0    4.7   4413.5   11000   40.1%
  10 dumb-1.9-dev-no_lmr           : -174.5    4.8   4254.5   11000   38.7%
  11 dumb-1.9-dev-hash_bmo         : -210.0    4.9   3732.5   11000   33.9%
  12 dumb-1.9-dev-no_hash          : -348.7    5.8   1999.0   11000   18.2%
 * fnco = frontier node cut off, bmo = best move only, hash = transposition table, see = Static Exchange Evaluation, lmp = late move pruning, lmr = late move reduction.

For killer moves, you should have a better improvement without a transposition table bestmove, as there is some redundancy between them.

algerbrex · Post by **algerbrex** » Sat Jul 17, 2021 3:17 am

mvanthoor wrote: ↑Sat Jul 17, 2021 12:24 am Is your code somewhere online?

Nope, but here's the code on GitHub now:

lithander · Post by **lithander** » Sat Jul 17, 2021 12:33 pm

Jakob Progsch wrote: ↑Fri Jul 16, 2021 8:57 pm Why is everyone writing their own tuner anyway? Just export the positions/features and scores and throw them at tensorflow or so. After all a PSQT is just a tiny single layer network and using a well optimized framework will have those converge within minutes instead of the hours people often quote for their DIY tuners?
Using these python based frameworks also makes for very convenient experimentation when adding more terms etc.

Why is everyone writing their own engines anyway?

I wanted to understand how it works before handing it off to a blackbox (for me) like tensorflow or Matlab/Octave. Also I experimented with the implementation details and that seemed to make a difference in the quality of th PSTs. For example when calculating the error I suppressed the
influence of positions that were not clearly fitting the endgame and midgame table. But you're right. The
not invented here syndrome comes to mind, of course.

algerbrex wrote: ↑Fri Jul 16, 2021 3:14 am What kind of ELO gain should I be expecting with killer moves added? (I know every engine is different, I'm just more so asking generally). And depending on the minuteness of the ELO gain, how many games would I need to run to see a result?

For me the main gain was that I'm using staged move generation and having a set of 4 killer moves I could try would often save me from generating all the non-captures which saved some time.

abulmo2 wrote: ↑Sat Jul 17, 2021 12:43 am When I remove a feature from Dumb, I got the following Elo differences :

Code: Select all

   # PLAYER                      : RATING  ERROR   POINTS  PLAYED    (%)
   1 dumb-1.9-dev                  :    0.0    4.8   6970.0   11000   63.4%
   2 dumb-1.9-dev-no_razoring      :   -1.5    4.7   6947.5   11000   63.2%
   3 dumb-1.9-dev-no_fnco          :   -7.7    5.0   6853.0   11000   62.3%
   4 dumb-1.9-dev-no_killer        :  -10.3    4.5   6814.5   11000   62.0%
   5 dumb-1.9-dev-no_lmp           :  -43.9    4.7   6295.0   11000   57.2%
   6 dumb-1.9-dev-no_see           :  -51.3    4.6   6179.5   11000   56.2%
   7 dumb-1.9-dev-no_aspiration    :  -69.9    4.9   5885.5   11000   53.5%
   8 dumb-1.9-dev-no_nullmove      :  -84.5    4.6   5655.5   11000   51.4%
   9 dumb-1.9-dev-no_history       : -164.0    4.7   4413.5   11000   40.1%
  10 dumb-1.9-dev-no_lmr           : -174.5    4.8   4254.5   11000   38.7%
  11 dumb-1.9-dev-hash_bmo         : -210.0    4.9   3732.5   11000   33.9%
  12 dumb-1.9-dev-no_hash          : -348.7    5.8   1999.0   11000   18.2%
 * fnco = frontier node cut off, bmo = best move only, hash = transposition table, see = Static Exchange Evaluation, lmp = late move pruning, lmr = late move reduction.

That's interesting. LMR and History both seem like a huge features in that list. But is the fact that removing one of the them (either LMR or History) both causes your engine to lose 160+ ELO maybe because they depend on each other and you break some kind of synergy between them? Or are each worth 160+ ELO in isolation?

Because when I tried history moves (never tried LMR) in isolation it didn't help me much and I removed it again because it didn't seem worth the added complexity in engine that aims to stay simple. But when you want to reduce some "late" moves then doing that based on their history value is probably a good idea. Do you do it like that?

mvanthoor · Post by **mvanthoor** » Sat Jul 17, 2021 4:08 pm

lithander wrote: ↑Sat Jul 17, 2021 12:33 pm That's interesting. LMR and History both seem like a huge features in that list. But is the fact that removing one of the them (either LMR or History) both causes your engine to lose 160+ ELO maybe because they depend on each other and you break some kind of synergy between them? Or are each worth 160+ ELO in isolation?

I think it's the synergy thing. LMR reduces not-so-good moves, and the history improves move ordering, so the engine doesn't accidentally reduce move X instead of Y, if X has a better history and is thus ordered higher.

So, if you have no history / bad history implementation, doesn't work well, or not as good as it could.
And if you don't have LMR, just doing the tiny move ordering optimization by history doesn't gain a lot either.

But if LMR specifically uses the extra information from the history ordering, then they go together very well.

mvanthoor · Post by **mvanthoor** » Sat Jul 17, 2021 5:40 pm

algerbrex wrote: ↑Sat Jul 17, 2021 3:17 am Nope, but here's the code on GitHub now:

You store the killer move in the bèta-cutoff, which is correct. You also make sure they are unique, which is correct as well. What I don't see is you testing if the killer move is a quiet move. You only store a move as a killer move if it's not a capture.

The reason is that the PV-move / TT-move is ordered on top, all captures are already ordered behind that, and then there are unordered quiet moves that aren't captures. If one of those moves caused the bèta-cutoff, it's a killer; so it's ordered right behind all the captures. (Later this can be refined by ordering it right behind the good captures, before neutral captures and bad captures.)

mvanthoor · Post by **mvanthoor** » Sat Jul 17, 2021 9:53 pm

The one thing I don't understand is that I've seen several engines in the past few weeks, claiming to get +70 Elo from aspiration windows, while I've also seen engines that claim to get nothing from it. I've been experimenting with those (on top of both fail-hard and fail-soft alpha/beta), and the results are, at this point, that they _might_ make a positive difference, but probably not more than +20 Elo, when on top of a fail-soft alpha/beta.

How much ELO should I expect to gain from killer moves?

Re: How much ELO should I expect to gain from killer moves?

Re: How much ELO should I expect to gain from killer moves?

Re: How much ELO should I expect to gain from killer moves?

Re: How much ELO should I expect to gain from killer moves?

Re: How much ELO should I expect to gain from killer moves?

Re: How much ELO should I expect to gain from killer moves?

Re: How much ELO should I expect to gain from killer moves?

Re: How much ELO should I expect to gain from killer moves?

Re: How much ELO should I expect to gain from killer moves?

Re: How much ELO should I expect to gain from killer moves?