I tried to implement Aspiration windows in Drofa two times, both were unsuccessful.
But on the first try i still had TT bug in the engine that made aspiration search very bad (plagued TT with worthless entries basically).
In the second try engine searched much less nodes, but still was ~-7 elo.
I suppose the Devil in the details here. You have to get everything right in order for it to work.
i`ll try one or two more times to implement this.
With like every top engine using it I more or less sure it is working technique, but the tricky one.
Are Aspiration Windows Worthless?
Moderators: hgm, Rebel, chrisw
-
- Posts: 105
- Joined: Thu Jun 18, 2020 3:21 pm
- Location: Moscow
- Full name: Alexander Litov
-
- Posts: 2488
- Joined: Tue Aug 30, 2016 8:19 pm
- Full name: Rasmus Althoff
Re: Are Aspiration Windows Worthless?
With pruning and reductions (especially LMR), that has gone out of the window long since anyway.
Rasmus Althoff
https://www.ct800.net
https://www.ct800.net
-
- Posts: 434
- Joined: Thu Apr 26, 2012 1:51 am
- Location: Oak Park, IL, USA
- Full name: Erik Madsen
Re: Are Aspiration Windows Worthless?
Is that a bad pun? Window, ha ha. Well, there never was correctness anyway because that assumes perfect static eval which is only true for draw-by-rule, stalemate, and checkmate. I see your point though. I just don’t see any advantage in aspiration windows over what I already get from PVS. Whereas I see a massive advantage in LMR.
My C# chess engine: https://www.madchess.net
-
- Posts: 433
- Joined: Fri Dec 16, 2016 11:04 am
- Location: France
- Full name: Richard Delorme
Re: Are Aspiration Windows Worthless?
I just run some experimentations that strongly disagree with your findings.
The aspiration windows algorithm is definitely worth on my engines. I got the following results, with self play using SPRT stop condition:
- Dumb : +54.2 +/- 7.3 Elo (834 games)
- Amoeba : +131.6 +/- 12.2 Elo (260 games)
Because of self play and SPRT, the Elo differences are probably exagerated, but obviously significant.
Are you sure your implementation is correct and optimal?
The aspiration windows algorithm is definitely worth on my engines. I got the following results, with self play using SPRT stop condition:
- Dumb : +54.2 +/- 7.3 Elo (834 games)
- Amoeba : +131.6 +/- 12.2 Elo (260 games)
Because of self play and SPRT, the Elo differences are probably exagerated, but obviously significant.
Are you sure your implementation is correct and optimal?
Richard Delorme
-
- Posts: 1221
- Joined: Wed Mar 08, 2006 8:28 pm
- Location: Florida, USA
Re: Are Aspiration Windows Worthless?
I've never had any luck with aspiration windows for Maverick. I've always assumed it was down to a poorly tuned evaluation function and generally weak-ish engine.
I've always thought that when searching the first move, the hash table would guide the search down the previous PV and the window would quickly close.
One point: do you have a fixed width for your window after each research, or are you gradually opening the window? For example...
First search: alpha = pv_score - 25; beta = pv_score + 25
Second search on fail high: alpha = pv_score - 25; beta = pv_score + 125
Third search on fail high: alpha = pv_score - 25; beta = pv_score + 300
Fourth search on fail high: alpha = pv_score - 25; beta = +inf
Best regards,
Steve
I've always thought that when searching the first move, the hash table would guide the search down the previous PV and the window would quickly close.
One point: do you have a fixed width for your window after each research, or are you gradually opening the window? For example...
First search: alpha = pv_score - 25; beta = pv_score + 25
Second search on fail high: alpha = pv_score - 25; beta = pv_score + 125
Third search on fail high: alpha = pv_score - 25; beta = pv_score + 300
Fourth search on fail high: alpha = pv_score - 25; beta = +inf
Best regards,
Steve
http://www.chessprogramming.net - Maverick Chess Engine
-
- Posts: 10297
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: Are Aspiration Windows Worthless?
If you want to test rating difference then SPRT is not the right test and you need to use fixed number of games.abulmo2 wrote: ↑Fri Dec 25, 2020 1:43 am I just run some experimentations that strongly disagree with your findings.
The aspiration windows algorithm is definitely worth on my engines. I got the following results, with self play using SPRT stop condition:
- Dumb : +54.2 +/- 7.3 Elo (834 games)
- Amoeba : +131.6 +/- 12.2 Elo (260 games)
Because of self play and SPRT, the Elo differences are probably exagerated, but obviously significant.
Are you sure your implementation is correct and optimal?
-
- Posts: 434
- Joined: Thu Apr 26, 2012 1:51 am
- Location: Oak Park, IL, USA
- Full name: Erik Madsen
Re: Are Aspiration Windows Worthless?
Interesting. Thank you Richard for running these tests. I always am willing to admit the possibility I screwed up something in my code.
I tried gradually opening the window by +/- 25, 50, 100, 200, 500, etc but eventually settled on +/- 100, 500, infinite.Steve Maughan wrote: ↑Fri Dec 25, 2020 11:15 amI've never had any luck with aspiration windows for Maverick. I've always assumed it was down to a poorly tuned evaluation function and generally weak-ish engine... Do you have a fixed width for your window after each research, or are you gradually opening the window?
My C# chess engine: https://www.madchess.net
-
- Posts: 433
- Joined: Fri Dec 16, 2016 11:04 am
- Location: France
- Full name: Richard Delorme
Re: Are Aspiration Windows Worthless?
I ran a gauntlet test with Dumb (aspiration on/off) and found +50.9 Elo (+/- 12, 100 games × 19 opponents) in favour of the engine with the aspiration windows. So the result is on par with the SPRT.Uri Blass wrote: ↑Fri Dec 25, 2020 6:28 pmIf you want to test rating difference then SPRT is not the right test and you need to use fixed number of games.abulmo2 wrote: ↑Fri Dec 25, 2020 1:43 am I just run some experimentations that strongly disagree with your findings.
The aspiration windows algorithm is definitely worth on my engines. I got the following results, with self play using SPRT stop condition:
- Dumb : +54.2 +/- 7.3 Elo (834 games)
- Amoeba : +131.6 +/- 12.2 Elo (260 games)
Because of self play and SPRT, the Elo differences are probably exagerated, but obviously significant.
Are you sure your implementation is correct and optimal?
Code: Select all
# PLAYER : RATING ERROR POINTS PLAYED (%)
1 DiscoCheck 3.7.1 : 2596.8 44.8 163.0 200 81.5%
2 Glaurung 2.2 : 2548.6 39.5 154.0 200 77.0%
3 arasan-15.6 : 2472.6 36.6 137.0 200 68.5%
4 Mini Rodent 1.0 : 2456.6 35.1 133.0 200 66.5%
5 Zappa 1.1 : 2437.2 35.5 128.0 200 64.0%
6 Cheese 1.7 64 bits : 2412.8 34.8 121.5 200 60.8%
7 Cyrano 0.6b17 : 2407.2 33.8 120.0 200 60.0%
8 Fruit 2.1 : 2389.1 33.8 115.0 200 57.5%
9 dumb-1.6 : 2361.3 12.2 1088.5 1900 57.3%
10 EXchess v6.50b : 2360.5 33.7 107.0 200 53.5%
11 Sloppy-0.2.2 : 2325.2 34.1 97.0 200 48.5%
12 dumb-1.6 no-AW : 2310.4 12.1 976.5 1900 51.4%
13 Fridolin 2.00 : 2253.3 34.7 77.0 200 38.5%
14 Pepito_v1.59 : 2211.1 36.8 66.0 200 33.0%
15 Yace Paderborn : 2188.7 37.3 60.5 200 30.2%
16 OliThink 5.3.2 : 2184.5 36.9 59.5 200 29.8%
17 amundsen : 2144.7 39.1 50.5 200 25.2%
18 Fruit 1.0 : 2135.3 39.4 48.5 200 24.2%
19 Jazz 501 : 2094.5 42.8 40.5 200 20.2%
20 phalanx : 2091.8 41.9 40.0 200 20.0%
21 beowulf : 1917.9 61.7 17.0 200 8.5%
White advantage = 0.00
Draw rate (equal opponents) = 50.00 %
Richard Delorme
-
- Posts: 4833
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: Are Aspiration Windows Worthless?
Tested Deuterium's aspwin, have not tested it for a long time. The version with aspwin won.
TC 15s+100ms
The algo is simple, if there is an early sign of score instability reset the bounds to its original value as early as possible.
It started with alpha = -inf, beta = +inf
* If it fails low, set alpha to score - 100, beta = score, but if the score is already losing or winning reset alpha/beta to -inf/+inf, then research meaning use the previous iteration depth.
* If it fails high, set beta to score + 100, alpha = score, but if the score is already losing or winning reset alpha/beta to -inf/+inf, then research.
* Otherwise, set alpha = score - 30, beta = score + 30, no research just continue with the next iteration depth.
Next:
* If it fails low and previous score was low or high then reset alpha/beta to -inf/+inf, then research. However if the score is already losing or winning reset alpha/beta to -inf/+inf, then research.
* If it fails high and previous score was low or high then reset alpha/beta to -inf/+inf, then research. However if the score is already losing or winning reset alpha/beta to -inf/+inf, then research.
So in summary if there is successive lows low/low or successive highs high/high, or alternate high/low or low/high, then reset alpha/beta to -inf/+inf.
I call 100 as BadWindow and 30 as GoodWindow.
I am trying to tune these two params with optuna optimizer at 100 games per trial for 100 trials at TC 15s+50ms to see if the optimizer can improve it.
The best param it could find so far after 6 trials with 53% score from a 100-game match against the default or init param is:
TC 15s+100ms
Code: Select all
Score of Deuterium_aw vs Deuterium: 89 - 60 - 155 [0.548] 304
... Deuterium_aw playing White: 49 - 23 - 80 [0.586] 152
... Deuterium_aw playing Black: 40 - 37 - 75 [0.510] 152
... White vs Black: 86 - 63 - 155 [0.538] 304
Elo difference: 33.2 +/- 27.4, LOS: 99.1 %, DrawRatio: 51.0 %
It started with alpha = -inf, beta = +inf
* If it fails low, set alpha to score - 100, beta = score, but if the score is already losing or winning reset alpha/beta to -inf/+inf, then research meaning use the previous iteration depth.
* If it fails high, set beta to score + 100, alpha = score, but if the score is already losing or winning reset alpha/beta to -inf/+inf, then research.
* Otherwise, set alpha = score - 30, beta = score + 30, no research just continue with the next iteration depth.
Next:
* If it fails low and previous score was low or high then reset alpha/beta to -inf/+inf, then research. However if the score is already losing or winning reset alpha/beta to -inf/+inf, then research.
* If it fails high and previous score was low or high then reset alpha/beta to -inf/+inf, then research. However if the score is already losing or winning reset alpha/beta to -inf/+inf, then research.
So in summary if there is successive lows low/low or successive highs high/high, or alternate high/low or low/high, then reset alpha/beta to -inf/+inf.
I call 100 as BadWindow and 30 as GoodWindow.
I am trying to tune these two params with optuna optimizer at 100 games per trial for 100 trials at TC 15s+50ms to see if the optimizer can improve it.
Code: Select all
python -u tuner.py --study-name deu_aspwindow_opt --sampler name=tpe --engine ./engines/deuterium/deuterium_17.exe --initial-best-value 0.55 --concurrency 6 --opening-file ./start_opening/ogpt_chess_startpos.epd --opening-format epd --input-param "{'AspWindowGood': {'default':30, 'min':5, 'max':100, 'step':1}, 'AspWindowBad': {'default':100, 'min':5, 'max':200, 'step':1}}" --games-per-trial 100 --trials 100 --base-time-sec 15 --inc-time-sec 0.05 --pgn-output deu_aspwindow_opt.pgn --threshold-pruner result=0.25 --plot
Code: Select all
2020-12-27 14:58:05,825 | INFO | init param: {'AspWindowBad': 100, 'AspWindowGood': 30}
2020-12-27 15:05:29,205 | INFO | study best param: {'AspWindowBad': 171, 'AspWindowGood': 42}
2020-12-27 15:05:29,206 | INFO | study best value: 0.53
-
- Posts: 4833
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: Are Aspiration Windows Worthless?
I stop the optimization after 40 trials (can be resumed). It came up with the best param below found at 17th trial.Ferdy wrote: ↑Sun Dec 27, 2020 8:20 am Tested Deuterium's aspwin, have not tested it for a long time. The version with aspwin won.
TC 15s+100msThe algo is simple, if there is an early sign of score instability reset the bounds to its original value as early as possible.Code: Select all
Score of Deuterium_aw vs Deuterium: 89 - 60 - 155 [0.548] 304 ... Deuterium_aw playing White: 49 - 23 - 80 [0.586] 152 ... Deuterium_aw playing Black: 40 - 37 - 75 [0.510] 152 ... White vs Black: 86 - 63 - 155 [0.538] 304 Elo difference: 33.2 +/- 27.4, LOS: 99.1 %, DrawRatio: 51.0 %
It started with alpha = -inf, beta = +inf
* If it fails low, set alpha to score - 100, beta = score, but if the score is already losing or winning reset alpha/beta to -inf/+inf, then research meaning use the previous iteration depth.
* If it fails high, set beta to score + 100, alpha = score, but if the score is already losing or winning reset alpha/beta to -inf/+inf, then research.
* Otherwise, set alpha = score - 30, beta = score + 30, no research just continue with the next iteration depth.
Next:
* If it fails low and previous score was low or high then reset alpha/beta to -inf/+inf, then research. However if the score is already losing or winning reset alpha/beta to -inf/+inf, then research.
* If it fails high and previous score was low or high then reset alpha/beta to -inf/+inf, then research. However if the score is already losing or winning reset alpha/beta to -inf/+inf, then research.
So in summary if there is successive lows low/low or successive highs high/high, or alternate high/low or low/high, then reset alpha/beta to -inf/+inf.
I call 100 as BadWindow and 30 as GoodWindow.
I am trying to tune these two params with optuna optimizer at 100 games per trial for 100 trials at TC 15s+50ms to see if the optimizer can improve it.
The best param it could find so far after 6 trials with 53% score from a 100-game match against the default or init param is:Code: Select all
python -u tuner.py --study-name deu_aspwindow_opt --sampler name=tpe --engine ./engines/deuterium/deuterium_17.exe --initial-best-value 0.55 --concurrency 6 --opening-file ./start_opening/ogpt_chess_startpos.epd --opening-format epd --input-param "{'AspWindowGood': {'default':30, 'min':5, 'max':100, 'step':1}, 'AspWindowBad': {'default':100, 'min':5, 'max':200, 'step':1}}" --games-per-trial 100 --trials 100 --base-time-sec 15 --inc-time-sec 0.05 --pgn-output deu_aspwindow_opt.pgn --threshold-pruner result=0.25 --plot
Code: Select all
2020-12-27 14:58:05,825 | INFO | init param: {'AspWindowBad': 100, 'AspWindowGood': 30} 2020-12-27 15:05:29,205 | INFO | study best param: {'AspWindowBad': 171, 'AspWindowGood': 42} 2020-12-27 15:05:29,206 | INFO | study best value: 0.53
Code: Select all
2020-12-27 19:11:54,617 | INFO | study best param: {'AspWindowBad': 32, 'AspWindowGood': 29}
2020-12-27 19:11:54,625 | INFO | study best value: 0.550625
2020-12-27 19:11:54,632 | INFO | study best trial number: 17
Run a verification match of 1000 games at TC 15s+100ms. The optimized version using {'AspWindowBad': 32, 'AspWindowGood': 29} won by +7 games vs the default {'AspWindowBad': 100, 'AspWindowGood': 30}
Code: Select all
Score of Deuterium_17_opt vs Deuterium_17: 219 - 212 - 569 [0.503] 1000
... Deuterium_17_opt playing White: 125 - 96 - 280 [0.529] 501
... Deuterium_17_opt playing Black: 94 - 116 - 289 [0.478] 499
... White vs Black: 241 - 190 - 569 [0.525] 1000
Elo difference: 2.4 +/- 14.1, LOS: 63.2 %, DrawRatio: 56.9 %