Progress on Rustic

mvanthoor · Post by **mvanthoor** » Tue May 25, 2021 10:18 pm

Mike Sherwin wrote: ↑Tue May 25, 2021 9:32 pm All these type things are wonky.

...

I know you probably do not want to hear about Bricabrac, but I'm going to tell you anyway. Currently as I work on the evaluation there are no reductions in the search. I removed them. The search is only pvs negamax with a couple of move ordering tricks. And there are no killers or history tables and to be clear no check extension either.

9 131 59 5488667 d2d4 d7d5 c1f4 b8d7 g1f3 g8f6 b1d2 e7e6 e2e3

I don't think that killers or history tables are going to be of much use in Bricabrac. But, I'll at some point try them anyway.

You always have a penchant for doing things differently. I'm not brave enough for that. Doing things differently would cost me even more time than writing this chess engine and its book/docs already does, not even speaking about testing.

Forget about my previous post.... I made a mistake. Again.

Because the killers need to be sorted below the captures (and later, history below the killers), I couldn't fit everything into one byte for the sort_score. So, I extended the sort_score field in the move to 2 bytes. I updated the documentation in the file. I even changed updated the output type of the function to u16. Then I forgot to change the AND-part from 0xFF (1 byte) to 0xFFFF (2 bytes), so the function DID output two bytes, but because of the 0xFF, it actually only contained 1 byte of information.

So except for the hash move sorting (which is the most important one, and always has the highest value), my entire move sorting was on the fritz. No wonder it didn't gain any Elo. I tested it by creating an empty move in the main() function, use set_score() and get_score(), and the values were different. After the fix, they're the same (as they obviously should be).

Retesting in new gauntlet tonight...

In cases like this, I wish I had made a record/struct out of my Move type instead of a 64-bit int. That would prevent bugs like this. I'm not even sure that this would be slower, when passing a reference to the move (64-bit pointer) instead of a 64-bit int to functions that need the move. It could be that the one level of pointer indirection costs (too much) speed, but I have no way to test this short of rewriting parts of the move handling.

Even though I'm going slowly with the development and checking things over and over again before testing, it's still extremely easy for oversights like this to happen. It just makes another point for the case that testing each and every function separately (and retest it if you change it) is the way to go to make a strong chess engine.

Even if this did not cause me to lose any strength, it's a new function that also didn't GAIN any strength, and would therefore have been useless.

mvanthoor · Post by **mvanthoor** » Mon May 31, 2021 8:49 pm

The last couple of nights several versions of the engines have been running tests.

Rustic Alpha 2 is the current version.
2.3.99 is Alpha 2 with two killer moves.
2.4.99 is Alpha 2 with two killer moves + history heuristic indexed by [piece][to], on alpha cuts
2.5.99 is Alpha 2 with two killer moves + history heuristic indexed by [side][piece][to], on both alpha and beta cuts

The results, without a TT, start position:

Code: Select all

Alpha 2, no TT, no killers, no history
info score cp 15 depth 7 seldepth 24 time 3532 nodes 33252202 nps 9414553
info score cp 5 depth 8 seldepth 26 time 29272 nodes 261511445 nps 8933843

Alpha 2.3.99, two killers, no history
info score cp 15 depth 7 seldepth 20 time 145 nodes 1108358 nps 7643848
info score cp 5 depth 8 seldepth 23 time 998 nodes 7191252 nps 7205663

Alpha 2.4.99, two killers, history by [piece][to] (on alpha cutoff)
info score cp 15 depth 7 seldepth 18 time 123 nodes 828468 nps 6735512
info score cp 5 depth 8 seldepth 23 time 885 nodes 5111010 nps 5775153

Alpha 2.5.99, two killlers, history by [side][piece][to] (on both alpha & beta cutoffs)
info score cp 15 depth 7 seldepth 16 time 106 nodes 736443 nps 6947575
info score cp 5 depth 8 seldepth 22 time 741 nodes 4449979 nps 6005370

The killer moves have a massive impact: almost as big as hash move sorting. (Maybe I should have put them into Alpha 1 already.) History heuristic has a noticeable improvement in speed, at least in the start position. The version with the [side][piece][to] indexing (which is basically a combination between what seems to be called "butterfly" history heuristic and "traditional" history heuristic) and increasing the heuristic on both alpha and beta cuts seems to be the fastest.

With a 32MB TT, mainly to be able to use the TT Move sorting:

Code: Select all

Alpha 2, 32MB TT, no killers, no history
info score cp 15 depth 7 seldepth 21 time 204 nodes 1631280 nps 7996471 hashfull 49
info score cp 5 depth 8 seldepth 23 time 598 nodes 4131456 nps 6908789 hashfull 148
info score cp 15 depth 9 seldepth 27 time 3677 nodes 26807077 nps 7290475 hashfull 737

Alpha 2.3.99, 32MB TT, two killers, no history
info score cp 15 depth 7 seldepth 18 time 57 nodes 375646 nps 6590281 hashfull 16
info score cp 5 depth 8 seldepth 21 time 258 nodes 1528494 nps 5924395 hashfull 79
info score cp 15 depth 9 seldepth 25 time 1101 nodes 7273534 nps 6606298 hashfull 307

Alpha 2.4.99, 32MB TT, two killers, history [piece][to] on alpha-cutoff
info score cp 15 depth 7 seldepth 18 time 56 nodes 337747 nps 6031196 hashfull 14
info score cp 5 depth 8 seldepth 22 time 296 nodes 1619808 nps 5472324 hashfull 83
info score cp 15 depth 9 seldepth 23 time 1397 nodes 7862014 nps 5627784 hashfull 355

Alpha 2.5.99, 32MB TT, two killers, history [piece][piece][to] on both alpha & beta cutoffs
info score cp 15 depth 7 seldepth 18 time 56 nodes 323766 nps 5781536 hashfull 14
info score cp 5 depth 8 seldepth 22 time 287 nodes 1537274 nps 5356355 hashfull 77
info score cp 20 depth 9 seldepth 23 time 1255 nodes 6846554 nps 5455422 hashfull 323

Killers still have a big influence, but less than without the TT. The history heuristic gets a bit wonky: it can cause a larger time to depth, while still searching less nodes; at least in the starting positions. Chess Programming Wiki says that in current-day algorithms, the history heuristic may be useless and to be considered "random noise".

I tested these version on a very fast 5s+0.1 time control, in gauntlets running for 5500 games, always against the same pool:

Code: Select all

# PLAYER                 :  RATING  POINTS  PLAYED   (%)
1 MinimalChess 0.4.1     :  2417.3  1215.5    2000    61
2 Rustic Alpha 2.5.99    :  2355.5  3273.5    5500    60
3 Rustic Alpha 2.4.99    :  2351.9  3247.5    5500    59
4 Rustic Alpha 2.3.99    :  2348.3  3221.0    5500    59
5 Chareth 0.1.0          :  2340.0   999.0    2000    50
6 CDrill Build 4         :  2314.4   926.5    2000    46
7 Clueless 1.4           :  2305.8   902.0    2000    45
8 Rustic Alpha 2         :  2305.5  2900.5    5500    53
9 Bit Genie 1.0.0        :  2299.9   885.5    2000    44
10 Wukong JS 1.5          :  2281.4   834.0    2000    42
11 Zahak 0.3.0            :  2278.9   827.0    2000    41
12 Loki 1.2.0             :  2274.5   815.0    2000    41
13 Pigeon 1.5.1           :  2263.2   784.0    2000    39
14 Zahak 0.2.1            :  2233.0   704.0    2000    35
15 Wukong JS 1.4          :  2130.4   465.0    2000    23

(Rustic performed better than expected; it seems lots of engines aren't really that good for superfast time controls. In a 2m+1s time control, Rustic is somewhere just above the middle of the list.)

The 2.5.99 version with the side-piece-to indexing, on both alpha and beta cuts has the highest Elo-rating. OK; it's only 7 elo compared to the version without any history heuristics. As this heuristic is a last ditch effort to SOMEHOW sort moves if it had a cut ANYWHERE, that's not surprising. I didn't yet run a test for 2.5.99 on a longer time control, but I don't expect it to gain much Elo. Version 2.4.99 only gained 2 Elo.

Because the history heuristic seems to have a positive impact if the transposition table is disabled, or the when the time control is very fast, I'll leave it in for now, but I could see a point where I'll remove it from the engine.

The current development version of Rustic is about 40 Elo (+/- 10, depending on the opponents and how the list is calibrated) stronger than Alpha 2. I don't really think that's enough for a release, so I'll first implement aspiration windows, and PVS.

By the way: the killers and history information are in the SearchInfo struct, so they will be reset on each new search. I've tried to move both one level up into the Search struct, but because the search is already designed to be(come) multi-threaded, I'd have to put the heuristics in a Mutex.

First: having the heuristics available for the entire game without resetting dit not bring a significant reduction in node count compared to the resetting version. (On some positions, the node-count even was worse...)
Second: having to lock and unlock the mutex cost so much time that any positive change with regard to the availability of the heuristics was completely obliterated. (To be honest: as expected.) I have nuked this branch and I'm not even going to test it. A time-to-depth that is about 75% longer than the previous version can't be any good.

===

PS: I also updated a few pages in Rustic's book/documentation. The beginning with the technical stuff has been made.

mvanthoor · Post by **mvanthoor** » Mon May 31, 2021 9:35 pm

The test for the aspiration windows is also running; currently the window is being closed by 50cp per iteration.

In the KiwiPete position, it reduced the node count somewhat, so I've set this version to testing. I don't know if a super-fast time control is useful for features like this, that have to do re-searches if something goes wrong. Maybe I'll retest this version at a 1m+0.6s time control as well.

I remember Eric Madsen's post where he says that aspiration windows are useless, and he removed them from MadChess.

It seems that from the three features I've now added (killer moves, history, and aspiration windows), only killer moves are clearly convincing from the beginning. Without a TT, they have a HUGE impact; as said, almost as large as TT-move sorting. Even with the TT active, they gain at least another 30-40 Elo.

History heuristic seems to be OK without the TT, but with TT, the result is marginal, by gaining only about 7 Elo, and only on very fast time controls. It feels as if aspiration windows are in the same class. We'll see. I'll document both features, but they might end up being removed if they only prove to be beneficial at super-fast time controls.

Henk · Post by **Henk** » Mon May 31, 2021 9:42 pm

Can you show us a win against Stockfish?

mvanthoor · Post by **mvanthoor** » Mon May 31, 2021 9:53 pm

Henk wrote: ↑Mon May 31, 2021 9:42 pm Can you show us a win against Stockfish?

No. Stockfish isn't in my pool yet. I still have few Elo-points to go, but the gap is getting smaller every day. Rustic is catching up to the old fish!

mvanthoor · Post by **mvanthoor** » Mon May 31, 2021 10:38 pm

I may have underestimated the aspiration window. If the current test run holds (at 5s+0.1s), this could be a large Elo-boost, at least at slow time controls. Then I'll have to retest this version in the 1m+0.6 gauntlet.

I'm slowly moving towards a triple-test system:

SPRT old vs. new. If this holds...
5s+0.1s gauntlet against at least 10 engines (5000 games). If this holds
1m+0.6 minute gauntlet

That should give me a fair estimate of Rustic's progress, and be able to cut short tests that don't gain any strength earlier.

mvanthoor · Post by **mvanthoor** » Wed Jun 02, 2021 12:11 am

Rustic is running the 1m+0.6 with aspiration window this evening and through the night. Currently the score is at +37 against the field (average elo: 1840). I find it strange how the engine field shuffles around when I change my engine. Some engines don't give a damn if I make my engine faster/search deeper (MinimalChess 0.4.1: +100 Elo against Rustic dev), other engines completely tank compared to previous versions of Rustic (WukongJS 1.5, Zahak 0.2.1 and 0.3.0). Some engines even perform _better_ ?! (Bit-Genie 1.0.0)

I wonder what the final rating will be tomorrow. In the hyperfast time control of 5s+0.1, the gain by the aspiration window was +20 Elo. In that list, the current dev version of Rustic is +70 over Alpha 2.

Then PVS, and the four remaining basic features are in. Then I can start on the tapered/tuned evaluation. (And somewhere down the line/in between versions: add XBoard, then rewrite the Comm interface to consolidate some stuff, and then improve time management.)

This evening I wrote some updates in Rustic's docs, mainly on move ordering. I'd be glad if people would check out that stuff and tell me if anything could be improved/is horribly wrong. Thanks

emadsen · Post by **emadsen** » Wed Jun 02, 2021 1:31 am

mvanthoor wrote: ↑Mon May 31, 2021 9:35 pm I remember Eric Madsen's post where he says that aspiration windows are useless, and he removed them from MadChess.

My engine had PVS and aspiration windows. I removed aspiration windows (except for MultiPV > 1 and limit strength mode) and the engine plays slightly stronger. In other words, I found in my engine PVS is good enough on its own.

If I understand your post correctly, you've added aspiration windows but don't yet have PVS. I'm guessing you'll gain strength when you add PVS (so aspiration windows + PVS). However, I wonder at that point, if you disable aspiration windows (leaving only PVS), does your engine play stronger than when both are enabled?

When I posted my question a few months ago most engine authors said yes, AW + PVS together were strongest. But a few of us had found PVS alone is strongest in our engines.

mvanthoor · Post by **mvanthoor** » Wed Jun 02, 2021 9:55 am

emadsen wrote: ↑Wed Jun 02, 2021 1:31 am If I understand your post correctly, you've added aspiration windows but don't yet have PVS...

Does your engine play stronger than when both are enabled?

Yes, only aspiration windows, no PVS yet. I'll test without aspiration windows and without history later.

The slow time control for the aspiration windows had a disappointing result. Just as the history, it didn't gain any Elo in that list.

Even though I can see the engine search deeper than before, it's not getting stronger on the 1m+0.6s time control.

Alpha 2 would search 7-8 ply in the middle game. The current dev-version searches 7-10 ply in the middle game. It seems the extra depth doesn't have any impact if you only have PST's enabled. On the super-fast time control, the extra 1-2 ply has much more impact because the search depth is lower. In short, I think Rustic is becoming capable of making poor-mediocre decisions faster and faster, and it's limited by its evaluation function.

I think I'm going to reverse course on this: first do the eval tuning, release _that_ as Alpha 3, and then re-add and re-test killers, history, aspiration windows, and PVS, so the faster search has a better evaluation to work with.

emadsen · Post by **emadsen** » Wed Jun 02, 2021 8:47 pm

mvanthoor wrote: ↑Wed Jun 02, 2021 9:55 am Yes, only aspiration windows, no PVS yet.

PVS should provide a significant speedup and strength increase because it narrows everything but the principal variation to a zero-width search window. Though I don't have a specific measurement. I believe the first release of my engine had PVS so I'm uncertain of its contribution to playing strength.

mvanthoor wrote: ↑Wed Jun 02, 2021 9:55 am The slow time control for the aspiration windows had a disappointing result. Just as the history, it didn't gain any Elo in that list.

History is very important for LMR. You could argue the entire point of tracking history scores is to order moves for LMR. Not quite true- even without LMR, history should help find quiet moves that cause beta cutoffs. But it really shines when it's paired with LMR.

mvanthoor wrote: ↑Wed Jun 02, 2021 9:55 am I think I'm going to reverse course on this: first do the eval tuning, release _that_ as Alpha 3, and then re-add and re-test killers, history, aspiration windows, and PVS, so the faster search has a better evaluation to work with.

Once you have sensible PSTs, each of the above features should improve playing strength. Eh, AW maybe yes, maybe no. History should be paired with LMR.

Progress on Rustic

Re: Progress on Rustic

Re: Progress on Rustic

Re: Progress on Rustic

Re: Progress on Rustic

Re: Progress on Rustic

Re: Progress on Rustic

Re: Progress on Rustic

Re: Progress on Rustic

Re: Progress on Rustic

Re: Progress on Rustic