Can principal variation search be worth no Elo?

mvanthoor · Post by **mvanthoor** » Wed Sep 22, 2021 12:18 pm

R. Tomasi wrote: ↑Sun Sep 19, 2021 1:09 am Assuming you're using iterative deepening and order TT moves first, then that should actually help with move ordering, I would think.

It does. I've refactored my alpha-beta function to be fail-soft. Now it also only has one point where it saves flags, values, and best move (if any) to the TT. This is just after the move loop. The move loop now breaks instead of returns on a bèta cutoff. This refactor didn't gain any strength, but the alpha-beta function is now much cleaner and more readable. Previously, it was a weird mix of fail-hard and fail-soft; it did everything you'd do for fail-soft, and then eventually perform a fail-hard, by returning alpha or beta, instead of best_score. That was a bit confusing.

But to be honest, that's a different topic.

mvanthoor · Post by **mvanthoor** » Wed Sep 22, 2021 8:26 pm

algerbrex wrote: ↑Wed Sep 22, 2021 5:41 am I think while I continue to debug this issue, I'll post updates here, and hopefully, they'll be useful to future newcomers like me.
...

I'll have a look at it myself as well. I've seen a lot of things in Blunder that are coded similar to how I did it in Rustic, (including the ASCII art at the console

), and as this is something that is already implemented in my own engine, I think I can look at the code without being influenced.

Because PVS gets enhanced by the TT (which is beneficial for the researches), I looked through that code first. I noticed a small difference with my code: the mate handling is different. For example, when storing:

Yours:

Code: Select all

	if score > Checkmate {
		score += int16(ply)
	}

	if score < -Checkmate {
		score -= int16(ply)
	}

Mine:

Code: Select all

	if score > Checkmate {
		score += int16(ply)
	}

	if score < Checkmate {
		score -= int16(ply)
	}

(Notice the missing minus sign in my code.)

After thinking about it, I decided that this was a bug in my engine: because of the missing minus sign, it may play into a variation, not seeing that it will be mated (because minus Checkmate is you being mated, and forgetting the minus sign results in the TT returning the wrong value).

This can happen in a small number of games, where the engine can play into a variation in which it will lose. So I changed this in my development version and started a test. At the moment, it's still running:

Code: Select all

Score of Rustic Alpha 3.13.100 vs Rustic Alpha 3.12.100: 574 - 489 - 416 [0.529]
...      Rustic Alpha 3.13.100 playing White: 331 - 189 - 220  [0.596] 740
...      Rustic Alpha 3.13.100 playing Black: 243 - 300 - 196  [0.461] 739
...      White vs Black: 631 - 432 - 416  [0.567] 1479
Elo difference: 20.0 +/- 15.0, LOS: 99.5 %, DrawRatio: 28.1 %
1479 of 2000 games finished.

A +20 Elo difference, already outside the margin of error. So I'm gaining 5-35 Elo, as it stands now. (I would need to run 5000 games or so to be completely sure.)

So... by looking into _your_ problem, I discovered a bug in my own engine, which could potentially gain +35 Elo when fixed. So the mate handling seems to NOT be the problem in your engine... and I think I should put you in my engine's credits even though you didn't actually "do" anything

I'll clone the Git repo and take a look at the rest of the code shortly.

algerbrex · Post by **algerbrex** » Wed Sep 22, 2021 9:07 pm

mvanthoor wrote: ↑Wed Sep 22, 2021 8:26 pm I'll have a look at it myself as well. I've seen a lot of things in Blunder that are coded similar to how I did it in Rustic, (including the ASCII art at the console ), and as this is something that is already implemented in my own engine, I think I can look at the code without being influenced.

Thanks, Marcel, a second pair of eyes is always much appreciated! And you and the author of Little Wing indeed influenced me to spice up my engine's CLI with some ASCII art

mvanthoor wrote: ↑Wed Sep 22, 2021 8:26 pm Because PVS gets enhanced by the TT (which is beneficial for the researches), I looked through that code first. I noticed a small difference with my code: the mate handling is different. For example, when storing:

Yours:
Code: Select all
	if score > Checkmate {
		score += int16(ply)
	}

	if score < -Checkmate {
		score -= int16(ply)
	}
Mine:
Code: Select all
	if score > Checkmate {
		score += int16(ply)
	}

	if score < Checkmate {
		score -= int16(ply)
	}
(Notice the missing minus sign in my code.)

After thinking about it, I decided that this was a bug in my engine: because of the missing minus sign, it may play into a variation, not seeing that it will be mated (because minus Checkmate is you being mated, and forgetting the minus sign results in the TT returning the wrong value).

This can happen in a small number of games, where the engine can play into a variation in which it will lose.

Yes, that was basically my reasoning. I wanted to be sure that I adjusted the mate scores correctly, even when being mated. That was my basic reasoning, but I never actually tested to see whether it was strictly necessary, but it made sense to me, and doing otherwise seemed like it'd be wrong.

I remember when I was doing a sanity check against other engines to make sure my TT code made sense, I was you were missing a minus sign, and I couldn't quite figure out why, and it bugged me. So thanks for the rationale.

mvanthoor wrote: ↑Wed Sep 22, 2021 8:26 pm So I changed this in my development version and started a test. At the moment, it's still running:
Code: Select all
Score of Rustic Alpha 3.13.100 vs Rustic Alpha 3.12.100: 574 - 489 - 416 [0.529]
...      Rustic Alpha 3.13.100 playing White: 331 - 189 - 220  [0.596] 740
...      Rustic Alpha 3.13.100 playing Black: 243 - 300 - 196  [0.461] 739
...      White vs Black: 631 - 432 - 416  [0.567] 1479
Elo difference: 20.0 +/- 15.0, LOS: 99.5 %, DrawRatio: 28.1 %
1479 of 2000 games finished.
A +20 Elo difference, already outside the margin of error. So I'm gaining 5-35 Elo, as it stands now. (I would need to run 5000 games or so to be completely sure.)

So... by looking into _your_ problem, I discovered a bug in my own engine, which could potentially gain +35 Elo when fixed. So the mate handling seems to NOT be the problem in your engine... and I think I should put you in my engine's credits even though you didn't actually "do" anything

I'll clone the Git repo and take a look at the rest of the code shortly.

That's excellent to hear! I'm very glad that Blunder's code helped improve your engine, even if it was pretty indirect

And thanks again for taking a look at the code.

With that said, I very recently just discovered a pretty embarrassing bug. This code from here:

size, err := strconv.Atoi(value)
if err != nil {
search.TransTable.Unitialize()
search.TransTable.Resize(uint64(size))
}

Should be:

algerbrex · Post by **algerbrex** » Wed Sep 22, 2021 9:23 pm

mvanthoor wrote: ↑Wed Sep 22, 2021 8:26 pm I'll have a look at it myself as well. I've seen a lot of things in Blunder that are coded similar to how I did it in Rustic, (including the ASCII art at the console ),

Thanks, Marcel, another pair of eyes is always appreciated! And yes, you and the Author of Little Wing inspired me to spice things up with some ASCII art

.

mvanthoor wrote: ↑Wed Sep 22, 2021 8:26 pm and as this is something that is already implemented in my own engine, I think I can look at the code without being influenced.

Good, glad that won't be a problem, as this is always something I've worried about myself. I mentioned elsewhere in a previous thread that I usually try to only take a close look at other people's engines only as a sanity check against the code I've written, and leave the details up to myself. Or if I'm completely lost on a certain idea, I'll try to just skim over the code and get a rough outline of what implementing something would look like. I've felt like this has been a nice mixture for me of benefitting from the community while still creating something original.

Although I also very much understand why others avoid looking at other's code entirely.

mvanthoor wrote: ↑Wed Sep 22, 2021 8:26 pm Because PVS gets enhanced by the TT (which is beneficial for the researches), I looked through that code first. I noticed a small difference with my code: the mate handling is different. For example, when storing:

Yours:
Code: Select all
	if score > Checkmate {
		score += int16(ply)
	}

	if score < -Checkmate {
		score -= int16(ply)
	}
Mine:
Code: Select all
	if score > Checkmate {
		score += int16(ply)
	}

	if score < Checkmate {
		score -= int16(ply)
	}
(Notice the missing minus sign in my code.)

After thinking about it, I decided that this was a bug in my engine: because of the missing minus sign, it may play into a variation, not seeing that it will be mated (because minus Checkmate is you being mated, and forgetting the minus sign results in the TT returning the wrong value).

This can happen in a small number of games, where the engine can play into a variation in which it will lose.

Yes, my reasoning was basically that I wanted to always adjust checkmate scores correctly, even scores where the engine was checkmated. I didn't want to introduce false mate scores into the table and cause issues with the engine recognizing mates.

mvanthoor wrote: ↑Wed Sep 22, 2021 8:26 pm So I changed this in my development version and started a test. At the moment, it's still running:
Code: Select all
Score of Rustic Alpha 3.13.100 vs Rustic Alpha 3.12.100: 574 - 489 - 416 [0.529]
...      Rustic Alpha 3.13.100 playing White: 331 - 189 - 220  [0.596] 740
...      Rustic Alpha 3.13.100 playing Black: 243 - 300 - 196  [0.461] 739
...      White vs Black: 631 - 432 - 416  [0.567] 1479
Elo difference: 20.0 +/- 15.0, LOS: 99.5 %, DrawRatio: 28.1 %
1479 of 2000 games finished.
A +20 Elo difference, already outside the margin of error. So I'm gaining 5-35 Elo, as it stands now. (I would need to run 5000 games or so to be completely sure.)

So... by looking into _your_ problem, I discovered a bug in my own engine, which could potentially gain +35 Elo when fixed. So the mate handling seems to NOT be the problem in your engine... and I think I should put you in my engine's credits even though you didn't actually "do" anything

That's excellent to hear. I'm glad Blunder's code helped you improve, even if very indirectly

mvanthoor wrote: ↑Wed Sep 22, 2021 8:26 pm I'll clone the Git repo and take a look at the rest of the code shortly.

Thanks again. With that being said, I myself just discovered quite an embarrassing bug recently. This code from here:

Code: Select all

size, err := strconv.Atoi(value)
if err != nil {
	search.TransTable.Unitialize()
	search.TransTable.Resize(uint64(size))
}

Should be:

Code: Select all

size, err := strconv.Atoi(value)
if err == nil {
	search.TransTable.Unitialize()
	search.TransTable.Resize(uint64(size))
}

Essentially Blunder has been playing all of its game using the default size of the TT, and has never actually read in any TT sizes from the command line

I'm sure this has caused some silly issues with testing...

mvanthoor · Post by **mvanthoor** » Wed Sep 22, 2021 10:12 pm

algerbrex wrote: ↑Wed Sep 22, 2021 9:23 pm That's excellent to hear. I'm glad Blunder's code helped you improve, even if very indirectly

I also nicked your comment with regard to the mate handling, because it was better than mine. I also updated my credits. Because in Rust, if's are faster than match statements if you need to only compare 1-2 things, I removed a match from the TT and replaced it with two if's; same as in your code. I'm now testing again, but the effect seems to be minimal (+5 Elo at this point; but hey, it's another 5 Elo. Possibly.)

edit: now it's a loss; I'll wait for the end of the test. It might be that only the comparative match (= a match which also has a guard or a comparison in one of the arms) is slower, and that a "normal" match is already faster than if's when comparing more than two items. If the test is negative, I´ll revert the change.

Essentially Blunder has been playing all of its game using the default size of the TT, and has never actually read in any TT sizes from the command line I'm sure this has caused some silly issues with testing...

At what time control do you test? If you play really fast games like I do, at 10s+0.1s (at this time), it could be that a 64 MB TT doesn't get filled up during a game. Then it doesn't actually matter how large it is. To make sure the TT and its replacement scheme are fully used, I test with only a 16 MB TT. That fills up in about about 30 moves on my computer. (That is intentional, as 30 moves is about half the average game length. On a faster computer that can see farther in the same time, it may fill up faster, so I'd make the TT a bit larger.)

===

Cloned the github repo. I'll see if I can take a look at it before I go to bed.

Mergi · Post by **Mergi** » Thu Sep 23, 2021 12:23 am

mvanthoor wrote: ↑Wed Sep 22, 2021 10:12 pm At what time control do you test? If you play really fast games like I do, at 10s+0.1s (at this time), it could be that a 64 MB TT doesn't get filled up during a game. Then it doesn't actually matter how large it is. To make sure the TT and its replacement scheme are fully used, I test with only a 16 MB TT. That fills up in about about 30 moves on my computer. (That is intentional, as 30 moves is about half the average game length. On a faster computer that can see farther in the same time, it may fill up faster, so I'd make the TT a bit larger.)

Note that TT "filling up" during the game with old entries shouldn't be a problem (unless there's a a buggy replacement scheme and old entries dont get overwritten, or multiple entries for one hash key occur in the table). If a program doesn't have a way of distinguishing old entries from new ones, and clogs up it's TT with the old entries, it would probably be better to just clear the whole TT before every go command, as even the largest transpostion table will get filled very fast during any reasonable time control.

mvanthoor · Post by **mvanthoor** » Thu Sep 23, 2021 12:45 am

Mergi wrote: ↑Thu Sep 23, 2021 12:23 am Note that TT "filling up" during the game with old entries shouldn't be a problem (unless there's a a buggy replacement scheme and old entries dont get overwritten, or multiple entries for one hash key occur in the table). If a program doesn't have a way of distinguishing old entries from new ones, and clogs up it's TT with the old entries, it would probably be better to just clear the whole TT before every go command, as even the largest transpostion table will get filled very fast during any reasonable time control.

This can be mitigated by using buckets: this stores multiple values in one TT entry. (For example, by making the TT-entry itself an array of 4 elements.) If you use a replacement scheme where you always replace the bucket with the lowest depth within an entry, you will always be able to store new information. But yes, you can end up with only one bucket being useful if the others contain entries with really high depths. Next optimization step for the TT will be aging, to clear out old buckets.

At first glance the TT does not seem the cause of the non-working PVS. I'll look at it more closely tomorrow evening, and take a look at the rest of the engine as well. PVS should be gaining about 50 Elo in self-play.

algerbrex · Post by **algerbrex** » Thu Sep 23, 2021 12:55 am

mvanthoor wrote: ↑Wed Sep 22, 2021 10:12 pm At what time control do you test? If you play really fast games like I do, at 10s+0.1s (at this time), it could be that a 64 MB TT doesn't get filled up during a game. Then it doesn't actually matter how large it is. To make sure the TT and its replacement scheme are fully used, I test with only a 16 MB TT. That fills up in about about 30 moves on my computer. (That is intentional, as 30 moves is about half the average game length. On a faster computer that can see farther in the same time, it may fill up faster, so I'd make the TT a bit larger.)

I usually test at 3+0.08s and 15+0.1s time controls, so the larger TT size I'm using may be underperforming. I'll try reducing the size, although I remember the Elo gain was fine when I manually set the TT size before each game to 128 MB. I got around 55 Elo from the TT-cuts, 100 Elo from the move order, and 160 Elo from both combined.

Strangely enough, now that I've rewritten my search routine to no longer have a dedicated negamax function for the root, the Elo gain from enabling vs disabling the TT is pretty awful, like ~15 ELo, so I'll need to look into that too, as I think going forward it'll be easier to just have one negamax function.

Mergi · Post by **Mergi** » Thu Sep 23, 2021 1:07 am

algerbrex wrote: ↑Thu Sep 23, 2021 12:55 am Strangely enough, now that I've rewritten my search routine to no longer have a dedicated negamax function for the root, the Elo gain from enabling vs disabling the TT is pretty awful, like ~15 ELo, so I'll need to look into that too, as I think going forward it'll be easier to just have one negamax function.

I took a quick glance at your TT code, and while this probably has nothing to do with your root position troubles, you do have a bug in your TT. When probing the TT and comparing the stored score to your current alpha/beta values, you need to adjust the stored score for mate first, before comparing them (currently you only adjust afterwards).

Mergi · Post by **Mergi** » Thu Sep 23, 2021 1:17 am

Code: Select all

TTEntrySize = 14

I'm not a go programmer, but are you sure this constant is correct? Usually structs would be memory-aligned regardless of programming language. The one you are using would be usually aligned to 16 bytes.

Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?