It's indeed insignificant, the point was that sometimes with close moves, tiny changes far down in the tree can affect the move choice indirectly. But that's because the moves are extremely close in the first place, and I agree that elo-wise changing choice between two equal moves is irrelevant.hgm wrote: ↑Sat Feb 29, 2020 9:20 am Note that I am not convinced by the effect of 7-men EGT in Stockfish. There should not be any lines of chessic relevance from the opening position that reach the 7-men stage, meaning that the scores obtained in these probes should never be able to propagate to the root. It sounds more like an accidental effect of replacing different TT entries because you don't search exactly the same tree, similar to what you would have if you run with a TT of a different size, or using a different set of Zobrist keys. Or that EGT probing slightly slows down the NPS. If root moves are so close in score that the choice between them is basically random, such insignificant things can affect the choice. (And Elo-wise you would not care at all.)
Measuring Hash Collisions (once again)
Moderators: hgm, Rebel, chrisw
-
- Posts: 550
- Joined: Tue Nov 19, 2019 8:48 pm
- Full name: Alayan Feh
Re: Measuring Hash Collisions (once again)
-
- Posts: 690
- Joined: Mon Apr 19, 2010 7:07 pm
- Location: Sweden
- Full name: Peter Osterlund
Re: Measuring Hash Collisions (once again)
That would indeed reduce the number of redundant bits, but it would likely not completely eliminate them, depending on your engine.hgm wrote: ↑Sat Feb 29, 2020 3:03 pm That pretty much confirms that 16 (non-redundant) bits in the signature is already good enough from an Elo perspective. And with full verification of the hash move, Elo would be the only concern.
So XOR the signature (or index) with the incremental eval, to make those bits non-redundant!petero2 wrote: ↑Sat Feb 29, 2020 1:22 pmSome of the bits in the Zobrist key are used to determine which bucket a position belongs to. Each entry also stores the whole 64 bit key, even though some of the bits are redundant since they were used to determine the bucket and thus are the same for all positions belonging to that bucket.
I added code to Texel to estimate how many bits of redundancy this would remove, by logging for each evaluation score how many times that score occurred during search, then computing the entropy of the distribution of scores. I got the following results by searching for about a minute from the starting position:
Code: Select all
Material only score : ~ 3 bits (*)
PST score only : ~ 8 bits
Material + PST : ~ 10 bits
PST_MG + PST_EG : ~ 16 bits
(MTRL+PST)_MG + (MTRL+PST)_EG : ~ 17.5 bits
Separate PST score per piece type : almost as good as 64 bit Zobrist hash
If the engine keeps track of the total piece square table (PST) score incrementally, you would gain about 8 bits.
If the engine incrementally keeps track of both middle game and end game PST scores, you would gain about 16 bits.
(*) For more tactical positions, the material score would work better though. For WAC71 I got about 6.3 bits.
In Texel I actually keep track of PST scores for each piece type, since I use different MG/EG interpolation factors for different piece types. If I combine all PST scores using a decent mixing function, I get a hash value that is almost as good as a 64 bit Zobrist hash value. I think computing this value would be at least as expensive as computing an additional 64 bit Zobrist hash value though, so it is probably not useful in practice.
I think this trick would work better in an engine that has been automatically tuned, since that tends to add a small amount of "noise" to the piece square tables.
For an engine using 64-bit Zobrist keys, I don't think this trick is worthwhile, since even if you use a huge 256GB hash table (2^32 buckets) you would still have 32 non-redundant bits to store in the hash entries, which according to my previous test is more than enough to not cause any elo loss.
I think this trick would be useful for a 32 bit engine using 32 bit Zobrist keys though, especially if the incremental eval includes both base piece values and PST corrections.
-
- Posts: 27817
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Measuring Hash Collisions (once again)
The incremental evaluation in KingSlayer is a 32-bit quantity that encodes both end-game and opening score for PST plus material (so that a multiply with the game phase followed by a right-shift can do the linear interpolation). The PST used for the incremental update contain likewise combined values. (The upper bits contain the difference of opening and end-game values, so that the piece base value mostly cancels there, and fewer bits are needed for that.)
So XORing this into the index should give me about 16 extra bits, at the expense of only a single XOR instruction. I also gain an extra bit by XORing the stm key only into the index. (I also do that for the e.p. status, but the entropy of that should be close to 0.) This doesn't cost anything extra, as I would have to XOR the stm into the key somehow anyway, and then duplicate the key to derive the index from it. It doesn't matter (speed-wise) whether I do the XOR before or after the duplication.
With effectively 80+ bits key length I felt I did not have to bother at all with hash-move verification or crash resistance in the face of faulty hash moves.
So XORing this into the index should give me about 16 extra bits, at the expense of only a single XOR instruction. I also gain an extra bit by XORing the stm key only into the index. (I also do that for the e.p. status, but the entropy of that should be close to 0.) This doesn't cost anything extra, as I would have to XOR the stm into the key somehow anyway, and then duplicate the key to derive the index from it. It doesn't matter (speed-wise) whether I do the XOR before or after the duplication.
With effectively 80+ bits key length I felt I did not have to bother at all with hash-move verification or crash resistance in the face of faulty hash moves.
-
- Posts: 6997
- Joined: Thu Aug 18, 2011 12:04 pm
- Full name: Ed Schröder
Re: Measuring Hash Collisions (once again)
Update:Rebel wrote: ↑Sat Feb 29, 2020 3:23 pm With Zobrist 8-bit checksum, 48 vs 56 bit hash key.
No single collision with a 56 byte key.Code: Select all
KEY = 48 bit KEY = 56 bit POS Time Coll TP Coll TP 100 0:30 26 17 0 0 100 1:00 41 23 0 0 100 2:30 136 98 0 0 100 5:00 263 211 0 0 100 10:00 438 355 0 0
Last step, playing games. On depth to measure move and/or score differences due to collisions.
Code: Select all
KEY = 48 bit KEY = 56 bit
Depth Time Coll TPE Coll TPE Moves Scores Played
14 5s 40 6 0 0 0 0 1418
15 10s 68 10 0 0 0 0 2225
16 20s 115 28 0 0 0 0 1445
17 40s 368 68 13 7 1 2 1527
Coll - number of collisions.
TPE - transposition errors.
Moves - number of different moves played due to collisions.
Scores - number of different scores due to collisions.
Played - number of moves played.
Remarks
1. Measuring the effect on collisions there seems to be a difference between loading positions from EPD with a clean hash table and playing games where the HT is never cleared and each position is close to the previous position. On 100 positions from EPD no single collision reported on 10 minutes thinking time while in games with an average time of 40 seconds (although containing much more positions) 13 collision were discovered.
2. It's surprising to notice how well a 48-bit key already does despite the transposition errors. For a SP program like ProDeo doing 2.5 million NPS it's safe to conclude a 56 bit hash key is satisfying and due to the lack of a SMP version it makes no sense (time wise) to extend this test to a 56 vs 64 bit key comparison.
3. Based on the results it's hard to estimate how many collisions a 64 hash key will produce on a system doing 75-100 million NPS on long time control.
4. Finally on depth=17 the 56 bit version gives 2 score and 1 move collision:
48. Qe2 {+0.21/17 309s}
48. Qe2 {+0.19/17 286s}
and
46. Qg4 {+0.58/17 111s}
46. Qe4 {+0.85/17 62s}
90% of coding is debugging, the other 10% is writing bugs.
-
- Posts: 6997
- Joined: Thu Aug 18, 2011 12:04 pm
- Full name: Ed Schröder
Re: Measuring Hash Collisions (once again)
I am going to do a 64 bit hash key check after all, self play games on nodes.
On an average middle game position, hash key 56 bit, NPS ± 2.6 million, no collision discovered.
But a question first, on TCEC hardware:
1. how many NPS is Stockfish (or another engine) doing ?
2. what is the TCEC time control nowadays ?
On an average middle game position, hash key 56 bit, NPS ± 2.6 million, no collision discovered.
Code: Select all
Nodes Time Probes Found Keys Coll TPE r1bk1n1r/pp1n1q1p/2p2p1R/3p4/3PpN2/2NB2Q1/PPP2PP1/2K1R3 w - -
1 billion 6:07 250.064.018 150.353.406 (60%) 0 0 00:05:04 20.00 1.41 1.Bxe4 dxe4 2.Nxe4 Rg8 3.Nd6 Qe7 4.Rxe7 Rxg3
2 billion 12:31 511.038.816 304.021.955 (59%) 0 0 00:05:11 20.00 1.41 1.Bxe4 dxe4 2.Nxe4 Rg8 3.Nd6 Qe7 4.Rxe7 Rxg3
4 billion 25:07 1.018.575.173 600.861.953 (58%) 0 0 00:14:49 21.00 1.41 1.Bxe4 dxe4 2.Nxe4 Rg8 3.Nd6 Qe7 4.Rxe7 Rxg3
1. how many NPS is Stockfish (or another engine) doing ?
2. what is the TCEC time control nowadays ?
90% of coding is debugging, the other 10% is writing bugs.
-
- Posts: 6997
- Joined: Thu Aug 18, 2011 12:04 pm
- Full name: Ed Schröder
Re: Measuring Hash Collisions (once again)
I have the 64-bit hashkey collision check running now, 5 games, 1 billion nodes, average time per move 7-8 minutes. So far after 5 x 50 = 250 moves no collision reported, that is, not yet.
The math, my Stockfish on 20 cores does 25 million NPS so doing 1 billion nodes with ProDeo equals about 40 seconds Stockfish running on 20 cores.
Now I have heard an experienced programmer say, Stockfish doing 75 million NPS will generate hundreds of collisions each second. Not my experience, ProDeo produces zero.
From Bob's and Anthony's study I have understood about 1 collision per one million, again not my experience.
I am wondering how these huge difference in numbers can be, one reason could be the creation of the random keys at program start. Unlike most of you (I think) I don't create 64 bit hash keys at once. I create a first set of 32 bit keys with formula X and a second set of 32 bit keys with formula Y, then combine them. Does this create so much better keys I get far less collisions?
The math, my Stockfish on 20 cores does 25 million NPS so doing 1 billion nodes with ProDeo equals about 40 seconds Stockfish running on 20 cores.
Now I have heard an experienced programmer say, Stockfish doing 75 million NPS will generate hundreds of collisions each second. Not my experience, ProDeo produces zero.
From Bob's and Anthony's study I have understood about 1 collision per one million, again not my experience.
I am wondering how these huge difference in numbers can be, one reason could be the creation of the random keys at program start. Unlike most of you (I think) I don't create 64 bit hash keys at once. I create a first set of 32 bit keys with formula X and a second set of 32 bit keys with formula Y, then combine them. Does this create so much better keys I get far less collisions?
90% of coding is debugging, the other 10% is writing bugs.
-
- Posts: 5566
- Joined: Tue Feb 28, 2012 11:56 pm
Re: Measuring Hash Collisions (once again)
Stockfish stores just 16 bits in a TT entry. With 3 entries per bucket, this means 1 false hit per 2^16/3=21845.33 probes that should not have hit.
It is pretty easy to calculate the number of false hits if the hashing scheme is known (number of bits stored in a TT entry, number of TT entries per TT bucket). This is all assuming that Zobrist keys follow an ideal uniform distribution (which they pretty much do if nothing is terribly wrong).
-
- Posts: 6997
- Joined: Thu Aug 18, 2011 12:04 pm
- Full name: Ed Schröder
Re: Measuring Hash Collisions (once again)
How can that function, any idea?
90% of coding is debugging, the other 10% is writing bugs.
-
- Posts: 27817
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Measuring Hash Collisions (once again)
Well, as Bob pointed out, collisions have virtually zero impact on Elo (even when they are what you call 'transpositions', so that their score is used). Unless they are more frequent than one every 10K nodes. Which with a 16-bit signature and 3-way bucket they aren't.
It can be added that even if thec ollisions would cause a measurable Elo-decrease, this is not without compensation. By storing a 16-bit signature instead of a 48-bit key you can nearly double the number of hash entries for the same TT size, and that in itself should be worth about 8 Elo. So if the collisions by themselves cause an Elo drop smaller than 8, this actually is the superior design. As the FishTest apparently demonstrated.
-
- Posts: 6997
- Joined: Thu Aug 18, 2011 12:04 pm
- Full name: Ed Schröder
Re: Measuring Hash Collisions (once again)
That's quite a fun thing to do next, a hash key of 16 bits, wow...hgm wrote: ↑Tue Mar 03, 2020 2:36 pmWell, as Bob pointed out, collisions have virtually zero impact on Elo (even when they are what you call 'transpositions', so that their score is used). Unless they are more frequent than one every 10K nodes. Which with a 16-bit signature and 3-way bucket they aren't.
It can be added that even if thec ollisions would cause a measurable Elo-decrease, this is not without compensation. By storing a 16-bit signature instead of a 48-bit key you can nearly double the number of hash entries for the same TT size, and that in itself should be worth about 8 Elo. So if the collisions by themselves cause an Elo drop smaller than 8, this actually is the superior design. As the FishTest apparently demonstrated.
90% of coding is debugging, the other 10% is writing bugs.