hash collisions

Dann Corbit · Post by **Dann Corbit** » Tue Feb 18, 2020 5:25 pm

If your code base has half a million lines in it, there will definitely be bugs.
I think that the proper goal is to attempt to eliminate all known bugs.
It might seem like things such as undefined behavior (e.g. integer overflow) are really not important, but such things have caused Arienne rockets to crash and x-ray machine to give a fatal dose of x-rays.

Undefined behavior in your program means that the program can literally do any thing unexpected, including allowing hostile code injections.

It is one thing to say we cannot write perfect code (and indeed, we imperfect humans are not very good at attempting anything perfectly). It is another thing to make no attempt to fix easily fixable well known and serious bugs.

I think that the computer chess community has largely taken the "who cares?" position, and that position is irresponsible. We can cause real damage even when "nothing serious" happens.

Consider the ubiquitous core dump. Utterly harmless snapshots of what was going on in memory seem benign enough. But a giant collection of them can fill up all the disk space, and now the computer cannot obtain the virtual memory it needs to perform some critical operation. Suppose we happen to be in the middle of a systems update when we freeze. We might cause an unrecoverable error that will require a system format.

My personal opinion is that we should not just assume that since our license may say we are not responsible for any harm, there is no need to fix critical known errors. Not doing so is negligent. You can do someone real harm and not caring about it is callous.

I have seen some very polar opinions on software quality. I think that Chris has an unreachable goal, but nonetheless, it is one that we should all strive for. Quality software that performs the wanted task without endangering anything is obviously a laudable goal. In industry, that goal is necessary, and we spend enormous effort in attempting to approach it. I realize that for the most part, computer chess is a hobby and the time allowed to increase software quality is limited. But it is an important lesson to put the customer first. I have always believed that the best goal in creation of software is, "Make the customer prosper." I believe this is true for every form of software development.

Ovyron · Post by **Ovyron** » Tue Feb 18, 2020 5:33 pm

chrisw wrote: ↑Tue Feb 18, 2020 12:53 pm Just one would be enough.

If one was enough, why did you ask for a list of all bugs?

Anyway, thanks Alayan for saving me the job. I'm surprised the list of bugs is so short, that's like 0.75 bugs per month while I expected like 1.5, so about half of the bugs I expected. But nine times more than what would have been enough.

My point was that this would have been valid even if Stockfish releases happened at radically different points in time, because releasing an engine with minimal bugs has never been a priority (it has only been a priority to fix known bugs ASAP, but you can't fix unknown bugs, and it'd be ridiculous to freeze the code to look for bugs instead of looking for improvements, as that'd have hurt the engine's development much more than any uncovered bug it might still have.)

hgm · Post by **hgm** » Tue Feb 18, 2020 5:42 pm

mar wrote: ↑Tue Feb 18, 2020 4:42 pm I've just realized that my code (I haven't touched it for the past 5 years except for minor bugfixes - no pun intended) relies on move flags
(validation) so I'll probably stick with 4-entry buckets, bob's xor trick and 64-bit keys.
I still like your idea a lot - certainly worth trying, but maybe for a different project in the future.

(score could be reduced by a couple of bits for sure)

I need to search for what exactly you mean with undercut replacement. So far I found one thread here where you proposed it as a replacement scheme several years ago.
As for replacement scheme, I simply compute a replacement score such that entries with mismatching age are the hottest candidates, then depth and finally I penalize replacing PV nodes based on bound.
Since I treat all of them equally, there's certainly a lot of room for experiments and improvement, especially because I hash qsearch too.
I could certainly try to fit your replacement ideas to the 4-bucket TT to see how it does,
this is very tempting (especially considering that my default TT size is 4MB [bytes, not entries]).

'Undercut replacement' is an intermediate between depth-preferred (replacing only by higher or equal depth) and always-replace, where you replace an entry when the new arrival has a depth that is larger, equal or exactly one less. Because a one-lower depth that can replace it occurs more frequently than the depth that filled it, this tends to push the depth hold by the item towards zero (where in always-replace entries it perpetually grows, until nothing manages to replace it anymore). The higher the depth of the stored item, the longer it will survive, though, as the depth needed to replace it will arrive less frequently. In always-replace entries all depths live equally long, namely until the next QS node arrives to replace them.

I am not sure what you mean by 'move flags'. But if you are used to 4 entries per bucket, I would think that 6 is already a significant improvement. So why the need to squeeze in 7? With 6 entries per bucket you have 85 bits per entry, 12 more than with 7 entries. That should enable you to store a lot of extra info.

If all the depth fields were packed in one word, it could also be possible to use SIMD for deciding where to replace.

With "shallowest-of-N" replacement one of the slots in the bucket serves as always-replace, and contains a very low depth almost all the time, while the other slots hold high and ever-increasing depth. Only in the very rare case that something comes in that had a higher depth than the shallowest other entry, it would elevate the content of the always-replace slot into the ranks of the preserved entries, and the shallowest of the latter would then take over the always-replace function. The common case is that depth 0 or 1 needs to be stored, and is deferred to the slot that currently functions as always-replace, and thus also holds depth 0 or 1 more than 90% of the time. If you do an SIMD compare of all depths to the incoming depth (or the incoming depth + 1), almost always only one of the slot will have a lower depth, and you can store it there without doing a brute-force determination of the minimum depth in the bucket.

Aging flags could also be treated in SIMD fashion, to detect whether there are any aged entries in the bucket (which, I think, will also rarely happen, as especially with a small hash table buckets will be used many times per search). If you alternate the aging flags and the depth in one word (3 bit aging + 7 bit depth, for 6 entries = 60 bit), it should be possible to let the SIMD comparison of the age result in a mask that can squash the depth of the stale entries, before you compare those to the incoming depth. The latter SIMD comparison will then tell you which entries in the bucket could be candidate for replacement (usually only 1, and in general very few), and you only would have to consider those.

Btw, my key-comparison trick worked because the initial OR sets all 'difference keys' to 1xxxxxxx, and the only 1xxxxxxx that drops below 10000000 on decrementing (to 01111111) is 10000000 itself. In all other cases the MSB will remain 1. So effectively the MSB will be set to the OR of all the lower bits, no matter how many there were. Only the MSB itself of the difference key has to be ORed in 'by hand' afterwards.

chrisw · Post by **chrisw** » Tue Feb 18, 2020 6:38 pm

Ovyron wrote: ↑Tue Feb 18, 2020 5:33 pm
chrisw wrote: ↑Tue Feb 18, 2020 12:53 pm Just one would be enough.
If one was enough, why did you ask for a list of all bugs?

are we playing a literal/metaphorical interpretation game?

Anyway, thanks Alayan for saving me the job. I'm surprised the list of bugs is so short, that's like 0.75 bugs per month while I expected like 1.5, so about half of the bugs I expected. But nine times more than what would have been enough.

My point was that this would have been valid even if Stockfish releases happened at radically different points in time, because releasing an engine with minimal bugs has never been a priority (it has only been a priority to fix known bugs ASAP, but you can't fix unknown bugs, and it'd be ridiculous to freeze the code to look for bugs instead of looking for improvements, as that'd have hurt the engine's development much more than any uncovered bug it might still have.)

chrisw · Post by **chrisw** » Tue Feb 18, 2020 6:40 pm

Alayan wrote: ↑Tue Feb 18, 2020 5:01 pm With a moderately large definition of bug...

Bugs shipped with SF11 and fixed since then :
https://github.com/official-stockfish/S ... dd694a3cec
https://github.com/official-stockfish/S ... 655300ce64
Bugs shipped with SF10 and fixed in SF11 :
https://github.com/official-stockfish/S ... 352a288f07
https://github.com/official-stockfish/S ... 2a0b0e618a
https://github.com/official-stockfish/S ... 0b750b3dda
https://github.com/official-stockfish/S ... 51cf6cacf5
https://github.com/official-stockfish/S ... c005b03df9
https://github.com/official-stockfish/S ... 6d3abc3cbc
https://github.com/official-stockfish/S ... dc1268e4c6
https://github.com/official-stockfish/S ... 86354bf9ae
https://github.com/official-stockfish/S ... a773f82595

Anybody familiar with the Stockfish's development process knew that the claim of no bug in release versions was ridiculous.

Cool. Some data points.

chrisw · Post by **chrisw** » Tue Feb 18, 2020 7:03 pm

Dann Corbit wrote: ↑Tue Feb 18, 2020 5:25 pm If your code base has half a million lines in it, there will definitely be bugs.
I think that the proper goal is to attempt to eliminate all known bugs.
It might seem like things such as undefined behavior (e.g. integer overflow) are really not important, but such things have caused Arienne rockets to crash and x-ray machine to give a fatal dose of x-rays.

Undefined behavior in your program means that the program can literally do any thing unexpected, including allowing hostile code injections.

It is one thing to say we cannot write perfect code (and indeed, we imperfect humans are not very good at attempting anything perfectly). It is another thing to make no attempt to fix easily fixable well known and serious bugs.

I think that the computer chess community has largely taken the "who cares?" position, and that position is irresponsible. We can cause real damage even when "nothing serious" happens.

Exactly. Good response. I think the only one here that tackles the issues, even if it did take a load of Hegelian dialect to get there.

Consider the ubiquitous core dump. Utterly harmless snapshots of what was going on in memory seem benign enough. But a giant collection of them can fill up all the disk space, and now the computer cannot obtain the virtual memory it needs to perform some critical operation. Suppose we happen to be in the middle of a systems update when we freeze. We might cause an unrecoverable error that will require a system format.

My personal opinion is that we should not just assume that since our license may say we are not responsible for any harm, there is no need to fix critical known errors. Not doing so is negligent. You can do someone real harm and not caring about it is callous.

Exactly. A mandatory standard of input EPD integrity testing would be a start. How many people do that?

I have seen some very polar opinions on software quality. I think that Chris has an unreachable goal, but nonetheless, it is one that we should all strive for.

Hegelian dialectic, comrade. You were the only one to step beyond the polar positions to look for synthesis.

Quality software that performs the wanted task without endangering anything is obviously a laudable goal. In industry, that goal is necessary, and we spend enormous effort in attempting to approach it. I realize that for the most part, computer chess is a hobby and the time allowed to increase software quality is limited. But it is an important lesson to put the customer first. I have always believed that the best goal in creation of software is, "Make the customer prosper." I believe this is true for every form of software development.

Ovyron · Post by **Ovyron** » Tue Feb 18, 2020 7:21 pm

chrisw wrote: ↑Tue Feb 18, 2020 6:40 pm
Alayan wrote: ↑Tue Feb 18, 2020 5:01 pm With a moderately large definition of bug...

Bugs shipped with SF11 and fixed since then :
https://github.com/official-stockfish/S ... dd694a3cec
https://github.com/official-stockfish/S ... 655300ce64
Bugs shipped with SF10 and fixed in SF11 :
https://github.com/official-stockfish/S ... 352a288f07
https://github.com/official-stockfish/S ... 2a0b0e618a
https://github.com/official-stockfish/S ... 0b750b3dda
https://github.com/official-stockfish/S ... 51cf6cacf5
https://github.com/official-stockfish/S ... c005b03df9
https://github.com/official-stockfish/S ... 6d3abc3cbc
https://github.com/official-stockfish/S ... dc1268e4c6
https://github.com/official-stockfish/S ... 86354bf9ae
https://github.com/official-stockfish/S ... a773f82595

Anybody familiar with the Stockfish's development process knew that the claim of no bug in release versions was ridiculous.
Cool. Some data points.

That refute everything you were saying about it. Any other example of "bug-free software" besides Stockfish?

bob · Post by **bob** » Tue Feb 18, 2020 7:49 pm

chrisw wrote: ↑Tue Feb 18, 2020 12:48 pm
bob wrote: ↑Tue Feb 18, 2020 3:37 am So, as usual, you want to just hand-wave the discussion away.

I prefer it to Bob dense paragraph waving. My rule for which has been, for a long time now, the longer you post and the more examples using cars included, the more is the one central point (being usually lost by then anyway) wrong.

This is a major problem. The entire field of software engineering sprang up to make progress. And today, the best SE can do is REDUCE the number of bugs that slip by. It is impossible to eliminate them completely.

your argument is circular. You assume there are always more bugs, therefore you can never get to none. Is like the maths proof about infinity and largest number. If I say N, you say N+1.
If I say N+2, you say N+3 and so on.

There are not an infinite number of bugs in a developing chess engine, obviously. What you are trying to tell us is just wrong, you are telling us that we can’t sit down and methodically test a finite engine of relatively low complexity in a bounded 64 square environment, finding and removing bugs one by one until we found them all. All N of them.

I say it is possible. Signed contracts to do it. Created the infrastructure to do it. Done it. Seen others do it. Got product through the most destructive possible testing environments. Many times.

If we could, that would be the holy grail of computer science.

misuse of concept. You just mean it would be a good thing if stuff worked.

Do some reading. Learn some facts. Then you will begin to understand the problems.

Here we go again. You just can’t stop yourself with the NPD put downs, can you?
Is there a technical argument anywhere that exists where you are not personally insulting the person who technically argues with you?

Having some foreign hot-shot come and yell is NOT going to eliminate all bugs.

actually, it did, and it provided the model by which we did it ourselves. Japanese style quality control. Which created Value, proven by Market.
Only way to do that is to eliminate all programmers, which would then eliminate all programs, which is the only way to eliminate all bugs.

You see anything about cars in my post?

I have not said there is an infinite number of bugs. I have said there is a LARGE number of potential bugs. And there are a significant number of bugs that slip past any type of testing you can throw at it reasonably and be able to complete it without taking years or decades to complete the testing.

What in the above is insulting. You obviously are not familiar with the discipline of software engineering.

Here are a few famous quotes from software engineering:

1. It’s hard enough to find an error in your code when you’re looking for it; its even harder when you’ve ASSUMED your code is ERROR-FREE. (sound familiar?) --Steve McConnell

2. Even the best planning is not so omniscient as to get it right the first time. -- Fred Brooks

3. Program testing can be used to show the presence of bugs, but never to show their absence! --Edsger Dijkstra

4. Before software can be reusable it first has to be usable. --Ralph Johnson

5. Every big computing disaster has come from taking too many ideas and putting them in one place. --Gordon Bell

6. Hiring people to write code to sell is not the same as hiring people to design and build durable, usable, dependable software. --Larry Constantine

7. If you don't know that a bug exists yet, is it still a bug? You seem to think "no" here... I do not.

8. Bottom line: Can you write "Bug free software"? NO Anyone who tells you otherwise is clueless. (note that is ANOTHER quote, not MY words). I presume you will call that an insult. I call it a quote and statement of fact.

9. There is no such thing as a perfect software. Zero bug development is a myth that should be dispensed with.

I found an interesting discussion about bugs. You run on computing hardware. That can have bugs. (remember the pentium floating point divide fiasco for just one example?) You run on an operating system that can have bugs that affect your program. You use multiple programming libraries (IE C library) that can have bugs. You use a compiler to convert your source code to an executable binary. That can have bugs. And, of course YOUR code will always have bugs, particularly if you use threads which adds several more orders of magnitude of potential bugs.

Now, to the above quotes, from some GIANTS of computer science, software development, and software engineering. You seem to believe that you know more than all/any of them. Think about that for a minute, and it might sink in. The GIANTS agree that bug-free software is a dream, not a reality.

Your statement "actually it did..." is utter nonsense. EVERYONE (but you, apparently) agrees that testing can only discover bugs, not prove that they don't exist. To believe otherwise is naive given all the evidence supporting this concept.

I do realize that actual data seems to you to always be an insult, because that is the only way you can escape from a discussion with you having zero supporting evidence to support you other than "your statements. Which have to be factual since you wrote em?" As I said, you should stop, read a bit, and get a better grasp on what is going on. Right now you are "way out there" compared to a bunch of people I have very high respect for.

For your request about EPD testing, _I_ did it. I have code that tests for everything I could think of. No more than 9 queens + pawns, no more than 10 knights, rooks and bishops + pawns. Valid EP status. Valid castling status (at least rooks/kings on original squares, no way to know if they have moved previously and moved back. Side not on move can't be in check. Total pawns + pieces can not be > 15. I ran millions of lines of EPD through my EPD input, then examined the resulting board positions to be sure they were valid, AFTER the EPD input code had checked for every possible error I could think of. And testing showed a couple I missed and had to fix. But once again, testing can not prove the absence of bugs, it can only prove their presence. So I won't claim that my FEN parsing is bug-free. I can only claim that massive testing found no bugs. I know of no holes, but can not guarantee there is a special case I missed. I will add that this checking caused lots of user complaints. They had test positions with 9 queens + 3 or more rooks or other pieces plus pawns still on board. I elected to ignore the complaints and treat EPD as a purely legal position description rather than something off-the-wall. But that is a TINY piece of code compared to the complex parts of Crafty. TINY.

So many of us DO try to test our code as thoroughly as possible. But no code will ever be perfect.

petero2 · Post by **petero2** » Tue Feb 18, 2020 8:15 pm

Ovyron wrote: ↑Tue Feb 18, 2020 7:21 pm Any other example of "bug-free software"

TeX version 3.14159265 might come pretty close. The latest bug fix release is 6 years old and the bounty for finding a bug is $327.68.

Ovyron · Post by **Ovyron** » Tue Feb 18, 2020 8:26 pm

Oh.

In that case the chess equivalent might be Sting, an engine based on Stockfish 2.1.1 where all known Stockfish 2.1.1 bugs would have been fixed so any bug that it'd have would need to have survived since that version.

So a bug-free version might be possible, but at what cost? (in this case, no improvements from 2.1.1 to version 11.)

hash collisions

Re: hash collisions

Re: hash collisions

Re: hash collisions

Re: hash collisions

Re: hash collisions

Re: hash collisions

Re: hash collisions

Re: hash collisions

Re: hash collisions

Re: hash collisions