hash collisions

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: hash collisions

Post by hgm »

chrisw wrote: Mon Feb 17, 2020 8:50 pm
hgm wrote: Mon Feb 17, 2020 8:31 pm He even considers it a bug, and reason to be fired over, if someone comes up with a more efficient algorithm than he himself would have used. :lol:
You two mutually massaging each other is funny.
You are quite right about that! And guess who is the laughing stock! :lol:
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: hash collisions

Post by bob »

Chris, YOU are the one that looks a little funny here. Here's why. This concept of "program bug" is a moving target with you. You now want to define bug as "program does not hang or cause the system to hang." And that is an OK definition of acceptable software quality. I'm not going to debate that. Software QC people have lots of ideas about what is acceptable. IE produces the right answers with any valid input, but fails/crashes with bogus input. Or produces the right answers for valid input and rejects bogus input. The latter can be problematic.

You should look up a good computer science and find the term "decidability". It is an interesting one. And algorithm is decidable if it will tell you "yes" or "no". For example, is this input valid according to the grammar defined for the language in question? If it is possible to answer yes or no, it is considered decidable. But there are PLENTY of algorithms that can answer "yes" for valid input and loop forever if the input does not meet the grammar specifications. Some would call that a bug, some would not, the computer scientist would simply say "not decidable" and move on.

That doesn't apply to chess, obviously. But if it did, would it be a bug in the program or a property of the program? Now it becomes less clear what it is. I consider a bug as (a) anything that causes any misbehavior at all (crash, bogus moves, illegal RVs, scores > infinity, leaving the king in check, etc); (b) anything that unintentionally affects program performance / speed. Such as searching moves more than once using the same alpha/beta bound and depth. Might be minimized by a hash hit, but maybe not due to hash entry overwrites; (c) anything in the code that does not do exactly what I intended, even if it doesn't necessarily (or extremely rarely) affect program results. After say 100M games it has not been found to be significantly worse with the unintentional behavior than it was without it. So it is (has) not affecting performance in a measurable way, so far. But in analyzing the code, you might find edge conditions (that are almost never encountered) that trigger the bug and the program might crash, produce bogus output, etc. This is a bug in hiding, but it is still a bug even if you can't find a way to trigger it.

I take all three of those as equally serious. I do accept some as "the fix is worse than the problem." For example, a parallel search regularly produces a different best move than the serial search. Aggravating as hell, but fixing it would kill parallel performance. You could call that a bug. I would have to call it an acceptable bug. A simple change to move ordering can greatly affect a serial program, and cause it to produce different best moves as well, thanks to transposition table grafting. Changing move ordering causes a bug? Was the bug there waiting for you to change the move ordering and expose it. Or is it simply an artifact (bug) in using a transposition table which affects the search in unexpected ways?

So I go with my definition. I accept the move ordering quirks, which are magnified by parallel search. I don't accept obvious (to me) code that doesn't do what I intended even if it does work. You can look at the latest Crafty source and look at the revision log in the comments. I found a very subtle bug that had no harmful effect, other than a minuscule performance reduction. Never seen it previously but noticed it completely by accident, and then fixed it.

I just can't use the term "no bugs" with a chess program, with a straight face. I know it doesn't happen. Wish it did, but wishing doesn't count.
mar
Posts: 2554
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: hash collisions

Post by mar »

hgm wrote: Mon Feb 17, 2020 8:46 pm I don't think this works. You can also get zero during squashing when all key extensions differ from the sought one, but all in different bits. So you would get a lot of false potential hits.

I often regret that there doesn't exist something like a 'logical multiply', which does a binary multiplication using OR on the partial products instead of adding them. A logical multiply of the signature-key XOR with 0xFF would then only leave 0 on the 0x8080808080808080ull bits if the corresponding byte was all 0 (i.e. a match). It could be emulated by

Code: Select all

bhash ^= xunp;
bhash |= bhash << 1;
bhash |= bhash << 2;
bhash |= bhash << 4;
return (bhash & 0x8080808080808080ull) != 0x8080808080808080ull;
You're right, I just did a quick test and got about 970k false positives out of 1M random matches with expected failure. That's a disaster.
I tried what you propose instead but it seems to suffer from lots of false negatives, unless I implemented it in a wrong way.
EDIT: correction: your version indeed works, I messed up the test itself
Last edited by mar on Mon Feb 17, 2020 10:03 pm, edited 1 time in total.
Martin Sedlak
chrisw
Posts: 4313
Joined: Tue Apr 03, 2012 4:28 pm

Re: hash collisions

Post by chrisw »

bob wrote: Mon Feb 17, 2020 7:30 pm
chrisw wrote: Mon Feb 17, 2020 4:00 pm
Dann Corbit wrote: Mon Feb 17, 2020 12:34 pm
chrisw wrote: Sun Feb 16, 2020 10:18 pm
hgm wrote: Sun Feb 16, 2020 7:22 pm You don't like it when your stupidity receives extra attention?
Hahahaha! 100% no bugs for chess engines is entirely possible, the mechanics of moving the six pieces in defined ways over 64 squares and performing recursive search are quite trivial. Programmers who won’t work to 100% bug free methods are a liability and unemployable. Yes, that’s my opinion and it’s pretty much all I wrote on the topic. Somehow this has tripped your ego and you resort to silly put downs, well okay, I’m well used to dealing with that too.
The goal is not necessarily writing bug free code, for instance by program proving, because it is quite expensive and difficult to do that. The goal is to remove serious known bugs so that the program operates safely.

Failure to perform the above is simple incompetence.
Well, this is true, but.
If you want to publish, say through a Japanese publisher (because they have the most rigorous QC imaginable), then you have to make sure 100% no possibility your product will hang their hardware to cause an end user to do a power on power off restart. For Sony (eg) that is totally unacceptable and they won’t accept your product. They’ll put your product through every imaginable thing an end user could do, and I had one fail once because play test turned every piece into a Queen, asked for a move, and the search got so lost into extensions it didn’t return quick enough. For them that’s a bug. Well, we fixed it and product was accepted, but huge scene with the intermediary wrap-into-graphics-game publisher who was now under reputational threat by Sony for submitting buggy product and tried stopping final payment. Lawyers, two court hearings, I won plus costs, but even so, it’s the final publisher who owns the gold and who gets to define “bug” and what constitutes 100% no bugs.
By the time you’ve done some work with high-QC customers, on of whom even sent their managers in to mine to oversee our first work for them, programmer sat on day after day by manager who checked and tested everything he did, until no bugs. Can be done, programmers used to being a little careless and not being checked too much really don’t like it, but it works.
OK, so NOW you change the definition of a bug and limit it to the case where it can't hang or crash the machine? Nothing wrong with that goal, but that is NOT the definition of a bug that almost 100% of the programmers would accept. What about generating an extra king and rook when you make/unmake a castling move when the king and rook have moved from their original squares? Will that crash the program? If so it is a bug? What if the program continues to play? Is it now not a bug? I can recall the 1976 ACM event where a program pushed a pawn to the 8th rank but did not promote it. Program didn't crash. Didn't blow up the game since the opponent considered the pawn a queen, but the first program just let it sit there. David ruled that the game would continue until this caused a problem that prevented the game from continuing. Was that a bug? To me, absolutely. Based on the "crashes state" it would be no.

I don't think anyone agrees that some third party can define what 100% no bugs means.
doesn’t matter, if your customer says it’s a bug, it’s a bug. If it’s a large enough customer and getting paid is important to you, you’ll agree and fix it.
Bugs are defined by you, programming, some self-respect presumably. By testers, QC, customers, and if you release buggy product, by the final users. Okay, once soon a time I thought the same way as you, there’s always bugs, whatever. Then our Japanese customer came over, he said he liked our products, he wanted to pay us lots of money and make contracts. But one condition 100% no bugs. I said that’s impossible, he shrugged and said “100% no bugs”. I said, we can try, he said no, 100% no bugs, we send in our manager from Japan to work over your programmers. So, I said ok, and signed contracts. The manager was like a whirlwind, it worked, I recognised it worked, all the programmers were replaced by ones with good attitude, turnover tripled, contracts were falling out of our ears, everything going out on time and thoroughly 100% no bugs, and you know the rest of the story.
It is simply not true release software (games, chess) needs to go out bugged.

A bug has a precise definition (violates program specifications in any no matter how small). So once again, we are discussing something where you have a non-traditional definition of a word, and you want to use that non-traditional definition to argue about bugs. How can a resolution EVER be reached? So back to the ACTUAL definition of "bug".
who cares? It’s just a programmer’s arguing point for why it isn’t possible that he’ll work to standards other than his own. Not an interesting argument. Well, it’s an academic argument really. Do you want to get paid, or do you get paid anyway?
Or as Japanese guy used to say, Fit it. No, now. You fix now. No bugs. You work. Now.
Back that up with rigorous testing and QC in a separate department and that’s it. 100% no bugs. And I’m bored with arguing with negative programmers. everything you know about bugs is wrong.
In the case of an SMP program, an expectation could legitimately be that N processors produce the same result as the one processor search (which takes T time) but in T/n time. No SMP program meets that specification. So we change the specification to not require T/N speed, but at least something less than the original T. Can't meet that goal either because an occasional search will take greater than T time. Not very often. So we are back to the definition of bug once again. Is this a bug? Can it be fixed? Should it be fixed? first answer is "yes". Second answer is "yes, but at a great performance loss for all the OTHER more normal positions." Third answer "no". Is it reasonable to fix the specifications so that a behavior now doesn't fall outside them? No doubt. But given the original specifications, there is definitely a bug present.

I am MUCH more concerned about all the bugs that likely exist in Crafty that DO affect moves played (and hence effect Elo) that I do NOT know about. Because if I don't know about 'em, I have no idea how they affect the games being played. THOSE are the bugs I am addressing. Not the obvious things. Crafty does a pure minimax search to order root moves. Give it a position with 9 queens on each side and that ordering search might take over a minute. Bug? probably since it would appear to hang in a game/1 minute time control, and would lose on time. Something I am going to fix? No. It is an unnatural position that won't ever occur in a real game.

Your turn.
I don’t care. I’ve had enough negative shite from programmers telling me XYZ is impossible to last a lifetime. XYZ is possible. This is how. You do it. Is surprisingly a very powerful management expression which gets things done. Just do it. You can do it.
Alayan
Posts: 550
Joined: Tue Nov 19, 2019 8:48 pm
Full name: Alayan Feh

Re: hash collisions

Post by Alayan »

You are making yourself ridiculous through dogmatism.

You can make bugs rare, you can avoid egregious bugs, but 100% no-bug in a complex enough program (especially a multi-threaded program) is a pipe-dream. There WILL be bugs that remain hidden through whatever QC and unit testing and whatever. You can't prove there are no bugs you missed, and there will be bugs you missed.

Besides this ridiculous contention, you're also making yourself a laughing stock through your redefinition of the word "bug" and repeating how you would "fire" whoever doesn't agree with it.

I don't like hgm's method, it's a design choice I dislike and would avoid ; but you don't get to call an internal behaviour you dislike a bug.

Null-move pruning leads to searching potentially impossible subtrees. Any forward pruning loses the theoretical convergence of alpha-beta to the correct result. Should a program not use these techniques because it goes against a lousy definition of a "bug" ? Are SMP non-deterministic results a "bug" ?
User avatar
xr_a_y
Posts: 1871
Joined: Sat Nov 25, 2017 2:28 pm
Location: France

Re: hash collisions

Post by xr_a_y »

Code: Select all

int main(int, char**){return 0;}
Bug free ... probably...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: hash collisions

Post by bob »

The problem here, is that now we are using a completely new definition of bug, "any software defect detected by the end user is a bug, nothing else counts." So, question 1, "given that it is impossible to produce a linear speedup with alpha/beta, if a program is 50% efficient in parallel search, meaning 8 processors searches 4x faster than one, do we call that a bug?" If so we are already in a bug-free is impossible situation. Question 2, "a program produces a sub-optimal (by its capabilities) principle variation. Is this a bug? How would the end user or QC program even know?" It is a bug to me, and to software engineering. The end-user and QC people are oblivious to anything incorrect happening, so it isn't a bug to them. A middle-ground attempt is that a computer chess programmer might well notice the problem, so it is a bug to him. A bug to the author, a bug to a technically competent end-user, not a bug to a naive end-user. Question 3. I wrote a paper analyzing parallel alpha/beta over 35 years ago. It used traditional Knuth/Moore analysis for a serial program, then some actual statistics from Cray Blitz (the now commonly measured "if a move fails high, how often is it the first move (good move ordering) vs how often does it fail high on the second (or later) move? Typical numbers were 90% back then and that remains today. So now we can come up with an actual estimate of the total number of nodes a parallel search ought to traverse compared to its serial equivalent. Suppose the program exceeds this by 50%, 1 out of 20 times. Is this a bug? 1 out of 10 times. is this a bug? 1 out of 2 times. Is this a bug. Suppose instead of 50% it is 100% (tree size doubles). Is this a bug. Some of the questions from #3 could be yes or no. Which means a definition of "bug" that only considers external output, output which is by the nature of the game will be difficult to understand completely, is pretty ambiguous. We try to avoid the #1 software killer, which is ambiguous specifications or outright wrong specifications. And now we use an ambiguous definition of "correct?"

As I have stated, the word 100% directly implies that there is a proof of no bugs. In a traditional proof, you have to show that whatever it is is true, 100% of the time, no exceptions. We prove theorems in such a way. If there is any exception, then the proof is invalid. I don't see how it is even possible to declare that a chess program has no errors. The only possible solution would be to provide every possible input, and verify that every output is exactly correct. That is, there are no exceptions to the claim (hypothesis) that no bug is present. That's something that can't be done. Hence the claim can't be proven. And there, Q.E.D. the proof fails and the claim is invalid. Note that the program MIGHT be bug-free. But until that is actually verified by testing against all possible inputs and timing variations, the proof is actually just an opinion.

One example: one good way to test a change you make is to compare the output of the new program against the old, to see if the change broke anything (the GCC folks do this with lots of programs to catch regression issues.) But now lets take this to a parallel search program. With some frequency that is higher than you might expect, the parallel program will produce a different BEST move. In some cases this could be due to transpositions that lead to the same position and the serial program always searches the first move in the list first, then it searches the second. If you split at the root, the race is on to see which thread finishes first and establishes alpha, where the other will then fail low when trying to back up over this move. Not a big deal. But with some frequency we see the score change each time we run the same position to the same depth. And sometimes we see a completely different best move displayed.

Here's an example. A position (wac230) searched to depth=30 on my MacBook. smp search enabled for both. Everything is identical between the two runs:

30-> 6.73/18.00 -1.91 1. ... Rh7 2. Rb1 Bd7 3. Ba3 Be8 4. Kg2
Rh4 5. f3 Rh5 6. f4 Rh4 7. Kg3 Rh7 8. Kg2
Rh4 9. Kg3 Rh8 10. Kg2 Rh4 11. Kg3 Rh5
12. Kg2 Rh4 13. Kg3 Rh7 14. Kg2 Rb7
15. Bb2 Rf7 16. Kf3 Rh7 17. Kg2 Rh4
18. Kg3 Rh5 19. Kg2 a4 20. Rh1 Rxh1
21. Kxh1


30-> 8.29/18.90 -1.85 1. ... Rh7 2. Rb1 Rh5 3. f4 Rh8 4. Ba3 Ba6
5. Kg2 Rb8 6. Kf3 a4 7. Kg3 Rh8 8. Kg2 Rh4
9. Rf1 Bc8 10. Bc1 Rh7 11. Rh1 Ra7 12. Ba3
Rb7 13. Bb2 Bd7 14. Kf3 Rb5 15. Ke3 Ra5
16. Ba3 Rb5

Notice that (a) both chose the same move. Notice that the PV's change at move 3. Notice the different scores. If you look, you will see some moves transposed in the PV's further out. Do we call this a bug or not? I do not, as I understand the issue of searching moves out of order, hash table grafting, and am used to seeing this. What about the end user? Probably would be considered a bug since the same input should not produce different outputs unless there is an intentional design that includes some sort of randomness.

This is a classic pedantic argument.

There is a third type of "bug". back in the 80's, Sun released a version of sendmail with DEBUG enabled. It was a feature they added to make it easy to test sendmail rules and such, sendmail could send a script that the other end would execute, normally used to send an email back. A couple of students discovered this and thought it would be cute to produce a ripple on the internet. The script they wrote simply grabbed the /etc/hosts file, then the sent an email to each of the remote hosts. Expecting them to send the emails to the next "layer" removed one level. This would ripple throughout the internet (to those machines running Sun which would participate, and to the machines not running sun would have to still handle the message. Simple idea. With one MAJOR failing. Suppose you are connected to a host that has connections to ten hosts. ONE of those is you. So you send out 10 emails, one to each host. You get TEN back, and when you run the script, you now send 10 emails to each of the hosts you can reach. While all of the hosts you can reach are busy with the same task. Rather than a single ripple, it produced a tsunami. Basically brought the internet down due to the incredible sendmail traffic. And it took a few days to clean it up. All it took was missing one host, and it would start the problem over. Here's the question. Was this a bug? an unintended consequence that should have been recognized and avoided? To me, it wasn't a bug since sendmail did exactly what it was told to do. Sun screwed up releasing a debug version of sendmail. And the CS students screwed up by ignoring the reflection problem. Could have written the script to NOT send it back to the "From:" host and that would have fixed it. To me, bug=no, stupidity=yes.

Another type of "bug". Dodge released the infamous hellcat challenger. It comes with two keys, a black one and a red one. The red key unlocks the hellcat's 707 horsepower. The black key limits it to 500. Either is incredibly powerful. Let's take someone that has never driven anything near that performance level. They use the red key, but Chrysler had a bug where either key only enabled 500 horsepower. So this inexperienced driver gets in, punches it, and says WOW this 700 horsepower thing is a monster. No idea he is 200 horses shy. Is this a bug? End user has no idea. But obviously it is a flaw in the software that needs fixing (this is hypothetical of course, it never happened, the red/black key is completely factual, just no bug in the car's software).

There are many ways to flaw the testing/verification process. Can you be sure you have a set of testers that can recognize every type of failure? Hard to do not knowing what the failures will be.

<etc>
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: hash collisions

Post by bob »

xr_a_y wrote: Mon Feb 17, 2020 11:27 pm

Code: Select all

int main(int, char**){return 0;}
Bug free ... probably...
Yep, and you can prove it with a precise specification and a software verification package. That's the fly in this ointment. You can verify small programs to 100% reliability (that the program specifically meets the specification). But the keyword is "small" and forget about "parallel".

BTW that might not produce the result you expect. I have worked on a couple of computers where a return of zero indicated failure, non-zero indicated success, when doing this stuff from some sort of shell script. And even in C programs we have quirks with this (see strcmp()) 0=match, non-zero=non-match.
chrisw
Posts: 4313
Joined: Tue Apr 03, 2012 4:28 pm

Re: hash collisions

Post by chrisw »

bob wrote: Mon Feb 17, 2020 11:58 pm The problem here, is that now we are using a completely new definition of bug, "any software defect detected by the end user is a bug, nothing else counts." So, question 1, "given that it is impossible to produce a linear speedup with alpha/beta, if a program is 50% efficient in parallel search, meaning 8 processors searches 4x faster than one, do we call that a bug?" If so we are already in a bug-free is impossible situation. Question 2, "a program produces a sub-optimal (by its capabilities) principle variation. Is this a bug? How would the end user or QC program even know?" It is a bug to me, and to software engineering. The end-user and QC people are oblivious to anything incorrect happening, so it isn't a bug to them. A middle-ground attempt is that a computer chess programmer might well notice the problem, so it is a bug to him. A bug to the author, a bug to a technically competent end-user, not a bug to a naive end-user. Question 3. I wrote a paper analyzing parallel alpha/beta over 35 years ago. It used traditional Knuth/Moore analysis for a serial program, then some actual statistics from Cray Blitz (the now commonly measured "if a move fails high, how often is it the first move (good move ordering) vs how often does it fail high on the second (or later) move? Typical numbers were 90% back then and that remains today. So now we can come up with an actual estimate of the total number of nodes a parallel search ought to traverse compared to its serial equivalent. Suppose the program exceeds this by 50%, 1 out of 20 times. Is this a bug? 1 out of 10 times. is this a bug? 1 out of 2 times. Is this a bug. Suppose instead of 50% it is 100% (tree size doubles). Is this a bug. Some of the questions from #3 could be yes or no. Which means a definition of "bug" that only considers external output, output which is by the nature of the game will be difficult to understand completely, is pretty ambiguous. We try to avoid the #1 software killer, which is ambiguous specifications or outright wrong specifications. And now we use an ambiguous definition of "correct?"

As I have stated, the word 100% directly implies that there is a proof of no bugs. In a traditional proof, you have to show that whatever it is is true, 100% of the time, no exceptions. We prove theorems in such a way. If there is any exception, then the proof is invalid. I don't see how it is even possible to declare that a chess program has no errors. The only possible solution would be to provide every possible input, and verify that every output is exactly correct. That is, there are no exceptions to the claim (hypothesis) that no bug is present. That's something that can't be done. Hence the claim can't be proven. And there, Q.E.D. the proof fails and the claim is invalid. Note that the program MIGHT be bug-free. But until that is actually verified by testing against all possible inputs and timing variations, the proof is actually just an opinion.

One example: one good way to test a change you make is to compare the output of the new program against the old, to see if the change broke anything (the GCC folks do this with lots of programs to catch regression issues.) But now lets take this to a parallel search program. With some frequency that is higher than you might expect, the parallel program will produce a different BEST move. In some cases this could be due to transpositions that lead to the same position and the serial program always searches the first move in the list first, then it searches the second. If you split at the root, the race is on to see which thread finishes first and establishes alpha, where the other will then fail low when trying to back up over this move. Not a big deal. But with some frequency we see the score change each time we run the same position to the same depth. And sometimes we see a completely different best move displayed.

Here's an example. A position (wac230) searched to depth=30 on my MacBook. smp search enabled for both. Everything is identical between the two runs:

30-> 6.73/18.00 -1.91 1. ... Rh7 2. Rb1 Bd7 3. Ba3 Be8 4. Kg2
Rh4 5. f3 Rh5 6. f4 Rh4 7. Kg3 Rh7 8. Kg2
Rh4 9. Kg3 Rh8 10. Kg2 Rh4 11. Kg3 Rh5
12. Kg2 Rh4 13. Kg3 Rh7 14. Kg2 Rb7
15. Bb2 Rf7 16. Kf3 Rh7 17. Kg2 Rh4
18. Kg3 Rh5 19. Kg2 a4 20. Rh1 Rxh1
21. Kxh1


30-> 8.29/18.90 -1.85 1. ... Rh7 2. Rb1 Rh5 3. f4 Rh8 4. Ba3 Ba6
5. Kg2 Rb8 6. Kf3 a4 7. Kg3 Rh8 8. Kg2 Rh4
9. Rf1 Bc8 10. Bc1 Rh7 11. Rh1 Ra7 12. Ba3
Rb7 13. Bb2 Bd7 14. Kf3 Rb5 15. Ke3 Ra5
16. Ba3 Rb5

Notice that (a) both chose the same move. Notice that the PV's change at move 3. Notice the different scores. If you look, you will see some moves transposed in the PV's further out. Do we call this a bug or not? I do not, as I understand the issue of searching moves out of order, hash table grafting, and am used to seeing this. What about the end user? Probably would be considered a bug since the same input should not produce different outputs unless there is an intentional design that includes some sort of randomness.

This is a classic pedantic argument.

There is a third type of "bug". back in the 80's, Sun released a version of sendmail with DEBUG enabled. It was a feature they added to make it easy to test sendmail rules and such, sendmail could send a script that the other end would execute, normally used to send an email back. A couple of students discovered this and thought it would be cute to produce a ripple on the internet. The script they wrote simply grabbed the /etc/hosts file, then the sent an email to each of the remote hosts. Expecting them to send the emails to the next "layer" removed one level. This would ripple throughout the internet (to those machines running Sun which would participate, and to the machines not running sun would have to still handle the message. Simple idea. With one MAJOR failing. Suppose you are connected to a host that has connections to ten hosts. ONE of those is you. So you send out 10 emails, one to each host. You get TEN back, and when you run the script, you now send 10 emails to each of the hosts you can reach. While all of the hosts you can reach are busy with the same task. Rather than a single ripple, it produced a tsunami. Basically brought the internet down due to the incredible sendmail traffic. And it took a few days to clean it up. All it took was missing one host, and it would start the problem over. Here's the question. Was this a bug? an unintended consequence that should have been recognized and avoided? To me, it wasn't a bug since sendmail did exactly what it was told to do. Sun screwed up releasing a debug version of sendmail. And the CS students screwed up by ignoring the reflection problem. Could have written the script to NOT send it back to the "From:" host and that would have fixed it. To me, bug=no, stupidity=yes.

Another type of "bug". Dodge released the infamous hellcat challenger. It comes with two keys, a black one and a red one. The red key unlocks the hellcat's 707 horsepower. The black key limits it to 500. Either is incredibly powerful. Let's take someone that has never driven anything near that performance level. They use the red key, but Chrysler had a bug where either key only enabled 500 horsepower. So this inexperienced driver gets in, punches it, and says WOW this 700 horsepower thing is a monster. No idea he is 200 horses shy. Is this a bug? End user has no idea. But obviously it is a flaw in the software that needs fixing (this is hypothetical of course, it never happened, the red/black key is completely factual, just no bug in the car's software).

There are many ways to flaw the testing/verification process. Can you be sure you have a set of testers that can recognize every type of failure? Hard to do not knowing what the failures will be.

<etc>
If you put as much effort into producing 100% bug-free product as you do into convincing everybody and yourself this is impossible, you’ld be well on the way to 100% bug free product.

I would guess that both you and hgm are heavily invested in the idea of the chess engine being a complex difficult beast that requires great creativity and smartness to even begin to tame. Bugs are natural because of the great complexity bla-di-bla.
Really, we should see computer chess as a religion.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: hash collisions

Post by bob »

So, as usual, you want to just hand-wave the discussion away. This is a major problem. The entire field of software engineering sprang up to make progress. And today, the best SE can do is REDUCE the number of bugs that slip by. It is impossible to eliminate them completely. If we could, that would be the holy grail of computer science.

Do some reading. Learn some facts. Then you will begin to understand the problems. Having some foreign hot-shot come and yell is NOT going to eliminate all bugs. Only way to do that is to eliminate all programmers, which would then eliminate all programs, which is the only way to eliminate all bugs.