hash collisions

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: hash collisions

Post by bob »

Personally, I have NEVER released a piece of code with known bugs. Or with even suspected bugs (testing simply continued to try to track them down.) That being said I have certainly released things with unknown bugs.

The most painful to me was in the 1970's. I had written about 100,000 lines of assembly language for the Xerox Sigma 9. Deep in the operating system. Main goad was to move ALL of the character-oriented hardware stuff out of the operating system and out into an attached computer to offload all the interrupt handling (each character types produced two interrupts, one for when the character was received in the Sigma 9 and one for when it was echoed back to the user. I had tested and tested this code, night after night, 10pm to 8am. For several months. We decided to "go live" on the campus computer on a Monday to test in a real user environment before handing it over to Telefile (which had taken over the xerox computer product line.

Damn thing was booted at 8am, and by 8:15 am it had crashed. A couple more attempts, a couple more crashes, couple more HUGE core dumps and off I went to test. Problem was simple. We had 64 serial lines on the I/O processor. We had only 32 on the original sigma 9 hardware. Nobody notice when we generated the operating system (anyone remember the term sysgen or system generation from the early IBM days?) my helper had left the number of serial connections at 32, which dictated the size of lots of data structures. Meanwhile, back at the ranch, the code for the front-end processor was set to 64 serial lines. When the 33rd user logged in, big problem as things started to get overwritten as data structures were indexed beyond the end of their data. We had tested night after night after night. Lesson learned.

After I moved to UAB, we ran into a similar problem. Except this time we used a second computer to generate commands and such to simulate 64 simultaneous users, to make sure no surprised lurked. We could leave this running all night as opposed to trying to find 64 users to sit at a terminal all night and do different things. Not good enough. The "load simulator" was not random enough, even though I had a number of random delays. Load simulator would run fine, real users crashed the system as load picked up. Couple of race conditions in an interrupt handler that took really unusual conditions before they would cause a problem.

Cray Blitz. Solid chess program, in 1993-1994 it was over 25 years old already. The parallel search stuff had been done back in 1983. So 11 years of never screwing up in a game. UNTIL we moved to the new Cray YMP in the late 80's. That machine had both 8 processors, AND a clock about twice as fast as the original Cray. When we used it in the ACM tournament, we almost lost a game if not for fortuitous debugging code. A race condition let the search go much deeper than intended. Deep enough to run out the end of arrays indexed by ply, which should never have exceeded 64. We "added kings" due to that problem, and CB was NOT happy with multiple kings. But some debugging code left in by accident looked at the board after an iteration, and if it found something wrong, it simply restored the position to what it was at the start of the search to allow debugging to continue. Our first hint was when we starting seeing things like Kg1-h1, Kb8-a8, Kf4-f5, etc. We were wondering where all the kings came from. The race condition had never exposed itself, mainly due to 2 or 4 processors making it very unlikely. But with 8, suddenly it was much more likely than unlikely the bug would appear. Even though the program had been tested over and over, had played in a dozen ACM events, etc. Race conditions are nasty, hard to find, and often lay dormant until the worst possible time.

As Dijkstra said, "testing can show the presence of bugs, but never the absence of bugs."

I should add, ALL of the above bugs were fixed and tested. But they were there in spite of due diligence in testing. Ed has an odd definition of "no bugs". I would assume ALL of the above would qualify as "bug-free" since the bugs were definitely fixed. But that is not _my_ definition of bug-free. I simply would refer to them as "after testing, no known bugs exist." Which is a LONG way from saying "after testing, NO bugs exist."

I never tolerate bugs of any kind. And I also do NOT write code that is buggy by nature. But I, like most everyone else, realize that our code can STILL contain undiscovered bugs no matter how much we test.

BTW, here's a funny (but true) story. You decide whether it was a bug or not. When I was working on my computer science degree, a friend of mine was one year ahead of me, but we worked together all the time. One day he came in smiling and I asked him "what's with the cheshire cat grin?" (reference to Alice in Wonderland.) He replied "I've found a source of funding that is easy to generate. I asked "explain?"

He said OK, I joined the book of the month club. Ever heard of those? I said yep, you join, every month they send you a book that you can keep and pay for, or else send it back. He said "right". He said that was step 1. Then he asked "Do you remember in the COBOL class back in the summer we discovered that the "zone punch" (an over punch over the least significant digit" was the way to enter a negative number on the IBM systems using packed decimal numbers, which were about all that were used in Data Processing at the time.)

He said "OK, here's what happened. When the books arrived, they came with an IBM-type punched card (holes in 'em but blank otherwise), no printing. I simply ran back to the card punch room, and ran it through the 029 interpreting keypunch machine. Which made a duplicate AND printed the character in each column of punches along the top of the card. I discovered that it contained my account number (as shown on mail that came from the club from time to time.) It also contained the title of the book. AND the price. I took the card, put it in a keypunch machine, turned printing off so there would be no indication of a change, and then I over punched the "zone code" to make the price of the book negative. I kept the book, sent the card back to the club, and next month I received (a) a new book and (b) a check for the amount of the book I had supposedly bought the previous month. But since it was negative, thanks to that zone overpunch, they gave me a credit balance and refunded it no questions asked. I then sold the book to friends for ½ price and ended up making 150% of the price of the book.

Obviously that was a pretty stupid approach. But it was the late 60's. No online data bases. No huge customer databases when 7mb disks were expensive and small. Bug or not? program worked exactly as it was supposed to. Nobody ever considered that the price could come back as a negative number by a clever programmer. Design flaw? Just one of those things?

I did tell him "this is going to haunt you. They are not going to remain idiots forever." It did and the last I heard before he graduated was that they were "in negotiations about this." They ended up hiring him. So the story had a happy ending. Bugs are bugs. There have been bugs since early life started on planet earth. Bugs remain today. Both software and real bugs. No future in sight where they will no longer exist, until Sol becomes a super-nova.
Ras
Posts: 2495
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: hash collisions

Post by Ras »

bob wrote: Sun Feb 23, 2020 5:09 amDeep enough to run out the end of arrays indexed by ply, which should never have exceeded 64.
Defensive coding practice would have prevented this using some check of depth vs. MAX_DEPTH in the recursion. Was that left out deliberately to squeeze some last drop of performance?
Rasmus Althoff
https://www.ct800.net
chrisw
Posts: 4346
Joined: Tue Apr 03, 2012 4:28 pm

Re: hash collisions

Post by chrisw »

bob wrote: Sun Feb 23, 2020 5:09 am Personally, I have NEVER released a piece of code with known bugs. Or with even suspected bugs (testing simply continued to try to track them down.) That being said I have certainly released things with unknown bugs.

The most painful to me was in the 1970's. I had written about 100,000 lines of assembly language for the Xerox Sigma 9. Deep in the operating system. Main goad was to move ALL of the character-oriented hardware stuff out of the operating system and out into an attached computer to offload all the interrupt handling (each character types produced two interrupts, one for when the character was received in the Sigma 9 and one for when it was echoed back to the user. I had tested and tested this code, night after night, 10pm to 8am. For several months. We decided to "go live" on the campus computer on a Monday to test in a real user environment before handing it over to Telefile (which had taken over the xerox computer product line.

Damn thing was booted at 8am, and by 8:15 am it had crashed. A couple more attempts, a couple more crashes, couple more HUGE core dumps and off I went to test. Problem was simple. We had 64 serial lines on the I/O processor. We had only 32 on the original sigma 9 hardware. Nobody notice when we generated the operating system (anyone remember the term sysgen or system generation from the early IBM days?) my helper had left the number of serial connections at 32, which dictated the size of lots of data structures. Meanwhile, back at the ranch, the code for the front-end processor was set to 64 serial lines. When the 33rd user logged in, big problem as things started to get overwritten as data structures were indexed beyond the end of their data. We had tested night after night after night. Lesson learned.

After I moved to UAB, we ran into a similar problem. Except this time we used a second computer to generate commands and such to simulate 64 simultaneous users, to make sure no surprised lurked. We could leave this running all night as opposed to trying to find 64 users to sit at a terminal all night and do different things. Not good enough. The "load simulator" was not random enough, even though I had a number of random delays. Load simulator would run fine, real users crashed the system as load picked up. Couple of race conditions in an interrupt handler that took really unusual conditions before they would cause a problem.

Cray Blitz. Solid chess program, in 1993-1994 it was over 25 years old already. The parallel search stuff had been done back in 1983. So 11 years of never screwing up in a game. UNTIL we moved to the new Cray YMP in the late 80's. That machine had both 8 processors, AND a clock about twice as fast as the original Cray. When we used it in the ACM tournament, we almost lost a game if not for fortuitous debugging code. A race condition let the search go much deeper than intended. Deep enough to run out the end of arrays indexed by ply, which should never have exceeded 64. We "added kings" due to that problem, and CB was NOT happy with multiple kings. But some debugging code left in by accident looked at the board after an iteration, and if it found something wrong, it simply restored the position to what it was at the start of the search to allow debugging to continue. Our first hint was when we starting seeing things like Kg1-h1, Kb8-a8, Kf4-f5, etc. We were wondering where all the kings came from. The race condition had never exposed itself, mainly due to 2 or 4 processors making it very unlikely. But with 8, suddenly it was much more likely than unlikely the bug would appear. Even though the program had been tested over and over, had played in a dozen ACM events, etc. Race conditions are nasty, hard to find, and often lay dormant until the worst possible time.

As Dijkstra said, "testing can show the presence of bugs, but never the absence of bugs."

I should add, ALL of the above bugs were fixed and tested. But they were there in spite of due diligence in testing. Ed has an odd definition of "no bugs". I would assume ALL of the above would qualify as "bug-free" since the bugs were definitely fixed. But that is not _my_ definition of bug-free. I simply would refer to them as "after testing, no known bugs exist." Which is a LONG way from saying "after testing, NO bugs exist."

I never tolerate bugs of any kind. And I also do NOT write code that is buggy by nature. But I, like most everyone else, realize that our code can STILL contain undiscovered bugs no matter how much we test.

BTW, here's a funny (but true) story. You decide whether it was a bug or not. When I was working on my computer science degree, a friend of mine was one year ahead of me, but we worked together all the time. One day he came in smiling and I asked him "what's with the cheshire cat grin?" (reference to Alice in Wonderland.) He replied "I've found a source of funding that is easy to generate. I asked "explain?"

He said OK, I joined the book of the month club. Ever heard of those? I said yep, you join, every month they send you a book that you can keep and pay for, or else send it back. He said "right". He said that was step 1. Then he asked "Do you remember in the COBOL class back in the summer we discovered that the "zone punch" (an over punch over the least significant digit" was the way to enter a negative number on the IBM systems using packed decimal numbers, which were about all that were used in Data Processing at the time.)

He said "OK, here's what happened. When the books arrived, they came with an IBM-type punched card (holes in 'em but blank otherwise), no printing. I simply ran back to the card punch room, and ran it through the 029 interpreting keypunch machine. Which made a duplicate AND printed the character in each column of punches along the top of the card. I discovered that it contained my account number (as shown on mail that came from the club from time to time.) It also contained the title of the book. AND the price. I took the card, put it in a keypunch machine, turned printing off so there would be no indication of a change, and then I over punched the "zone code" to make the price of the book negative. I kept the book, sent the card back to the club, and next month I received (a) a new book and (b) a check for the amount of the book I had supposedly bought the previous month. But since it was negative, thanks to that zone overpunch, they gave me a credit balance and refunded it no questions asked. I then sold the book to friends for ½ price and ended up making 150% of the price of the book.

Obviously that was a pretty stupid approach. But it was the late 60's. No online data bases. No huge customer databases when 7mb disks were expensive and small. Bug or not? program worked exactly as it was supposed to. Nobody ever considered that the price could come back as a negative number by a clever programmer.
Revealing that you describe an act of uttering a forged document and fraud for pecuniary advantage as "bug", "funny" and "by a clever programmer". That sort of white collar crime is a serious imprisonable offence. It's not at all clever, it's just forgery using a punch card writer rather than a pen or a printer. "Source of funding, easy to generate" is a serial offence.

Obviously, in your world "programmers, friends of yours" can do no wrong. Even in the face of clear fraud. You even boast about it. Character assessment and ethics not your strong point.

Design flaw? Just one of those things?
Criminal activity.

I did tell him "this is going to haunt you. They are not going to remain idiots forever." It did and the last I heard before he graduated was that they were "in negotiations about this." They ended up hiring him. So the story had a happy ending. Bugs are bugs.
Yeah, right. It's the book publishers fault, for having a "bug". Actually they had a criminal user, one of your friends. Did you try the same thing yourself?

There have been bugs since early life started on planet earth. Bugs remain today. Both software and real bugs. No future in sight where they will no longer exist, until Sol becomes a super-nova.
User avatar
hgm
Posts: 27846
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: hash collisions

Post by hgm »

You seem to miss the point. Resistance to fraud should be one of the key specifications of any system for dealing with money. It is a well-known fact that criminals do exist, no matter how much we might dislike that. Failing to care about fraud resistance is neglicence.

There are societies in which actions that encourage crime are a punishable offense themselves. I think one can be fined here for leaving one's car unlocked, because that encourages theft. (I am not sure if that still applies; I never had a car.) That doesn't make ransacking unlocked cars legal here.

And yeah, criminals can be clever. So we'd better make sure we are clever in protecting ourselves from them. Making it easy for them, just because they are not supposed to do what they do would be really stupid. So yes, the book publisher is at fault. You could argue that it is his own money he is putting up for grabs (like you would be putting your own unlocked car on offer...), so that it is OK for him to do this. But of course the money lost this way doesn't really come out of the pocket of the person that designed this system. It is the one that set this up that is guilty of criminal neglicience w.r.t. the publisher (be it his client or employer).
User avatar
Rebel
Posts: 7025
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: hash collisions

Post by Rebel »

abulmo2 wrote: Sat Feb 22, 2020 11:08 pm
Rebel wrote: Sat Feb 22, 2020 9:56 pm As lone individuals writing chess engines we don't have the resources to do expensive, time consuming test methodologies, beta test procedures before we release.
In writing Amoeba, as a lone individual, I spent 99.9% of my time testing and 0.1% writing code. I have limited financial resources, but I have an invaluable resource: time.
Can you elaborate what the testing is about?
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 7025
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: hash collisions

Post by Rebel »

bob wrote: Sun Feb 23, 2020 3:06 am
Rebel wrote: Sat Feb 22, 2020 9:56 pm As far as I am concerned the discussion is about the possibility of bug-free software. From recent previous posts:

Me:
Rebel wrote: Fri Feb 21, 2020 11:04 pm 2. Unless I misunderstand, you pretend it's impossible to write 100% bug free code. I think you are wrong.
You:
bob wrote: Sat Feb 22, 2020 12:08 am 2) You are correct, except that (a) I don't pretend, I know ...
As such it doesn't make sense to list examples of inferior software, bugs are part of programming, to assume (or in your words to know) not all can't fixed is a lack of imagination or underestimation of the capabilities of others. As lone individuals writing chess engines we don't have the resources to do expensive, time consuming test methodologies, beta test procedures before we release. (Y)our experiences as lone individuals are not a good measuring rod.
Not sure how to proceed.
start:

Me neither.

You think it's impossible 100% bug free software exists, how odd.
You said "banks don't tolerate errors" or something to that effect. So I gave some counter-examples where really BIG banks had been bitten by really serious bugs.
So what?

They fix it.

What's so hard to understand?

I've already given quotes showing that the consensus is that you can't test away all bugs.
Don't call cherry picked quotes consensus, you, me and the rest of us have wriiten bug-free code playing legal moves. Or do you think Crafty in theory can play an illegal move?

Do you think, make_move and undo_move are bug-free? Or could Crafty in theory crash?


BUT, just to come back around full circle, "debugged chess program." NOW you say "As lone individuals writing chess engines we don't have the resources to do expensive, time consuming test methodologies, beta test procedures before we release. (Y)our experiences as lone individuals are not a good measuring rod." So which is it? Chess programs have bugs? Or they don't have bugs. Can't possibly be both. I claim they do. You just explained why they do. So why the argument???
Proof we are talking about different things.

goto start;
90% of coding is debugging, the other 10% is writing bugs.
User avatar
hgm
Posts: 27846
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: hash collisions

Post by hgm »

Yes, YOU are talking about dfferent things. The issue (between Bob and Chris) was whether release versions of complex software are always 100% bug free.

You are talking about whether bugs will eventually be fixed. Not relevant to their discussion.
chrisw
Posts: 4346
Joined: Tue Apr 03, 2012 4:28 pm

Re: hash collisions

Post by chrisw »

hgm wrote: Sun Feb 23, 2020 2:08 pm Yes, YOU are talking about dfferent things. The issue (between Bob and Chris) was whether release versions of complex software are always 100% bug free.
Then you're not following.
The issue is about whether chess engines are so complex that they cannot be rendered free of bugs. Bob argues they are sufficiently complex that it is not possible for ANY chess engine to be 100% no bugs.

Chris is arguing that the domain of chess engines is sufficiently non-complex that it is possible, with the right attitude and QC and methods, to render ONE chess engine 100% bug free within a meaningful timeframe. Chris is also arguing that if it can be done for ONE engine then it can be done for ALL.

You are talking about whether bugs will eventually be fixed. Not relevant to their discussion.
Of course it is relevant. Once there were bugs (feature of developing any chess engine), then appropriate QC and methodology will remove those bugs (aka chess engine bugs will eventually or quickly, depending, be fixed) until 100% bug free status attained. Computer chess tends to use the end users as the "final" beta test, before "release", often there being no "release", just future versions, also "released" with effectively beta-test status each time.

Both Ed and I, also others, have different development experiences to what you are perhaps familiar with. Ed's engines that went into portable electronic devices needed a way higher standard of testing than beta-test. Engines from my company that were externally published on hard media, likewise. Nobody wants thousands or tens of thousand or hundreds of thousands of manufactured, boxed, wrapped and shipped products to be recalled because of some stupid fault. Publishing companies that ship their own product by CD or download can afford to be a little more cavalier because cost of recall is not so high. Lone programmers who ship free software can be more cavalier still. Some of us, HAVE to make sure that stuff works, else we pay a big price. Thus, we do make sure. Other programmers rely on assuring their customers/end users that "bugs are inevitable".
User avatar
hgm
Posts: 27846
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: hash collisions

Post by hgm »

So your point is really that all bugs that exist in released versions of important software can be fixed, once they are discovered.

I don't think that is really worth arguing about; it seems self-evident. Bob (and others) are only arguing here that complex software in general will contain undiscovered bugs, and it is equally self-evident that you cannot fix what you do not know.

As I already pointed out the claim that all chess engines must contain bugs has already been refuted by the micro-Max example. Micro-Max is simple enough. So this is no longer a point of discussion, there do exist very simple chess programs.

Problems for which no known fix exist are not considered bugs, but "known limitations". A compiler does not in general warn you for program constructs that would loop indefinitely. It has been mathematically proven that no algorithm exists that could do that. ("The stopping problem of Turing machines") So we just have to live with it that our programs sometimes unexpectedly hang at run time, if we goofed.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: hash collisions

Post by bob »

Ras wrote: Sun Feb 23, 2020 7:01 am
bob wrote: Sun Feb 23, 2020 5:09 amDeep enough to run out the end of arrays indexed by ply, which should never have exceeded 64.
Defensive coding practice would have prevented this using some check of depth vs. MAX_DEPTH in the recursion. Was that left out deliberately to squeeze some last drop of performance?
Yes. We had such a check, but at a place where it didn't impact performance (just as Crafty has it today). I think that in the assembly version of search we did not do this check because a memory reference was anything but free for the first reference to a specific area of memory. Successive words would be chained in by the vector hardware. I remember testing and we could come nowhere near to that limit. Until the search screwed up. :)