hash collisions

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: hash collisions

Post by bob »

chrisw wrote: Sun Feb 23, 2020 10:39 am
bob wrote: Sun Feb 23, 2020 5:09 am Personally, I have NEVER released a piece of code with known bugs. Or with even suspected bugs (testing simply continued to try to track them down.) That being said I have certainly released things with unknown bugs.

The most painful to me was in the 1970's. I had written about 100,000 lines of assembly language for the Xerox Sigma 9. Deep in the operating system. Main goad was to move ALL of the character-oriented hardware stuff out of the operating system and out into an attached computer to offload all the interrupt handling (each character types produced two interrupts, one for when the character was received in the Sigma 9 and one for when it was echoed back to the user. I had tested and tested this code, night after night, 10pm to 8am. For several months. We decided to "go live" on the campus computer on a Monday to test in a real user environment before handing it over to Telefile (which had taken over the xerox computer product line.

Damn thing was booted at 8am, and by 8:15 am it had crashed. A couple more attempts, a couple more crashes, couple more HUGE core dumps and off I went to test. Problem was simple. We had 64 serial lines on the I/O processor. We had only 32 on the original sigma 9 hardware. Nobody notice when we generated the operating system (anyone remember the term sysgen or system generation from the early IBM days?) my helper had left the number of serial connections at 32, which dictated the size of lots of data structures. Meanwhile, back at the ranch, the code for the front-end processor was set to 64 serial lines. When the 33rd user logged in, big problem as things started to get overwritten as data structures were indexed beyond the end of their data. We had tested night after night after night. Lesson learned.

After I moved to UAB, we ran into a similar problem. Except this time we used a second computer to generate commands and such to simulate 64 simultaneous users, to make sure no surprised lurked. We could leave this running all night as opposed to trying to find 64 users to sit at a terminal all night and do different things. Not good enough. The "load simulator" was not random enough, even though I had a number of random delays. Load simulator would run fine, real users crashed the system as load picked up. Couple of race conditions in an interrupt handler that took really unusual conditions before they would cause a problem.

Cray Blitz. Solid chess program, in 1993-1994 it was over 25 years old already. The parallel search stuff had been done back in 1983. So 11 years of never screwing up in a game. UNTIL we moved to the new Cray YMP in the late 80's. That machine had both 8 processors, AND a clock about twice as fast as the original Cray. When we used it in the ACM tournament, we almost lost a game if not for fortuitous debugging code. A race condition let the search go much deeper than intended. Deep enough to run out the end of arrays indexed by ply, which should never have exceeded 64. We "added kings" due to that problem, and CB was NOT happy with multiple kings. But some debugging code left in by accident looked at the board after an iteration, and if it found something wrong, it simply restored the position to what it was at the start of the search to allow debugging to continue. Our first hint was when we starting seeing things like Kg1-h1, Kb8-a8, Kf4-f5, etc. We were wondering where all the kings came from. The race condition had never exposed itself, mainly due to 2 or 4 processors making it very unlikely. But with 8, suddenly it was much more likely than unlikely the bug would appear. Even though the program had been tested over and over, had played in a dozen ACM events, etc. Race conditions are nasty, hard to find, and often lay dormant until the worst possible time.

As Dijkstra said, "testing can show the presence of bugs, but never the absence of bugs."

I should add, ALL of the above bugs were fixed and tested. But they were there in spite of due diligence in testing. Ed has an odd definition of "no bugs". I would assume ALL of the above would qualify as "bug-free" since the bugs were definitely fixed. But that is not _my_ definition of bug-free. I simply would refer to them as "after testing, no known bugs exist." Which is a LONG way from saying "after testing, NO bugs exist."

I never tolerate bugs of any kind. And I also do NOT write code that is buggy by nature. But I, like most everyone else, realize that our code can STILL contain undiscovered bugs no matter how much we test.

BTW, here's a funny (but true) story. You decide whether it was a bug or not. When I was working on my computer science degree, a friend of mine was one year ahead of me, but we worked together all the time. One day he came in smiling and I asked him "what's with the cheshire cat grin?" (reference to Alice in Wonderland.) He replied "I've found a source of funding that is easy to generate. I asked "explain?"

He said OK, I joined the book of the month club. Ever heard of those? I said yep, you join, every month they send you a book that you can keep and pay for, or else send it back. He said "right". He said that was step 1. Then he asked "Do you remember in the COBOL class back in the summer we discovered that the "zone punch" (an over punch over the least significant digit" was the way to enter a negative number on the IBM systems using packed decimal numbers, which were about all that were used in Data Processing at the time.)

He said "OK, here's what happened. When the books arrived, they came with an IBM-type punched card (holes in 'em but blank otherwise), no printing. I simply ran back to the card punch room, and ran it through the 029 interpreting keypunch machine. Which made a duplicate AND printed the character in each column of punches along the top of the card. I discovered that it contained my account number (as shown on mail that came from the club from time to time.) It also contained the title of the book. AND the price. I took the card, put it in a keypunch machine, turned printing off so there would be no indication of a change, and then I over punched the "zone code" to make the price of the book negative. I kept the book, sent the card back to the club, and next month I received (a) a new book and (b) a check for the amount of the book I had supposedly bought the previous month. But since it was negative, thanks to that zone overpunch, they gave me a credit balance and refunded it no questions asked. I then sold the book to friends for ½ price and ended up making 150% of the price of the book.

Obviously that was a pretty stupid approach. But it was the late 60's. No online data bases. No huge customer databases when 7mb disks were expensive and small. Bug or not? program worked exactly as it was supposed to. Nobody ever considered that the price could come back as a negative number by a clever programmer.
Revealing that you describe an act of uttering a forged document and fraud for pecuniary advantage as "bug", "funny" and "by a clever programmer". That sort of white collar crime is a serious imprisonable offence. It's not at all clever, it's just forgery using a punch card writer rather than a pen or a printer. "Source of funding, easy to generate" is a serial offence.

Obviously, in your world "programmers, friends of yours" can do no wrong. Even in the face of clear fraud. You even boast about it. Character assessment and ethics not your strong point.

Design flaw? Just one of those things?
Criminal activity.

I did tell him "this is going to haunt you. They are not going to remain idiots forever." It did and the last I heard before he graduated was that they were "in negotiations about this." They ended up hiring him. So the story had a happy ending. Bugs are bugs.
Yeah, right. It's the book publishers fault, for having a "bug". Actually they had a criminal user, one of your friends. Did you try the same thing yourself?

There have been bugs since early life started on planet earth. Bugs remain today. Both software and real bugs. No future in sight where they will no longer exist, until Sol becomes a super-nova.
You REALLY one of those that can't see the forest for the trees, eh? I was NOT commenting on the guy that exploited the hole. I was commenting on the hole. JUST the hole. I'd bet 99% of the people here would realize that and also realize that there was no judgement about the student that pulled this off. NONE.

Would you not call hacking into web sites to expose user data or obtain credit card numbers as "criminal activity?" MANY bugs allow criminal activity. So what? They are STILL bugs produced at some point in the process from inception to delivery.

The issue was "was this a software bug?" was it a design flaw? was it an incomplete/ambiguous specification document? How you get off into criminal statutes and such is a complete mystery.

Can't see where I "boasted about it" at all. I explained a pretty serious issue. That was in that software for 4-5 years. Testing did not expose it. Walk-throughs did not catch it. Design reviews did not catch it. Specification validation did not catch it.

No wonder every conversation with you degenerates into an argument. It is apparently what YOU want...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: hash collisions

Post by bob »

hgm wrote: Sun Feb 23, 2020 11:08 am You seem to miss the point. Resistance to fraud should be one of the key specifications of any system for dealing with money. It is a well-known fact that criminals do exist, no matter how much we might dislike that. Failing to care about fraud resistance is neglicence.

There are societies in which actions that encourage crime are a punishable offense themselves. I think one can be fined here for leaving one's car unlocked, because that encourages theft. (I am not sure if that still applies; I never had a car.) That doesn't make ransacking unlocked cars legal here.

And yeah, criminals can be clever. So we'd better make sure we are clever in protecting ourselves from them. Making it easy for them, just because they are not supposed to do what they do would be really stupid. So yes, the book publisher is at fault. You could argue that it is his own money he is putting up for grabs (like you would be putting your own unlocked car on offer...), so that it is OK for him to do this. But of course the money lost this way doesn't really come out of the pocket of the person that designed this system. It is the one that set this up that is guilty of criminal neglicience w.r.t. the publisher (be it his client or employer).
"Miss the point?" :) :) :)

He doesn't even recognize the existence of any "point" whatsoever. He is only looking for something to use to insult, besmirch, etc...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: hash collisions

Post by bob »

Rebel wrote: Sun Feb 23, 2020 12:39 pm
bob wrote: Sun Feb 23, 2020 3:06 am
Rebel wrote: Sat Feb 22, 2020 9:56 pm As far as I am concerned the discussion is about the possibility of bug-free software. From recent previous posts:

Me:
Rebel wrote: Fri Feb 21, 2020 11:04 pm 2. Unless I misunderstand, you pretend it's impossible to write 100% bug free code. I think you are wrong.
You:
bob wrote: Sat Feb 22, 2020 12:08 am 2) You are correct, except that (a) I don't pretend, I know ...
As such it doesn't make sense to list examples of inferior software, bugs are part of programming, to assume (or in your words to know) not all can't fixed is a lack of imagination or underestimation of the capabilities of others. As lone individuals writing chess engines we don't have the resources to do expensive, time consuming test methodologies, beta test procedures before we release. (Y)our experiences as lone individuals are not a good measuring rod.
Not sure how to proceed.
start:

Me neither.

You think it's impossible 100% bug free software exists, how odd.
You said "banks don't tolerate errors" or something to that effect. So I gave some counter-examples where really BIG banks had been bitten by really serious bugs.
So what?

They fix it.

What's so hard to understand?

I've already given quotes showing that the consensus is that you can't test away all bugs.
Don't call cherry picked quotes consensus, you, me and the rest of us have wriiten bug-free code playing legal moves. Or do you think Crafty in theory can play an illegal move?

Do you think, make_move and undo_move are bug-free? Or could Crafty in theory crash?


BUT, just to come back around full circle, "debugged chess program." NOW you say "As lone individuals writing chess engines we don't have the resources to do expensive, time consuming test methodologies, beta test procedures before we release. (Y)our experiences as lone individuals are not a good measuring rod." So which is it? Chess programs have bugs? Or they don't have bugs. Can't possibly be both. I claim they do. You just explained why they do. So why the argument???
Proof we are talking about different things.

goto start;
Let's get this right. Not just I think that bug-free code doesn't exist. A GREAT MANY computer scientists agree. In fact I can't think of a single person that has not agreed with this. It came to a head when "A software crisis" was published. If you google "software crisis" you will discover that this was first recognized as a problem and discussed publicly somewhere around 1968. We talked about this in my undergraduate curriculum, although at the time there were no specific proposed solutions. Same holds true today. There are software design tools (Rational Rose from IBM is one, Texas Instruments developed a very good one although I don't remember what it was called - but not today's "SimpleLink" system). All of these kinds of tools do pretty well in mitigating software failures. There's been a lot of discussion on "runnable specifications" but the same problem still exists... and error in the specs produces an error in the emitted source code.

So exactly how are thousands of computer scientists "odd"? Just because Chris and you say so and say you can develop bug-free code. Wish I were that good, and I AM pretty good at writing code...

You simply have no clue about what the expression "bug free code" actually means. It does not mean software after all known bugs are fixed" just so you know. And it didn't include all previous versions of a software system, prior to the current, because in your words "they found the bugs and fixed them."

Assuming they found ALL the bugs. Why didn't they find them all in the first version released? Or the second? Or the third. But somehow, magically the last version was perfect. Until another bug shows up and they fix it. Bug-free code has no bugs of any kind. We can't even prove that is the case for ANY sizable software program (say > 5,000 lines of code to be generous. More like 500 lines of code.)

So how about coming back to reality. You can't call something bug-free if it is continually being updated to eliminate bugs. Actually that is the definition of lousy software.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: hash collisions

Post by bob »

Rebel wrote: Sun Feb 23, 2020 11:19 am
abulmo2 wrote: Sat Feb 22, 2020 11:08 pm
Rebel wrote: Sat Feb 22, 2020 9:56 pm As lone individuals writing chess engines we don't have the resources to do expensive, time consuming test methodologies, beta test procedures before we release.
In writing Amoeba, as a lone individual, I spent 99.9% of my time testing and 0.1% writing code. I have limited financial resources, but I have an invaluable resource: time.
Can you elaborate what the testing is about?
What do you think? Maybe finding bugs so they can be fixed? That's an odd question.

This is easy. "Testing verifies that the system meets the different requirements including, functional, performance, reliability, security, usability and so on."

Sounds perfectly straightforward to me. No other definition of testing applies to software.

Is Crafty bug free? Absolutely not. Nor is Rebel. Either might have no existing known bugs. But unknown? Come on. Can Crafty crash? I hope not but I would not think about guaranteeing that. Too much code. Way too much room for potential errors when doing a parallel search. Way too high a probability of a simple typo. I've seen a simple "1" changed to a "l" (one to ell) typo cause headaches. If "l" is undefined, you would catch it. In FORTRAN you would not since variables don't have to be declared, there is implicit typing in the language. Never had a typo in your code? I've just been working on FEN parsing due to the discussion here. I had a couple of typos. One the C compiler from Apple caught. One it did not. Comes with the territory. Unless you also believe you can type without making a single error. This discussion is simply unrealistic.
Ras
Posts: 2579
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: hash collisions

Post by Ras »

bob wrote: Sun Feb 23, 2020 7:00 pmI think that in the assembly version of search we did not do this check because a memory reference was anything but free
Similar in the old dedicated 6502 units where the programmers fought for every byte of code space. The result is that these engines may crash if run on modern PCs in emulators since they can reach depths that were not possible with the original hardware. Or with positions like mate in 1 with
[d]1QqQqQq1/r6Q/Q6q/q6Q/B2q4/q6Q/k6K/1qQ1QqRb w - -
where e.g. the Mephisto Briquette 3 will just hang and display random garbage because there are too many moves. The MM5 with Ed's famous engine decides on Kh2xh1?, and after Qg8xg1# even replies with the illegal move Kh1-h2.
Rasmus Althoff
https://www.ct800.net
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: hash collisions

Post by bob »

chrisw wrote: Sun Feb 23, 2020 3:54 pm
hgm wrote: Sun Feb 23, 2020 2:08 pm Yes, YOU are talking about dfferent things. The issue (between Bob and Chris) was whether release versions of complex software are always 100% bug free.
Then you're not following.
The issue is about whether chess engines are so complex that they cannot be rendered free of bugs. Bob argues they are sufficiently complex that it is not possible for ANY chess engine to be 100% no bugs.

Chris is arguing that the domain of chess engines is sufficiently non-complex that it is possible, with the right attitude and QC and methods, to render ONE chess engine 100% bug free within a meaningful timeframe. Chris is also arguing that if it can be done for ONE engine then it can be done for ALL.

You are talking about whether bugs will eventually be fixed. Not relevant to their discussion.
Of course it is relevant. Once there were bugs (feature of developing any chess engine), then appropriate QC and methodology will remove those bugs (aka chess engine bugs will eventually or quickly, depending, be fixed) until 100% bug free status attained. Computer chess tends to use the end users as the "final" beta test, before "release", often there being no "release", just future versions, also "released" with effectively beta-test status each time.

Both Ed and I, also others, have different development experiences to what you are perhaps familiar with. Ed's engines that went into portable electronic devices needed a way higher standard of testing than beta-test. Engines from my company that were externally published on hard media, likewise. Nobody wants thousands or tens of thousand or hundreds of thousands of manufactured, boxed, wrapped and shipped products to be recalled because of some stupid fault. Publishing companies that ship their own product by CD or download can afford to be a little more cavalier because cost of recall is not so high. Lone programmers who ship free software can be more cavalier still. Some of us, HAVE to make sure that stuff works, else we pay a big price. Thus, we do make sure. Other programmers rely on assuring their customers/end users that "bugs are inevitable".
Here's an interesting quote from Ed circa 2019 (last year): "90% of coding is debugging, the other 10% is writing bugs." Hardly sounds like writing bug-free code.

As far as writing bug-free code of any size, "you are a legend in your own mind." NOBODY else agrees with you. And haven't since the late 60's...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: hash collisions

Post by bob »

bob wrote: Sun Feb 23, 2020 7:27 pm
Rebel wrote: Sun Feb 23, 2020 11:19 am
abulmo2 wrote: Sat Feb 22, 2020 11:08 pm
Rebel wrote: Sat Feb 22, 2020 9:56 pm As lone individuals writing chess engines we don't have the resources to do expensive, time consuming test methodologies, beta test procedures before we release.
In writing Amoeba, as a lone individual, I spent 99.9% of my time testing and 0.1% writing code. I have limited financial resources, but I have an invaluable resource: time.
Can you elaborate what the testing is about?
What do you think? Maybe finding bugs so they can be fixed? That's an odd question.

This is easy. "Testing verifies that the system meets the different requirements including, functional, performance, reliability, security, usability and so on."

Sounds perfectly straightforward to me. No other definition of testing applies to software.

Is Crafty bug free? Absolutely not. Nor is Rebel. Either might have no existing known bugs. But unknown? Come on. Can Crafty crash? I hope not but I would not think about guaranteeing that. Too much code. Way too much room for potential errors when doing a parallel search. Way too high a probability of a simple typo. I've seen a simple "1" changed to a "l" (one to ell) typo cause headaches. If "l" is undefined, you would catch it. In FORTRAN you would not since variables don't have to be declared, there is implicit typing in the language. Never had a typo in your code? I've just been working on FEN parsing due to the discussion here. I had a couple of typos. One the C compiler from Apple caught. One it did not. Comes with the territory. Unless you also believe you can type without making a single error. This discussion is simply unrealistic.
How can this be? Ed writes bug-free code?
User avatar
Rebel
Posts: 7207
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: hash collisions

Post by Rebel »

hgm wrote: Sun Feb 23, 2020 2:08 pm Yes, YOU are talking about dfferent things.
Not my bad, look at post http://talkchess.com/forum3/viewtopic.p ... 29#p831029

Bob is saying he knows bug-free software does not exist. Nonsense, bug-free software that does what it is supposed to do what the programmer had in mind hardly gets any attention. The bugs, especially the juicy ones, do get it all.
hgm wrote: Sun Feb 23, 2020 2:08 pm The issue (between Bob and Chris) was whether release versions of complex software are always 100% bug free.
I doubt Chris used the word always. Can be he said, depending on how high you set the bar. And as ex commercials there is (was) that demand. During the time of dedicated computers I can't remember I had to make a bug fix version, maybe on one occassion, I really don't remember, but the bar was high by the manufactory and rightly so. Same for the software of Chris. Our common past perhaps is the reason we agree on the matter.
You are talking about whether bugs will eventually be fixed. Not relevant to their discussion.
Well, my own discussion developed different, see above.
90% of coding is debugging, the other 10% is writing bugs.
chrisw
Posts: 4457
Joined: Tue Apr 03, 2012 4:28 pm
Location: Midi-Pyrénées
Full name: Christopher Whittington

Re: hash collisions

Post by chrisw »

bob wrote: Sun Feb 23, 2020 3:12 am
abulmo2 wrote: Sat Feb 22, 2020 11:08 pm
Rebel wrote: Sat Feb 22, 2020 9:56 pm As lone individuals writing chess engines we don't have the resources to do expensive, time consuming test methodologies, beta test procedures before we release.
In writing Amoeba, as a lone individual, I spent 99.9% of my time testing and 0.1% writing code. I have limited financial resources, but I have an invaluable resource: time.
I also have a lot of time. But I also have access to huge computing resources. I have literally played many millions of games during testing, also. And in spite of all the effort, and all the games played, I STILL find the occasional bug. And will continue to do so based on experience. I'm sure Ed/Chris will just say I am a lousy programmer and they can do much better. I'd first like to see their program(s) that can beat mine; and then have time to carefully examine their programs for the presence of bugs.

It's easy to talk a good game.
Hilarious. Talk is all you do. Volume in inverse proportion to reality. Most if not all with one intention of bigging yourself and belittling everybody else.

MUCH harder to actually write that bug-free code. And convince anybody that is actually IS bug-free.
Well, Ed was convincing in the market place and in the demand for his chess engine code by electronics manufacturers, who by definition, are going to be very concerned that they are taking on bug-free code else they end up having to bin thousands of manufactured units. Me likewise, We get voted for by hard earned money, contracts for supply and capital wanting to invest, over a very extended period. You?
User avatar
Rebel
Posts: 7207
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: hash collisions

Post by Rebel »

Ras wrote: Sun Feb 23, 2020 7:29 pm
bob wrote: Sun Feb 23, 2020 7:00 pmI think that in the assembly version of search we did not do this check because a memory reference was anything but free
Similar in the old dedicated 6502 units where the programmers fought for every byte of code space. The result is that these engines may crash if run on modern PCs in emulators since they can reach depths that were not possible with the original hardware. Or with positions like mate in 1 with
[d]1QqQqQq1/r6Q/Q6q/q6Q/B2q4/q6Q/k6K/1qQ1QqRb w - -
where e.g. the Mephisto Briquette 3 will just hang and display random garbage because there are too many moves. The MM5 with Ed's famous engine decides on Kh2xh1?, and after Qg8xg1# even replies with the illegal move Kh1-h2.
:D

You already gave the hint, 157 possible (legal) moves (likely for both sides) too much to handle for only 8kb RAM, it was known this could happen. Maybe the MM4 does better, it doesn't do check extensions in Q-search.
90% of coding is debugging, the other 10% is writing bugs.