Is 8/16 bit computing inappropriate for chess programming?

Bill Rogers · Post by **Bill Rogers** » Fri Apr 25, 2008 7:46 am

Hey Steven my first pc was an Elf with 256 bytes of ram, hand wired. It lasted until the TRS-80 Model I came out with a wopping 4k of ram.
Oops, think I revealing my age.
Bill

sje · Post by **sje** » Fri Apr 25, 2008 3:39 pm

Bill Rogers wrote:Hey Steven my first pc was an Elf with 256 bytes of ram, hand wired. It lasted until the TRS-80 Model I came out with a wopping 4k of ram.

My first computer I built was an IMSAI 8080 (2 MHz 8080, 4 KB RAM) back in 1976. Before that I had an HP-25 (49 bytes of program memory and eight storage registers). I feel older than I look and I expect the vultures to start circling soon.

sje · Post by **sje** » Fri Apr 25, 2008 3:48 pm

Zach Wegner wrote:It's just that I'd probably rather spend the money on an FPGA if I were to go in that direction. Which I've actually been thinking about, even earlier today before you posted.

I've thought about doing an FPGA project. Here's the starting point for Xilinx evaluation kits:

http://www.xilinx.com/products/boards/s3_sk_promo.htm

Xilinx also has a product that has a decent 32 bit RISC processor core implemented on an FPGA with plenty of extra cells for implementing a custom coprocessor. This would be ideal for a bitboard accelerator.

Bill Rogers · Post by **Bill Rogers** » Fri Apr 25, 2008 3:53 pm

Although this is beyond me it looks really great. Maybe now some of the advanced programmers and hard code their engines and kick Deep Blue in the royal butt.
Bill

rfadden · Post by **rfadden** » Fri Apr 25, 2008 6:50 pm

sje wrote:
Bill Rogers wrote:Hey Steven my first pc was an Elf with 256 bytes of ram, hand wired. It lasted until the TRS-80 Model I came out with a wopping 4k of ram.
My first computer I built was an IMSAI 8080 (2 MHz 8080, 4 KB RAM) back in 1976. Before that I had an HP-25 (49 bytes of program memory and eight storage registers). I feel older than I look and I expect the vultures to start circling soon.

I also started on an Altair 8800 in 1976 and interestingly the one we bought was from "The Byte Shop" in Andover MA which I remember as the world's first computer store. At this store they had this guru young man who was actually assembling the Altair units in the back room, and so they offered the computer already built. This was a great deal as it turns out because then you take home the computer and it had been tested and run, and it worked great. (I think it would have been a tough combination to have your first computer be one that you built, potentially have a problem with it and then have to diagnose your own faults while having no real experience).

Strangely my other prior experience has a similarity with what you have written. My prior experience was with an HP-65 / HP-67.
I had been very fortunate in that I had been using an HP-65 programmable calculator (with the magnetic cards) since about 1973. My stepdad (a new step-dad at the time) bought two HP-65 and he gave me one of them and we both learned programming at that time. He later used an HP trade-in offer and he traded in both of these HP-65 for similar but improved HP-67. So we both switched to using the HP-67 at around the 1974 time frame. The HP-67 looked almost the same except it had something like 250 program instructions whereas the HP-65 had 99.
My stepdad Jim and I did the most amazing things with 99 instructions. Many of our tough program had exactly this number of instructions since we had to work to use less to "fit" the whole program in the calculator. Once we had the HP-67 with 250+ instructions, I remember that all of our projects became easy in comparison.

One thing my Stepdad did that was smart, he bought the Altair with a new high-tech 16K memory board. I think this board had the new (at the time) 4K bit static RAM chips, and these were a bit expensive, but the 16K MITS board was great (if you could afford it).

So we had 16K RAM and we bought Bill Gate's 8K Basic (for $100), and that combination of RAM, Basic, and that computer worked great!

Most importantly I still have my HP-67 and I still have this Altair. I bought the computer for $400. from my Stepdad many years after it was no longer in use.

I am very glad that I bought and kept this computer. This computer has tremendous value to me, and I take good care of it, and I keep it running and in good shape.

-----------

By the way, a related pet project of mine is to get a chess program running on this computer. The key concept is to use an old, antique, or specialty computing device to serve some sort of useful or new purpose. I believe this is a unique and worthwhile hobby independent of any other interests.

I used to think that I should use my Altair to "Control the house", as in monitor things or control things, but then I noticed a rather obvious negative factor. I have to use an Uninterruptable Power Supply with this computer because when the power goes out I have to load in the bootstrap using the front panel. We never bought a PROM board and so at power-on the computer has nothing in it.

I went on eBay and I bought antique PROM boards that are Altair compatible but unfortunately each old board that I bought requires restoration.

So anyway, once I do get this PROM capability working with my old Altair I would like to put *something* in it, and here's the crazy idea...

I write (or load) a chess program and then via an RS232 interface I use an attached PC as a sort of gateway where I place the Altair "On the Internet" for other people to use or play against. Going further with this idea I could also attempt to have the Altair as one of the always-on chess engines available for matches on ICC.

See the key point is not that the Altair would play well but that it can play at all. Making this available on ICC would be fun.

The key is the "Interest" of the thing. If your interest is in the best playing engine then of course we would all stick with Rybka running on an 8 processor PC. Once you get away from wanting the best playing program then all sorts of interesting options come up.

This is also why I like Bill's approach of writing the chess engine in Basic. They key is to do something a little different, and then enjoy!

Thanks.

rfadden · Post by **rfadden** » Fri Apr 25, 2008 7:43 pm

I'm a hardware architect and mostly I design special purpose graphics hardware but essentially all of this work could be called the design of special purpose computers.

Yes, Belle, Hitech, then Deep Blue used this same technology of special purpose hardware for chess position evaluation, or for evaluation and for the last levels of search.

We have been using devices with FPGA's as the prototype units for testing our logic in the development of graphics chips and this still is one of the most important tools used in the field that I'm in. The FPGA chips have of course been getting more and more capable.

One of the key ideas is something like this: First take the computing problem that you have and try to use the fastest inexpensive or reasonable cost general purpose computer to solve the problem. If your goal is to compute faster than this, then look first toward the portion of your program that takes the majority of the calculation time.

For a chess engine, start with the Eval function and the use of the Transposition table within the innermost loop, and consider how fast your search engine would run if these operations took essentially zero time. (Not that the special purpose hardware would produce a result in zero time, but this is a rule-of-thumb way of considering the potential speed of special purpose hardware.)

So first decide if this is fast enough for what you want. We could talk about the actual computation time of Eval or the Transposition table use as a separate topic.

In some cases you may not be happy even if Eval essentially took zero time.

Imagine one chess position per clock going into Eval, and imagine at the end of a pure hardware pipeline, one Eval result per clock coming out of the special purpose hardware. That by the way is so fast that the general purpose computer might have a hard time even sustaining this rate of generating positions that are worth evaluating. Also note that any pipeline delay of such hardware would cause a different style of problem. There is a separation in time between when the positions are submitted and when the "results per clock" answers come back and are available for further use. In the world of special purpose hardware this "pipeline delay" issue or "transport delay" sometime ends up being the Tail that Wags the Dog. A chess program would typically prefer to get the results of Eval immediately so that the next move chosen for Eval can be based upon this Eval. That's a key characteristic of a typical alphabeta search.

So anyway decide if "Zero time for Eval" is fast enough for you.

If not, then the special purpose logic needs to include the last N levels of tree search as part of it's logic.

Roughly speaking I would say that the special purpose hardware is much easier to construct if you stay away from tree search and the sequential application of the AlphaBeta algorithm in hardware. For example do you want a hardware based Null Move algorithm in there? Well anyway, much of this kind of logic is placed into a state machine, and the logic cycles through chess board positions rather like a typical software solution, except the hardware units would be good at sustaining a "One position per clock" type of rate. Internally the special purpose hardware still has to feed the "one result per clock" style of Eval hardware, so that rate of operation roughly still limits the speed of the special purpose hardware.

Yes, in hardware you can have multiple copies of this Eval Unit, so then the ideas go off into a wide open area of many possibilities so for now I think we should talk about initially having one ultra-fast parallel Eval unit (to prime the pump and to start this discussion).

------------

I think for most applications you would be better off to first concentrate solely on Placing all of Eval and the use of the Transposition table into special purpose hardware, and then see how far you can go with "Zero Time Eval." Note that as I mention this all of the rest of the problem should be handled quite well on these ultra fast 8 processor computers.

So can all of Eval be done on an FPGA? Yes.

The clock rate of the FPGA will be significantly lower than the clock rate of the very latest special purpose chips, but on the other hand the clock rate of FPGA that are available today is about as fast as the special purpose chips were a handful of years back (like 5 years). This kind of clock rate seems fast enough to me.

-----------------

Here is an experiment that I have been thinking about constructing. Take one of your favorite chess engines that is available in source code and modify the engine as follows:

At the Eval function write new software routines to save the result of Eval, and also arrange that during an actual search all tree search is forced to follow exactly the same path in a repeatable fashion (avoid any randomness, avoid having the search controlled by elapsed time, since this will disrupt the experiment).

Save the results of Eval in an array in RAM. Add a small piece of logic that will optionally grab the Eval results from RAM instead of going through the calculation. Then run the chess engine in this "Speed Test" mode where for Eval you get an immediate "Zero calculation delay" result.

During this search you should get the same answer as before, and this would be used to confirm that the technique is working. Then simply measure this speed of search on your computer.

In other words you can simulate what it is like to have awesome special purpose hardware that is added to the computer for chess search. You can see the speed of search on your current computer.

Measure the speed-up. Use this technique to see the factor of speed improvement or to see how deep the engine could search given a position and given a desired time constraint.

Note that even if you want the special purpose hardware to do more than Eval, the above described "simulation" approach can still be used. Simply select the portion of the chess engine that you intend to put into hardware, and create a software "black box" that simulates the presense of this hardware.

Using the above described technique then run your chess engine and watch how fast it can search.

You can only use this technique while *repeating* a search. The first time through you use the software to record the answers that the hardware would otherwise produce, and in the second pass through, the timing pass, you simulate the hardware giving answers as fast as your software can accept these answers. You then see the speed of your chess engine which if finished would eventually be using special purpose hardware for part of the search.

As I said, this is the technique that I would like to try...

I mention this in the hopes of stirring up some interest or at least some interesting discussion...

Thanks.

sje · Post by **sje** » Fri Apr 25, 2008 8:17 pm

I think that the Byte Shop may have been number two; I bought my IMSAI kit through, I think, "The Computer Store" in Waltham Massachusetts. They did not sell any Altair products because of some MITS exclusivity clause. But they had stuff for just about every other brand. It was all frighteningly expensive for me, just a teenager at the time and living at the poverty level.

I sold my HP-25 a year or so later to help fund a purchase of an HP-67 that was US$400 at the time. I wrote some fairly nifty software for that 224 byte calculator including a checkers program.

If you're doing an 8080 program of any kind, I'd suggest getting or writing an emulator and do the development an a real system. That's the approach I'm using for the W65C02S system that's slowly coming together on a breadboard in my workshop. And it's got a 32 KB RAM along with an 8 KB EEPROM, so I don't have to worry about cramming things into 4 KB.

An annoyance from my Z80 chess programming experience was the attempt to do all the scoring with eight bit arithmetic. The better way is to set up a small scale 16 bit emulator along the lines of Wozniak's "Sweet Sixteen" for the Apple II. But I'd implement it as an 4 element stack RPN calculator. For my W65C02S effort, I'll probably use a 32 bit RPN emulator along with a 16 bit version.

On an 20 MHz W65C02C, bitboard operations are feasible, if just barely. It can handle basic bitboard operations at a rate of about 100 KHz. The slowest routine is Card() (a.k.a, PopCount) which takes maybe two or three times as long as a basic boolean operation.

bob · Post by **bob** » Sun Apr 27, 2008 6:08 pm

sje wrote:
Bill Rogers wrote:Hey Steven my first pc was an Elf with 256 bytes of ram, hand wired. It lasted until the TRS-80 Model I came out with a wopping 4k of ram.
My first computer I built was an IMSAI 8080 (2 MHz 8080, 4 KB RAM) back in 1976. Before that I had an HP-25 (49 bytes of program memory and eight storage registers). I feel older than I look and I expect the vultures to start circling soon.

Sure is wasn't an altair 8800? The Imsai was pre-assembled and of higher quality than the old MITS original Altair box...

sje · Post by **sje** » Sun Apr 27, 2008 6:45 pm

bob wrote:Sure is wasn't an altair 8800? The Imsai was pre-assembled and of higher quality than the old MITS original Altair box...

The IMSAI 8080, the first of the model line, appeared in kit form although later systems may have been available assembled. I put together my stripped down system containing four circuit boards from parts. I had a rather early version of the kit that included a number of errors in the assembly manual. I recall one for the power supply board that could have been nasty had I not done a thorough reading before heating up the old soldering iron. The other boards all plugged into the S-100 bus: the CPU card, the 4 KB static memory, and the front panel.

http://en.wikipedia.org/wiki/IMSAI_8080

The IMSAI 8080, although certainly possessing some faults, was superior in almost every way to the MITS Altair 8800/8800a/8800b.

There was an attempt eight years ago to revive the IMSAI 8080 as a real product with some improvements and shipped assembled. No product has been delivered yet. An alternative avenue to nostalgia is an Altair 8800 kit (see: http://www.altairkit.com); it's really shipping (although slowly), and it's quite faithful to the original except for the use of a significantly better, safer, and modern power supply.

hgm · Post by **hgm** » Sun Apr 27, 2008 7:53 pm

sje wrote:Speeds are from 8 MHz to 32 MHz and the architecture runs most instructions in a single clock cycle. RAM is somewhat limited, but EEPROM runs up to 256 KB.

...

Now consider the possibility of using 4,096 of these microcontrollers. That's one controller for each pair of squares, so there's a controller for each possible move (promotions are a special case). With some creative wiring and coding, one could build a HiTech clone of sorts.

If the clock speed is only 32MHz, why bother? An Intel core, doing 4 instructions per clock at 3GHz, is 400 times faster. So even with 4096 of these things, you can only do the equivalent of 10 cores. If you would not have any communication overhead, that is. IMO it would already be extremely optimistic to suppose you could do the work of 4 cores.

So just buy an off the shelf Core 2 Quad...

Is 8/16 bit computing inappropriate for chess programming?

Re: RCA 1802

Re: RCA 1802

FPGA alternatives

Re: FPGA alternatives

Re: RCA 1802

Re: Sample microcontroller array application

Early computer days

Re: RCA 1802

IMSAI 8080

Re: Is 8/16 bit computing inappropriate for chess programmin