The future of computer chess

smatovic · Post by **smatovic** » Mon Apr 22, 2024 11:26 am

Viz wrote: ↑Mon Apr 22, 2024 11:21 am
smatovic wrote: ↑Mon Apr 22, 2024 11:05 am ....well, maybe first define the timeframe what oldschool and newschool is, Discord members mention that CPW and TC are oldschool, and that "we" are missing the point of modern chess programming, so, I myself am interested, what I am missing in context of "what modern chess programming is about", or alike, what is modern chess programming about, and, what was old chess programming about?

--
Srdja
Actual statistical testing to get actual statistical signifficance is the most important part which wasn't the case prior to rybka more or less and why rybka was so dominant at it prime.
Ofc other things - SPSA, details of how to implement heuristics that gain - like LMR can be implemented to gain 5 elo or to gain 100 elo depending on implementation, but this is true for almost any known heuristic like null move pruning or futility pruning.

So,

- use SPRT for engine-engine self-play testing
- use SPSA for tuning of search heuristics parameters in LMR, null move pruning, futility pruning and alike

--
Srdja

Viz · Post by **Viz** » Mon Apr 22, 2024 11:31 am

smatovic wrote: ↑Mon Apr 22, 2024 11:26 am
Viz wrote: ↑Mon Apr 22, 2024 11:21 am
smatovic wrote: ↑Mon Apr 22, 2024 11:05 am ....well, maybe first define the timeframe what oldschool and newschool is, Discord members mention that CPW and TC are oldschool, and that "we" are missing the point of modern chess programming, so, I myself am interested, what I am missing in context of "what modern chess programming is about", or alike, what is modern chess programming about, and, what was old chess programming about?

--
Srdja
Actual statistical testing to get actual statistical signifficance is the most important part which wasn't the case prior to rybka more or less and why rybka was so dominant at it prime.
Ofc other things - SPSA, details of how to implement heuristics that gain - like LMR can be implemented to gain 5 elo or to gain 100 elo depending on implementation, but this is true for almost any known heuristic like null move pruning or futility pruning.
So,

- use SPRT for engine-engine self-play testing
- use SPSA for tuning of search heuristics parameters in LMR, null move pruning, futility pruning and alike

--
Srdja

Use SPSA also for eval, because you will have HCE.
Know details of implementation - how much to reduce moves in LMR (general look of formula, not -1-2-3), how much to reduce after a null move (the same), what margin to have in RFP, what heuristics are worth the most in hancrafted evaluation (aka what brings most elo to kingdanger, passers, etc) - this is extremely important if you are size/memory limited, you would want to implement from "big to small". Etc. Efficient way to do this was completely unknown in 1960 (at least this is what I assume or it was lost somewhere in the process of chess engines development to 2010).
Good basic logic + good tuning on top is literally worth hundreds of elo compared to what you can do "by feel". It's not even close.

towforce · Post by **towforce** » Mon Apr 22, 2024 4:22 pm

Viz wrote: ↑Mon Apr 22, 2024 11:31 am
smatovic wrote: ↑Mon Apr 22, 2024 11:26 am
Viz wrote: ↑Mon Apr 22, 2024 11:21 am
smatovic wrote: ↑Mon Apr 22, 2024 11:05 am ....well, maybe first define the timeframe what oldschool and newschool is, Discord members mention that CPW and TC are oldschool, and that "we" are missing the point of modern chess programming, so, I myself am interested, what I am missing in context of "what modern chess programming is about", or alike, what is modern chess programming about, and, what was old chess programming about?

--
Srdja
Actual statistical testing to get actual statistical signifficance is the most important part which wasn't the case prior to rybka more or less and why rybka was so dominant at it prime.
Ofc other things - SPSA, details of how to implement heuristics that gain - like LMR can be implemented to gain 5 elo or to gain 100 elo depending on implementation, but this is true for almost any known heuristic like null move pruning or futility pruning.
So,

- use SPRT for engine-engine self-play testing
- use SPSA for tuning of search heuristics parameters in LMR, null move pruning, futility pruning and alike

--
Srdja
Use SPSA also for eval, because you will have HCE.
Know details of implementation - how much to reduce moves in LMR (general look of formula, not -1-2-3), how much to reduce after a null move (the same), what margin to have in RFP, what heuristics are worth the most in hancrafted evaluation (aka what brings most elo to kingdanger, passers, etc) - this is extremely important if you are size/memory limited, you would want to implement from "big to small". Etc. Efficient way to do this was completely unknown in 1960 (at least this is what I assume or it was lost somewhere in the process of chess engines development to 2010).
Good basic logic + good tuning on top is literally worth hundreds of elo compared to what you can do "by feel". It's not even close.

I know I'm biased, but I'm going to say that this is "Old school with features" rather than "modern".

I remember a time when authors were discarding knowledge from the eval, because it slowed down the eval and reduced search, which proved to be more valuable than knowledge at that time.

NNs are able to get a higher Elo rating with more knowledge and slower eval: this tells us that authors, writing their hand-coded evals, missed some really important chess knowledge!

So now we need to go, to borrow a film title, ...drum roll... back to the future!

The job for "future of computer chess" people is not to add features to the old school model, but to move ahead with a new paradigm: work out, or devise, the truly important knowledge about chess, and then GO BACK (the the future!) to where we were 30 years ago, and start discarding all the knowledge that doesn't contribute enough value to justify its CPU time!

Viz · Post by **Viz** » Mon Apr 22, 2024 5:03 pm

The thing is yes. Most of techniques in principle were known long time ago/known in other black & white games.
Problem is simple - proper implementation of this techniques differs by literally thousand of elo to what it was 30 years ago. And let me tell you 1000 elo is somewhat of a big deal.

towforce · Post by **towforce** » Mon Apr 22, 2024 5:35 pm

The future of computer chess: uncover (or devise) core knowledge that will enable anyone to master the game in a short period of time, and feel that they truly, deeply understand it better than all the humans who have lived before.

Since the 1960s, chess has been an important part of AI, and it still is - for when the above is achieved, we'll then be able to do it for other subjects, and we will then have a new frontier in teaching/training humans new skills.

petero2 · Post by **petero2** » Tue Apr 23, 2024 10:40 pm

Viz wrote: ↑Mon Apr 22, 2024 4:48 am The only reason why this programs will always remain the best on this hardware is because no one would ever bother to make a better program for it.
For times of their creation it was an epic achievement, of course. But with modern day knowledge this things would be beaten pretty easily after literally weeks of work of someone who is capable of writing smth for this hardware.
Like 4ku on 1 core has almost the same rating as crafty 25.2 while having 4kb weight executable. This is difference between modern knowledge and ancient knowledge.

It seems very unlikely that you could beat an old program for the 6502 CPU using "modern techniques" in "weeks of work", considering how primitive the 6502 is. The 4ku engine does not seem suitable for the following reasons:

* 4ku uses 64 bit integers. Those would have to be emulated on the 6502, eg (LDA + ORA + STA) for each byte, for a total of (4+4+4)*8=96 clock cycles. At 3.6 MHz (the speed of my Novag Constellation) this would mean ~37000 arithmetic operations per second. This is around 1e5 times slower than on my current CPU, where 4ku gets around 2Mnps, so an estimate is that 4ku would run at around 20nps on a 6502.

* It is unclear if the compiled program would fit into 64KB, but lets assume for now that it does and that there is even 16KB free for hash tables.

* 4ku has a history table that is 64KB large. This obviously has to be replaced with something else, but assume for now that this replacement would somehow not reduce the strength of the program.

I modified the 4ku source code to play at 20 nps and use a 16KB hash table. I then played a game vs my Novag Constellation 3.6 mhz at level 2 (40 moves in 5 minutes). The time control for 4ku was 1 minut + 7 seconds / move. Here is the game:
[pgn]
[Event "Computer Chess Game"]
[Site "zen4.localdomain"]
[Date "2024.04.23"]
[Round "-"]
[White "4ku 20nps"]
[Black "Novag Constellation 3.6 mhz"]
[Result "0-1"]
[TimeControl "60+7"]
[Annotator "1. +0.32"]

1. Nc3 {+0.32/3} g6 2. Nf3 {+0.85/2 4} Nf6 3. d3 {+0.72/2 5} Bg7 4. Be3
{+0.91/2 6} d5 5. h3 {+1.10/2 26} Nc6 6. Rb1 {+0.71/2 18} d4 7. Bf4
{-2.38/1 6} dxc3 8. bxc3 {-2.38/1 2.9} Nd5 9. Bg5 {+0.00/1 11} Nxc3 10. Qc1
{+0.00/1 9} Nxb1 11. Qxb1 {-5.56/1 2.8} b6 12. c4 {-5.19/1 7} h6 13. Bf4
{-5.40/1 6} Bb7 14. a4 {-5.21/1 8} Bc3+ 15. Nd2 {-5.80/1 5} e5 16. Bg3
{-5.93/2 10} Bxd2+ 17. Kxd2 {-5.72/2 9} O-O 18. Ke1 {-5.66/1 2.6} Qd6 19.
e3 {-5.91/1 21} Qb4+ 20. Qxb4 {-5.64/1 1.6} Nxb4 21. Bxe5 {-5.52/2 5} Bxg2
22. Bxg2 {-3.28/2 10} Nxd3+ 23. Ke2 {-6.62/2 5} Nxe5 24. Bxa8 {-6.62/2 3}
Rxa8 25. Rc1 {-6.86/2 5} Nd7 26. Rb1 {-6.38/2 6} a5 27. Rb5 {-6.58/2 7} Nc5
28. h4 {-7.54/1 5} Nxa4 29. Kd3 {-7.72/1 12} Rd8+ 30. Kc2 {-8.29/2 7} Kg7
31. h5 {-7.83/2 7} c6 32. Re5 {-8.20/2 7} Nc5 33. hxg6 {-7.94/2 2.4} Kf6
34. Rh5 {-7.58/2 7} Kxg6 35. Rh4 {-8.34/2 7} a4 36. Rg4+ {-8.04/2 9} Kf5
37. Rg7 {-7.83/2 7} Kf6 38. Rh7 {-7.51/2 5} Kg6 39. Rxh6+ {-14.14/2 3} Kxh6
40. Kc3 {-14.62/2 4} a3 41. Kc2 {-15.26/2 5} a2 42. Kb2 {-16.07/2 4} Rd2+
43. Kc3 {-19.86/2 8} Ne4+ 44. Kb3 {-26.28/2 4} a1=Q 45. c5 {-27.78/2 1.5}
Qc3+ 46. Ka4 {-299.98/2 5} Nxc5#
{Xboard adjudication: Checkmate} 0-1
[/pgn]
Just one game result is obviously not statistically significant, but if you also look at the played moves and the reported search depth it is obvious that 4ku is extremely week under these conditions. The game was basically over after move 6.

Now lets assume that it was somehow possible to manually optimize the program to make it 10 times faster without reducing the playing strength for a given number of nodes. The speed would then be around 200 nps, which is about the same speed as the handwritten assembly programs got on the 6502.

I played another game at 200 nps using the same time control:
[pgn]
[Event "Computer Chess Game"]
[Site "zen4.localdomain"]
[Date "2024.04.23"]
[Round "-"]
[White "4ku 200nps"]
[Black "Novag Constellation 3.6 mhz"]
[Result "0-1"]
[TimeControl "60+7"]
[Annotator "1. +0.29"]

1. Nc3 {+0.29/4} g6 2. d3 {+0.66/5 3} Bg7 3. Nf3 {+0.89/4 3} Nf6 4. h3
{+0.52/4 4} d5 5. Be3 {+0.09/5 13} Nc6 6. Nb5 {-0.01/5 4} a6 7. Nbd4
{-0.26/5 2.9} Bd7 8. Nxc6 {+0.40/5 6} Bxc6 9. Ne5 {+0.42/4 2.8} Nd7 10.
Nxc6 {+0.50/6 6} bxc6 11. c3 {+0.40/6 9} O-O 12. Qc2 {+0.23/4 4} Rb8 13. d4
{+0.15/4 5} c5 14. dxc5 {+0.93/4 4} e5 15. c6 {+1.25/6 8} Nf6 16. Ba7
{+1.92/6 9} Ra8 17. Bc5 {+1.04/6 8} Re8 18. f3 {+1.03/6 4} Re6 19. Qa4
{+0.99/4 4} Qe8 20. O-O-O {+0.37/4 4} Rxc6 21. Qa5 {+0.16/4 5} Qe6 22. Kb1
{+0.43/3 5} Rb8 23. g4 {+0.60/2 4} Rb5 24. Qa3 {-3.52/5 7} Rbxc5 25. h4
{-3.65/4 5} Bf8 26. g5 {-3.94/5 10} Rxc3 27. Qa4 {-5.59/5 5} R3c4 28. Qb3
{-5.21/5 4} Rb6 29. Qd3 {-5.83/5 4} Rd4 30. Qc2 {-6.07/7 10} Rc6 31. Qb3
{-5.42/7 11} Rb6 32. Qc2 {+0.00/12 4} Rxd1+ 33. Qxd1 {-6.41/6 11} Nh5 34.
Bh3 {-6.41/6 5} Qc6 35. Qc1 {-6.39/6 7} Qxc1+ 36. Rxc1 {-5.88/5 6} c5 37.
Bd7 {-6.26/5 6} d4 38. Kc2 {-7.54/6 14} c4 39. b3 {-7.03/4 4} Rb7 40. Bc6
{-6.09/4 5} Rc7 41. Bd5 {-8.30/5 4} cxb3+ 42. Kb1 {-8.03/5 5} Rxc1+ 43.
Kxc1 {-8.36/6 5} bxa2 44. Bxa2 {-7.87/7 6} Kg7 45. Bc4 {-8.04/6 7} a5 46.
Kc2 {-8.17/6 4} Nf4 47. Kb3 {-8.10/6 6} Bb4 48. Bb5 {-8.97/6 4} Be1 49. Ka4
{-9.86/7 6} Kf8 50. Bc4 {-9.59/7 7} Ke7 51. Kb5 {-8.96/5 4} Ne6 52. Ka4
{-9.51/7 14} Kd6 53. Kb5 {-9.32/5 4} Kc7 54. Ka4 {-8.76/6 7} Kb6 55. Bxe6
{-10.51/7 9} fxe6 56. Kb3 {-11.45/8 5} Bxh4 57. Kc4 {-11.61/7 4} Bxg5 58.
Kd3 {-12.64/8 6} h5 59. Ke4 {-13.20/8 6} Bf4 60. e3 {-15.92/10 5} h4 61.
exf4 {-9.42/10 9} h3 62. f5 {-14.67/8 5} exf5+ 63. Kxe5 {-18.42/7 5} h2 64.
Kxd4 {-18.89/8 9} h1=Q 65. Kd5 {-19.61/7 12} Qxf3+ 66. Ke5 {-20.40/6 9}
Qe4+ 67. Kf6 {-27.90/7 5} f4 68. Kg7 {-28.71/9 6} f3 69. Kh7 {-29.46/9 5}
f2 70. Kg7 {-30.00/8 4} f1=Q 71. Kh6 {-34.46/8 5} Qf8+ 72. Kh7
{-299.98/12 8} g5#
{Xboard adjudication: Checkmate} 0-1
[/pgn]
4ku now played a bit better but it is still obvious that it is very weak and the game was basically over after move 23 where it lost a piece due to a 3-ply tactic.

So I am not conviced it would be easy to create a stronger engine for the 6502 than what already existed 35 years ago.

It could be that 4ku would scale better than the Constellation so it would win at very long time controls, but this is also not obvious if you only have some KBs of RAM to use for both the history tables and the transposition tables.

If on the other hand the target was a 68000 CPU I think it would be a lot easier to port a modern engine to that hardware. (Although NNUE would still be unfeasible.)

Link to instruction set documentation for the 6502: https://www.masswerk.at/6502/6502_instruction_set.html

towforce · Post by **towforce** » Wed Apr 24, 2024 12:08 am

petero2 wrote: ↑Tue Apr 23, 2024 10:40 pmIt seems very unlikely that you could beat an old program for the 6502 CPU using "modern techniques" in "weeks of work", considering how primitive the 6502 is...

Very good points - and thank you for providing the sample games! The developers of the 8-bit CPU period did a really good job of pulling chess performance out of what is, by today's standards, a very limited microprocessor. As well as things we take for granted like 64 bit integer arithmetic in a single instruction, the hardware constraints are everywhere you look: for example, after executing an instruction and obtaining the address of the next instruction, guess what it had to do next? It had to take control of the bus, and fetch that next instruction from memory - which was in another chip! Maybe that next instruction needed data to calculate? Yep - take control of the bus and fetch the first 8 bits of that data - from another chip! None of this cache memory nonsense or instruction pipelining in the 6502 - just a few registers!

Also, of course, the instruction set of the 6502 was very limited: it would often take many clock cycles to do something "simple and easy" - even without the above-mentioned memory fetches.

If the computer clock was 5 Mhz, you wouldn't get anything like the same amount of useful work done with those 5 Mhz in one second as you would do on an SOC running at 5 Mhz - and, of course, today's SOCs run MASSIVELY faster than 5 Mhz!

I believe that if they took the time needed, and tried hard, today's author's would be able to produce a better chess engine than the classic ones of the period - but as you said, it would not be a trivial undertaking.

The following bullet points are speculation on my part:

* you might need to write the code in 6502 assembly language to compete with the old masters (and, at this time, you won't get any useful help from ChatGPT or any other LLM to do that)

* you might also need a version of the same code in a high level language to do the auto-tuning

* an author who is young in 2024 is initially going to be shocked at how restricted the 6502 assembly language is, and how little memory they have to play with

As I said before, nobody is actually going to do the job of writing a new program for the 6502 in 2024 - but if they did, it would probably be a lot more difficult than they would be expecting it to be.

mclane · Post by **mclane** » Wed Apr 24, 2024 5:33 am

That was my point that the software progress is part of the hardware progress.

Without the limitations of that era not beeing taken away, software progress could not really have happened.

petero2 · Post by **petero2** » Wed Apr 24, 2024 8:30 am

towforce wrote: ↑Wed Apr 24, 2024 12:08 am * you might need to write the code in 6502 assembly language to compete with the old masters (and, at this time, you won't get any useful help from ChatGPT or any other LLM to do that)

ChatGPT may be able to give some help. I tried asking it:

How can I implement selection sort in assembly language for a 6502 CPU?

It gave the following answer:

ChatGPT wrote: Implementing selection sort in assembly language for a 6502 CPU involves breaking down the algorithm into its basic steps and translating each step into 6502 assembly instructions. Here's a simplified version of selection sort algorithm in assembly language for 6502 CPU:

Code: Select all

; Selection Sort implementation in 6502 assembly language

    ; Constants
ARRAY_SIZE = 10          ; Define the size of the array
ARRAY_START = $0200      ; Memory location where the array starts

    ; Main program
    .org $0800           ; Starting address of the program

start:
    LDX #0              ; Initialize outer loop index
outerLoop:
    LDA ARRAY_START,X   ; Load the value at ARRAY[X] into accumulator
    STA MIN_INDEX       ; Store the current index as the index of minimum value

    LDX #1              ; Initialize inner loop index
innerLoop:
    CMP ARRAY_START,X   ; Compare with the next element
    BCC updateMin       ; If the next element is smaller, update MIN_INDEX
    DEX                 ; Decrement inner loop index
    CPX #ARRAY_SIZE     ; Check if reached end of inner loop
    BNE innerLoop       ; If not, continue inner loop

updateMin:
    STA TEMP_INDEX      ; Store the current index temporarily
    LDA MIN_INDEX       ; Load the index of minimum value
    STA MIN_VALUE       ; Store the value at that index temporarily
    LDA ARRAY_START,X   ; Load the value at ARRAY[X] into accumulator
    STA ARRAY_START,TEMP_INDEX  ; Swap the current element with the minimum element
    LDA MIN_VALUE       ; Load the minimum value
    STA ARRAY_START,X   ; Store the minimum value at ARRAY[X]

    INX                 ; Increment outer loop index
    CPX #ARRAY_SIZE     ; Check if reached end of outer loop
    BNE outerLoop       ; If not, continue outer loop

end:
    BRK                 ; End of program

    ; Variables
MIN_INDEX: .byte 0      ; Index of minimum value
TEMP_INDEX: .byte 0     ; Temporary index storage
MIN_VALUE: .byte 0      ; Minimum value storage

    ; Array Data
    .fill ARRAY_SIZE, 0  ; Fill the array with initial values

This assembly code assumes you have a memory region starting from $0200 where your array is stored. You can adjust the constants ARRAY_SIZE and ARRAY_START to fit your specific array size and memory location. The program starts at address $0800. It compares adjacent elements in the array and swaps them if they are in the wrong order. Finally, it iterates through the array until it is sorted.

I did not try to verify if the suggested code actually works though.

Uri Blass · Post by **Uri Blass** » Wed Apr 24, 2024 9:04 am

mclane wrote: ↑Wed Apr 24, 2024 5:33 am That was my point that the software progress is part of the hardware progress.

Without the limitations of that era not beeing taken away, software progress could not really have happened.

I diagree.

1)Facts are that there is also software progress for the same hardware.

It is not that stockfish16.1 is better than Stockfish8 only if you use the newest hardware and you can use hardware from the time of Stockfish8 and it is still better.

2)It may be not so easy to generate something better for the very old hardware but it does not mean that it is impossible.

I do not believe the knowledge of today cannot help to modify the software of the past to become better at least in long time control.

For example I believe old hardware should be enough to implement LMR in the way engines use it today.

The future of computer chess

Re: The future of computer chess

Re: The future of computer chess

Re: The future of computer chess

Re: The future of computer chess

Re: The future of computer chess

Re: The future of computer chess

Re: The future of computer chess

Re: The future of computer chess

Re: The future of computer chess

Re: The future of computer chess