Big new ideas in chess programming

Henk · Post by **Henk** » Thu Sep 19, 2013 4:43 pm

Futility pruning does not give a noticable increase in search depth for my chess program. Aren't the lowest levels skipped most of the time due to the Null move or LMR reductions.

rtitle · Post by **rtitle** » Thu Sep 19, 2013 4:52 pm

You pose an interesting question. I am a software developer by profession and a chessplayer/chess-engine-developer as a hobby. One thing I've observed from my professional life (software development) is that anytime someone says something like "this solution X is the best possible solution to a given problem Y; all that remains is tweaks on solution X", in the long run that statement is always proved wrong. Non-chess example: We thought how to build database systems was a solved problem. Textbooks said how and everyone was tweaking the same basic design for at least 20 years. And then a revolution happened in our understanding of how to build database systems, and that revolution is still going on today in a variety of "big data" projects at various research labs and companies.

So getting back to chess engines: Basically they are solving a specialized kind of search problem (minimax search). If you want to ask "what would the ultimate solution to chess search look like?", you have to ask how do current solutions scale up. Because: Individual computers will continue to get more and more cores and more and more memory. And with the "cloud", clusters of many such computers will become available and cheap. So, how does the current "standard" chess engine design (e.g. as implemented by Stockfish) do at scaling up? Well, the standard design does a good job at using many cores in one machine, e.g. with PVS-still parallel search algorithms. It also does a pretty good job of utilizing memory, with a large transposition table. Scaling up to use clusters effectively is a challenge because it's expensive to distribute the transposition table. My intuition is that a better use of memory than the transposition table would be to explicitly store a subset of the search tree. Of course you can't store the whole tree, it would take too much memory, but if you could somehow keep the "critical" subtree in memory, it'd serve the same function as the transposition table (remembering info about what you previously computed about a position), and would be less random about what subset of positions it stored, and would distribute better. Just one idea from left field. I have not implemented it.

Or maybe someone will come up with a totally different and superior design someday... neural nets or g*d-knows-what.

Bottom line is, I think it's always wrong to say "This is it. It's the best that can be done ever." New and better ideas *will* be found.

Rich Title

Don · Post by **Don** » Thu Sep 19, 2013 5:01 pm

Henk wrote:Futility pruning does not give a noticable increase in search depth for my chess program. Aren't the lowest levels skipped most of the time due to the Null move or LMR reductions.

But nothing seems to work in your program. Futility pruning gives a massive speedup for Komodo.

Don · Post by **Don** » Thu Sep 19, 2013 5:30 pm

rtitle wrote:You pose an interesting question. I am a software developer by profession and a chessplayer/chess-engine-developer as a hobby. One thing I've observed from my professional life (software development) is that anytime someone says something like "this solution X is the best possible solution to a given problem Y; all that remains is tweaks on solution X", in the long run that statement is always proved wrong. Non-chess example: We thought how to build database systems was a solved problem. Textbooks said how and everyone was tweaking the same basic design for at least 20 years. And then a revolution happened in our understanding of how to build database systems, and that revolution is still going on today in a variety of "big data" projects at various research labs and companies.

At a company I used to work for we formed groups to discuss interesting ways to solve problems we were having. I ended up in a group involved in figuring out how to deal with a problem involving a lot of data - data that we had to organize and get to each customer. This was before "big data" and I was a little voice that said, "we need to forget about sql databases and do something a lot more scalable." It was obvious to me that just incrementally improving this was not going to scale. And we were supposed to exploring innovative (out of the box) ideas as it was a kind of retreat for brainstorming. However, I was shouted down and everyone else was mentally locked into variations of the old way of doing things. Ironically, the big sql database solution was not actually very useful - it was not like we needed the ability to present information is hundreds of potential ways with dozens of tables.

That whole experience (along with many other things) proved something I have been discovering all my life - that all of us have a pretty limited imagination, even if we personally view ourselves as exception in this regard (we are not.) We cannot go very far outside the box we live in at the moment. Engineers can be extremely conservative - even the ones that we consider brilliant.

With Chess we have seen that for decades now. It started 40 years ago with proclamations that computers could never attain the expert level of chess (which later was changed to "master" level) with all the dutiful comparisons to the age of the universe and number of electrons in the Universe and other such comparisons to prove that chess was too big to even play much beyond the 1800 level.

You can see a replay in computer go. Just a few years ago the strongest go programs were not even single digit Kyu strength and many people were spewing wisdom about how 1 Dan was virtually unattainable, masters were Gods and untouchable. Few had the "vision" to imagine programming GO outside the old and boring way of evaluating each move (basically a 1 ply search) combined with some sort of life and death analysis of groups on the board and using a big database of patterns to solve the problems. It was an extremely static approach completely unscalable and the best programs were the ones that fine tuned and tweaked this same basic approach the best.

Many times I said that some kind of highly selective tree search was going to be absolutely required in go, even if we did not know how to do it right now. I was laughed at for this and then people patiently did the universe comparison for me again like they used to do in chess. But look at what is being done now, a highly selective tree search in go!

Chess does get new big ideas every few years and it will continue to happen. There are many who will say that is not the case but they will say this because they cannot imagine what it is. They are victims of their own chicken and egg issues. Every generation pretty much believes they have figured out all the big things and are especially wise.

So getting back to chess engines: Basically they are solving a specialized kind of search problem (minimax search). If you want to ask "what would the ultimate solution to chess search look like?", you have to ask how do current solutions scale up. Because: Individual computers will continue to get more and more cores and more and more memory. And with the "cloud", clusters of many such computers will become available and cheap. So, how does the current "standard" chess engine design (e.g. as implemented by Stockfish) do at scaling up? Well, the standard design does a good job at using many cores in one machine, e.g. with PVS-still parallel search algorithms. It also does a pretty good job of utilizing memory, with a large transposition table. Scaling up to use clusters effectively is a challenge because it's expensive to distribute the transposition table. My intuition is that a better use of memory than the transposition table would be to explicitly store a subset of the search tree. Of course you can't store the whole tree, it would take too much memory, but if you could somehow keep the "critical" subtree in memory, it'd serve the same function as the transposition table (remembering info about what you previously computed about a position), and would be less random about what subset of positions it stored, and would distribute better. Just one idea from left field. I have not implemented it.

Or maybe someone will come up with a totally different and superior design someday... neural nets or g*d-knows-what.

Bottom line is, I think it's always wrong to say "This is it. It's the best that can be done ever." New and better ideas *will* be found.

Rich Title

Henk · Post by **Henk** » Thu Sep 19, 2013 6:00 pm

Don wrote:
Henk wrote:Futility pruning does not give a noticable increase in search depth for my chess program. Aren't the lowest levels skipped most of the time due to the Null move or LMR reductions.
But nothing seems to work in your program. Futility pruning gives a massive speedup for Komodo.

Nothing is a bit too much. Check extensions does not work in my chess program too. Looks like LMR might work even in my chess program though it is a burden to tune.

mcostalba · Post by **mcostalba** » Thu Sep 19, 2013 7:10 pm

Don wrote: Chess does get new big ideas every few years and it will continue to happen.

Every few years !?!?!?

Could you please put a "first seen" date along each item of the below (your) list ?

1. quies "stand pat" search.
2. hash tables
3. check extensions.
4. futility pruning
5. null move pruning.
6. LMR

I am interested to understand what you think was not existing few years ago.

Robert Pope · Post by **Robert Pope** » Thu Sep 19, 2013 7:36 pm

mcostalba wrote:
Don wrote: Chess does get new big ideas every few years and it will continue to happen.
Every few years !?!?!?

Could you please put a "first seen" date along each item of the below (your) list ?

1. quies "stand pat" search.
2. hash tables
3. check extensions.
4. futility pruning
5. null move pruning.
6. LMR

I am interested to understand what you think was not existing few years ago.

Well, "few" doesn't necessarily have to mean three or four.

I don't think LMR gained much traction before 10 years ago.

20 years ago, null move wasn't universal by any means (deep blue), and its implementation was generally pretty static (R=2 or R=3). I think there have been several refinements on it since then.

The idea of endgame tablebases isn't new, but implementation of bitbases, on-the-fly generation, and compression/decompression schemes that are much superior have come up since 2000.

gladius · Post by **gladius** » Thu Sep 19, 2013 9:40 pm

JohnS wrote:I've been amazed at the rapid strength increase for Stockfish and complement the development team for providing us with a super-strong and free engine. Looking at the development log at abrok.eu/stockfish, it's interesting to see that most changes only involve a few lines of code and give around 2 - 5 Elo increase each.

So my question is: Is this the future for strength increases - extensive testing with small improvements (discounting increases due to hardware speed improvements)? Have we seen the last of the big new ideas in chess programming that could result in major Elo increases (say greater than 50 for a single idea)?

In asking the question I'm in no way trying to downgrade the success of the Stockfish team or any other chess programmer. I have huge respect for all of them for their skills. This is a philosophical question.

I think there are definitely ideas out there remaining to be found that will be very impactful. Even within in the current alpha-beta framework, not talking about things like MC searches or something.

As others have mentioned, you can have a strong idea like LMR, but the small tweaks on it really make a big difference, and over time, that's where a lot of the strength comes from. Looking at LMR now, it's a relatively small piece of code, but calling it one idea is glossing over history. That specific LMR formulation has taken tons of experimentation and tweaks to arrive at the current form.

One of the big advances in the last 10 years has been the scientific measurement of changes, by running many thousands of games. This is resulting in a period of "optimization wars", where the engines are pushed to their limits in the current alpha-beta framework. We still have quite a ways to go here. If someone was magically able to tweak all the parameters "perfectly" in SF for example (no code changes other than changing numbers), I'd guess there would be at least 100 ELO gain.

Don · Post by **Don** » Thu Sep 19, 2013 11:05 pm

gladius wrote:
JohnS wrote:I've been amazed at the rapid strength increase for Stockfish and complement the development team for providing us with a super-strong and free engine. Looking at the development log at abrok.eu/stockfish, it's interesting to see that most changes only involve a few lines of code and give around 2 - 5 Elo increase each.

So my question is: Is this the future for strength increases - extensive testing with small improvements (discounting increases due to hardware speed improvements)? Have we seen the last of the big new ideas in chess programming that could result in major Elo increases (say greater than 50 for a single idea)?

In asking the question I'm in no way trying to downgrade the success of the Stockfish team or any other chess programmer. I have huge respect for all of them for their skills. This is a philosophical question.
I think there are definitely ideas out there remaining to be found that will be very impactful. Even within in the current alpha-beta framework, not talking about things like MC searches or something.

As others have mentioned, you can have a strong idea like LMR, but the small tweaks on it really make a big difference, and over time, that's where a lot of the strength comes from. Looking at LMR now, it's a relatively small piece of code, but calling it one idea is glossing over history. That specific LMR formulation has taken tons of experimentation and tweaks to arrive at the current form.

That is true of hash tables, null move pruning and everything else - in fact there is no specific LMR formulation or hash table formulation or anything else.

LMR is a very specific concept and it is called "Late Move Reductions" for a reason. It doesn't matter that there are many possible ways to implement it, any more than that no two programs do exactly the same extensions. It's like alpha/beta pruning - it's a specific concept whether you use mtd(f), PVS search or something else.

In my opinion you cannot credibly say it's not really one idea - unless you want to say the same things about extensions, alpha/beta, hash tables, history and everything else.

One of the big advances in the last 10 years has been the scientific measurement of changes, by running many thousands of games. This is resulting in a period of "optimization wars", where the engines are pushed to their limits in the current alpha-beta framework. We still have quite a ways to go here. If someone was magically able to tweak all the parameters "perfectly" in SF for example (no code changes other than changing numbers), I'd guess there would be at least 100 ELO gain.

I agree completely. It's not something that is actually IN a chess program and that is why I didn't put it on my list of big things - but I was tempted to do so and perhaps it belongs there.

gladius · Post by **gladius** » Fri Sep 20, 2013 3:47 am

Don wrote:
gladius wrote:As others have mentioned, you can have a strong idea like LMR, but the small tweaks on it really make a big difference, and over time, that's where a lot of the strength comes from. Looking at LMR now, it's a relatively small piece of code, but calling it one idea is glossing over history. That specific LMR formulation has taken tons of experimentation and tweaks to arrive at the current form.
That is true of hash tables, null move pruning and everything else - in fact there is no specific LMR formulation or hash table formulation or anything else.

LMR is a very specific concept and it is called "Late Move Reductions" for a reason. It doesn't matter that there are many possible ways to implement it, any more than that no two programs do exactly the same extensions. It's like alpha/beta pruning - it's a specific concept whether you use mtd(f), PVS search or something else.

In my opinion you cannot credibly say it's not really one idea - unless you want to say the same things about extensions, alpha/beta, hash tables, history and everything else.

Certainly, there was the key insight of reducing the depth of moves later in the list, with a research if necessary. But, reducing it down to ELO terms, how much comes from the idea vs the implementation choices?

Big new ideas in chess programming

Re: Big new ideas in chess programming

Re: Big new ideas in chess programming

Re: Big new ideas in chess programming

Re: Big new ideas in chess programming

Re: Big new ideas in chess programming

Re: Big new ideas in chess programming

Re: Big new ideas in chess programming

Re: Big new ideas in chess programming

Re: Big new ideas in chess programming

Re: Big new ideas in chess programming