Questions for the Stockfish team

Houdini · Post by **Houdini** » Tue Jul 20, 2010 7:36 pm

Milos wrote:Evaluation is just a dead-end. There are so many different positional things that you need to implement to noticeably increase elo, and all that with an assumption that you don't lose any speed. However, with each new thing you lose more in speed than what you gain with positional knowledge. Or simply put, evaluation as it is in the most strong programs is really in it's local optimum, and you can't improve it more without drastically changing search.
On the other hand, you can improve search almost for free .

I disagree.
Houdini 1.03's 15-20 Elo increase is entirely due to improved evaluation.
Miguel, I'm on your side

.

Robert

jwes · Post by **jwes** » Tue Jul 20, 2010 8:56 pm

Chan Rasjid wrote:
Concusion: Everything they does fits together very well. Thats the point why Stockfish is that strong.
This answer is not helpful.

Michael Sherwin asked only a simple question - "What is the secret ?" and those who know are not willing to tell in a simple manner.

Rasjid

You forgot the smiley. If people didn't know better, they might think you were serious.

Chan Rasjid · Post by **Chan Rasjid** » Tue Jul 20, 2010 9:20 pm

I will try to address the original post which is more about what are important principles to observe when it comes to writing a top engine.

I) Bugs - the influence of bugs might be more serious than most programmers suspect. It might be the single most important reason that an engine falls behind despite having implemented all elements known in chess programming.

II) a principle - don't do anything wrong.
Why is Houdini so strong ? Because the author did all things right! He started with the codes of Ipp* and therefore there was nothing that was wrong to start with. Robert Houdard then made changes and he only allowed changes that were only right, ie after testing. It is the same with Osipov Yuri and Vasik Rajlich. Vasik rewrote Fruit in bitboard and again he started with nothing that was too much wrong. He then added many things right and thus was born Rybka.

There are many ways that a chess program "can do wrong" and serious enough to ensure it will never be an elo 3000 program. I will see if I could have some concrete examples :-
a) What do you do when you reach a node that is down in material ?
I remember the SF programmers mentioned in passing about they have a way to deal with nodes that are down in material. Ed of Rebel did mention something in his article about how to deal with QS nodes before calling evaluation when it is either way up or down in material. There is always the temptation to implement a solution without thorough testing. But testing requires hardware resources as well as time. Without testing it is better not to do anything.
b) Assume that the pawn's mobility is very important - that all top chess programmers will always have it. I will try to describe a way to implement it:
i) Add up all push squares that the pawns can make as well as the number of captures the pawns could make. Use the weight 5 - 10 cp per square or capture. We could improve on it. We could eliminate all PxP as both sides will always balanced off. We also should eliminate all PxBig_pieces as there should be sufficiently high bonuses for pawns attacking bigger pieces.
ii) So now just add up all push squares that the pawns can make. We could add refinements by taking only "safe mobility" where the squares do not have threats or have own protection. Is this good enough ? No! There is a great weakness and such an implementation of pawn's mobility might even be counter productive! It has ignored something that might be critical - pawn's mobility should exclude pawns that are covering the king that has castled into a corner, the three covering pawns. In some situations, it is important that these pawns in front should be taken by its own pieces especially N/B and more so if both sides have castled into the same corners. Taking mobility of these pawns would discouraged other own pieces to place themselves in front of these pawns - a serious evaluation bug!

There are many factors in evaluation and the above example is just one way we could "do wrong" in evaluation! If a single bug independently cost 10 elo and if a program has a total of 10 of such errors, the total elo cost might be greater than 100 if they compound exponentially. So the principle is "anything wrong" could be costly when they add up.

Then there is the question about whether search or evaluation is more important. It is like asking which part of the human anatomy is the most important - it cannot be answered.

A chess program is made up of elements that we could categorize and two important elements are evaluation and search. These two are fundamentally essential and no program could be without either. A program that is without an evaluation only makes random moves - I am not talking about returning material, but an eval() that returns a constant. A program without a search is even worst - it does not make a move! As both are essential, it seems we cannot ask which is more important.

What we know is that a program makes an evaluation in QS nodes. Returning material alone is sufficient if somehow a program can somehow far out-search another. Returning a reverse eval() is not allowed as it "does something grossly wrong" and out-searching is of no use.

Although it seems it is easier to gain elo through search, a bad evaluation could still hold a program back in the end. Evaluation could still be one of the reason there is a fairly large gap between Crafty and Stockfish.

Rasjid

bob · Post by **bob** » Tue Jul 20, 2010 9:39 pm

Joost Buijs wrote:I do understand that with an infinite depth you don't need eval at all. With a perfect evaluation function a 1 ply search will be sufficient as well. This is just theoretical.

It is my feeling that everything depends on the quality of the evaluation. When i look at my own engine, it has an evaluation function comparable to a 1600 player, but it plays at 2850 level just because it is very good at tactics. I'm pretty sure that when i'm able to improve the evaluation function to a higher level it's elo will go up.

OK, some background. It turns out that if you replace Crafty's evaluation with a pure random number, it plays well above 2,000 Elo. If you disable all the search extensions, reductions, no null-move and such, you still can't get it below 1800. There has been a long discussion about this, something I call "The Beal Effect" since Don Beal first reported on this particular phenomenon many years ago. So a basic search + random eval gives an 1800 player. Full search + full eval adds 1,000 to that. How much from each? Unknown. But I have watched many many stockfish vs crafty games and the deciding issue does not seem to be evaluation. We seem to get hurt by endgame search depth more than anything...

bob · Post by **bob** » Tue Jul 20, 2010 9:41 pm

Chan Rasjid wrote:
Joost Buijs wrote:I do understand that with an infinite depth you don't need eval at all. With a perfect evaluation function a 1 ply search will be sufficient as well. This is just theoretical.

It is my feeling that everything depends on the quality of the evaluation. When i look at my own engine, it has an evaluation function comparable to a 1600 player, but it plays at 2850 level just because it is very good at tactics. I'm pretty sure that when i'm able to improve the evaluation function to a higher level it's elo will go up.
To write an 2850 engine is not easy as everything must fit together well and it needs a lot of testing.

I am more inclined to agree that evaluation is important or even very important.

Rasjid

EDIT: BTW can someone just explain once what is meant by "tactical" and searching deeper for "tactical" advantage. I don't understand.

tactics is about winning material. Evaluations don't recognize "mate in 20" and such, the search finds that by shuffling wood around on the board. Ditto for winning a pawn or more. Positional judgement is about placement of pieces on the board, without any regard to potential tactical issues.

bob · Post by **bob** » Tue Jul 20, 2010 9:43 pm

Houdini wrote:
Milos wrote:Evaluation is just a dead-end. There are so many different positional things that you need to implement to noticeably increase elo, and all that with an assumption that you don't lose any speed. However, with each new thing you lose more in speed than what you gain with positional knowledge. Or simply put, evaluation as it is in the most strong programs is really in it's local optimum, and you can't improve it more without drastically changing search.
On the other hand, you can improve search almost for free .
I disagree.
Houdini 1.03's 15-20 Elo increase is entirely due to improved evaluation.
Miguel, I'm on your side .

Robert

This is basically irrelevant. Before you wrote a single line of code, you already had a _very_ good search written for you by someone else. So of course improving the eval can help. Unless you start from scratch and discover that search comes first, eval can be simple for a good while and still produce good results (again, see Fruit).

jwes · Post by **jwes** » Tue Jul 20, 2010 9:58 pm

Milos wrote:
Joost Buijs wrote:It is my feeling that everything depends on the quality of the evaluation. When i look at my own engine, it has an evaluation function comparable to a 1600 player, but it plays at 2850 level just because it is very good at tactics. I'm pretty sure that when i'm able to improve the evaluation function to a higher level it's elo will go up.
Evaluation is just a dead-end. There are so many different positional things that you need to implement to noticeably increase elo, and all that with an assumption that you don't lose any speed. However, with each new thing you lose more in speed than what you gain with positional knowledge. Or simply put, evaluation as it is in the most strong programs is really in it's local optimum, and you can't improve it more without drastically changing search.
On the other hand, you can improve search almost for free .

Do you have any evidence to support your statements?

Chan Rasjid · Post by **Chan Rasjid** » Tue Jul 20, 2010 10:09 pm

jwes wrote:
Milos wrote:
Joost Buijs wrote:It is my feeling that everything depends on the quality of the evaluation. When i look at my own engine, it has an evaluation function comparable to a 1600 player, but it plays at 2850 level just because it is very good at tactics. I'm pretty sure that when i'm able to improve the evaluation function to a higher level it's elo will go up.
Evaluation is just a dead-end. There are so many different positional things that you need to implement to noticeably increase elo, and all that with an assumption that you don't lose any speed. However, with each new thing you lose more in speed than what you gain with positional knowledge. Or simply put, evaluation as it is in the most strong programs is really in it's local optimum, and you can't improve it more without drastically changing search.
On the other hand, you can improve search almost for free .
Do you have any evidence to support your statements?

What Milos (and Bob) says is generally correct. eval() is called many times and adding any lines of codes to it could be costly in terms of speed trade-off. Also, very often, we don't know if the extra eval() codes contribute to a weaker or stronger engine unless we test.

Adding lines to search() is almost without speed trade-off.

Rasjid

Daniel Shawul · Post by **Daniel Shawul** » Tue Jul 20, 2010 10:37 pm

You can always use your eval to guide your search,by looking at the value before and after the move to make reduction/pruning decisions. That doesn't necessarily require you to write eval first, except maybe you need to evaluate at every internal node. Specific evaluation features like king safety , passed pawns or other significant positional terms , are sometimes used to trigger search extensions and guide the engine to/away from these positions.. But i honestly can't think of more.

Also the question is not about which part to do first but which gives more benefits in an ELO per time spent perspective. All parts of the engine are ofcourse important and should be designed to compliment one another. But the importance of search over eval has already been demonstrated by engines like Olithink, Fruit etc.. while I am yet to see the reverse (very good eval with an average search).

Daniel Shawul · Post by **Daniel Shawul** » Tue Jul 20, 2010 10:44 pm

When you say random eval , do you mean tactics(material eval) + random score for positional terms OR just random score for all... In the later case I can't see how it can reach 1800 elo.. All the engine will do is make random moves throwing away material. It may try to avoid mates when it sees them through search but how can it possibly think intelligently when it is fed garbage at the end.

Questions for the Stockfish team

Re: Questions for the Stockfish team

Re: Questions for the Stockfish team

Re: Questions for the Stockfish team

Re: Questions for the Stockfish team

Re: Questions for the Stockfish team

Re: Questions for the Stockfish team

Re: Questions for the Stockfish team

Re: Questions for the Stockfish team

Re: Questions for the Stockfish team

Re: Questions for the Stockfish team