Questions for the Stockfish team

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Questions for the Stockfish team

Post by bob »

Chan Rasjid wrote:
Stockfish is open because is a book on chess engine development, but instead of being presented as a collection of papers or as an "how to" documentation is presented in form of actual source code that, in my personal opinion, is the best way to present / teach software related stuff.
The source code book of Stockfish is the worst way to teach the "secrets of top chess engines". After reading through all the volumes of the Encyclopedia Britannica, what we get in the end is the end.

The Ippolit authors know how to write a top chess engine but they code it in an almost inscrutable style of C. It is nothing evil. They are just playing a game - "if you want to know the secrets you don't have it easy".

Why is Komodo a top engine ? Don rewrites a new chess program and it easily make it to the top because he knows the "secrets". It is the same with Naum and some others.

Bob Hyatt said that there is a gap between Crafty and Stockfish but "not an insurmountable gap". I say the gap is insurmountable. The gap exist because Crafty is NOT Stockfish and as long as Crafty is "not" Stockfish it may never close the gap. Why?

I'll ask Bob Hyatt : "Is the evaluation of Crafty significantly different from that of Stockfish ?" If he says it is a clone evaluator of Stockfish, then I quit - I don't know why the gap. But if he says "yes, the differences are quite many and substantial". I will then say "There you are, you have answered yourself where the gap is". A top engine's evaluation is not simple and all the many different aspect must fit as "...Everything should fit together perfectly". If Crafty is to want an elo 3000 evaluation and at the same time substantially different from that of Stockfish, then it must be "substantially different in a smarter manner".
It is very difficult to beat the evaluation of stockfish as these top engines have pushed chess programming to its limit starting from Rybka. BB mentioned in a post somewhere that (probably) what Vasik contributed to computer chess is scientific testing - but he must know fairly clearly what to test.

If Michael Sherwin were to ask "Are you sure you got it right - that evaluation is this important ?". My answer is in a question : "If you reverse the sign of the evaluation of Stockfish and play it against TSCP, which is the stronger program ?".

Rasjid.
Evaluation is not _that_ important. To understand this, you should try to take Stockfish (or any other program) and "dumb it down" so that it is more fun to play for less skilled players. Just tweaking the eval does not make this easy. I suspect the evaluations are not that different, in fact. There are a known number of important positional concepts that most cover, and then a few that are different for each program. Eval covers every situation from start of game to the end. But the program is not in those situations for the entire game. If you break passed pawn scoring, you don't lose 100 Elo, for example.

I think looking at the eval is the wrong place to look for the difference. If my primary goal was to catch stockfish 1.8, we could do that in 3 months or less. But I don't want a "copy". As far as the difference between the two programs in Elo, one important fact overlooked is that the numbers I quote are for _current_ Crafty, not the released version, and the current version is around +60 Elo stronger than released version. This version will be out just as soon as I finish testing the "skill" command changes on the cluster to see what happens to Elo as skill level is reduced.

Some seem happy to copy other programs. Somehow that is not a satisfying approach for me. And I don't plan on doing it myself.
Joost Buijs
Posts: 1563
Joined: Thu Jul 16, 2009 10:47 am
Location: Almere, The Netherlands

Re: Questions for the Stockfish team

Post by Joost Buijs »

In my opinion eval is very important. Why do you use all kinds of search tricks like razoring, nullmove, singular extensions and LMR? Just to look deeper. You can look as deep as you want, but if your eval is off you won't get anywhere.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Questions for the Stockfish team

Post by bob »

Joost Buijs wrote:In my opinion eval is very important. Why do you use all kinds of search tricks like razoring, nullmove, singular extensions and LMR? Just to look deeper. You can look as deep as you want, but if your eval is off you won't get anywhere.
That's simply the wrong answer. We do razoring, null-move, etc to look deeper to find _tactics_. You should go back and read the discussions about trying to reduce the strength of a program by whacking up the evaluation only. It is _not_ easy to reduce the strength when you fiddle with positional stuff but leave the tactical genius in place...

Look at fruit. Very strong. Very simple evaluation.

As far as your last statement, if I look deep enough I don't need _any_ evaluation at all...
Joost Buijs
Posts: 1563
Joined: Thu Jul 16, 2009 10:47 am
Location: Almere, The Netherlands

Re: Questions for the Stockfish team

Post by Joost Buijs »

I do understand that with an infinite depth you don't need eval at all. With a perfect evaluation function a 1 ply search will be sufficient as well. This is just theoretical.

It is my feeling that everything depends on the quality of the evaluation. When i look at my own engine, it has an evaluation function comparable to a 1600 player, but it plays at 2850 level just because it is very good at tactics. I'm pretty sure that when i'm able to improve the evaluation function to a higher level it's elo will go up.
Chan Rasjid
Posts: 588
Joined: Thu Mar 09, 2006 4:47 pm
Location: Singapore

Re: Questions for the Stockfish team

Post by Chan Rasjid »

Mangar wrote:Hi,

sorry the answer is not helpful for you. IMHO there is no "secret". Take all known eval- and search-features. Implement a chess engine without bugs and tune the features that they fit together well. Then you get an engine with the strength of stockfish.
It´s hard work, it needs millions of testgames to tune and maybe a little luck you don´t get in a local maximum to early.

When Fruit came out the trick of its early strength was late move reduction. This was easy to find out and implement in other chess engines. (Even it was very hard for me to agree it could work).
In my tests I wasn´t able to find a single trick that improves the strength of our chess engine.

Thus you try to find something that isn´t there, kind of frustrating I know it well. I tried hard to find a trick.
Greetings Volker
Of course I am not saying you are wrong.

Many of us know the basic of chess programming and Michael Sherwin was asking about some principles to follow to make a top engine. He knows cloning stockfish blindly do not make him understanding why it is the top.

Rasjid
Chan Rasjid
Posts: 588
Joined: Thu Mar 09, 2006 4:47 pm
Location: Singapore

Re: Questions for the Stockfish team

Post by Chan Rasjid »

Uri Blass wrote:
Chan Rasjid wrote:
Stockfish is open because is a book on chess engine development, but instead of being presented as a collection of papers or as an "how to" documentation is presented in form of actual source code that, in my personal opinion, is the best way to present / teach software related stuff.
The source code book of Stockfish is the worst way to teach the "secrets of top chess engines". After reading through all the volumes of the Encyclopedia Britannica, what we get in the end is the end.

The Ippolit authors know how to write a top chess engine but they code it in an almost inscrutable style of C. It is nothing evil. They are just playing a game - "if you want to know the secrets you don't have it easy".

Why is Komodo a top engine ? Don rewrites a new chess program and it easily make it to the top because he knows the "secrets". It is the same with Naum and some others.

Bob Hyatt said that there is a gap between Crafty and Stockfish but "not an insurmountable gap". I say the gap is insurmountable. The gap exist because Crafty is NOT Stockfish and as long as Crafty is "not" Stockfish it may never close the gap. Why?

I'll ask Bob Hyatt : "Is the evaluation of Crafty significantly different from that of Stockfish ?" If he says it is a clone evaluator of Stockfish, then I quit - I don't know why the gap. But if he says "yes, the differences are quite many and substantial". I will then say "There you are, you have answered yourself where the gap is". A top engine's evaluation is not simple and all the many different aspect must fit as "...Everything should fit together perfectly". If Crafty is to want an elo 3000 evaluation and at the same time substantially different from that of Stockfish, then it must be "substantially different in a smarter manner".
It is very difficult to beat the evaluation of stockfish as these top engines have pushed chess programming to its limit starting from Rybka. BB mentioned in a post somewhere that (probably) what Vasik contributed to computer chess is scientific testing - but he must know fairly clearly what to test.

If Michael Sherwin were to ask "Are you sure you got it right - that evaluation is this important ?". My answer is in a question : "If you reverse the sign of the evaluation of Stockfish and play it against TSCP, which is the stronger program ?".

Rasjid.
It is clear that intentionally bad evaluation can cause significant demage
but the question is not what is the value of the evaluation relative to intnentionally bad evaluation but what is the value of the evaluation relative to a simple evaluation.

I did not try it with stockfish but I found in the past that strelka with only piece square table evaluation can beat easily Joker and Joker is significantly stronger than tscp so I believe that search is important and it is not obvious that the relative advantage of stockfish is in the evaluation.

I think that it is possible to test it and you need to change the evaluation function of Crafty to give the same results as stockfish and test the new program against Crafty.

Uri
I think there is a simple reason why Strelka with PST only can easily win Joker. It is what all of us understand by "out-searching" another engine.

Joker is relatively weak and the search of strelka is relatively very strong and it simply out-search Joker. If the PST of strelka accounts for some basic differences between normal game and endgame then, in itself, the PST is a "good enought" static evaluation. There is even a possiblity that with just PST, Strelka could win Stockfish if it is given significantly more time per move. Out-searching always work as long as a program "don't do anything wrong".

Rasjid
Chan Rasjid
Posts: 588
Joined: Thu Mar 09, 2006 4:47 pm
Location: Singapore

Re: Questions for the Stockfish team

Post by Chan Rasjid »

Joost Buijs wrote:I do understand that with an infinite depth you don't need eval at all. With a perfect evaluation function a 1 ply search will be sufficient as well. This is just theoretical.

It is my feeling that everything depends on the quality of the evaluation. When i look at my own engine, it has an evaluation function comparable to a 1600 player, but it plays at 2850 level just because it is very good at tactics. I'm pretty sure that when i'm able to improve the evaluation function to a higher level it's elo will go up.
To write an 2850 engine is not easy as everything must fit together well and it needs a lot of testing.

I am more inclined to agree that evaluation is important or even very important.

Rasjid

EDIT: BTW can someone just explain once what is meant by "tactical" and searching deeper for "tactical" advantage. I don't understand.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Questions for the Stockfish team

Post by Daniel Shawul »

To expect WDL tips as in the case of Tic Tac Toe is unrealistic. However I think it is much more important to concentrate your efforts on search than evaluation for faster progress. For the same search tree better eval should always give improvement if it doesn't reduce nps a lot. However that is not the case and most strong engines have a trick or two in them to beat you down in tactics. Trying to counter that by improving eval is very difficult if not impossible as that is a static feature. Once you get the search fixed though your eval could give you the advantage.

This discussion has led me to think a bit regarding probabilistic evaluation of positions. I wonder if there are past studies to measure uncertainty of evals and use a MonteCarlo type search ?? random thought.

Daniel
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Questions for the Stockfish team

Post by michiguel »

Daniel Shawul wrote:To expect WDL tips as in the case of Tic Tac Toe is unrealistic. However I think it is much more important to concentrate your efforts on search than evaluation for faster progress. For the same search tree better eval should always give improvement if it doesn't reduce nps a lot. However that is not the case and most strong engines have a trick or two in them to beat you down in tactics. Trying to counter that by improving eval is very difficult if not impossible as that is a static feature. Once you get the search fixed though your eval could give you the advantage.
I believe in the opposite approach = Get a good eval before optimizing search because there are many search tricks that could depend on eval(). Of course, I have no proof to back this up and I may be in the minority.

In any case, I do not believe it is "either one or the other"

Miguel

This discussion has led me to think a bit regarding probabilistic evaluation of positions. I wonder if there are past studies to measure uncertainty of evals and use a MonteCarlo type search ?? random thought.

Daniel
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Questions for the Stockfish team

Post by Milos »

Joost Buijs wrote:It is my feeling that everything depends on the quality of the evaluation. When i look at my own engine, it has an evaluation function comparable to a 1600 player, but it plays at 2850 level just because it is very good at tactics. I'm pretty sure that when i'm able to improve the evaluation function to a higher level it's elo will go up.
Evaluation is just a dead-end. There are so many different positional things that you need to implement to noticeably increase elo, and all that with an assumption that you don't lose any speed. However, with each new thing you lose more in speed than what you gain with positional knowledge. Or simply put, evaluation as it is in the most strong programs is really in it's local optimum, and you can't improve it more without drastically changing search.
On the other hand, you can improve search almost for free ;).