stockfish fail high fail low

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Uri Blass
Posts: 10268
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: stockfish fail high fail low

Post by Uri Blass »

mcostalba wrote:
zamar wrote:I understand, but we need a clean solution to fix this. About the practical cases: I don't know :(
Joona, I think we should give a kind of "clean solution" definition to our friends here ;-)

Clean solution: a solution is considered clean _only_ when

1) It removes more lines of code then it adds

2) Is supported by strong and serious testing evidence


It is interesting to note that Uri's suggestions although always imaginative and creative consistently fail both the above points. I have since long abandoned the wishful thinking of point 2, but at least point 1 _could_ be honored once in a while.

Uri, just to be clear. Read my lips: "No Hacks !" :-)

The secret of SF quick improvments is that we try our best to stay away from tricks, hacks, shortcuts and all that kind of garbage because they are no way out roads, you go along that line hoping it brings you sometime and suddendly you find out the engine is a mess and you cannot no more progress on that.

If implementation is simple and clear it can be improved further, otherwise you are at the end of the road.
I have the following comments:

1)Note that in stockfish1.7 you added more lines then the lines that you deleted relative to stockfish1.6 because
I read the following:
58 files changed, 2136 insertions(+), 1901 deletions(-) so it seems that you do not do only changes that removes more lines of code then adding lines of codes.

2)My exeperience with the fail high fail low behaviour is clearly unstable.
In one of my correspondence games I got the following scores(with the same first pv move) and in previous analysis that I did stockfish simply crashed after more than 24 hours of search and I do not know if it is related to fail low fail high problem(in this case I decided to stop stockfish and make it analyze from clear hash after the first move because I suspect that there may be a bug).

I do not give the moves because this games is still not over.

36 08:12 1.531.788.082 3.109.219 -0.64
37- 08:42 1.622.035.970 3.105.456 -0.76
37- 10:12 1.891.115.130 3.087.439 -0.88
37+ 13:21 2.473.535,745 3.085.694 -0.52
37 14:01 2.603.596,766 3.093.676 -0.64
38- 19:00 3.523.314,284 3.090.093 -0.72
38- 23:17 4.320.447,937 3.092.595 -0.80
38- 25:58 4.808.657,047 3.085.470 -0.96
38+ 27:44 5.126.212,642 3.080.585 -0.56
38+ 33:08 6.124.192,272 3.080.214 -0.48
38 35:43 6.592.299,159 3.075.585 -0.76
39- 41:54 7.724.662,632 3.071.714 -0.88
39- 44:57 8.271.452,842 3.066.131 -1.01
39+ 49:13 9.025.704,344 3.056.351 -0.64
39+ 51:36 9.472.362,932 3.059.438 -0.52
39- 59:40 10.973.719,229 3.064.694 -1.25
39+ 1:17:41 14.236.335,495 3.054.021 -0.28

3)thinking about what you can change without adding more lines
I have no good idea but I think that maybe it is better to use
researchCountFH and researchCountFL in the size of the aspiration window
In the relevant case the score at depth 36 and at depth 37 is the same
but depth 37 had fail low and fail high.
I suggest that you may try to define
researchCountFH and researchCountFL as global variable like aspiration_delta when you can add to aspiration_delta some number like
1<<(researchCountFL+researchCountFH)
It seems to be slightly more code but not by much.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: stockfish fail high fail low

Post by Sven »

mcostalba wrote:
michiguel wrote: You cannot attach virtue or vice to the number of lines of code, IMHO.
Hi Miguel,

improving strength you know is not easy, improving strength reducing lines of code is even more difficult and _normally_ it means you have a clear understanding of the problem and of all its side-effects: when this conditions is true your code _happens_ to be also clean and elegant without any visible effort.

It is a consequence of having a clean and complete vision of the problem.

When your code is a mess it means you have not understood the big picture and you are blindly handwaving here and there.

This is IMHO, of course ;-)
I think it is the other way round. Clean and "elegant" code usually implies having understood the big picture. Not having understood it often results in "messy" code. When reading your statement I get the impression that it reverses these implications, A => B is not necessarily the same as B => A (but as !B => !A). Please correct me if I have misunderstood you.

A good understanding of the problem does not always mean that you have or create "good" code, for different possible reasons including

- lack of time or other resources,
- lack of coding/debugging experience,
- too many unfinished ideas paired with inability to complete one thing before starting another (I am expert here :-( ),
- a long history of development with several different programmers and styles being involved.

Another point, I see no strong connection between "good" or "elegant" code and the number of lines of code. Improvements of code elegance (if objectively measurable at all) are also possible by adding some lines. "Local readability" may be an important goal IMO, and sometimes you can achieve it by shortening some complex code here, "locally", and adding another function there which is needed for the shortening, resulting in an overall increase of code lines which does not really matter.

Sven
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: stockfish fail high fail low

Post by mcostalba »

Sven Schüle wrote: I think it is the other way round. Clean and "elegant" code usually implies having understood the big picture. Not having understood it often results in "messy" code. When reading your statement I get the impression that it reverses these implications, A => B is not necessarily the same as B => A (but as !B => !A). Please correct me if I have misunderstood you.

A good understanding of the problem does not always mean that you have or create "good" code, for different possible reasons including

- lack of time or other resources,
- lack of coding/debugging experience,
- too many unfinished ideas paired with inability to complete one thing before starting another (I am expert here :-( ),
- a long history of development with several different programmers and styles being involved.

Another point, I see no strong connection between "good" or "elegant" code and the number of lines of code. Improvements of code elegance (if objectively measurable at all) are also possible by adding some lines. "Local readability" may be an important goal IMO, and sometimes you can achieve it by shortening some complex code here, "locally", and adding another function there which is needed for the shortening, resulting in an overall increase of code lines which does not really matter.

Sven
Ok, you are very nitpicking here ;-)

When I talk about reducing lines of code I give as assumed keeping the same code style: of course I can remove all the comments and shrink, but I didn't mean that.

This is for the "Local readability" point you raised.

..regarding the implication I understand your point....but I still keep my definition ;-) ..but also yours stand: I was just talking from the developer point of view, you from the code reader/reviewer point of view.

Regarding the points that could lead to bad code, I assume we are talking of a developer that _can_ write good code...not a beginner or someone that publish half cooked code (note that having few time does not imply to publish half cooked or hurry up code, just to delay the relase).

I also answer to Uri when says that 1.7 adds more lines that it removes: I never said that we write _only_ elegant code :-) ...ok...when you add functionality it is normal to add also code, but my point was another and I think you got it.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Clean solution (Re: stockfish fail high fail low)

Post by Sven »

mcostalba wrote:
Sven Schüle wrote:I think it is the other way round. Clean and "elegant" code usually implies having understood the big picture. Not having understood it often results in "messy" code. When reading your statement I get the impression that it reverses these implications, A => B is not necessarily the same as B => A (but as !B => !A). Please correct me if I have misunderstood you.

A good understanding of the problem does not always mean that you have or create "good" code, for different possible reasons including

- lack of time or other resources,
- lack of coding/debugging experience,
- too many unfinished ideas paired with inability to complete one thing before starting another (I am expert here :-( ),
- a long history of development with several different programmers and styles being involved.

Another point, I see no strong connection between "good" or "elegant" code and the number of lines of code. Improvements of code elegance (if objectively measurable at all) are also possible by adding some lines. "Local readability" may be an important goal IMO, and sometimes you can achieve it by shortening some complex code here, "locally", and adding another function there which is needed for the shortening, resulting in an overall increase of code lines which does not really matter.
Ok, you are very nitpicking here ;-)

When I talk about reducing lines of code I give as assumed keeping the same code style: of course I can remove all the comments and shrink, but I didn't mean that.

This is for the "Local readability" point you raised.

..regarding the implication I understand your point....but I still keep my definition ;-) ..but also yours stand: I was just talking from the developer point of view, you from the code reader/reviewer point of view.

Regarding the points that could lead to bad code, I assume we are talking of a developer that _can_ write good code...not a beginner or someone that publish half cooked code (note that having few time does not imply to publish half cooked or hurry up code, just to delay the relase).

I also answer to Uri when says that 1.7 adds more lines that it removes: I never said that we write _only_ elegant code :-) ...ok...when you add functionality it is normal to add also code, but my point was another and I think you got it.
There was zero nitpicking intended.

You gave your opinion that understanding the big picture were already sufficient to create good code, so that "messy" code would mean a lack of that understanding. I replied with my opinion that there are several practical reasons why "understanding" could be insufficient to also get well-written code (of course it helps a lot, agreed on that). Based on my experience, the reality is different from the "understood == well-coded" picture you are drawing. Especially when it comes to releasing a new version it is very common practice (and there is nothing to criticize about it) to give higher priority on getting and keeping stable functionality than on code elegance. And you know well the "never change a running system" advice :-)

My point of view is that of a developer. I would not separate that from the point of view of a code reader/reviewer as you did. Both will be reading the source code to understand how it works, or why it does not work as expected or required. A reviewer needs fairly good knowledge about development himself to produce qualified review results. So I see no different viewpoints. A possible advantage of the reviewer is that he is not involved and can be more neutral.

Regarding "lines of code", I do not understand how the idea of "removing comments" is related to what I wrote. My point was different. Improving "local readability" can be done by replacing a complex piece of code by a call to a newly created function, but the total lines of code will usually increase by that. In your measurement this kind of change could be dropped since it makes the whole program slightly bigger. That was my point.
mcostalba wrote:Clean solution: a solution is considered clean _only_ when

1) It removes more lines of code then it adds

2) Is supported by strong and serious testing evidence
So to come back on your "what is a clean solution" advice, for me it is defined differently regarding point 1. It is a solution that is

- consistent with the overall idea of the rest of the program,
- complies to the rules and goals which (may) have been defined for the project, and
- does not contradict to commonly accepted principles of writing good software. "Clean" in this context is often understood as not being a "hack" that is introduced for a very special purpose but may have implications which have not been thought through completely, or which are known but regarded as "not important enough to fix now" or even as "we can live with it".

The definition of "accepted principles" is difficult because you won't find *the one and only* item 1..20 cookbook for it on the net or in any document, but instead thousands of different advices. So these principles are subject to both experience and individual style to a great amount.

Therefore I think one should better not focus too much on "lines of code" as indicator for the "cleanness" of a solution.

Being in your place, I would give a try at least to Uri's proposal. Perhaps you find a "clean" way to implement it, in order to see whether it improves Stockfish.

Sven
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: Clean solution (Re: stockfish fail high fail low)

Post by mcostalba »

Sven Schüle wrote: Being in your place, I would give a try at least to Uri's proposal. Perhaps you find a "clean" way to implement it, in order to see whether it improves Stockfish.
Thanks for your clarification, I agree on what you said.

Regarding Uri (or someone else that comes with an idea) I am not going to test it if _he_ is not willing to invest his time in this, my principle is that neither I will do.

The only exception is if I found an idea extremely interesting...but this is not the case ;-)
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Clean solution (Re: stockfish fail high fail low)

Post by Sven »

mcostalba wrote:
Sven Schüle wrote: Being in your place, I would give a try at least to Uri's proposal. Perhaps you find a "clean" way to implement it, in order to see whether it improves Stockfish.
Thanks for your clarification, I agree on what you said.

Regarding Uri (or someone else that comes with an idea) I am not going to test it if _he_ is not willing to invest his time in this, my principle is that neither I will do.

The only exception is if I found an idea extremely interesting...but this is not the case ;-)
O.k., I can fully accept that because you are the one to decide about your time.

Sometimes people are indeed willing to test their own proposals but do not have an appropriate environment for that, for instance when it comes to SMP or cluster related stuff. I guess in such cases you might think differently.

Sven
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: Clean solution (Re: stockfish fail high fail low)

Post by mcostalba »

Sven Schüle wrote: Sometimes people are indeed willing to test their own proposals but do not have an appropriate environment for that, for instance when it comes to SMP or cluster related stuff. I guess in such cases you might think differently.
I won't argue on hypothetical cases, if someone steps in saying: "I have this wonderful idea (patch attached) but I cannot test myself because of XXX" then for me it is good whenever is the XXX part as long as is realistic.

I would like to note that to test a patch you don't need _any_ special software as long as you have a computer, all the other needed software is free: compiler / SF sources / tournament manager (cutechess is free and very good).

If it is single core is good the same, more then a quarter of commits that went in SF from Glaurung times have been tested on a single core (my old PC). Only very specific thread (SMP) code requires SMP PC, but in that case also a Dual Core is enough. Anyhow this are all void words just to keep the discussion on: Let's present a real case and we will see what to do.
User avatar
Eelco de Groot
Posts: 4561
Joined: Sun Mar 12, 2006 2:40 am
Full name:   

Re: Clean solution (Re: stockfish fail high fail low)

Post by Eelco de Groot »

A real case, no offense I hope to any of the Stockfish authors and it is maybe something more for the Stockfish team as a whole to discuss, at least I don't think it should be avoided, I also do not really want to start a big thread in this forum, I do not even expect a reply, but the real case I think is what to with ideas from decompiled private or commercial programs.


A few points that would come to mind, just personally speaking and as I am using Stockfish code it is something I also should consider for myself when I would want to publish something derived form Stockfish...
  • Stockfish has such a solid codebase by now, I think it should stand on its own now as it certainly can.
  • There is no commercial interest, or plans to go commercial with Stockfish, so there is also no real need to incorporate every idea available if the source is in some doubt.
  • GPL questions, in case there is some doubt, even if just using ideas, in spirit GPL code should be free for everyone and not use proprietary ideas i.e. from possibly illegally published code.
  • You can't and don't need to avoid learning these ideas "sideways", there is no stopping that I know, and I know I myself advocated translating the Ippolit code to more readable sources, as that would still be preferred in the end, so that would be the other side of this debate and even an opposite view to my own points given above. The question is maybe also: do we just continue as before if another case like this happens?
I don't mean to offend anyone and I don't want to start a flame war. I just wondered what will happen with the next decompiled program that is "translated" to "open source". I mean for sure we don't need to beat Ivanhoe, I couldn't care less about that.

Sorry to be off topic.
Eelco
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan
QED
Posts: 60
Joined: Thu Nov 05, 2009 9:53 pm

Re: stockfish fail high fail low

Post by QED »

Stockfish idea is that score from transposition table is not trusted at pv node. But in search.cpp in the function search_pv() there is

Code: Select all

        // Step 15. Full depth search
        if &#40;doFullDepthSearch&#41;
        &#123;
            ss&#91;ply&#93;.reduction = Depth&#40;0&#41;;
            value = -search&#40;pos, ss, -alpha, newDepth, ply+1, true, threadID&#41;;

            // Step extra. pv search &#40;only in PV nodes&#41;
            if &#40;value > alpha && value < beta&#41;
                value = -search_pv&#40;pos, ss, -beta, -alpha, newDepth, ply+1, threadID&#41;;
        &#125;
The strange thing is that if -search() makes value >= beta, -search_pv() would not be called, and we return bound that can be based on TT value. That bound should not be trusted.

First simple solution (that reduces the length of code) is to drop value < beta condition

Code: Select all

        // Step 15. Full depth search
        if &#40;doFullDepthSearch&#41;
        &#123;
            ss&#91;ply&#93;.reduction = Depth&#40;0&#41;;
            value = -search&#40;pos, ss, -alpha, newDepth, ply+1, true, threadID&#41;;

            // Step extra. pv search &#40;only in PV nodes&#41;
            if &#40;value > alpha&#41;
                value = -search_pv&#40;pos, ss, -beta, -alpha, newDepth, ply+1, threadID&#41;;
        &#125;
but I thought this would slow down search a bit in cases where -search_pv() would end up with value <= alpha, so I have tested "pv zero width verification" instead

Code: Select all

        // Step 15. Full depth search
        if &#40;doFullDepthSearch&#41;
        &#123;
            ss&#91;ply&#93;.reduction = Depth&#40;0&#41;;
            value = -search&#40;pos, ss, -alpha, newDepth, ply+1, true, threadID&#41;;

            // PV zero-width verification
            if &#40;value > alpha&#41;
                value = -search_pv&#40;pos, ss, -alpha - 1, -alpha, newDepth, ply+1, threadID&#41;;

            // Step extra. pv search &#40;only in PV nodes&#41;
            if &#40;value > alpha && value < beta&#41;
                value = -search_pv&#40;pos, ss, -beta, -alpha, newDepth, ply+1, threadID&#41;;
        &#125;
And this got -1 (bayes)elo with error bars +-13 (or so, I accidentaly deleted the .pgn file) after 1000 games (time control 30 seconds/game, 32MB hash, 1 thread). There are more tests under way. So it does not change playing strength and it could return more reliable score.

Of course, there is also sp_search_pv(), I have not fully used my usual method "look at output to see if it feels right" and I still have other ideas to make solution more clean, so this solution is certainly not final.
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: stockfish fail high fail low

Post by mcostalba »

QED wrote: First simple solution (that reduces the length of code) is to drop value < beta condition

Code: Select all

        // Step 15. Full depth search
        if &#40;doFullDepthSearch&#41;
        &#123;
            ss&#91;ply&#93;.reduction = Depth&#40;0&#41;;
            value = -search&#40;pos, ss, -alpha, newDepth, ply+1, true, threadID&#41;;

            // Step extra. pv search &#40;only in PV nodes&#41;
            if &#40;value > alpha&#41;
                value = -search_pv&#40;pos, ss, -beta, -alpha, newDepth, ply+1, threadID&#41;;
        &#125;
but I thought this would slow down search a bit in cases where -search_pv() would end up with value <= alpha
hmmmm, actually I like your idea, is the same already used in root_search()

Code: Select all

from root_search&#40;)&#58;

                    // Step 15. Full depth search
                    if &#40;doFullDepthSearch&#41;
                    &#123;
                        // Full depth non-pv search using alpha as upperbound
                        ss&#91;0&#93;.reduction = Depth&#40;0&#41;;
                        value = -search&#40;pos, ss, -alpha, newDepth, 1, true, 0&#41;;

                        // If we are above alpha then research at same depth but as PV
                        // to get a correct score or eventually a fail high above beta.
                        if &#40;value > alpha&#41;
                            value = -search_pv&#40;pos, ss, -beta, -alpha, newDepth, 1, 0&#41;;
                    &#125;
do you plan to test your first idea ? Otherwise I'll do.

Thanks
Marco