Singular extensions

zd3nik · Post by **zd3nik** » Sun Mar 22, 2015 12:05 am

I found this thread started by Mr Hyatt back in 2006:

http://talkchess.com/forum/viewtopic.ph ... at&start=0

The thread seemed to evolve into more of a discussion on best practices for determining ELO. But the main point Robert makes in the very beginning is pretty similar to what I'm experiencing: SE is a wash at best.

So today I'm trying something a little different.

Code: Select all

if &#40;tt_move.score >= beta&#41; and &#40;tt_move.depth >= depth - N&#41;
   mark tt_move as candidate for SE
   of course, if tt_move.depth >= depth just return beta as usual

... perform normal pruning stuff ...
... perform normal search on tt_move, if it causes beta cutoff, great ...

if &#40;tt_move is candidate&#41; and &#40;depth not extended&#41; and &#40;pc_count > 1&#41;
  do depth+1 search on tt_move
  if &#40;causes beta cutoff&#41; awesome
  if &#40;increases alpha&#41;, great, increase alpha &#40;can only happen in PV nodes&#41;
  if &#40;depth+1 score is much worse than previous score&#41;
    possible threat in this position, do depth+1 on remaining moves

I think this may be something between PV extensions and typical SE. The number of moves extended is pretty small due to the candidate pre-requisite. In my tests I'm seeing typically anywhere from 0 to a couple hundred extensions in the 10 to 15 ply search range. And only a small fraction of those cause a depth+1 search on remaining moves.

I'd appreciate any feedback on this idea. I'd also appreciate reports on success with other SE techniques.

Thanks,
STC

bob · Post by **bob** » Mon Mar 23, 2015 1:42 am

zd3nik wrote:I found this thread started by Mr Hyatt back in 2006:

http://talkchess.com/forum/viewtopic.ph ... at&start=0

The thread seemed to evolve into more of a discussion on best practices for determining ELO. But the main point Robert makes in the very beginning is pretty similar to what I'm experiencing: SE is a wash at best.

So today I'm trying something a little different.
Code: Select all
if &#40;tt_move.score >= beta&#41; and &#40;tt_move.depth >= depth - N&#41;
   mark tt_move as candidate for SE
   of course, if tt_move.depth >= depth just return beta as usual

... perform normal pruning stuff ...
... perform normal search on tt_move, if it causes beta cutoff, great ...

if &#40;tt_move is candidate&#41; and &#40;depth not extended&#41; and &#40;pc_count > 1&#41;
  do depth+1 search on tt_move
  if &#40;causes beta cutoff&#41; awesome
  if &#40;increases alpha&#41;, great, increase alpha &#40;can only happen in PV nodes&#41;
  if &#40;depth+1 score is much worse than previous score&#41;
    possible threat in this position, do depth+1 on remaining moves
I think this may be something between PV extensions and typical SE. The number of moves extended is pretty small due to the candidate pre-requisite. In my tests I'm seeing typically anywhere from 0 to a couple hundred extensions in the 10 to 15 ply search range. And only a small fraction of those cause a depth+1 search on remaining moves.

I'd appreciate any feedback on this idea. I'd also appreciate reports on success with other SE techniques.

Thanks,
STC

I've saved the code this time around. But over the last year, I experimented with two specific extensions that had worked for me in the past. One was the full Hsu-like SE as explained in the JICGA, not the "sorta-singular extension in vogue today in engines like Stockfish and others. I got it to a break-even point, but never anything better. After spending almost 6 full months of doing nothing else (related to crafty, anyway) but playing with SE. The sticky transposition table? Check. Flagging a move as singular and keeping it for 1-2 iterations to avoid instabilities? Check. I used the Hsu/Campbell ICGA paper just as I did when I added this to Cray Blitz.

My impression after all the work was this: If you JUST look at tactical positions, it looks ingenious. You get crazy long PVs in positions where things are forced. And you think "this has got to be good." But then I try it against a gauntlet and the wheels come off over the next 30,000 games. I used it in Cray Blitz because our testing was pretty limited and test positions gave us the best chance of finding things that were good or bad. I now suspect the gains were Minimal. Hsu/Campbell initially reported +200 or +300 Elo (I don't remember which) based on deepthought+SE playing 20 long games against deep thought-noSE, and winning something like 15/20 games or something. They later tested it more rigorously and scaled that back a tad. All the way back to maybe 10-20 Elo. (I don't have the ICGA article here at home so don't have the actual numbers).

While I was going, I took Chrilly Donninger's null-move threat extension and tried it again (I had used it in the 8.x or so versions, but later removed it. The basic idea is this. If a move fails high, before you return beta, you do a quick null-move with a downward off set window. If that fails low, you can make the following observation: OK, if I do nothing, things go REALLY bad (fail low on something like alpha - 1.5 or whatever, I don't remember the actual optimal margin without my notes at the office), but I am failing high on this move, so I am suspicious. This move might be the ONLY move that saves me here (it did fail high) and when I reach such a forced position, I need to be very careful, as if this move fails low later, I am dead. So I extend this supposed fail-high move by one ply. If it fails high, I now accept the fail high since it was "verified" by a one ply deeper search. If it fails low, I am immediately suspicious of this fail high move and I keep searching as if the fail high did not happen.

I also managed to tune this to break-even by fine-tuning things, just as I did in SE. And again, it will show an eye-popping PV here and there where this triggers deeper and deeper searches. But I could not make it show an Elo gain. I have saved the code to experiment more with it in the future, but for the time being...

I eventually rationalized it as "OK, there are two ways to skin a rabbit, you can extend the right moves to see the key stuff, or you can reduce the other moves so that the right moves end up getting searched deeper in an inverse way. Whether that is the explanation or not, I don't know. I am nearly positive there were no programming problems with either, as I dumped a lot of trees and followed the analysis through large tree traces. And it worked just as expected, except that for some things you go way deep, but for everything else, you have to give up some depth. If you want to try any of this, I'll be happy to chime in with my results or the things I tried as you get into it. It seems like it ought to work, but the tradeoff between the overhead to recognize a singular move or a threat can overwhelm the actual gain, very easily.

If you want to do it right, I would certainly recommend Hsu's article on singular extensions. There are lots of tricks and traps along the way. Hsu defined PV singular and fail-high singular. You can get tricked. For example, if I play something like QxQ, and you have only one recapture, you have to be careful. Extending an easy move (the only recapture) is probably a waste, yet it sure looks singular since it is better than all the other moves by a significant margin. One has to watch passed pawn endgames as well if you do real singular extensions. If you want to know more about SE, and by SE I mean the real HSU SE, not the SE that is triggered by the ttable best move, then feel free to ask.

What you are describing really doesn't sound much like SE as I know it, at all.

For example, a PV singular test goes like this. First move is searched and you get a score S. All remaining moves are searched to normal depth, but with an offset window new_alpha = alpha - margin; new_beta = alpha - margin +1. If all of those fail low, they are significantly worse than the first move you tried, so you re-search that first move to D+1 and return that score rather than the original score.

For fail high singular it is more messy. You search the first move and it fails high. Normally you would return beta here, but not with SE. You search all the remaining moves (which would normally be skipped) with a reduced depth (this is tunable... reduce by 2, 3, or use depth/2 or whatever) AND the offset window. If all fail low, we re-search this fail high move to depth + 1. But the messiness comes in when one of those remaining moves fails high. You could say "OK, two good moves, this isn't singular." Or you could do as Hsu did and say "OK, this second move might be better than the first and it might be enough better to make IT singular. You have to resolve this by researching both. And once you resolve that, another move may fail high again. This is messy, and even worse, you have to avoid the potential loop since you are overwriting hash stuff that affects these searches.

As I said, messy. VERY VERY messy..

zd3nik · Post by **zd3nik** » Tue Mar 24, 2015 7:34 am

What you are describing really doesn't sound much like SE as I know it, at all.

For example, a PV singular test goes like this. First move is searched and you get a score S. All remaining moves are searched to normal depth, but with an offset window new_alpha = alpha - margin; new_beta = alpha - margin +1. If all of those fail low, they are significantly worse than the first move you tried, so you re-search that first move to D+1 and return that score rather than the original score.

Correct, it's not SE, at least not according to well established definitions. It's probably more like PV extensions than SE, but it's not PV extensions either because it sometimes does extensions on non-PV nodes. It changes the definition "singular" from "moves that are singularly good", to "extending a single move". Probably a better name for this would be "first move extensions" or "TT move extensions".

My experiments with SE yielded pretty much the same results as your trials, though I don't claim to have been nearly as rigorous in my tests. SE only seems to provide benefit in highly tactical positions. In most other positions it's simply wasted processing. Making positional play weaker because of the time wasted doing reduced depth searches with an offset window. Net result: about a wash.

The different tact I'm trying, which I described in the original post, is to simply extend all TT moves that show some likelihood of producing a beta cutoff (fail high). But I only do this singular extension on the TT move if it doesn't produce a beta cutoff with a normal depth search first. And if the extended depth search on the TT move shows a significant drop in score as compared to the normal depth search, there may be a tactical threat or a horizon effect at the position so I extend the depth of all siblings.

I also disable these TT move extensions if the side to move doesn't have at least 2 non-pawn pieces (I've also bumped this up to >= 3). This is to protect against the problems you touched upon in doing singular extensions in pawn endings. I also don't think there's much need for extensions when there very few pieces on the board because normal search can achieve very great depths in those cases anyway, and you want to look at as many possibilities as you can with equal level of depth when there are very few pieces on the board because the best moves are often much more subtle in those cases.

My testing with this so far has produced mixed results. Sometimes this approach actually allows a significant overall depth increase (probably due to PV extension-esque TT priming). Other times the extra searching provides no benefit at all and only decreases overall search depth due to wasted processing. But the net result seems to be at least equal in terms of ELO, usually slightly better. Unfortunately I don't have the resources to do thorough testing. So my test results could just be deduced from noise.

I've also been experimenting with using values from the history heuristic table to tag first moves as candidates for singular extension. As you might expect, the results are also + and - only more so than when using TT data only.

This heuristic is very simple to implement, yet has many possible variations. I wonder if you would be inclined to give it a quick try in crafty just to see what happens?

I've already spent enough time, as you have, with "standard" SE. This is just the exploration of a tangential idea.

STC

petero2 · Post by **petero2** » Tue Mar 24, 2015 8:28 pm

zd3nik wrote:I'd also appreciate reports on success with other SE techniques.

In texel 1.05 I implemented hash table based singular extensions. At that time I measured approximately a 20 elo increase in hyper bullet (1s+0.08s/move) self tests. To see if this result was an anomaly caused by self test, I ran a tournament containing the following engines:

1. Texel106a4 : The current development version of texel.
2. Stockfish6_t/4 : Stockfish 6, playing with a 1 to 4 time handicap.
3. Texel106a4_noSE : The current development version of texel with SE code removed.

The time control was 4+0.32 for texel and 1+0.08 for stockfish. I used one fourth the thinking time for stockfish to make the ratings approximately equal. Each engine played 20000 games against the other two engines. The result was:

Code: Select all

Rank Name                Elo      +      - games   score   oppo.   draws 
   1 Texel106a4         15.6    3.6      4 20000   55.3%   -15.6   51.6% 
   2 Texel106a4_noSE   -15.6      4    3.6 20000   44.7%    15.6   51.6% 

Rank Name                Elo      +      - games   score   oppo.   draws 
   1 Texel106a4          5.6    2.7    2.7 20000   51.7%    -5.6   34.2% 
   2 Stockfish6_t/4     -5.6    2.7    2.7 20000   48.3%     5.6   34.2% 

Rank Name                Elo      +      - games   score   oppo.   draws 
   1 Stockfish6_t/4        8      3    2.6 20000   52.4%      -8   34.6% 
   2 Texel106a4_noSE      -8    2.6      3 20000   47.6%       8   34.6%

So singular extensions in texel are worth approximately 27 elo against a time-handicapped stockfish and 31 elo in self play.

I think the implementation is quite standard. It works roughly like this:

If the hash table probe or internal iterative deepening (IID) produced an exact score or a lower bound, the remaining depth is big enough, and the hash/IID move would not otherwise have been extended, perform a "singular search" for all other moves.

The "singular search" uses half the search depth and alpha+1 = beta = hashOrIIDScore-25.

If the "singular search" fails low, that is if all other moves are more than 25cp worse than the hash/IID score, extend the hash/IID move by one ply.

bob · Post by **bob** » Tue Mar 24, 2015 9:47 pm

First obvious question: What suggests that extending a tt move is a good thing? It was best in another position, but you searched it to a normal depth. Why extend it HERE but not on the original search? That's where SE is supposed to work. If you extend it the first time because it tested singular, you store that in the so-called "sticky trans/ref table" and you extend it the next time you reach the same position WITHOUT the singular test (offset window) being necessary at all.

The very first version of the tt-singular approach I saw, I think was in robbolito or ippolit, and it simply extended the tt best move. Did not work for me. I even removed it (it was inserted into an early version of stockfish) and it made zero difference in Elo in my gauntlet testing, confirming my zero improvement when trying it in Crafty.

More recent versions of stockfish do something more rational, but still short of Hsu's approach.

I tried about a zillion variations of Hsu's code. I explained how complex a singular test was, because moves can fail high on the offset window and they themselves might actually be singular. I stopped that pretty quickly as it is serious overhead. My best (zero gain) approach was to simply start the singular test (at PV nodes a normal search with offset window, at FH nodes a reduced depth search with offset window) and ANY fail high on that offset window said "this is not singular" and got rid of the repeated researching that happens is significantly pathological positions.

Some might ask, break-even, why not keep it? The answer was it makes the code significantly messier. I prefer to keep things simple unless something actually improves things.

bob · Post by **bob** » Tue Mar 24, 2015 10:25 pm

BTW, as you work on this, feel free to ask questions OR report results. It is an interesting topic and the last word has NOT been spoken yet. My results have been neutral. That doesn't mean there are not better options somewhere to raise that bar a bit.

I saved the code for that reason, because it is incredibly messy to write the first time around.

I think the sticky trans/ref table might have other uses as well, so that's a future topic. Perhaps things like rather than saying "this move is really good, always extend it", one could say "this move REALLY sucks, always reduce it the max."

zd3nik · Post by **zd3nik** » Thu Mar 26, 2015 3:50 am

Thanks for the feedback.

If the hash table probe or internal iterative deepening (IID) produced an exact score or a lower bound, the remaining depth is big enough, and the hash/IID move would not otherwise have been extended, perform a "singular search" for all other moves.

I've tried this as well. But I didn't do very much testing on it. After watching a few self-play games and performing a very quick gauntlet test it seemed to be reducing the average search depth due to all the reduced depth test searches. And I didn't witness any games where this implementation showed it's strength.

That's when I decided to just see what would happen if I simply extended every TT move that had an exact or lowerbound score >= beta and sufficient depth without doing the reduced depth test searches to verify it was "singular".

Update: Now that I've had time to let a reasonable number of gauntlet games churn, I'm seeing that my idea isn't working out well at all. It's quite terrible in fact! Oh well, hat's what happens when you just "try it" without first trying to deduce whether it makes any sense.

zd3nik · Post by **zd3nik** » Thu Mar 26, 2015 4:02 am

First obvious question: What suggests that extending a tt move is a good thing?

I didn't have a "solid" reason for doing this. It was more trial and error than anything. See my response to Peter. But I did see some similarity to PV extensions (which I've never tried and I may be completely misunderstanding what PV extensions are). The idea there, I believe, is that you already know the move you're looking at is probably the best move, so search it a little deeper to either

1) discover more quickly that it's even stringer than initial lower depth searches found, allowing you to increase alpha sooner, which should help reduce search time on other lines.
2) discover a horizon effect a little sooner, triggering another line to take it's place, which will also get extended by the same logic and either hit the same horizon effect or turn out to be immune to whatever tactic the first PV fell victim to.

You're end up putting a lot more time into the PV lines this way, but inversely reducing time spent in the non-PV lines (if the PV remains stable), or more quickly detecting horizon effects.

The very first version of the tt-singular approach I saw, I think was in robbolito or ippolit, and it simply extended the tt best move.

Sounds like this has already been tried then. Pretty much the story of every thing Ive ever tried in my chess programs.

Anyway, as I stated in my response to Peter, further testing in my engines with this technique have shown it is no good at all (in my engines at least). So I think I'll either go back to the more traditional SE approach, or finally give "real" PV extensions a try.

Thanks for the feedback! It's nice to see there's an active community of people interested in this stuff.[/quote]

bob · Post by **bob** » Thu Mar 26, 2015 8:27 pm

zd3nik wrote:
First obvious question: What suggests that extending a tt move is a good thing?
I didn't have a "solid" reason for doing this. It was more trial and error than anything. See my response to Peter. But I did see some similarity to PV extensions (which I've never tried and I may be completely misunderstanding what PV extensions are). The idea there, I believe, is that you already know the move you're looking at is probably the best move, so search it a little deeper to either

1) discover more quickly that it's even stringer than initial lower depth searches found, allowing you to increase alpha sooner, which should help reduce search time on other lines.
2) discover a horizon effect a little sooner, triggering another line to take it's place, which will also get extended by the same logic and either hit the same horizon effect or turn out to be immune to whatever tactic the first PV fell victim to.

You're end up putting a lot more time into the PV lines this way, but inversely reducing time spent in the non-PV lines (if the PV remains stable), or more quickly detecting horizon effects.

The very first version of the tt-singular approach I saw, I think was in robbolito or ippolit, and it simply extended the tt best move.
Sounds like this has already been tried then. Pretty much the story of every thing Ive ever tried in my chess programs.

Anyway, as I stated in my response to Peter, further testing in my engines with this technique have shown it is no good at all (in my engines at least). So I think I'll either go back to the more traditional SE approach, or finally give "real" PV extensions a try.

Thanks for the feedback! It's nice to see there's an active community of people interested in this stuff.

[/quote]

Note that searching PV lines deeper than non-PV lines tends to make the PV stable, because you don't have enough depth elsewhere to spot the critical tactics. LMR works, but it does have its drawbacks.

Singular extensions

Singular extensions

Re: Singular extensions

Re: Singular extensions

Re: Singular extensions

Re: Singular extensions

Re: Singular extensions

Re: Singular extensions

Re: Singular extensions

Re: Singular extensions