I found this thread started by Mr Hyatt back in 2006:
http://talkchess.com/forum/viewtopic.ph ... at&start=0
The thread seemed to evolve into more of a discussion on best practices for determining ELO. But the main point Robert makes in the very beginning is pretty similar to what I'm experiencing: SE is a wash at best.
So today I'm trying something a little different.
Code: Select all
if (tt_move.score >= beta) and (tt_move.depth >= depth - N)
mark tt_move as candidate for SE
of course, if tt_move.depth >= depth just return beta as usual
... perform normal pruning stuff ...
... perform normal search on tt_move, if it causes beta cutoff, great ...
if (tt_move is candidate) and (depth not extended) and (pc_count > 1)
do depth+1 search on tt_move
if (causes beta cutoff) awesome
if (increases alpha), great, increase alpha (can only happen in PV nodes)
if (depth+1 score is much worse than previous score)
possible threat in this position, do depth+1 on remaining moves
I think this may be something between PV extensions and typical SE. The number of moves extended is pretty small due to the candidate pre-requisite. In my tests I'm seeing typically anywhere from 0 to a couple hundred extensions in the 10 to 15 ply search range. And only a small fraction of those cause a depth+1 search on remaining moves.
I'd appreciate any feedback on this idea. I'd also appreciate reports on success with other SE techniques.
I've saved the code this time around. But over the last year, I experimented with two specific extensions that had worked for me in the past. One was the full Hsu-like SE as explained in the JICGA, not the "sorta-singular extension in vogue today in engines like Stockfish and others. I got it to a break-even point, but never anything better. After spending almost 6 full months of doing nothing else (related to crafty, anyway) but playing with SE. The sticky transposition table? Check. Flagging a move as singular and keeping it for 1-2 iterations to avoid instabilities? Check. I used the Hsu/Campbell ICGA paper just as I did when I added this to Cray Blitz.
My impression after all the work was this: If you JUST look at tactical positions, it looks ingenious. You get crazy long PVs in positions where things are forced. And you think "this has got to be good." But then I try it against a gauntlet and the wheels come off over the next 30,000 games. I used it in Cray Blitz because our testing was pretty limited and test positions gave us the best chance of finding things that were good or bad. I now suspect the gains were Minimal. Hsu/Campbell initially reported +200 or +300 Elo (I don't remember which) based on deepthought+SE playing 20 long games against deep thought-noSE, and winning something like 15/20 games or something. They later tested it more rigorously and scaled that back a tad. All the way back to maybe 10-20 Elo. (I don't have the ICGA article here at home so don't have the actual numbers).
While I was going, I took Chrilly Donninger's null-move threat extension and tried it again (I had used it in the 8.x or so versions, but later removed it. The basic idea is this. If a move fails high, before you return beta, you do a quick null-move with a downward off set window. If that fails low, you can make the following observation: OK, if I do nothing, things go REALLY bad (fail low on something like alpha - 1.5 or whatever, I don't remember the actual optimal margin without my notes at the office), but I am failing high on this move, so I am suspicious. This move might be the ONLY move that saves me here (it did fail high) and when I reach such a forced position, I need to be very careful, as if this move fails low later, I am dead. So I extend this supposed fail-high move by one ply. If it fails high, I now accept the fail high since it was "verified" by a one ply deeper search. If it fails low, I am immediately suspicious of this fail high move and I keep searching as if the fail high did not happen.
I also managed to tune this to break-even by fine-tuning things, just as I did in SE. And again, it will show an eye-popping PV here and there where this triggers deeper and deeper searches. But I could not make it show an Elo gain. I have saved the code to experiment more with it in the future, but for the time being...
I eventually rationalized it as "OK, there are two ways to skin a rabbit, you can extend the right moves to see the key stuff, or you can reduce the other moves so that the right moves end up getting searched deeper in an inverse way. Whether that is the explanation or not, I don't know. I am nearly positive there were no programming problems with either, as I dumped a lot of trees and followed the analysis through large tree traces. And it worked just as expected, except that for some things you go way deep, but for everything else, you have to give up some depth. If you want to try any of this, I'll be happy to chime in with my results or the things I tried as you get into it. It seems like it ought to work, but the tradeoff between the overhead to recognize a singular move or a threat can overwhelm the actual gain, very easily.
If you want to do it right, I would certainly recommend Hsu's article on singular extensions. There are lots of tricks and traps along the way. Hsu defined PV singular and fail-high singular. You can get tricked. For example, if I play something like QxQ, and you have only one recapture, you have to be careful. Extending an easy move (the only recapture) is probably a waste, yet it sure looks singular since it is better than all the other moves by a significant margin. One has to watch passed pawn endgames as well if you do real singular extensions. If you want to know more about SE, and by SE I mean the real HSU SE, not the SE that is triggered by the ttable best move, then feel free to ask.
What you are describing really doesn't sound much like SE as I know it, at all.
For example, a PV singular test goes like this. First move is searched and you get a score S. All remaining moves are searched to normal depth, but with an offset window new_alpha = alpha - margin; new_beta = alpha - margin +1. If all of those fail low, they are significantly worse than the first move you tried, so you re-search that first move to D+1 and return that score rather than the original score.
For fail high singular it is more messy. You search the first move and it fails high. Normally you would return beta here, but not with SE. You search all the remaining moves (which would normally be skipped) with a reduced depth (this is tunable... reduce by 2, 3, or use depth/2 or whatever) AND the offset window. If all fail low, we re-search this fail high move to depth + 1. But the messiness comes in when one of those remaining moves fails high. You could say "OK, two good moves, this isn't singular." Or you could do as Hsu did and say "OK, this second move might be better than the first and it might be enough better to make IT singular. You have to resolve this by researching both. And once you resolve that, another move may fail high again. This is messy, and even worse, you have to avoid the potential loop since you are overwriting hash stuff that affects these searches.
As I said, messy. VERY VERY messy..