Time Management and Move Selection

odomobo · Post by **odomobo** » Tue Jul 10, 2018 1:08 am

I've been thinking about the following, and was wondering if there are some practices for them. My questions are specifically with regards to alpha-beta search.

What's the best approach to finding singular best moves -- that is, when there's only one good move? For example, when your opponent initiates a queen trade, and there are no good zwischenzugs. Obv. under a fail-hard system, you'll never know how much worse your alternatives lines are to the PV. My idea was to allocate a small fraction of the search time (maybe somewhere between 1 and 10%) to search for such a move, by setting alpha = PV score - margin (maybe 200 cp). If no alternative line is able to raise alpha by the end of the singular move search, then return PV as the best line.

Is it safe to use a move which raises alpha over the PV, but before the search at that depth is finished? That is, when a non-PV move raises alpha just before we run out of time. I understand that if the PV lost value vs the previous search depth, that it makes sense to allocate extra "panic" time to find a good move (since we have been working with incomplete information up to this point). But what about the case where the PV score stayed nearly the same? Logically, I think it should be safe to use the new move, since we are using the best information we have, but intuitively I feel like there might be some mistake in doing this.

jdart · Post by **jdart** » Tue Jul 10, 2018 2:11 am

Many programs try to detect "easy moves" such as simple recaptures and spend less time searching on those than other moves. One way to do that is to search a few iterations with a wide window, so you get scores for all moves, not just the PV.

This is a good optimization in general. But there are cases where an obvious recapture is bad: accepting a gambit pawn or piece for example.

Re accepting moves that have scored above alpha but have not had their scores resolved: you can get a fail-high followed by a fail-low. So in general this is not safe IMO.

--Jon

bob · Post by **bob** » Wed Jul 25, 2018 5:01 am

I think you need to give up on "easy move" itself. Leads to problems. What if your opponent finds a deep tactic that lets it hang a piece, but it takes a deep search to realize this? Easy move code can make you do the capture and get blown out. There are things you can try, with risks. For example, if you don't change your mind after an iteration ends, you can reduce the target time a bit. Every time you complete without a change, reduce further. Any change to a different move resets to original target time or even longer to make sure you are not making a mistake. How far you reduce the target time is the factor that controls the risk. Reduce it zero and there is no risk, nor reward. Did a ton of testing this back during some offline discussions with Don D / Komodo several years ago. There are elo to be had if you tune it carefully, there are elo to lose if you don't.

jdart · Post by **jdart** » Wed Jul 25, 2018 5:11 am

Kind of hard to do that with UCI since it doesn't really provide a coherent record of the game to the engine.

--Jon

hgm · Post by **hgm** » Wed Jul 25, 2018 6:33 am

It depends on if you want to do this in the root or elsewhere. In the latter case you would probably want to extend the singular move, while in the root you want to reduce the thinking time of such 'easy moves'.

In Shokidoki I implemented 'easy move' in the way you say: before every search in the root I set a variable 'margin', and raise alpha to bestScore - margin rather than alpha. Whenever another move fails high w.r.t. this, I reset margin to 0, so that in the remaining iterations it searches normally. The decision whether to stop thinking is then based (among other things) on whether margin is still non-zero or not.

When an iteration takes unexpectedly long I usually terminate it (playing the best move so far) when the score is not too much below that of the previous iteration. This means it gets at least extra time to search the PV move; if that takes unexpectedly long, it is usually a sign its score is going to change a lot, and you want to make sure it will not change for the worse before you play it. If the score does drop, you want to allocate extra time to find some alternative. You might not find the best alternative, but it is unlikely that one of the other moves will score much above its value of the previous iteration (and hence the PV score of the previous iteration). So taking the first that comes close seems a good compromise.

Ras · Post by **Ras** » Wed Jul 25, 2018 5:29 pm

bob wrote: ↑Wed Jul 25, 2018 5:01 amWhat if your opponent finds a deep tactic that lets it hang a piece, but it takes a deep search to realize this?

Easy solution: keep track of what your eval in the game is. If it suddenly changes too drastically for the better (more than 50 cp), then the opponent has either blundered or set up a trap. Disable "easy move detection" in this case.

hgm · Post by **hgm** » Wed Jul 25, 2018 6:46 pm

I don't think this is really something to worry about. The opponent could also set up a trap involving deep tactics without sacrificing a piece. This would be much easier, as he then does not have to earn back any investment before the trap starts to pay off. And any time you use to think about the easy move goes at the expense of time you can think in other moves, where traps are much more likely to be found (because easier to construct). It is really the same trade-off as for applying search reductions: you reduce the branches where statistically there is little hope to find something, to have more nodes left on the branches that seem more promising.

The case of a sudden score jump is really an independent issue. It can just as easily occur when you have several moves that are close in score, e.g. because the opponent did a move that left him with a bad Pawn structure. If could indeed pay to think somewhat longer in such a case. It might already pay to think longer after your opponent thought very long.

Ras · Post by **Ras** » Wed Jul 25, 2018 10:30 pm

jdart wrote: ↑Wed Jul 25, 2018 5:11 amKind of hard to do that with UCI since it doesn't really provide a coherent record of the game to the engine.

After every move, I set up a table that contains the hashes for the positions after every legal answer move from the opponent. If after the next go, the position hash matches with any of the stored ones, it's a continued game.

Ras · Post by **Ras** » Wed Jul 25, 2018 10:41 pm

hgm wrote: ↑Wed Jul 25, 2018 6:46 pmThe opponent could also set up a trap involving deep tactics without sacrificing a piece.

In real games, that's not an issue because the "easy move" usually comes into play when there was an exchange and there is only one way to recapture.

And any time you use to think about the easy move goes at the expense of time you can think in other moves

I don't follow. You don't think about the easy move, you make a regular search. First a 1 ply plus QS search over all legal moves, then you note the score drop between best and second best move. If the difference is more than a threshold (I use 200 cp), and if the score after the best move is within a +/- 50 cp window from last' turns eval, the top move is a candidate for an "easy move". Then regular search starts, and if the "easy move" is still at the top after 6 plies depth, then I cancel the search and return the "easy move".

The case of a sudden score jump is really an independent issue.

No, it's an important point because otherwise, the engine would mistakenly accept sacrifices without taking a closer look. If the game is at let's say +0.25, and in the next move, the 1 ply search suggests I'm at +300 because the opponent sacrificed a knight, then this looks suspicious enough to drop the "easy move" quick reply.

jdart · Post by **jdart** » Thu Jul 26, 2018 5:06 am

Looking at Bob's suggestion again, it really can be done within a single search, not needing the previous score (although that might be helpful).

It is really an extension of what I now do, which is extend time if the PV is unstable and/or the score is dropping. This is the opposite: possibly shaving away time from the nominal target if the PV is stable and the score seems ok. I do this now but I take the decision early on in the search and can reduce time quite a lot if the conditions are met; this may be too aggressive.

Due to the horizon effect you will always take a risk whenever choose you terminate the search; the next iteration might be very different. But all this logic is assuming that on the average you will get some gain.

--Jon

Time Management and Move Selection

Time Management and Move Selection

Re: Time Management and Move Selection

Re: Time Management and Move Selection

Re: Time Management and Move Selection

Re: Time Management and Move Selection

Re: Time Management and Move Selection

Re: Time Management and Move Selection

Re: Time Management and Move Selection

Re: Time Management and Move Selection

Re: Time Management and Move Selection