How can a program beat itself? Where is the randomness?

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

JayRod
Posts: 18
Joined: Mon Aug 11, 2014 7:36 am

Re: How can a program beat itself? Where is the randomness?

Post by JayRod »

kbhearn wrote:There is generally no explicit randomness in the search. There's semi-random effects caused by timing, particularly in the case of an SMP search where some nodes might get searched under one timing and not under another due to threads getting preempted by the system at different points resulting in different nodes visited, different splits, even more different search order, different hash entries available at a given node, etc.


Threads imply parallel processing. Is that done nowadays? I assume so, and, if so, since threads can take different times to execute, I can easily see how this will create randomness akin (in my mind) to the same programs searching different parts of the same tree at the same time.

Also it's not 100% clear from Robert Hyatt's answer: do most engine searches nowadays truncate the tree for the sake of saving time in a blitz game, or not? And I'm not referring to Alpha–beta pruning, but simply deciding that certain moves in the chess tree will not even be considered beyond a minimum ply depth where all moves are considered; since, when playing at 5 seconds a move, I don't see how you can avoid doing this, since so little time is present, unless your program thinks on the opponent's time, which I believe is not done by most engines, in that they have a parameter to turn this think on opponent feature off.

kbhearn wrote: The most common case for the weaker engine winning though is where a line is bad for the stronger engine but neither the strong nor weaker engine see it and the weaker engine just falls into 'oh, i'm winning!', the stronger engine likely even seeing it first but not being able to do anything about it by then.


Yes, thank you, I understood this 'ignorance is bliss' argument, and in fact when I play stronger human players, I sometimes win the same way (they see too much for their own good, while I stick to some straightforward tactical theme and sometimes get lucky and beat them).
kbhearn wrote: The less common case of seeing more is bad should be very uncommon these days but would involve a long line where a large unavoidable advantage occurs at the cost of a disadvantage that takes an even longer line to see how to defend it and have the disadvantage you gave up become the relevant issue. At current search depths the chance that the root evaluation relies on such leaves is... rare.
Yes, but I wish somebody would collect these oddities so we could examine them. I once saw a thread where each strong engine was given 24 hours a move on fast hardware, and though the game ended in a draw, it was quite fascinating (as I recall it involved a queen for two rooks trade, which is, next to a queen for three minor pieces trade, interesting to watch).

JayRod
kbhearn
Posts: 411
Joined: Thu Dec 30, 2010 4:48 am

Re: How can a program beat itself? Where is the randomness?

Post by kbhearn »

JayRod wrote: Threads imply parallel processing. Is that done nowadays? I assume so, and, if so, since threads can take different times to execute, I can easily see how this will create randomness akin (in my mind) to the same programs searching different parts of the same tree at the same time.
All of the top engines include parallel search capabilities. By default they tend to only use a single thread and are pretty close to deterministic, but you can change that in their uci settings to more fully use your pc.
Also it's not 100% clear from Robert Hyatt's answer: do most engine searches nowadays truncate the tree for the sake of saving time in a blitz game, or not? And I'm not referring to Alpha–beta pruning, but simply deciding that certain moves in the chess tree will not even be considered beyond a minimum ply depth where all moves are considered; since, when playing at 5 seconds a move, I don't see how you can avoid doing this, since so little time is present, unless your program thinks on the opponent's time, which I believe is not done by most engines, in that they have a parameter to turn this think on opponent feature off.
There are a number of methods used to truncate or shorten the search. Given infinite time all of these reduced/truncated lines would be searched in future iterations, it's just methods of prioritising what needs to be checked first/to greater depth allowing among other benefits seeing the PV much deeper which helps with the horizon effect. This is not blitz specific and is important to the general performance of modern engines.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: How can a program beat itself? Where is the randomness?

Post by Sven »

Hi Jay,

I propose that you try to find the explanation for a different scenario first: two identical engine instances play each other with identical time control. Of course we expect a result of 50% with a sufficient number of games. But will all games end in a draw? No, they won't. Instance A will win some games and B as well. Why?

The answer is quite easy, and it is indeed related to the horizon effect. During a game of N plies length an engine takes N move decisions (let's ignore opening books here). P percent of these decisions are "perfect" (i.e., the engine plays the optimal move) but (100-P) percent are not optimal, and among these non-optimal move decisions there may be at least one that loses the game where a better move would not lose. Nevertheless the engine plays that move since it does not see the losing continuation within its current search horizon.

So in a match "A vs. B" with equal time you get some wins on either side due to playing sub-optimal moves.

Now return to your case: A vs. B where instance B gets twice of the search time of instance A but is otherwise identical to A. Some of B's move decisions will remain the same as before (i.e., same as A would play) but some other decisions will be different. Among these different moves some (probably most) will be better than before due to the deeper search. A will still win some games due to B playing sub-optimal moves but the ratio becomes smaller. Now double thinking time for B again, and the story continues: the ratio of sub-optimal moves of B becomes even smaller so B will win more games again.
User avatar
hgm
Posts: 27808
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: How can a program beat itself? Where is the randomness?

Post by hgm »

That you can look twice as far doesn't mean you can see everything. You can still take a dead-end street. It merely means that the stronger engine will know that the game is lost before the weaker one will realize it is winning, but if there are no side streets, that is not much help.
JayRod
Posts: 18
Joined: Mon Aug 11, 2014 7:36 am

Re: How can a program beat itself? Where is the randomness?

Post by JayRod »

Sven Schüle wrote:
I propose that you try to find the explanation for a different scenario first: two identical engine instances play each other with identical time control. Of course we expect a result of 50% with a sufficient number of games. But will all games end in a draw? No, they won't. Instance A will win some games and B as well. Why?

The answer is quite easy, and it is indeed related to the horizon effect. During a game of N plies length an engine takes N move decisions (let's ignore opening books here). P percent of these decisions are "perfect" (i.e., the engine plays the optimal move) but (100-P) percent are not optimal, and among these non-optimal move decisions there may be at least one that loses the game where a better move would not lose. Nevertheless the engine plays that move since it does not see the losing continuation within its current search horizon.

So in a match "A vs. B" with equal time you get some wins on either side due to playing sub-optimal moves.

Now return to your case: A vs. B where instance B gets twice of the search time of instance A but is otherwise identical to A. Some of B's move decisions will remain the same as before (i.e., same as A would play) but some other decisions will be different. Among these different moves some (probably most) will be better than before due to the deeper search. A will still win some games due to B playing sub-optimal moves but the ratio becomes smaller. Now double thinking time for B again, and the story continues: the ratio of sub-optimal moves of B becomes even smaller so B will win more games again.
Thanks for the reply. The answer I was looking for was given, indirectly, by Kevin Hearn ("There are a number of methods used to truncate or shorten the search. Given infinite time all of these reduced/truncated lines would be searched in future iterations, it's just methods of prioritising what needs to be checked first/to greater depth allowing among other benefits seeing the PV much deeper which helps with the horizon effect. ")

I say indirectly because it is left implied that what is going on when Instance A (weaker) beats Instance B (stronger) of the same chess program is that Instance A, Weaker, is 'stuck in' a truncated part of the tree, given the limited time present to search (a few seconds) that gives a value to the principle variation move order that is in fact the "correct" "best" move to an initial position, but Instance B (stronger) is in a different part of the tree because: (1) though the tree is searched in exactly the same order by exactly the same algorithm (this 'same order' was not clear to me when I first posted, I thought the tree was searched randomly but apparently it is not), (2) certain 'bad parts' of the truncated tree will give a value to the alpha/beta algorithm and/or move order algorithm, when searched deeper, that gives a 'better answer' than the weaker engine's answer, when pushing the min/max values of these 'bad parts' to the top of the tree, even though these bad parts of the chess tree are not where the "correct" or "best" move lies, which by chance is where the weak tree would be stuck looking, hence, (3) the stronger Instance B of the program will report these 'bad part' tree moves as where the best move is, take these moves, and the weaker Instance A will win because these 'bad parts' lead to say a forced win for the weaker engine.

Given 'infinite (i.e. long but finite) time' however, the stronger chess engine should beat the weaker chess engine (or at least draw), almost every time, though even with infinite time you'll get the effect I mentioned above, just less of it since chess is not that bizarre where a reversal of fortune happens after 20 moves in a superior material position (though of course I've seen it, with say a "positional sacrifice", but often the reversal of fortune is only about five or 10 moves deep).

BTW in this context "search horizon" is a bit misleading to me. It's not really search horizon at work it seems, since search horizon the way I understand it is the number of half-moves or plies searched in a chess tree, and both instances A, B can have the same search horizon*. It's more about the sub-branches of the chess tree within this search horizon and how these branches are searched that's giving rise to the anomalies described above and in this thread.

JayRod


* If I'm wrong about search horizon, then consider the scenario where the same instance of a program is limited to searching say 20 plies on an average PC (that is, a tree where you cannot exhaustively search every portion of it with an ordinary PC...I am guessing 20 plies is such a tree). If you give instance B, twice as much time as instance A, will instance A never win over B in a blitz game? I think not, which proves the point, since the tree is being searched to the same search horizon of 20 plies by both engines (they go all the way down, 10 moves deep), but the weaker engine A sometimes wins over stronger B because it 'blunders into victory' as per the above since B is searching a different part of the tree and finds a 'false' superior move in a 'bad' part of the tree.
BTW in my original post I am talking about giving programs the same 'average time' which is not the same as a fixed number of plies, though there's a correlation between the two I'm sure.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: How can a program beat itself? Where is the randomness?

Post by Sven »

Hi Jay,

some parts of your post may be right but I think in reality it's even more simple than that. A weaker engine winning against a stronger one does not need to "blunder into victory" in order to win, it is much more likely that the stronger engine blunders itself and the weaker engine finds the refutation.
User avatar
hgm
Posts: 27808
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: How can a program beat itself? Where is the randomness?

Post by hgm »

You seem to have a wrong concept of what 'horizon' means. By definition it is always the boundary of the game tree from the current position between what was searched and what wasn't. The instance of the program that gets less time can only search a smaller part of the game tree, its horizon will on average be nearer, and there is stuff behind it that it did not see, which the longer-thinking instance could. This has absolutely nothing to do with the shape of the tree (i.e. reductions or extensions). Programs that use strictly fixed-depth searches will sometimes also lose from themselves when they think one ply less deep. In fact I have heard of people beating Rybka by always playing the second move of its PV (which by definition was searched 1 ply less deep).

The most common case where the stronger program unwittingly plays losing moves that are trivial enough to exploit that even a weaker program would not miss it is because of imperfect static evaluation. E.g. the famous case of a Bishop capturing Pa7, and getting trapped by b6. If the evaluation does not recognize that pattern (wBa7,bPb6) and strongly penalizes it, programs will think it is a good deal that gains a Pawn, as the actual loss of the Bishop might be 20 ply away, too deep for both the weak and small version to see. But the weak version might not see in time the strong version threatens to 'gain' this Pawn, so that it won't take measures to prevent it, so that it finally 'loses' the Pawn. But it will trap the Bishop, and keep it trapped, not becase it thinks it can gain it, but for simpleminded mobility or piece-square reasons that it could see even in a 1-ply search.

Static evaluation is full of such 'pitfalls'. The stronger program is more likely to achieve its goals as stipulated by the eval, but that holds for really worthwile goals just as much as for the occasional pitfall.

Also note that looking deeper isn't always an advantage. When B can see A has a forced mate, and starts to delay it by sacrificing Queen and Rooks in spite checks, the weaker side, which likely hasn't seen the mate at all, will happily see that it can gain Queen and Rooks (it would often be the only legal move to take them...), and then win the game easily in its own way.