Adjudication issues

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Adjudication issues

Post by Sven »

In a CCRL gauntlet that is currently run by Graham, the following game was played:

[pgn][Site "GrahamCCRL.dyndns.org\Jumbo Gauntlet.ele"]
[Date "12-29-2016"]
[White "NGplay 9.85 64-bit "]
[Black "Jumbo 0.4.0 64-bit "]
[Result "1-0"]
[TimeControl "4032:000 "]

1. c4 e5 2. Nc3 d6 3. Nf3 Nf6 4. e3 Nc6 5. d4 Bf5 6. d5 Ne7 7. Qb3
b6 8. h3 h6 9. Bd2 g5 10. g4 Bh7 11. O-O-O Bg7 12. Qb5 Qd7 13. Qxd7
Kxd7 14. Be1 Bg6 15. Rg1 Rab8 16. h4 h5 17. hxg5 Nxg4 18. Nh4 Bh7
19. Bh3 e4 20. Bxg4 hxg4 21. Rxg4 f5 22. Rg1 Bg8 23. Ng2 a6 24. Nf4
b5 25. cxb5 axb5 26. Nce2 b4 27. g6 Rb5 28. Kb1 Nxd5 29. Nxd5 Rxd5
30. Rxd5 Bxd5 31. Bxb4 Rh2 32. Be1 Bc4 33. Nf4 c6 34. b3 Be6 35. Kc2
d5 36. a4 c5 37. a5 Bg8 38. Rg5 Rh1 39. Bd2 d4 40. Rxf5 d3 41. Nxd3
exd3 42. Kxd3 Rb1 43. Rxc5 Rxb3 44. Ke4 Kd6 45. Rg5 Bc4 46. f4 Ke7
47. Kf3 Ke6 48. Rc5 Bd5 49. Kg4 Kd6 50. Rc1 Kd7 51. f5 Rd3 52. Be1
Rxe3 53. Rd1 Re4 54. Kg5 Rd4 55. Rxd4 Bxd4 56. f6 Be3 57. Kf5 Be6
58. Ke4 Bc5 59. a6 Kc6 60. f7 Ba2 61. Kf5 Bc4 62. Bc3 Be7 63. a7 Kb7
64. Bd4 Bb3 65. Ke5 Ba2 66. Bb6 Ka8 67. Bg1 Bc4 68. Bf2 Bb3 69. Bd4
Kb7 70. Bb6 Bb4 71. Be3 Be7 72. Bg1 Bc4 73. Bb6 Bb3 74. Kf5 Ka8 75. Be3
Bc4 76. Bg1 Bb3 77. Bb6 Bd5 78. Bd4 Bc4 79. Ke5 Kb7 80. Ke4 Bb3 81. Kf5
Ba2 82. Kf4 Bf8 83. Be3 Bb4 84. Ke5 Be7 85. Kf5 Bd5 86. Kf4 Bc4 87. Bd4
Bd5 88. Bg1 Ka8 89. Be3 Bb4 90. Ke5 Bc4 91. Kf6 Bc3 92. Ke7 Bb4 93. Kd8
Kb7 94. Bg5 Bf8 95. Ke8 Bg7 96. Bf6 Bh6 97. Bd4 Ka8 98. Ke7
1-0[/pgn]

Final position:
[d]k7/P3KP2/6Pb/8/2bB4/8/8/8 b - - 70 98

The game was adjudicated by the GUI (ChessGUI) as a win for White after both engines had displayed a score of more than +6.00 in favor of White for a while. Obviously both engines are not aware of the "wrong bishop" pattern which would save the draw for Black. In fact there is no winning plan for White since Black can capture both f- and g-pawns with his two bishops. There were even a couple of situations where Black already missed to enforce the draw, e.g. after 85.Kf5 with 85... Bb1+ 86.Ke6 Bxg6 87.Kxe7 Bxf7 draw, or after 95.Ke8 Bd3! 96.Kxf8 Bxg6 followed by 97... Bxf7 draw. But even without enforcing the draw like that, and even if we consider that SF displays a "winning score" of about +2.1 for White in the final position (and had even displayed +3.66 for a longer period before), it remains a fact that the game would have ended in a draw with very high probability without the adjudication.

Now my questions:

1) What do you think, should this game be adjudicated as a win for White in the final position based on the scores displayed by both engines?

2) If you answered "no" then how should the situation be handled by the GUI?
- Only adjudicate if the material on the current board already reflects the score within +/- 2.0 pawns?
- Only adjudicate if the fifty-moves counter is below a certain threshold, e.g. 20? (In the game above it was 35 in the end since White made no progress for a long time.)
- Only adjudicate if the "winning" side displays a PV that either leads to a tablebase win or to a position where the material reflects the score within +/- 2.0 pawns?
- Anything else, or any combination of the points above?

Looking forward to your opinions ... Please note that my main interest in this case is not to change the result into something "better" for my engine Jumbo. I would like to know how others think about adjudication in special cases like the one above, and if there is anything we could learn from it, possibly also something to be improved.
User avatar
hgm
Posts: 27796
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Adjudication issues

Post by hgm »

In any case you should never write engines that display scores below -1.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Adjudication issues

Post by Sven »

hgm wrote:In any case you should never write engines that display scores below -1.
That would indeed solve it :lol:
User avatar
MikeB
Posts: 4889
Joined: Thu Mar 09, 2006 6:34 am
Location: Pen Argyl, Pennsylvania

Re: Adjudication issues

Post by MikeB »

hgm wrote:In any case you should never write engines that display scores below -1.
Bingo +1!
carldaman
Posts: 2283
Joined: Sat Jun 02, 2012 2:13 am

Re: Adjudication issues

Post by carldaman »

Sven Schüle wrote:
Now my questions:

1) What do you think, should this game be adjudicated as a win for White in the final position based on the scores displayed by both engines?

2) If you answered "no" then how should the situation be handled by the GUI?
- Only adjudicate if the material on the current board already reflects the score within +/- 2.0 pawns?
- Only adjudicate if the fifty-moves counter is below a certain threshold, e.g. 20? (In the game above it was 35 in the end since White made no progress for a long time.)
- Only adjudicate if the "winning" side displays a PV that either leads to a tablebase win or to a position where the material reflects the score within +/- 2.0 pawns?
- Anything else, or any combination of the points above?

Looking forward to your opinions ... Please note that my main interest in this case is not to change the result into something "better" for my engine Jumbo. I would like to know how others think about adjudication in special cases like the one above, and if there is anything we could learn from it, possibly also something to be improved.
1) The displayed scores forced the adjudication under the adj rules, so the result must stand.

2) These are all great suggestions, but is there any GUI that would allow setting up such clever rules? I'm not aware of it. AFAIK, GUIs are still in a semi-primitive state in this area of adjudication.

CL
User avatar
Mike S.
Posts: 1480
Joined: Thu Mar 09, 2006 5:33 am

Re: Adjudication issues

Post by Mike S. »

This is a case where human intervention is required, and it has to be adjudicated as a draw.
Regards, Mike
Colin-G
Posts: 191
Joined: Mon Oct 31, 2016 6:30 pm
Location: England

Re: Adjudication issues

Post by Colin-G »

Mike S. wrote:This is a case where human intervention is required, and it has to be adjudicated as a draw.
I agree.
If this game had been played in one of my tournaments, it would have carried on and probably have ended in a draw by 50 move rule.
This is because I use a resign/difference setting of 9.0 pawns and not the 6.0 that was used in this game.
Any engine that thinks it is 9 pawns in front should be able to see the winning moves and play them.
Stockfish 8 (with 5 man Syzygy bases) on this slowish computer evaluates the final position shown as White winning by 3.72 pawns at depth 65 ply after about a minute or so.
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: Adjudication issues

Post by Rebel »

Sven Schüle wrote:Looking forward to your opinions ... Please note that my main interest in this case is not to change the result into something "better" for my engine Jumbo. I would like to know how others think about adjudication in special cases like the one above, and if there is anything we could learn from it, possibly also something to be improved.
Since both engines don't have the proper knowledge statistically Jumbo could profit from the next 3 of such cases. Also make sure Jumbo doesn't end up in such positions :wink:

In my active time I used a treshold of -5.00 | +5.00 no matter if there are Queens on the board or not. Thus so now and then there were incorrect adjudications and statistically it doesn't matter and, well, if the engine made it to a +5.xx score it deserved to win anyway.

Alright, in tournaments and even in rating lists such cases should not happen. I don't envy Graham right now :lol:
Modern Times
Posts: 3548
Joined: Thu Jun 07, 2012 11:02 pm

Re: Adjudication issues

Post by Modern Times »

I think Graham is sleeping well. In an ideal world if you had unlimited time and hardware, you'd turn adjudication off. In the real world, probably 99.9% of adjudications are correct, and the price for the 0.1% of wrong adjudications is worth paying for a ratings list - far more games played because they are shorter, with more robust ratings due to smaller statistical margins of error. And the 0.1% almost certainly not affecting the ratings at all. Of course it is impossible for every game to be checked manually when you are running more than 1,000 per week

Personally I set adjudication at 8.50. For a while I followed the TCEC settings, but they cut the games too short for my liking.
User avatar
Harvey Williamson
Posts: 2010
Joined: Sun May 25, 2008 11:12 pm
Location: Whitchurch. Shropshire, UK.
Full name: Harvey Williamson

Re: Adjudication issues

Post by Harvey Williamson »

Modern Times wrote:I In the real world, probably 99.9% of adjudications are correct

Personally I set adjudication at 8.50.
Sounds about right to me.