Hi Frank, thanks again for that.
Despite what I wrote at CSS I find the idea with the tbs- positions having one single solution at the edge of 50moves to be won and the next weaker move very near to it but winning only cursedly, really interesting to test engines with, of course without usage of tbs.
But let's have a look at the exemplified three positions I talked about at CSS already, starting with nr. 81:
8/q7/6P1/4K2Q/8/8/8/k7 w - - 0 1
bm Qh1+
Second best Qd1+ is just over 50 moves to reset counter, so drawing by this rule, DTM is only 2 moves longer for the one but for the other one. Of course you can hope modern engines to separate the two moves from each other by search without tbs, but here's SF dev. with 30 threads of 16.5GHz CPU, 8G hash (too little, just because it's the standard setting at your site I began with that) and MultiPV=2:
8/q7/6P1/4K2Q/8/8/8/k7 w - -
Engine: Stockfish250602 (8192 MB)
von the Stockfish developers (see AUTHORS f
52 8:13 +0.69 1.Dd1+ Ka2 2.De2+ Kb3 3.Df3+ Kc2
4.Df5+ Kc1 5.Df1+ Kc2 6.Db5 Dg7+
7.Kf5 Df8+ 8.Ke4 Da8+ 9.Kf4 Da7
10.De5 Df2+ 11.Kg5 Dd2+ 12.Kf6 Dd8+
13.Kf5 Dd7+ 14.De6 (33.880.629.084) 68646
51 8:13 +0.65 1.Dh1+ Ka2 2.Dd5+ Kb1 3.De4+ Ka2
4.Dg2+ Kb1 5.Df1+ Kc2 6.Db5 Dg7+
7.Kf5 Df8+ 8.Ke4 Da8+ 9.Kf4 Dd8
10.De5 Dd2+ 11.Kg4 Dd1+ 12.Kf5 Dd7+
13.De6 Db5+ 14.Kf6 (33.880.629.084) 68646
As for so near to end position with so little material on board, that's practially drawing as for the eval, isn't it? So which one of the two moves the engine would choose would be rather pure accident, wouldn't it?
Next one, nr. 82, same hardware- setting, just giving 32G hash this time according to the LTC:
8/8/8/1r6/3K1B2/8/7R/k7 w - - 0 1
bm Kc3, this time Kc4 together with given one solution is about as near to cursed or not cursed win as the two moves before were then:
8/8/8/1r6/3K1B2/8/7R/k7 w - -
Engine: Stockfish250602 (32768 MB)
von the Stockfish developers (see AUTHORS f
76 6:57 0.00 1.Kc3 Ta5 2.Ld6 Tf5 3.Lb4 Tf3+ 4.Kc2 Ka2
5.Lc5 Tg3 6.Tf2 Th3 7.Ld6 Te3 8.Tf8 Te2+
9.Kd3 Tb2 10.Ta8+ Kb1 11.Le5 Td2+
12.Kc4 Kc2 13.Ta1 Te2 14.Ta2+ (29.369.207.763) 70312
76 6:57 0.00 1.Kc4 Ta5 2.Kb3 Tb5+ (29.369.207.763) 70312
You see, this time SF evaluates both candidate moves 0.00, so what?
Next one, nr. 84
8/8/7Q/1r4K1/2p5/8/3k4/8 w - - 0 1
Bm Kf6+ wins having the edge as for DTZ and 50 moves, next best Kg4+ is just over 50 and a cursed win again:
8/8/7Q/1r4K1/2p5/8/3k4/8 w - -
Engine: Stockfish250602 (32768 MB)
von the Stockfish developers (see AUTHORS f
55 9:40 +1.36 1.Kf6+ Kc2 2.Dg6+ Kb2 3.Dg2+ Kb3
4.Dg8 Kb4 5.Dg4 Td5 6.De6 Td2 7.Db6+ Kc3
8.Da5+ Kd3 9.Df5+ Kc3 10.Ke6 Kb4
11.Db1+ Kc5 12.Dg1+ Kb4 13.Db6+ Ka4
14.Dc5 (45.871.022.887) 79083
54 9:40 +1.12 1.Kg4+ Kc2 2.Da6 Tb4 3.Da2+ Kd3
4.Da3+ Tb3 5.Da6 Tb2 6.Kf5 Te2 7.Db5 Te3
8.Dd7+ Kc3 9.Kf4 Te2 10.Da4 Kd3
11.Dd1+ Td2 12.Db1+ Kd4 13.Kf5 Kc3
14.Ke5 (45.871.022.887) 79083
And after some more ponder- time:
8/8/7Q/1r4K1/2p5/8/3k4/8 w - -
Engine: Stockfish250602 (32768 MB)
von the Stockfish developers (see AUTHORS f
57 15:54 +1.40 1.Kf6+ Kc2 2.Dg6+ Kc1 3.Dg1+ Kb2
4.Dg2+ Kb3 5.Dg8 Th5 6.Db8+ Kc2
7.De8 Th4 8.Dg6+ Kb2 9.Dg2+ Kb3
10.Dd5 Th3 11.Db5+ Kc3 12.Ke6 Td3
13.Db1 Kd2 14.Da2+ (78.511.122.412) 82229
56 15:54 +1.33 1.Kg4+ Kc2 (78.511.122.412) 82229
Again we see SF evaluate the two moves very near to each other and both not with a clearly winning eval, again the choice between them without tbs will be pure hazard, won't it? At least here the better one comes a little nearer to +-, yet the discrimination between the two evals is not big enough for what I'm used to see with classical single best moves.
The reason, why the positions at first glance fooled me at all was, I looked at DTM only with Nalimovs at Shredder- GUI, having switched off Syzygys and seeing the engine evaluate each time two moves very near to each other, I thought the positions wrongly to be cursed wins anyhow. To say it one more time, as I already did at CSS, these are (as all positions are you know the best follow- up- lines of well enough) test positions to be evaluated by engines with looking at LTC- output, but without seeing this (output) too and thus knowing the reasons for the evaluations (the lines according to the evals) adjudication by GUI or tool as the one solved and the other one not, is bias by selection, so not to be used well with quite different positions in a suite for short to medium TC. As for my personal pov, I'd rather use them with Forward Backward only to see the points (plies) where the the engines' evals start to change with much shorter hardware- time but with standalone pondering only.
As I already said at CSS, if I'd use such positions for a suite at all, I'd at least make a suite of it's own out of them only, (80 of such hardware- time are many as for the overall time to be invested anyhow, just the statistical relevance will be very low of course then with such big error bars) and without tbs (with them it doesn't make sense) engines would need even much more hardware- time then that
they 'd need on average for the first 81, regards
Peter.