Vinvin wrote:Conditions :
Intel-i7-4GHz, 1 core, 30 min per position, 2GB hashtables
Syzygy 6 pcs on SSD
Interface : Arena 3.5
116 positions from Hard-Talkchess-2015-Beta-2
SF 6b1 solved 60 on 116 !
I think the ff position (3rd in 116) is not a good position for testing as there are equally good moves that wins. Bc1, Bd2 and other bishop moves.
[d]8/2b2k1K/1pPp1p2/1P1P1P2/5B2/8/8/8 w - -
Vinvin wrote:Conditions :
Intel-i7-4GHz, 1 core, 30 min per position, 2GB hashtables
Syzygy 6 pcs on SSD
Interface : Arena 3.5
116 positions from Hard-Talkchess-2015-Beta-2
SF 6b1 solved 60 on 116 !
I think the ff position (3rd in 116) is not a good position for testing as there are equally good moves that wins. Bc1, Bd2 and other bishop moves.
[d]8/2b2k1K/1pPp1p2/1P1P1P2/5B2/8/8/8 w - -
Vinvin wrote:Conditions :
Intel-i7-4GHz, 1 core, 30 min per position, 2GB hashtables
Syzygy 6 pcs on SSD
Interface : Arena 3.5
116 positions from Hard-Talkchess-2015-Beta-2
SF 6b1 solved 60 on 116 !
I think the ff position (3rd in 116) is not a good position for testing as there are equally good moves that wins. Bc1, Bd2 and other bishop moves.
[d]8/2b2k1K/1pPp1p2/1P1P1P2/5B2/8/8/8 w - -
Now I see your point. But be careful with this kind of testing method, as there might be a case where the engine may see the capture in the third move but when you advance the position by playing Be3 for example, the engine may not see the capture or perhaps delay it as long as 2x repeat and 50-move rule are still not kicking in.
I plan to create a python script for these tests, so at least I know that this third position is different. I will also report the total points based on the difficulties of the positions. Solving more difficult positions will gain more points than easier ones.
Vinvin wrote:Conditions :
Intel-i7-4GHz, 1 core, 30 min per position, 2GB hashtables
Syzygy 6 pcs on SSD
Interface : Arena 3.5
116 positions from Hard-Talkchess-2015-Beta-2
SF 6b1 solved 60 on 116 !
I think the ff position (3rd in 116) is not a good position for testing as there are equally good moves that wins. Bc1, Bd2 and other bishop moves.
[d]8/2b2k1K/1pPp1p2/1P1P1P2/5B2/8/8/8 w - -
Now I see your point. But be careful with this kind of testing method, as there might be a case where the engine may see the capture in the third move but when you advance the position by playing Be3 for example, the engine may not see the capture or perhaps delay it as long as 2x repeat and 50-move rule are still not kicking in.
I plan to create a python script for these tests, so at least I know that this third position is different. I will also report the total points based on the difficulties of the positions. Solving more difficult positions will gain more points than easier ones.
A script, nice ! It should include checking winning and drawing score for some positions too (e.g. position #57).
More, I'll probably change some difficulty values for a couple of positions after the beta 2 phase.
Vinvin wrote:
A script, nice ! It should include checking winning and drawing score for some positions too (e.g. position #57).
More, I'll probably change some difficulty values for a couple of positions after the beta 2 phase.
My script has a limitation of reading a move, it should be in long algebraic, like e2e4, same as in uci standard because I can easily compare the bestmove sent by engine and the bm equivalent in long algebraic. The bm on your epd will not be changed, but I plan to add c3 "e2e4"; as equivalent to bm e4 for example. Once you finalized the epd, I will add the bm in long algebraic found in comment3. It would be fine if you can reserve the c3 for bm in long algebraic.
Regarding pos 57, how do you define draw eval? maybe +/-80 score which means,
(bestscore <= 80 && bestscore >= -80) then this pos is solved.
What is your plan on no. 99?
[d]7k/8/4N3/4NN2/n2K4/8/8/3n4 w - - bm Kc4; c0 "mate in 21"; c1 "diff=9";
Are you going to replace this? According to 7-men table this is just a draw.
What is your plan on no. 99?
[d]7k/8/4N3/4NN2/n2K4/8/8/3n4 w - - bm Kc4; c0 "mate in 21"; c1 "diff=9";
Are you going to replace this? According to 7-men table this is just a draw.
2 bad news :
1) I ran SF6-RC1 with 3 best moves on the suite but Arena 3.5 doesn't handle the automatic test with more than 1 line. I have the big log file and I've to browse manually to get the timing for each move :-/
2) I let Sting 4.7 analyze the suite overnight (6 cores, 10 min/position) but in the morning I see it crashed after 42 positions (backward from the end). I resumed the suite from the beginning. More news in 8 hours.