WAC test

Evert · Post by **Evert** » Sun Jul 17, 2011 4:22 pm

lucasart wrote: PS: here's a position that seems relatively easy for humans but tricky for computers (my engine never finds Rxb2! and Houdini only finds it at depth 20, although it reaches that depth in a couple of seconds...)

[D]8/7p/5k2/5p2/p1p2P2/Pr1pPK2/1P1R3P/8 b - -

That's because humans know that two connected pawns on the sixth rank can't be stopped by a rook. If you add that knowledge (be careful that you don't award a bonus when it's not warrented, no other pieces besides the rook on the board and the enemy king cannot enter the square) your engine will find it easily.
I don't think it does much for elo, but it's still neat to see it when it does come up.

By the way, Jazz first finds it at depth 3, but then goes off it until depth 7

Code: Select all

&#91; 1&#93;   -3.75    0.00        0   &#40;EBR =  0.00&#41;     0 / 0 Rxa3   
&#91; 2&#93;   +1.09    0.00      178   &#40;EBR =  8.90&#41;     2 / 1 c3     bxc3   Rxc3   
< 3>            0.00      595   +1.09 ->  +1.56   3 / 0 Rxb2   Rxb2   c3     
&#91; 3&#93;   +1.56    0.00      595   &#40;EBR =  1.94&#41;     3 / 0 Rxb2   Rxb2   c3     
&#91; 4&#93;   +1.48    0.01     1087   &#40;EBR =  1.49&#41;     4 / 0 Rxb2   Rxb2   c3     Rb8    
< 5>            0.02     4645   +1.20 ->  +1.52   6 / 0 c3     bxc3   Rxc3   Ra2    Rc2    
&#91; 5&#93;   +1.52    0.02     5614   &#40;EBR =  3.69&#41;     5 / 0 c3     bxc3   Rxc3   Ra2    Rc2    
&#91; 6&#93;   +1.52    0.04     8832   &#40;EBR =  1.28&#41;     6 / 0 c3     bxc3   Rxc3   Rb2    Rxa3   e4     
< 7>   ---      0.05    13523   +1.52 ->  +0.87   8 / 0 c3     bxc3   Rxc3   e4     Rxa3   e5     Ke7    Rb2    
< 7>   !!!      0.07    26605   +0.87 ->  +1.52   8 / 0 Rxb2   Rxb2   c3     Rb6    Kf7    Rb7    Ke6    Rb6    Kd5    Rb7    
&#91; 7&#93;   +1.52    0.08    29520   &#40;EBR =  2.29&#41;    10 / 0 Rxb2   Rxb2   c3     Rb6    Kf7    Rb7    Ke6    Rb6    Kd5    Rb7    
&#91; 8&#93;   +1.51    0.11    51877   &#40;EBR =  0.66&#41;    11 / 0 Rxb2   Rxb2   c3     Rb6    Ke7    Rh6    c2     Rxh7   Kd6    Rh6    Kd5    
< 9>   !!!      0.15    99107   +1.51 ->  +2.24  11 / 0 Rxb2   Rxb2   c3     Rb6    Ke7    Rb7    Kd8    Rb8    Kc7    Rb4    c2     <H>
&#91; 9&#93;   +2.24    0.16   104265   &#40;EBR =  1.92&#41;    11 / 0 Rxb2   Rxb2   c3     Rb6    Ke7    Rb7    Kd8    Rb8    Kc7    Rb4    c2     
&#91;10&#93;   +2.24    0.19   137645   &#40;EBR =  0.70&#41;    14 / 0 Rxb2   Rxb2   c3     Rb6    Ke7    Rb7    Kd8    Rb8    Kc7    Rb4    c2     Rc4    Kd6    e4     
&#91;11&#93;   +2.58    0.28   225587   &#40;EBR =  2.22&#41;    15 / 0 Rxb2   Rxb2   c3     Rb6    Ke7    Rb7    Ke8    Rb8    Kd7    Rb7    Kc6    Rb1    c2     Rc1    Kd5    
&#91;12&#93;   +2.60    0.36   325703   &#40;EBR =  1.12&#41;    16 / 0 Rxb2   Rxb2   c3     Rb6    Ke7    Rb7    Ke8    Rb8    Kd7    Rb7    Kc6    Rb1    c2     Rc1    Kd5    Kg3    
<13>   !!!      0.89   951019   +2.60 ->  +4.43  17 / 0 Rxb2   Rxb2   c3     Rb6    Ke7    Rb7    Kd6    Rb6    Kc7    Rb4    c2     Rc4    Kb6    Kg3    d2     Rxc2   d1Q    
&#91;13&#93;   +4.43    0.96  1028692   &#40;EBR =  6.66&#41;    17 / 0 Rxb2   Rxb2   c3     Rb6    Ke7    Rb7    Kd6    Rb6    Kc7    Rb4    c2     Rc4    Kb6    Kg3    d2     Rxc2   d1Q    
&#91;14&#93;   +4.39    1.33  1448118   &#40;EBR =  0.62&#41;    18 / 0 Rxb2   Rxb2   c3     Rb6    Ke7    Rb7    Kd6    Rb6    Kc7    Rb4    c2     Rc4    Kb6    Kg3    d2     Rxc2   d1Q    Rc8    
&#91;15&#93;   +4.49    2.04  2270682   &#40;EBR =  1.91&#41;    17 / 0 Rxb2   Rxb2   c3     Rb6    Ke7    Rc6    c2     Kf2    Kd7    Rc3    d2     Rxc2   d1Q    Rc5    Qd2    Kf3    Ke6

I'm sure it used to do better than that on this position. Of course, you don't tune your program so that it efficiently solves one particular position...

Ralph Stoesser · Post by **Ralph Stoesser** » Sun Jul 17, 2011 4:33 pm

lucasart wrote:
lucasart wrote: My current score is only WAC 65/300... 8MB Hash and 1 second per position. And I still have some illegal moves in the PV... still a lot of work to do
And 226/300 *without* htable. clearly something wrong with my htable...

Yes, with a working HT you should be able to solve this one

[D]8/7p/5k2/5p2/p1p2P2/Pr1pPK2/1P1R3P/8 b - -

I'm also in the process of writing a chess engine. I have no search extensions besides check extension and a very basic eval without pawn_on_7th bonus, connected_passers bonus or something. But I have a working HT and can solve it at depth 23 after 1 minute.

lucasart · Post by **lucasart** » Sun Jul 17, 2011 4:50 pm

lucasart wrote:
lucasart wrote: My current score is only WAC 65/300... 8MB Hash and 1 second per position. And I still have some illegal moves in the PV... still a lot of work to do
And 226/300 *without* htable. clearly something wrong with my htable...

=> Great news! I found the htable bug.

Now my engine scores 226/300 on wac.epd, using either no hash or 8MB hash, and searching 1 second per move only.

I don't expect a big improvement by increasing the search time or the hash size, but I will do it to see if nothing wrong occurs (like no hash scoring better than htable with more time and more hash).

Clearly it can't rival with any serious engine yet, but I am quite happy with that first result

bob · Post by **bob** » Sun Jul 17, 2011 6:10 pm

lucasart wrote:hello

I would like to submit my engine (under developpement) to the famous WAC.epd test. Do you know of a (UCI + Linux) interface that allows that ?
Basically I'm looking for a feature like:
* load positions sequentially
* computer analyzes for X seconds
* compares the best move found by computer to the one indicated in the EPD
* output a score (like 200/300 meaning 200 correct moves out of the 300 WAC.epd positions)

Thank you

PS: here's a position that seems relatively easy for humans but tricky for computers (my engine never finds Rxb2! and Houdini only finds it at depth 20, although it reaches that depth in a couple of seconds...)

[D]8/7p/5k2/5p2/p1p2P2/Pr1pPK2/1P1R3P/8 b - -

Most programs solve almost all wac positions in under a second. Here's crafty on my laptop. using just one cpu:

17 0.26 -1 1. ... Rxb2!
17 0.28 -3 1. ... Rxb2!
17 0.40 -M 1. ... Rxb2!
17 0.57 -5.28 1. ... Rxb2 2. Rxb2 c3 3. Rb6+ Ke7
4. Rc6 c2 5. Kf2 d2 6. Rxc2 d1=Q 7.
Rc7+ Kd6 8. Rxh7 Qd2+ 9. Kf3 Qd5+ 10.
Kf2 Qa2+ 11. Kf3 Qxa3 12. Rg7

I wrote the code myself to run these kinds of tests. That is always the best solution to this since you can collect whatever statistics you want.

lucasart · Post by **lucasart** » Sun Jul 17, 2011 6:22 pm

bob wrote: Most programs solve almost all wac positions in under a second. Here's crafty on my laptop. using just one cpu:

17 0.26 -1 1. ... Rxb2!
17 0.28 -3 1. ... Rxb2!
17 0.40 -M 1. ... Rxb2!
17 0.57 -5.28 1. ... Rxb2 2. Rxb2 c3 3. Rb6+ Ke7
4. Rc6 c2 5. Kf2 d2 6. Rxc2 d1=Q 7.
Rc7+ Kd6 8. Rxh7 Qd2+ 9. Kf3 Qd5+ 10.
Kf2 Qa2+ 11. Kf3 Qxa3 12. Rg7

I wrote the code myself to run these kinds of tests. That is always the best solution to this since you can collect whatever statistics you want.

Yes I also wrote a small function to do an epd. However it's not all automatic. I post process in a shell script the output to get the WAC result. Anyway it does the job and I have the engine output of the search in a dump.txt file which is useful.
So I fixed a couple of things and reran with 16 MB Hash and 2 second per move
=> 243/300 !
big improvement (but a long way to go)

lucasart · Post by **lucasart** » Sun Jul 17, 2011 6:37 pm

lucasart wrote: So I fixed a couple of things and reran with 16 MB Hash and 2 second per move
=> 243/300 !
big improvement (but a long way to go)

and w/o hash and still 2 second per move, 237/240. so my hash table code is probably ok (ish)

lucasart · Post by **lucasart** » Sun Jul 17, 2011 6:46 pm

there are 2 interesting positions, on which my engine output a "NoMove" as it couldn't even searching depth 1!

[D]1r1r1qk1/p2n1p1p/bp1Pn1pQ/2pNp3/2P2P1N/1P5B/P6P/3R1RK1 w - - 0 1

and

[D]qn1kr2r/1pRbb3/pP5p/P2pP1pP/3N1pQ1/3B4/3B1PP1/R5K1 w - - 0 1

anyway, it's getting late, so I'll have a look tomorrow to see what happenned. with or without hash the problem is there, so debugging shouldn't be too hard.

Evert · Post by **Evert** » Sun Jul 17, 2011 7:23 pm

lucasart wrote:there are 2 interesting positions, on which my engine output a "NoMove" as it couldn't even searching depth 1!

Do you ever extend by more than one ply, deliberately or by accident?
If so, that's probably causing a massive (and pointless) explosion of the search tree.

hgm · Post by **hgm** » Sun Jul 17, 2011 10:32 pm

lucasart wrote:
Roman Hartmann wrote:Hi,
Polyglot does all you're looking for just nicely.
Can polyglot do a small tournament, engine vs engine, and output a PGN and a score ?
For example I'd want it to play 100 games in 1'+1" against another uci engine, and see the score and the pgn.

No, Polyglot is a single-engine client.

lucasart · Post by **lucasart** » Mon Jul 18, 2011 4:49 pm

Evert wrote:
lucasart wrote:there are 2 interesting positions, on which my engine output a "NoMove" as it couldn't even searching depth 1!
Do you ever extend by more than one ply, deliberately or by accident?
If so, that's probably causing a massive (and pointless) explosion of the search tree.

After a lot of debugging on the first position, I realized that I was facing a quiescent search explosion!
So I read my qsearch again and again, and I finally found the solution in StockFish. I was just pruning see < 0 without distinction, whilst SF does the following SEE pruning in the qsearch

Code: Select all

      // Detect non-capture evasions that are candidate to be pruned
      evasionPrunable =   inCheck
                       && bestValue > VALUE_MATED_IN_PLY_MAX
                       && !pos.move_is_capture&#40;move&#41;
                       && !pos.can_castle&#40;pos.side_to_move&#40;));

      // Don't search moves with negative SEE values
      if (   !PvNode
          && (!inCheck || evasionPrunable&#41;
          &&  move != ttMove
          && !move_is_promotion&#40;move&#41;
          &&  pos.see_sign&#40;move&#41; < 0&#41;
          continue;

So if my search exploded it's probably because of all the useless non capturing check evasions. Also I was brutally pruning see < 0 at PV nodes, which is dangerous (I was even pruning the tt move...)

So after adapting this code (especially because my see does handle promotions unlike SF's), as well as adding 2 killer moves and a mate killer, my WAC score is now 253/300 in 2 sec per move and using 16 MB Hash.

Just out curiosity I wonder who invented this SEE pruning refinement. Was it first seen in SF or were there previous open source program doing that ? Perhaps Crafty ?

WAC test

Re: WAC test

Re: WAC test

Re: WAC test

Re: WAC test

Re: WAC test

Re: WAC test

Re: WAC test

Re: WAC test

Re: WAC test

Re: WAC test