Request for thoughts: SEE "Test Set"

AndrewGrant · Post by **AndrewGrant** » Tue May 24, 2022 4:41 am

I would like to generate // mine from existing games, a list of a few million positions with an associated move. Each entry into this big list will have the following: <fen> <some_move> <expected_see_score>. (What about <fen> <square> <score>?) The score will follow the normal conventions, where the pieces are { +1, +3, +3, +5, +9 }. These SEE scores that get computed need to be "perfect". They should be the scores that any of our engines would return, if no one was concerned about the run-time cost of the SEE() functions.

A few topics:
1. How to generate such a list? How to select from existing PGNs?
2. What does "perfect" really mean for SEE? Blockers? Pinners? Checks?
3. What is the interface for this via UCI?
4. Engines don't use 1/3/3/5/9. What do do about that?

Would be nice to know "what % of SEE positions do you compute correctly", with various amounts of effort.

Henk · Post by **Henk** » Tue May 24, 2022 10:18 am

Each time I used SEE in my chess program there was no gain. Computing SEE was too expensive. Never or hardly ever going to try again.
Same holds for pawn table. Might be my evaluation too cheap.

dangi12012 · Post by **dangi12012** » Tue May 24, 2022 11:09 am

AndrewGrant wrote: ↑Tue May 24, 2022 4:41 am I would like to generate // mine from existing games, a list of a few million positions with an associated move. Each entry into this big list will have the following: <fen> <some_move> <expected_see_score>. (What about <fen> <square> <score>?) The score will follow the normal conventions, where the pieces are { +1, +3, +3, +5, +9 }. These SEE scores that get computed need to be "perfect". They should be the scores that any of our engines would return, if no one was concerned about the run-time cost of the SEE() functions.

A few topics:
1. How to generate such a list? How to select from existing PGNs?
2. What does "perfect" really mean for SEE? Blockers? Pinners? Checks?
3. What is the interface for this via UCI?
4. Engines don't use 1/3/3/5/9. What do do about that?

Would be nice to know "what % of SEE positions do you compute correctly", with various amounts of effort.

What helped me with a similar project was pgn-extract with this you can take any big pgn (like the lichess open database dump) and just print every fen in there line by line.
Now with the full FEN its trivial to calculate what you are asking for with a pipe into a small custom application that prints the SEE.

Code: Select all

pgn-extract "file.pgn" | SEE_calc > out.txt

Luecx · Post by **Luecx** » Tue May 24, 2022 11:33 am

dangi12012 wrote: ↑Tue May 24, 2022 11:09 am
AndrewGrant wrote: ↑Tue May 24, 2022 4:41 am I would like to generate // mine from existing games, a list of a few million positions with an associated move. Each entry into this big list will have the following: <fen> <some_move> <expected_see_score>. (What about <fen> <square> <score>?) The score will follow the normal conventions, where the pieces are { +1, +3, +3, +5, +9 }. These SEE scores that get computed need to be "perfect". They should be the scores that any of our engines would return, if no one was concerned about the run-time cost of the SEE() functions.

A few topics:
1. How to generate such a list? How to select from existing PGNs?
2. What does "perfect" really mean for SEE? Blockers? Pinners? Checks?
3. What is the interface for this via UCI?
4. Engines don't use 1/3/3/5/9. What do do about that?

Would be nice to know "what % of SEE positions do you compute correctly", with various amounts of effort.
What helped me with a similar project was pgn-extract with this you can take any big pgn (like the lichess open database dump) and just print every fen in there line by line.
Now with the full FEN its trivial to calculate what you are asking for with a pipe into a small custom application that prints the SEE.
Code: Select all
pgn-extract "file.pgn" | SEE_calc > out.txt

I have used the following for Koivisto. I excluded promotions:

Code: Select all

    // clang-format off
    verifySEECase("4R3/2r3p1/5bk1/1p1r3p/p2PR1P1/P1BK1P2/1P6/8 b - -", genMove(H5,G4,CAPTURE, BLACK_PAWN, WHITE_PAWN), 0);
    verifySEECase("4R3/2r3p1/5bk1/1p1r1p1p/p2PR1P1/P1BK1P2/1P6/8 b - -", genMove(H5,G4,CAPTURE, BLACK_PAWN, WHITE_PAWN), 0);
    verifySEECase("4r1k1/5pp1/nbp4p/1p2p2q/1P2P1b1/1BP2N1P/1B2QPPK/3R4 b - -", genMove(G4, F3, CAPTURE, BLACK_BISHOP, WHITE_KNIGHT), 0);
    verifySEECase("2r1r1k1/pp1bppbp/3p1np1/q3P3/2P2P2/1P2B3/P1N1B1PP/2RQ1RK1 b - -", genMove(D6, E5, CAPTURE, BLACK_PAWN, WHITE_PAWN), 100);
    verifySEECase("7r/5qpk/p1Qp1b1p/3r3n/BB3p2/5p2/P1P2P2/4RK1R w - -", genMove(E1, E8, QUIET, WHITE_ROOK), 0);
    verifySEECase("6rr/6pk/p1Qp1b1p/2n5/1B3p2/5p2/P1P2P2/4RK1R w - -", genMove(E1, E8, QUIET, WHITE_ROOK), -500);
    verifySEECase("7r/5qpk/2Qp1b1p/1N1r3n/BB3p2/5p2/P1P2P2/4RK1R w - -",  genMove(E1, E8, QUIET, WHITE_ROOK), -500);
//    verifySEECase("6RR/4bP2/8/8/5r2/3K4/5p2/4k3 w - -", genMove(F7, F8, QUEEN_PROMOTION, WHITE_PAWN), 225);
//    verifySEECase("6RR/4bP2/8/8/5r2/3K4/5p2/4k3 w - -", genMove(F7, F8, KNIGHT_PROMOTION, WHITE_PAWN), 225);
//    verifySEECase("7R/5P2/8/8/8/3K2r1/5p2/4k3 w - -", genMove(F7, F8, QUEEN_PROMOTION, WHITE_PAWN), 900);
//    verifySEECase("7R/5P2/8/8/8/3K2r1/5p2/4k3 w - -", genMove(F7, F8, BISHOP_PROMOTION, WHITE_PAWN), 225);
//    verifySEECase("7R/4bP2/8/8/1q6/3K4/5p2/4k3 w - -", genMove(F7, F8, ROOK_PROMOTION, WHITE_PAWN), -100);
    verifySEECase("8/4kp2/2npp3/1Nn5/1p2PQP1/7q/1PP1B3/4KR1r b - -", genMove(H1, F1, CAPTURE, BLACK_ROOK, WHITE_ROOK), 0);
    verifySEECase("8/4kp2/2npp3/1Nn5/1p2P1P1/7q/1PP1B3/4KR1r b - -", genMove(H1, F1, CAPTURE, BLACK_ROOK, WHITE_ROOK), 0);
    verifySEECase("2r2r1k/6bp/p7/2q2p1Q/3PpP2/1B6/P5PP/2RR3K b - -", genMove(C5, C1, CAPTURE, BLACK_QUEEN, WHITE_ROOK), 2*500-1000);
    verifySEECase("r2qk1nr/pp2ppbp/2b3p1/2p1p3/8/2N2N2/PPPP1PPP/R1BQR1K1 w kq -", genMove(F3,E5,CAPTURE,WHITE_KNIGHT,BLACK_PAWN), 100);
    verifySEECase("6r1/4kq2/b2p1p2/p1pPb3/p1P2B1Q/2P4P/2B1R1P1/6K1 w - -", genMove(F4, E5, CAPTURE, WHITE_BISHOP, BLACK_BISHOP), 0);
//    verifySEECase("3q2nk/pb1r1p2/np6/3P2Pp/2p1P3/2R4B/PQ3P1P/3R2K1 w - h6", genMove(G5, H6, EN_PASSANT, WHITE_PAWN, BLACK_PAWN), 0);
//    verifySEECase("3q2nk/pb1r1p2/np6/3P2Pp/2p1P3/2R1B2B/PQ3P1P/3R2K1 w - h6", genMove(G5, H6, EN_PASSANT, WHITE_PAWN, BLACK_PAWN), 100);
    verifySEECase("2r4r/1P4pk/p2p1b1p/7n/BB3p2/2R2p2/P1P2P2/4RK2 w - -",  genMove(C3, C8, CAPTURE, WHITE_ROOK, BLACK_ROOK), 500);
    verifySEECase("2r5/1P4pk/p2p1b1p/5b1n/BB3p2/2R2p2/P1P2P2/4RK2 w - -", genMove(C3, C8, CAPTURE, WHITE_ROOK, BLACK_ROOK), 500);
    verifySEECase("2r4k/2r4p/p7/2b2p1b/4pP2/1BR5/P1R3PP/2Q4K w - -", genMove(C3, C5, CAPTURE, WHITE_ROOK, BLACK_BISHOP), 325);
    verifySEECase("8/pp6/2pkp3/4bp2/2R3b1/2P5/PP4B1/1K6 w - -", genMove(E5, C3, CAPTURE, BLACK_BISHOP, WHITE_PAWN), 100-325);
    verifySEECase("4q3/1p1pr1k1/1B2rp2/6p1/p3PP2/P3R1P1/1P2R1K1/4Q3 b - -", genMove(E6, E4, CAPTURE, BLACK_ROOK, WHITE_PAWN), 100-500);
    verifySEECase("4q3/1p1pr1kb/1B2rp2/6p1/p3PP2/P3R1P1/1P2R1K1/4Q3 b - -", genMove(H7, E4, CAPTURE, BLACK_BISHOP, WHITE_PAWN), 100);

Rebel · Post by **Rebel** » Tue May 24, 2022 1:05 pm

[fen]rn1qkb1r/p1p2pp1/3p1n1p/1B2p1B1/1p2P1b1/2NP1N1P/PPPQ1PP1/R3K2R b KQkq - -[/fen]

I somewhere have a PGN util that will report hanging pieces for white and black. Following the above position it will create something like this:

rn1qkb1r/p1p2pp1/3p1n1p/1B2p1B1/1p2P1b1/2NP1N1P/PPPQ1PP1/R3K2R b KQkq - w1=b5xe8=inf; w2=h3xg4=3; b1=h6xg5=2; b2=b4xc3=2;

Code: Select all

w1=b5xe8=inf; // white attacks the black king, see value inf
w2=h3xg4=3;   // white attacks the black bishop on g4, see value 3
b1=h6xg5=2;   // black attacks the white bishop on g5, see value 2 since it is defended by the white queen
b2=b4xc3=2;   // black attacks the white knight on c3, see value 2 since the knight is defended.

The util does not take pins into consideration. I could create xx million of such and offer it for download.

Sopel · Post by **Sopel** » Tue May 24, 2022 1:11 pm

1. since search hits more and different kinds of nodes than pgns then maybe it's better to do this in the search tree itself
4. if you want to compute how good the engine's approximation is then you should use the same piece values as the engine uses in see

abulmo2 · Post by **abulmo2** » Tue May 24, 2022 4:12 pm

The difficult part is to get the right SEE value: for example:
[fen]5rk1/5pp1/2r4p/5b2/2R5/6Q1/R1P1qPP1/5NK1 b - - 0 1[/fen]
For the move Bf5xc2, the SEE value depends on which white rook takes the bishop first. In the sequence Bf5xc2 [Rc4xc2 Rc6xc2 Ra2xc2 Qe2xc2] the capture should be scored +1 and be considered erroneously ad a good capture. In the sequence Bf5xc2 Ra2xc2 [Qe2xc2 Rc4xc2 Rc6xc2] the capture should be scored -2 and be considered correctly a bad capture. It is difficult to have a SEE that is both fast and precise on all kind of positions.

JVMerlino · Post by **JVMerlino** » Tue May 24, 2022 5:38 pm

Perhaps I misunderstand your goal, but you shouldn't need the "expected SEE score", but rather just "bad capture" and "not a bad capture".

AndrewGrant · Post by **AndrewGrant** » Tue May 24, 2022 9:53 pm

abulmo2 wrote: ↑Tue May 24, 2022 4:12 pm The difficult part is to get the right SEE value: for example:
[fen]5rk1/5pp1/2r4p/5b2/2R5/6Q1/R1P1qPP1/5NK1 b - - 0 1[/fen]
For the move Bf5xc2, the SEE value depends on which white rook takes the bishop first. In the sequence Bf5xc2 [Rc4xc2 Rc6xc2 Ra2xc2 Qe2xc2] the capture should be scored +1 and be considered erroneously ad a good capture. In the sequence Bf5xc2 Ra2xc2 [Qe2xc2 Rc4xc2 Rc6xc2] the capture should be scored -2 and be considered correctly a bad capture. It is difficult to have a SEE that is both fast and precise on all kind of positions.

What you said is exactly why I want this test set. I would not expect any engine to figure out the rook inter-play here. I'm not trying to get a set and hit it with 100% accuracy. I just want to see what % you get right with various methods. If I exlcude pinned pieces, do I go from 90% to 95%? Or is it simply 90% to 91%?

I have no interest in a perfect SEE for an engine -- just for marking positions for a later suite of tests

jdart · Post by **jdart** » Mon May 30, 2022 6:01 pm

I have some unit tests for see in this file: https://github.com/jdart1/arasan-chess/ ... c/unit.cpp

Request for thoughts: SEE "Test Set"

Request for thoughts: SEE "Test Set"

Re: Request for thoughts: SEE "Test Set"

Re: Request for thoughts: SEE "Test Set"

Re: Request for thoughts: SEE "Test Set"

Re: Request for thoughts: SEE "Test Set"

Re: Request for thoughts: SEE "Test Set"

Re: Request for thoughts: SEE "Test Set"

Re: Request for thoughts: SEE "Test Set"

Re: Request for thoughts: SEE "Test Set"

Re: Request for thoughts: SEE "Test Set"