STS results and errata

jesper_nielsen · Post by **jesper_nielsen** » Mon Aug 31, 2009 11:44 am

Hi Again!

Here are some more STS results from Pupsi2 0.07 on a 1.86 GHz Intel Pentium(R) M laptop.

Fast = 10 seconds per position.
Medium = 1 minute per position.
Slow = 7 minutes per position.

Code: Select all

Solved
       Fast Medium Slow
1.0    57    67    75
2.0    68    79    78
3.0    54    62    71
4.0    64    71    --
5.0    56    61    --

Points
       Fast Medium Slow
1.0    656   724   809
2.0    745   846   863
3.0    664   736   802
4.0    745   806   ---
5.0    629   687   ---

Note: The number of solved are the number of 10 point moves selected.

Weird that the 2.0 solved actually goes down with more time. But the points still go up! I wander how that should be interpreted.

To bad I already proclaimed the 2.0 as Pupsi's favorite test set!

STS2.0 where the points awarded do not match the "bm" tag:

Code: Select all

2b2n2/2q2p1k/1r6/2pPP1QB/2p1P3/2P4P/1P4P1/2B3K1 b - - bm Ra6; c0 "Rb6g6=10, Kh7h8=3, Qc7b7=5, Rb6a6=4"; id "Open Files & Diagnonals.017";
2r3k1/p1pb2pp/3p1p1r/1p6/4P2q/1PQB2R1/PBP1N1P1/6K1 w - - bm Bc1; c0 "Qa5=10, Bc1=10"; id "Open Files & Diagnonals.026"; 
3r3k/p1q2ppp/5b2/3Pp3/2p4P/P4BQ1/1PP3P1/1K5R w - - bm Be4; c0 "Rh1f1=10, Bf3e4=10, Rh1e1=4, h4h5=1"; id "Open Files & Diagnonals.042"; 
q4rk1/5ppp/p1r4b/2p1p3/Q3P2P/1R4P1/5PK1/3R1B2 w - - bm Bc4; c0 "Rd7=10, Bc4=7, Rf3=3"; id "Open Files & Diagnonals.081"; 
r6k/p3Npb1/q5pp/1p2p3/1B6/PPP3P1/3R1P1P/3R2K1 b - - bm Qe6; c0 "Qf6=10, Qb7=8, Qe6=10"; id "Open Files & Diagnonals.098";

Kind regards,
Jesper

swami · Post by **swami** » Mon Aug 31, 2009 2:54 pm

It's good to know that Pupsi is real good in open files and diagonals, I reckon that's the toughest of all the test suites for most of the engines.

But STS 5.0 Bishop vs Knight happen to be the easiest amongst all test suites for many top engines. Pupsi does the toughest very well but the easiest mediocre

STS 4.0 Square vacancy is 2nd toughest of all test suites for many engines but according to Pupsi it's the 2nd easiest... also interesting is that STS 3.0 Knight outposts is 2nd easiest for many engines, but according to Pupsi, it's the 2nd toughest

My observation about Pupsi's results:

I think changing values for Bishop and Knight might help a big deal or implementing code that has something to do with Knight (that's where I spot the weakness lies). It's the essential knowledge. Pupsi seems to be weak at handling the knight however.

Pupsi has exceptional understanding of opening up files and diagonals for Rook, Queen and Bishop, mobility, posting queens on vacant squares etc

Thanks for the bug reports on STS 2.0, Do you happen to notice any bugs in STS 3.0, 4.0 and 5.0? Points distribution in thousand points scale is also interesting.

Regards,
Swami

jesper_nielsen · Post by **jesper_nielsen** » Mon Aug 31, 2009 3:14 pm

Thanks for your observations!

I will definitely look into the knight evaluation of Pupsi. I desperately need a positive result soon!

I have tried many, many things but non that would give any kind of improvement.

I have not noticed any "best move vs points" bugs in the 3.0 set. And I will go through the 4.0 and 5.0 when the slow results are done.

Kind regards,
Jesper

jesper_nielsen · Post by **jesper_nielsen** » Wed Sep 02, 2009 9:13 am

STS 4.0 results are done.

Pupsi2 0.07 on a 1.86 GHz Intel Pentium(R) M laptop.

Fast = 10 seconds per position.
Medium = 1 minute per position.
Slow = 7 minutes per position.

Code:

Solved
Fast Medium Slow
1.0: 57 67 75
2.0: 68 79 78
3.0: 54 62 71
4.0: 64 71 73
5.0: 56 61 --

Points
Fast Medium Slow
1.0: 656 724 809
2.0: 745 846 863
3.0: 664 736 802
4.0: 745 806 829
5.0: 629 687 ---

Note: The number of solved are the number of 10 point moves selected.

I could not find any point vs best move problems in STS 4.0.

Kind regards,
Jesper

swami · Post by **swami** » Wed Sep 02, 2009 9:34 am

jesper_nielsen wrote:STS 4.0 results are done.

Pupsi2 0.07 on a 1.86 GHz Intel Pentium(R) M laptop.

Fast = 10 seconds per position.
Medium = 1 minute per position.
Slow = 7 minutes per position.

Solved
Fast Medium Slow
1.0: 57 67 75
2.0: 68 79 78
3.0: 54 62 71
4.0: 64 71 73
5.0: 56 61 --

Slow - Fast:

Improvement in

1.0 = 18
2.0 = 10
3.0 = 17
4.0 = 9

Pupsi needs to speed up thinking time on v1.0 and v3.0 to be able to solve the positions quickly.

v4.0 score range seems to be consistent but it could be improved on longer time controls. Puspi has shown to possess good understanding of open, files and diagonals, on par with Spike, Rybka, Bright, Glaurung etc

I could not find any point vs best move problems in STS 4.0.

Thanks for trying to find bugs in the tests. Your inputs are much appreciated.

swami · Post by **swami** » Wed Sep 02, 2009 9:39 am

jesper_nielsen wrote:Thanks for your observations!

I will definitely look into the knight evaluation of Pupsi. I desperately need a positive result soon! I have tried many, many things but non that would give any kind of improvement.

I have not noticed any "best move vs points" bugs in the 3.0 set. And I will go through the 4.0 and 5.0 when the slow results are done.

Kind regards,
Jesper

Pupsi will improve a great deal if it manages to handle Knight well. Results have clearly shown that it's weakness lies in Knight management. I hope you find ways to improve the code regarding placement of knights on good squares, reposition the knight when it's right time (meaning the position is closed rather than open), justified trade of Knight for Bishop etc.

Best Regards.
Swami

jesper_nielsen · Post by **jesper_nielsen** » Wed Sep 02, 2009 10:22 am

swami wrote:
jesper_nielsen wrote:Thanks for your observations!

I will definitely look into the knight evaluation of Pupsi. I desperately need a positive result soon! I have tried many, many things but non that would give any kind of improvement.

I have not noticed any "best move vs points" bugs in the 3.0 set. And I will go through the 4.0 and 5.0 when the slow results are done.

Kind regards,
Jesper
Pupsi will improve a great deal if it manages to handle Knight well. Results have clearly shown that it's weakness lies in Knight management. I hope you find ways to improve the code regarding placement of knights on good squares, reposition the knight when it's right time (meaning the position is closed rather than open), justified trade of Knight for Bishop etc.

Best Regards.
Swami

I am going to run some tests adjusting the knight value and perhaps the bishop pair bonus this weekend.

I have tried adding outpost bonus for knights, but that actually seemed to hurt performance.

But the knight needs looking into!

Keep up the good work!

Kind regards,
Jesper

swami · Post by **swami** » Wed Sep 02, 2009 10:47 am

jesper_nielsen wrote:I am going to run some tests adjusting the knight value and perhaps the bishop pair bonus this weekend.

While you're at it, you may find this resource useful:

http://home.comcast.net/~danheisman/Art ... alance.htm

It may give you some ideas regarding the material values.

Also, repositioning of knight in closed position is actually tougher to tackle than outposts. Bishop - Knight trade off is easy to improve I think.

Good luck and have fun experimenting the changes this weekend!

Best Regards,
Swami

jesper_nielsen · Post by **jesper_nielsen** » Thu Sep 03, 2009 12:41 pm

STS 5.0 results are done.

Pupsi2 0.07 on a 1.86 GHz Intel Pentium(R) M laptop.

Fast = 10 seconds per position.
Medium = 1 minute per position.
Slow = 7 minutes per position.

Solved

Code: Select all

Fast Medium Slow 
1.0: 57 67 75 
2.0: 68 79 78 
3.0: 54 62 71 
4.0: 64 71 73 
5.0: 56 61 66

Points

Code: Select all

Fast Medium Slow 
1.0: 656 724 809 
2.0: 745 846 863 
3.0: 664 736 802 
4.0: 745 806 829 
5.0: 629 687 737

Note: The number of solved are the number of 10 point moves selected.

I could not find any point vs best move problems in STS 5.0 either.

I already have the Larry Kaufman material evaluation values. So I think it must be something else messing up the knight evaluation?!

Kind regards,
Jesper

Dann Corbit · Post by **Dann Corbit** » Fri Sep 04, 2009 8:27 am

jesper_nielsen wrote:Hi Again!

Here are some more STS results from Pupsi2 0.07 on a 1.86 GHz Intel Pentium(R) M laptop.

Fast = 10 seconds per position.
Medium = 1 minute per position.
Slow = 7 minutes per position.
Code: Select all
Solved
       Fast Medium Slow
1.0    57    67    75
2.0    68    79    78
3.0    54    62    71
4.0    64    71    --
5.0    56    61    --

Points
       Fast Medium Slow
1.0    656   724   809
2.0    745   846   863
3.0    664   736   802
4.0    745   806   ---
5.0    629   687   ---
Note: The number of solved are the number of 10 point moves selected.

Weird that the 2.0 solved actually goes down with more time. But the points still go up! I wander how that should be interpreted. To bad I already proclaimed the 2.0 as Pupsi's favorite test set!

STS2.0 where the points awarded do not match the "bm" tag:
Code: Select all
2b2n2/2q2p1k/1r6/2pPP1QB/2p1P3/2P4P/1P4P1/2B3K1 b - - bm Ra6; c0 "Rb6g6=10, Kh7h8=3, Qc7b7=5, Rb6a6=4"; id "Open Files & Diagnonals.017";
2r3k1/p1pb2pp/3p1p1r/1p6/4P2q/1PQB2R1/PBP1N1P1/6K1 w - - bm Bc1; c0 "Qa5=10, Bc1=10"; id "Open Files & Diagnonals.026"; 
3r3k/p1q2ppp/5b2/3Pp3/2p4P/P4BQ1/1PP3P1/1K5R w - - bm Be4; c0 "Rh1f1=10, Bf3e4=10, Rh1e1=4, h4h5=1"; id "Open Files & Diagnonals.042"; 
q4rk1/5ppp/p1r4b/2p1p3/Q3P2P/1R4P1/5PK1/3R1B2 w - - bm Bc4; c0 "Rd7=10, Bc4=7, Rf3=3"; id "Open Files & Diagnonals.081"; 
r6k/p3Npb1/q5pp/1p2p3/1B6/PPP3P1/3R1P1P/3R2K1 b - - bm Qe6; c0 "Qf6=10, Qb7=8, Qe6=10"; id "Open Files & Diagnonals.098"; 
Kind regards,
Jesper

There are clearly problems with those records. Thanks for your investigations. I have been slowly removing the manual steps from my QA process, but there are still things that can go wrong even now.

The first record is just plain wrong and should be removed from the set.
The other three records have alternate solutions and should also probably be replaced as well.

[d]2b2n2/2q2p1k/1r6/2pPP1QB/2p1P3/2P4P/1P4P1/2B3K1 b - - bm Rg8c8; c0 "Rg8c8=10, Kh8=3, Ng6=5, Qb7=5, Ra6=4, Rg6=4"; id "Open Files & Diagnonals.017";
[d]2r3k1/p1pb2pp/3p1p1r/1p6/4P2q/1PQB2R1/PBP1N1P1/6K1 w - - bm Bc1 Qa5; c0 "Qa5=10, Bc1=10"; id "Open Files & Diagnonals.026";
[d]3r3k/p1q2ppp/5b2/3Pp3/2p4P/P4BQ1/1PP3P1/1K5R w - - bm Be4 Rf1; c0 "Rf1=10, Be4=10, Re1=4, Rf1=5, h5=1"; id "Open Files & Diagnonals.042";
[d]q4rk1/5ppp/p1r4b/2p1p3/Q3P2P/1R4P1/5PK1/3R1B2 w - - bm Bc4 Rd7; c0 "Rd7=10, Bc4=10, Rf3=7"; id "Open Files & Diagnonals.081";
r6k/p3Npb1/q5pp/1p2p3/1B6/PPP3P1/3R1P1P/3R2K1 b - - bm Qe6 Qf6; c0 "Qf6=10, Qb7=8, Qe6=10"; id "Open Files & Diagnonals.098";

STS results and errata

STS results and errata

Re: STS results and errata

Re: STS results and errata

Re: STS results and errata

Re: STS results and errata

Re: STS results and errata

Re: STS results and errata

Re: STS results and errata

Re: STS results and errata

Re: STS results and errata