I have recently added checks to the q-search, since several have reported better results. And I have run some tests (not on the cluster) using three versions:
1. old non-check version
2. new qsearch check version, everything else the same
3. new qsearch check version with null-move R=3 rather than the adaptive 2-3 I have been using for many years.
I am looking for a reasonable number of test positions to see how this behaves tactically. I have tried WAC but have no luck. Crafty gets all but a couple in 1 second per move. At .5 seconds per move it is hardly worse. And at that time resolution, the "jitter" becomes an issue. I'd like to have a set that the normal crafty might get 50 out of 100 at something reasonable like 10-15 seconds per move, so that I can determine if either of the new versions solves more in the same time limit, or solves the same 50 but in less time.
Anybody have any favorites that do not have all these "easy for today's programs" positions that need a fraction of a second to find? And no, no Nolot positions as I'd like to experiment and get answers back more quickly than a day per position.
testing of a different sort
Moderators: hgm, Rebel, chrisw
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
-
- Posts: 12662
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: testing of a different sort
It seems likely that mate positions will benefit from checks in qsearch.bob wrote:I have recently added checks to the q-search, since several have reported better results. And I have run some tests (not on the cluster) using three versions:
1. old non-check version
2. new qsearch check version, everything else the same
3. new qsearch check version with null-move R=3 rather than the adaptive 2-3 I have been using for many years.
I am looking for a reasonable number of test positions to see how this behaves tactically. I have tried WAC but have no luck. Crafty gets all but a couple in 1 second per move. At .5 seconds per move it is hardly worse. And at that time resolution, the "jitter" becomes an issue. I'd like to have a set that the normal crafty might get 50 out of 100 at something reasonable like 10-15 seconds per move, so that I can determine if either of the new versions solves more in the same time limit, or solves the same 50 but in less time.
Anybody have any favorites that do not have all these "easy for today's programs" positions that need a fraction of a second to find? And no, no Nolot positions as I'd like to experiment and get answers back more quickly than a day per position.
Here are some mate position sets:
http://cap.connx.com/EPD/Les_Fernandez_ ... th.epd.bz2
http://cap.connx.com/EPD/M20.EPD.bz2
http://cap.connx.com/EPD/MATEIN2.EPD.bz2
http://cap.connx.com/EPD/MATESRCH.EPD.bz2
http://cap.connx.com/EPD/dm001.epd.bz2
http://cap.connx.com/EPD/dm002.epd.bz2
http://cap.connx.com/EPD/dm003.epd.bz2
http://cap.connx.com/EPD/dm004.epd.bz2
http://cap.connx.com/EPD/dm005.epd.bz2
http://cap.connx.com/EPD/dm006.epd.bz2
http://cap.connx.com/EPD/dm007.epd.bz2
http://cap.connx.com/EPD/dm008.epd.bz2
http://cap.connx.com/EPD/dm009.epd.bz2
http://cap.connx.com/EPD/dm010.epd.bz2
http://cap.connx.com/EPD/dm011.epd.bz2
http://cap.connx.com/EPD/dm012.epd.bz2
http://cap.connx.com/EPD/dm013.epd.bz2
http://cap.connx.com/EPD/dm014.epd.bz2
http://cap.connx.com/EPD/dm015.epd.bz2
http://cap.connx.com/EPD/dm016.epd.bz2
http://cap.connx.com/EPD/dm017.epd.bz2
http://cap.connx.com/EPD/dm018.epd.bz2
http://cap.connx.com/EPD/dm019.epd.bz2
http://cap.connx.com/EPD/dm020.epd.bz2
http://cap.connx.com/EPD/dm021.epd.bz2
http://cap.connx.com/EPD/dm022.epd.bz2
http://cap.connx.com/EPD/dm023.epd.bz2
http://cap.connx.com/EPD/dm024.epd.bz2
http://cap.connx.com/EPD/dm025.epd.bz2
http://cap.connx.com/EPD/dm026.epd.bz2
http://cap.connx.com/EPD/dm027.epd.bz2
http://cap.connx.com/EPD/dm028.epd.bz2
http://cap.connx.com/EPD/dm029.epd.bz2
http://cap.connx.com/EPD/dm030.epd.bz2
http://cap.connx.com/EPD/dm031.epd.bz2
http://cap.connx.com/EPD/dm032.epd.bz2
http://cap.connx.com/EPD/dm033.epd.bz2
http://cap.connx.com/EPD/dm034.epd.bz2
http://cap.connx.com/EPD/dm035.epd.bz2
http://cap.connx.com/EPD/dm036.epd.bz2
http://cap.connx.com/EPD/dm037.epd.bz2
http://cap.connx.com/EPD/dm038.epd.bz2
http://cap.connx.com/EPD/dm039.epd.bz2
http://cap.connx.com/EPD/dm040.epd.bz2
http://cap.connx.com/EPD/dm041.epd.bz2
http://cap.connx.com/EPD/dm042.epd.bz2
http://cap.connx.com/EPD/dm043.epd.bz2
http://cap.connx.com/EPD/dm044.epd.bz2
http://cap.connx.com/EPD/dm045.epd.bz2
http://cap.connx.com/EPD/dm046.epd.bz2
http://cap.connx.com/EPD/dm047.epd.bz2
http://cap.connx.com/EPD/dm048.epd.bz2
http://cap.connx.com/EPD/dm050.epd.bz2
http://cap.connx.com/EPD/dm051.epd.bz2
http://cap.connx.com/EPD/dm052.epd.bz2
http://cap.connx.com/EPD/dm053.epd.bz2
http://cap.connx.com/EPD/dm054.epd.bz2
http://cap.connx.com/EPD/dm055.epd.bz2
http://cap.connx.com/EPD/dm056.epd.bz2
http://cap.connx.com/EPD/dm057.epd.bz2
http://cap.connx.com/EPD/dm058.epd.bz2
http://cap.connx.com/EPD/dm060.epd.bz2
http://cap.connx.com/EPD/dm061.epd.bz2
http://cap.connx.com/EPD/dm062.epd.bz2
http://cap.connx.com/EPD/dm063.epd.bz2
http://cap.connx.com/EPD/dm064.epd.bz2
http://cap.connx.com/EPD/dm065.epd.bz2
http://cap.connx.com/EPD/dm066.epd.bz2
http://cap.connx.com/EPD/dm067.epd.bz2
http://cap.connx.com/EPD/dm069.epd.bz2
http://cap.connx.com/EPD/dm070.epd.bz2
http://cap.connx.com/EPD/dm071.epd.bz2
http://cap.connx.com/EPD/dm072.epd.bz2
http://cap.connx.com/EPD/dm074.epd.bz2
http://cap.connx.com/EPD/dm075.epd.bz2
http://cap.connx.com/EPD/dm077.epd.bz2
http://cap.connx.com/EPD/dm082.epd.bz2
http://cap.connx.com/EPD/dm087.epd.bz2
http://cap.connx.com/EPD/dm089.epd.bz2
http://cap.connx.com/EPD/dm092.epd.bz2
http://cap.connx.com/EPD/dm093.epd.bz2
http://cap.connx.com/EPD/dm096.epd.bz2
http://cap.connx.com/EPD/dm1.epd.bz2
http://cap.connx.com/EPD/dm100.epd.bz2
http://cap.connx.com/EPD/dm101.epd.bz2
http://cap.connx.com/EPD/dm102.epd.bz2
http://cap.connx.com/EPD/dm103.epd.bz2
http://cap.connx.com/EPD/dm104.epd.bz2
http://cap.connx.com/EPD/dm105.epd.bz2
http://cap.connx.com/EPD/dm110.epd.bz2
http://cap.connx.com/EPD/dm119.epd.bz2
http://cap.connx.com/EPD/dm120.epd.bz2
http://cap.connx.com/EPD/dm121.epd.bz2
http://cap.connx.com/EPD/dm125.epd.bz2
http://cap.connx.com/EPD/dm126.epd.bz2
http://cap.connx.com/EPD/dm130.epd.bz2
http://cap.connx.com/EPD/dm135.epd.bz2
http://cap.connx.com/EPD/dm14.epd.bz2
http://cap.connx.com/EPD/dm2.epd.bz2
http://cap.connx.com/EPD/dm255.epd.bz2
http://cap.connx.com/EPD/dm3.epd.bz2
http://cap.connx.com/EPD/dm4.epd.bz2
http://cap.connx.com/EPD/dm5.epd.bz2
http://cap.connx.com/EPD/dm9.epd.bz2
http://cap.connx.com/EPD/dmt.epd.bz2
http://cap.connx.com/EPD/dm001.epd.bz2
http://cap.connx.com/EPD/m1.epd.bz2
http://cap.connx.com/EPD/m10.epd.bz2
http://cap.connx.com/EPD/m10a.epd.bz2
http://cap.connx.com/EPD/m11.epd.bz2
http://cap.connx.com/EPD/m12.epd.bz2
http://cap.connx.com/EPD/m15.epd.bz2
http://cap.connx.com/EPD/m16.epd.bz2
http://cap.connx.com/EPD/m2.epd.bz2
http://cap.connx.com/EPD/m2t.epd.bz2
http://cap.connx.com/EPD/m3.epd.bz2
http://cap.connx.com/EPD/m30.epd.bz2
http://cap.connx.com/EPD/m3a.epd.bz2
http://cap.connx.com/EPD/m3t.epd.bz2
http://cap.connx.com/EPD/m7.epd.bz2
http://cap.connx.com/EPD/m8.epd.bz2
http://cap.connx.com/EPD/ma.epd.bz2
http://cap.connx.com/EPD/many.epd.bz2
http://cap.connx.com/EPD/mate.epd.bz2
http://cap.connx.com/EPD/mates.epd.bz2
http://cap.connx.com/EPD/matesm.epd.bz2
http://cap.connx.com/EPD/matetest2.epd.bz2
http://cap.connx.com/EPD/tmat.epd.bz2
http://cap.connx.com/EPD/tmate2.epd.bz2
http://cap.connx.com/EPD/tmates.epd.bz2
-
- Posts: 4675
- Joined: Mon Mar 13, 2006 7:43 pm
Changing time value resolution
Perhaps you might consider changing the time value resolution in Crafty to allow for very short searches with consistent duration. Symbolic uses microsecond resolution as that's common enough on decent platforms. The CIL Toolkit also uses microsecond resolution when that's supported by the underlying Lisp processor.
With microsecond resolution, elapsed time values from the beginning of the Thompson Epoch (1970.01.01) fit into 64 bit integers with room to spare.
Probably, millisecond resolution should be sufficient.
With microsecond resolution, elapsed time values from the beginning of the Thompson Epoch (1970.01.01) fit into 64 bit integers with room to spare.
Probably, millisecond resolution should be sufficient.
-
- Posts: 1922
- Joined: Thu Mar 09, 2006 12:51 am
- Location: Earth
Re: Changing time value resolution
Oh no, not this again...
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: testing of a different sort
The ones that are particularly interesting are the positions where late null moves break things. For example positions where we end up with a queen at f6 and a pawn at h6, with the unstoppable mate threat of Qg7#, but if I don't move (null-move) I collapse the search into the quiescence phase and evaluate a lost position as equal. The report has been that checks in the q-search reduce those kinds of errors allowing more aggressive null-move settings. But the mates are also interesting, just so they are not too easy... Measuring fractions of a second is problematic.Dann Corbit wrote:It seems likely that mate positions will benefit from checks in qsearch.bob wrote:I have recently added checks to the q-search, since several have reported better results. And I have run some tests (not on the cluster) using three versions:
1. old non-check version
2. new qsearch check version, everything else the same
3. new qsearch check version with null-move R=3 rather than the adaptive 2-3 I have been using for many years.
I am looking for a reasonable number of test positions to see how this behaves tactically. I have tried WAC but have no luck. Crafty gets all but a couple in 1 second per move. At .5 seconds per move it is hardly worse. And at that time resolution, the "jitter" becomes an issue. I'd like to have a set that the normal crafty might get 50 out of 100 at something reasonable like 10-15 seconds per move, so that I can determine if either of the new versions solves more in the same time limit, or solves the same 50 but in less time.
Anybody have any favorites that do not have all these "easy for today's programs" positions that need a fraction of a second to find? And no, no Nolot positions as I'd like to experiment and get answers back more quickly than a day per position.
Here are some mate position sets:
http://cap.connx.com/EPD/Les_Fernandez_ ... th.epd.bz2
http://cap.connx.com/EPD/M20.EPD.bz2
http://cap.connx.com/EPD/MATEIN2.EPD.bz2
http://cap.connx.com/EPD/MATESRCH.EPD.bz2
http://cap.connx.com/EPD/dm001.epd.bz2
http://cap.connx.com/EPD/dm002.epd.bz2
http://cap.connx.com/EPD/dm003.epd.bz2
http://cap.connx.com/EPD/dm004.epd.bz2
http://cap.connx.com/EPD/dm005.epd.bz2
http://cap.connx.com/EPD/dm006.epd.bz2
http://cap.connx.com/EPD/dm007.epd.bz2
http://cap.connx.com/EPD/dm008.epd.bz2
http://cap.connx.com/EPD/dm009.epd.bz2
http://cap.connx.com/EPD/dm010.epd.bz2
http://cap.connx.com/EPD/dm011.epd.bz2
http://cap.connx.com/EPD/dm012.epd.bz2
http://cap.connx.com/EPD/dm013.epd.bz2
http://cap.connx.com/EPD/dm014.epd.bz2
http://cap.connx.com/EPD/dm015.epd.bz2
http://cap.connx.com/EPD/dm016.epd.bz2
http://cap.connx.com/EPD/dm017.epd.bz2
http://cap.connx.com/EPD/dm018.epd.bz2
http://cap.connx.com/EPD/dm019.epd.bz2
http://cap.connx.com/EPD/dm020.epd.bz2
http://cap.connx.com/EPD/dm021.epd.bz2
http://cap.connx.com/EPD/dm022.epd.bz2
http://cap.connx.com/EPD/dm023.epd.bz2
http://cap.connx.com/EPD/dm024.epd.bz2
http://cap.connx.com/EPD/dm025.epd.bz2
http://cap.connx.com/EPD/dm026.epd.bz2
http://cap.connx.com/EPD/dm027.epd.bz2
http://cap.connx.com/EPD/dm028.epd.bz2
http://cap.connx.com/EPD/dm029.epd.bz2
http://cap.connx.com/EPD/dm030.epd.bz2
http://cap.connx.com/EPD/dm031.epd.bz2
http://cap.connx.com/EPD/dm032.epd.bz2
http://cap.connx.com/EPD/dm033.epd.bz2
http://cap.connx.com/EPD/dm034.epd.bz2
http://cap.connx.com/EPD/dm035.epd.bz2
http://cap.connx.com/EPD/dm036.epd.bz2
http://cap.connx.com/EPD/dm037.epd.bz2
http://cap.connx.com/EPD/dm038.epd.bz2
http://cap.connx.com/EPD/dm039.epd.bz2
http://cap.connx.com/EPD/dm040.epd.bz2
http://cap.connx.com/EPD/dm041.epd.bz2
http://cap.connx.com/EPD/dm042.epd.bz2
http://cap.connx.com/EPD/dm043.epd.bz2
http://cap.connx.com/EPD/dm044.epd.bz2
http://cap.connx.com/EPD/dm045.epd.bz2
http://cap.connx.com/EPD/dm046.epd.bz2
http://cap.connx.com/EPD/dm047.epd.bz2
http://cap.connx.com/EPD/dm048.epd.bz2
http://cap.connx.com/EPD/dm050.epd.bz2
http://cap.connx.com/EPD/dm051.epd.bz2
http://cap.connx.com/EPD/dm052.epd.bz2
http://cap.connx.com/EPD/dm053.epd.bz2
http://cap.connx.com/EPD/dm054.epd.bz2
http://cap.connx.com/EPD/dm055.epd.bz2
http://cap.connx.com/EPD/dm056.epd.bz2
http://cap.connx.com/EPD/dm057.epd.bz2
http://cap.connx.com/EPD/dm058.epd.bz2
http://cap.connx.com/EPD/dm060.epd.bz2
http://cap.connx.com/EPD/dm061.epd.bz2
http://cap.connx.com/EPD/dm062.epd.bz2
http://cap.connx.com/EPD/dm063.epd.bz2
http://cap.connx.com/EPD/dm064.epd.bz2
http://cap.connx.com/EPD/dm065.epd.bz2
http://cap.connx.com/EPD/dm066.epd.bz2
http://cap.connx.com/EPD/dm067.epd.bz2
http://cap.connx.com/EPD/dm069.epd.bz2
http://cap.connx.com/EPD/dm070.epd.bz2
http://cap.connx.com/EPD/dm071.epd.bz2
http://cap.connx.com/EPD/dm072.epd.bz2
http://cap.connx.com/EPD/dm074.epd.bz2
http://cap.connx.com/EPD/dm075.epd.bz2
http://cap.connx.com/EPD/dm077.epd.bz2
http://cap.connx.com/EPD/dm082.epd.bz2
http://cap.connx.com/EPD/dm087.epd.bz2
http://cap.connx.com/EPD/dm089.epd.bz2
http://cap.connx.com/EPD/dm092.epd.bz2
http://cap.connx.com/EPD/dm093.epd.bz2
http://cap.connx.com/EPD/dm096.epd.bz2
http://cap.connx.com/EPD/dm1.epd.bz2
http://cap.connx.com/EPD/dm100.epd.bz2
http://cap.connx.com/EPD/dm101.epd.bz2
http://cap.connx.com/EPD/dm102.epd.bz2
http://cap.connx.com/EPD/dm103.epd.bz2
http://cap.connx.com/EPD/dm104.epd.bz2
http://cap.connx.com/EPD/dm105.epd.bz2
http://cap.connx.com/EPD/dm110.epd.bz2
http://cap.connx.com/EPD/dm119.epd.bz2
http://cap.connx.com/EPD/dm120.epd.bz2
http://cap.connx.com/EPD/dm121.epd.bz2
http://cap.connx.com/EPD/dm125.epd.bz2
http://cap.connx.com/EPD/dm126.epd.bz2
http://cap.connx.com/EPD/dm130.epd.bz2
http://cap.connx.com/EPD/dm135.epd.bz2
http://cap.connx.com/EPD/dm14.epd.bz2
http://cap.connx.com/EPD/dm2.epd.bz2
http://cap.connx.com/EPD/dm255.epd.bz2
http://cap.connx.com/EPD/dm3.epd.bz2
http://cap.connx.com/EPD/dm4.epd.bz2
http://cap.connx.com/EPD/dm5.epd.bz2
http://cap.connx.com/EPD/dm9.epd.bz2
http://cap.connx.com/EPD/dmt.epd.bz2
http://cap.connx.com/EPD/dm001.epd.bz2
http://cap.connx.com/EPD/m1.epd.bz2
http://cap.connx.com/EPD/m10.epd.bz2
http://cap.connx.com/EPD/m10a.epd.bz2
http://cap.connx.com/EPD/m11.epd.bz2
http://cap.connx.com/EPD/m12.epd.bz2
http://cap.connx.com/EPD/m15.epd.bz2
http://cap.connx.com/EPD/m16.epd.bz2
http://cap.connx.com/EPD/m2.epd.bz2
http://cap.connx.com/EPD/m2t.epd.bz2
http://cap.connx.com/EPD/m3.epd.bz2
http://cap.connx.com/EPD/m30.epd.bz2
http://cap.connx.com/EPD/m3a.epd.bz2
http://cap.connx.com/EPD/m3t.epd.bz2
http://cap.connx.com/EPD/m7.epd.bz2
http://cap.connx.com/EPD/m8.epd.bz2
http://cap.connx.com/EPD/ma.epd.bz2
http://cap.connx.com/EPD/many.epd.bz2
http://cap.connx.com/EPD/mate.epd.bz2
http://cap.connx.com/EPD/mates.epd.bz2
http://cap.connx.com/EPD/matesm.epd.bz2
http://cap.connx.com/EPD/matetest2.epd.bz2
http://cap.connx.com/EPD/tmat.epd.bz2
http://cap.connx.com/EPD/tmate2.epd.bz2
http://cap.connx.com/EPD/tmates.epd.bz2
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Changing time value resolution
If you use CPU time, you might pull that off. But for elapsed time, which is all I measure, that won't work, since elapsed time is not that accurate. If I am going to compare two things, I need very accurate measurements, and operating system interference can affect very short time measurements. Hence my wanting positions that 15-30 seconds of time is enough so that I can use enough time to wash out the "jitter effect".sje wrote:Perhaps you might consider changing the time value resolution in Crafty to allow for very short searches with consistent duration. Symbolic uses microsecond resolution as that's common enough on decent platforms. The CIL Toolkit also uses microsecond resolution when that's supported by the underlying Lisp processor.
With microsecond resolution, elapsed time values from the beginning of the Thompson Epoch (1970.01.01) fit into 64 bit integers with room to spare.
Probably, millisecond resolution should be sufficient.
-
- Posts: 12662
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: testing of a different sort
There are also plenty of other tests there besides:
http://cap.connx.com/EPD/
Pull down any that you like and give them a go. Perhaps some of them will have the characteristics you are after.
If zugzwang positions are what you are after, here are some specifics:
zug.epd.bz2
zugged.epd.bz2
zughard.epd.bz2
zugzwang.epd.bz2
It seems to me that the property most desired in this case is to perform general tests without any harm, and to solve problems where a check sequence would reveal trouble faster. I do not think I have ever built an EPD test specifically for that purpose but it does sound useful.
http://cap.connx.com/EPD/
Pull down any that you like and give them a go. Perhaps some of them will have the characteristics you are after.
If zugzwang positions are what you are after, here are some specifics:
zug.epd.bz2
zugged.epd.bz2
zughard.epd.bz2
zugzwang.epd.bz2
It seems to me that the property most desired in this case is to perform general tests without any harm, and to solve problems where a check sequence would reveal trouble faster. I do not think I have ever built an EPD test specifically for that purpose but it does sound useful.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Changing time value resolution
Nope. Said all I intend to say about time jitter here.Zach Wegner wrote:Oh no, not this again...
-
- Posts: 4675
- Joined: Mon Mar 13, 2006 7:43 pm
Re: Changing time value resolution
How about using a fixed node count limit instead of a wall clock limit?bob wrote:If you use CPU time, you might pull that off. But for elapsed time, which is all I measure, that won't work, since elapsed time is not that accurate. If I am going to compare two things, I need very accurate measurements, and operating system interference can affect very short time measurements. Hence my wanting positions that 15-30 seconds of time is enough so that I can use enough time to wash out the "jitter effect".
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Changing time value resolution
I don't know how to make it fair. Two different versions, NPS varies differently because of the q-search checks and check evasions... Tried it but then compared to times and basically each position needs a different number of nodes, which was a pain to try to deal with.sje wrote:How about using a fixed node count limit instead of a wall clock limit?bob wrote:If you use CPU time, you might pull that off. But for elapsed time, which is all I measure, that won't work, since elapsed time is not that accurate. If I am going to compare two things, I need very accurate measurements, and operating system interference can affect very short time measurements. Hence my wanting positions that 15-30 seconds of time is enough so that I can use enough time to wash out the "jitter effect".
I think I have nearly reached a conclusion. The new version finds some things 1 or 2 plies quicker. That's good. But when it does, it still takes as much time as the old version did, except the old version got 1-2 plies deeper. A cluster test is, so far, showing no rating change. If that holds up for 40K games, this is getting the axe in favor of simplicity again, although I may try a little more tweaking here and there to see if there is any more left. I certainly need to try null R=3 everywhere. That used to be worse, I want to see if the checks allow it to be safe enough to use.