Oscar will need the capability of processing work unit files made from FEN records, so to that end I've installed a first attempt at automated FEN batch file processing into the program. The initial test file is WAC.fen as referenced above, and the test is to calculate the perft(n) for each position and report the total of these calculations for each depth.
Oscar will need the capability of processing work unit files made from FEN records, so to that end I've installed a first attempt at automated FEN batch file processing into the program. The initial test file is WAC.fen as referenced above, and the test is to calculate the perft(n) for each position and report the total of these calculations for each depth.
Yes, I did recall my earlier perft() runs on WAC but couldn't locate the results.
To check and debug differences between the outputs of different programs requires the use of a common format. I will be using the existing work unit format; it's easy to generate and easy to parse. Because all the data is generated by the same formatting routine, a simple text comparison between two files is sufficient to spot any differences.
Work unit input (7 fields): FEN occurrences
Work unit output (9 fields): FEN occurrences perft(7) occurrences*perft(7)
Soon, I'll have Oscar process the 68 record work unit wu7.964; the output should match Symbolic's output byte-for-byte.
Since Oscar will be made open source under the GPL, others may use it to debug their programs -- or to spot bugs in Oscar itself.
sje wrote:Yes, I did recall my earlier perft() runs on WAC but couldn't locate the results.
I checked the first 30 positions of WAC with JetChess (only depth 3) and all the results agree with Richard's results. I assume that JetChess is right so I recommend you to check each one of the first 30 positions and if your results do not match with the first 30 Richard's results at depth 3 (also my results), then you have a problem. Ideally, it would be better to check all the positions but I have better things to do that check manually each position (it is the only way I know to do it).
Furthermore, I isolated depth 5 results by Richard and sum them. I obtained the same total than him. I mean, he did not go wrong in the sum.
I am not claiming anybody is right, only that the first 30 results at depth 3 seem correct in Richard's post. Good luck, Steven.
Back in the Old Days, some workers would edit the position data in the test suites for various reasons, but neglect to inform others of the changes. So different versions of the same suite would make any calculation results hazardous to compare. Now we have the web and can be more sure of using canonical data.