Perft(14) revisited

sje · Post by **sje** » Fri Aug 08, 2014 4:32 pm

Back on 2013-04-02, Peter Österlund reported that perft(14) = 61,885,021,521,585,529,237.

I have not yet seen a confirmation of this, nor any intermediate results which would constitute a proof.

My thought is that any calculation of this magnitude needs some sort of independent validation. To this end, I've started a perft(14) effort.

Notes:

There are 96,400,068 unique ply 7 positions from the initial array. Many of these occur only once among the 3,195,901,860 perft(7) tree terminal nodes, but some appear much more often. The maximal occurrence count is 2,600 and there are two positions which occur this often:

rnbqkbnr/pppp1ppp/4p3/8/3PP3/8/PPP2PPP/RNBQKBNR b KQkq - 2 4
rnbqkbnr/pppp1ppp/8/4p3/3PP3/8/PPP2PPP/RNBQKBNR b KQkq - 2 4

To calculate perft(14), one way is to list all of the 96,400,068 unique(7) positions and their occurrence counts, then calculate perft(7) for each of these, then sum the products of the occurrence count and perft(7) for each position.

For the above two positions (FEN, occurrence count, perft(7), product):

rnbqkbnr/pppp1ppp/4p3/8/3PP3/8/PPP2PPP/RNBQKBNR b KQkq - 2 4 2600 43928908948 114215163264800
rnbqkbnr/pppp1ppp/8/4p3/3PP3/8/PPP2PPP/RNBQKBNR b KQkq - 2 4 2600 44948569301 116866280182600

The next six most frequent positions and their [occurrence count, perft(7), product] triad:

rnbqkbnr/pppp1ppp/4p3/8/4P3/3P4/PPP2PPP/RNBQKBNR b KQkq - 2 4 2166 30198759357 65410512767262
rnbqkbnr/pppp1ppp/8/4p3/4P3/3P4/PPP2PPP/RNBQKBNR b KQkq - 2 4 2166 25540143800 55319951470800
rnbqkbnr/ppp1pppp/3p4/8/3PP3/8/PPP2PPP/RNBQKBNR b KQkq - 2 4 2248 31150577721 70026498716808
rnbqkbnr/pppp1ppp/4p3/8/3P4/4P3/PPP2PPP/RNBQKBNR b KQkq - 2 4 2258 33424915802 75473459880916
rnbqkbnr/pppp1ppp/8/4p3/3P4/4P3/PPP2PPP/RNBQKBNR b KQkq - 2 4 2258 39795827246 89858977921468
rnbqkbnr/ppp1pppp/8/3p4/3PP3/8/PPP2PPP/RNBQKBNR b KQkq - 2 4 2368 39377074544 93244912520192

Adding all of the 96,400,068 products will give perft(14), which I estimate to be in the neighborhood of 6x10^19 and will overflow a 64 bit counter, so a special summation will be needed.

I have divided the 96,400,068 unique ply 7 positions into 964 work unit files of 100,000 positions each and a 965th file with 68 positions. Each work unit record has a position FEN followed by its occurrence count. The gzip (".gz" suffix) versions of these can be found at:

https://dl.dropboxusercontent.com/u/316 ... wu7.000.gz
https://dl.dropboxusercontent.com/u/316 ... wu7.001.gz
...
https://dl.dropboxusercontent.com/u/316 ... wu7.964.gz

Result files will be posted as well. For every record in a work unit file, there is a corresponding record in the result file. Each result record is an echo of the input record followed by the perft(7) of the position and the product of it and the occurrence count. Currently, only the result file for the very last workunit is available. It's at:

https://dl.dropboxusercontent.com/u/316 ... 964.sum.gz

The 965 work unit files have passed tests which used them to calculate perft(n), n = 7 through 10.

Calculating perft(14) with my current setup will take years. My idea for posting the workunit links is to stimulate interest for assistance from those with fast machines and a program which can read FEN and can do 100,000 perft(7) calculations in a reasonable time. Another idea is for me to try and squeeze some old perft() code into something OpenCL can digest and run it with my NVIDIA GeForce GTX 775M (2 GB GDDR5 memory, 1,344 CUDA cores). The code I have now uses recursion and that's not allowed with OpenCL kernels.

sje · Post by **sje** » Fri Aug 08, 2014 4:42 pm

I have machines working on work units wu7.000 through wu7.004. Current progress (completed perft(7) calculation counts; each work unit has 100,000 records):

wu7.000: 2,734
wu7.001: 54,461
wu7.002: 1,522
wu7.003: 2,490
wu7.004: 555

The wu7.001 run began about twelve days ago and is running on a dual core box. The others are all running on quad core boxes and were just stared within the past few hours.

sje · Post by **sje** » Fri Aug 08, 2014 4:46 pm

I calculated perft(7) = 3,195,901,860 back about 1989 on a Sun Microsystems 3/180 workstation which had a 16 MHz Motorola 68020 CPU. The run took about 36 hours. My bitboard program, developed on a 1986 Mac Plus and written in C, did not use transposition table assistance nor did it have any kind of 64 bit support. I used signed integer four byte variables, so I had to manually add the 20 perft(6) subtotals to get the final answer.

Today I have a quad core 64 bit machine with 32 GB RAM, and it can calculate perft(7) in about 1.6 seconds, a speed-up factor of about 80,000. Where will we be in another quarter century?

zullil · Post by **zullil** » Fri Aug 08, 2014 6:13 pm

sje wrote:Back on 2013-04-02, Peter Österlund reported that perft(14) = 61,885,021,521,585,529,237.

I have not yet seen a confirmation of this, nor any intermediate results which would constitute a proof.

My thought is that any calculation of this magnitude needs some sort of independent validation. To this end, I've started a perft(14) effort.

Forgive me, but why is this tabulation worth doing at all?

Will new techniques be used? I mean, what's the point?

(This coming from someone whose published research focuses on abstruse topics in low-dimensional topology and mathematical knot theory.

)

sje · Post by **sje** » Fri Aug 08, 2014 6:29 pm

zullil wrote:Forgive me, but why is this tabulation worth doing at all?

Why do some people climb mountains?

Establishing correct perft(n) values supplies data for benchmarking and for program verification. It also allows for comparing performance of different algorithms along with comparisons of different hardware.

A perft() routine is a good place to have hooks for test code, and such can be used to gather timing statistics over a repeatable, known set of positions. One example here is to do deep perft() runs starting with an endgame positions to time tablebase access code. Other examples include selective activation of differing versions of transposition table code to generate comparative timing performance data.

Calculating perft(14) today may be somewhat onerous, but in ten years it might be easy. But those coders ten years from now doing testing will need to know what the value of perft(14) truly is, so someone here and now must provide the answer.

Brunetti · Post by **Brunetti** » Fri Aug 08, 2014 8:07 pm

sje wrote:To calculate perft(14), one way is to list all of the 96,400,068 unique(7) positions and their occurrence counts, then calculate perft(7) for each of these, then sum the products of the occurrence count and perft(7) for each position.

Why exactly 7+7? I mean, why shouldn't it take less effort with, say, 8+6? And couldn't it be even better to do that with a sort of iterative deepening? You calculate perft 1 of your 96M positions, then group them and so on. Just asking, never tested such a thing

Alex

zullil · Post by **zullil** » Fri Aug 08, 2014 8:42 pm

sje wrote:
zullil wrote:Forgive me, but why is this tabulation worth doing at all?
Why do some people climb mountains?

Establishing correct perft(n) values supplies data for benchmarking and for program verification. It also allows for comparing performance of different algorithms along with comparisons of different hardware.

A perft() routine is a good place to have hooks for test code, and such can be used to gather timing statistics over a repeatable, known set of positions. One example here is to do deep perft() runs starting with an endgame positions to time tablebase access code. Other examples include selective activation of differing versions of transposition table code to generate comparative timing performance data.

Calculating perft(14) today may be somewhat onerous, but in ten years it might be easy. But those coders ten years from now doing testing will need to know what the value of perft(14) truly is, so someone here and now must provide the answer.

Seems like there are better mountains to climb, but to each his own. Good luck.

Adam Hair · Post by **Adam Hair** » Fri Aug 08, 2014 9:13 pm

zullil wrote:
sje wrote:
zullil wrote:Forgive me, but why is this tabulation worth doing at all?
Why do some people climb mountains?

Establishing correct perft(n) values supplies data for benchmarking and for program verification. It also allows for comparing performance of different algorithms along with comparisons of different hardware.

A perft() routine is a good place to have hooks for test code, and such can be used to gather timing statistics over a repeatable, known set of positions. One example here is to do deep perft() runs starting with an endgame positions to time tablebase access code. Other examples include selective activation of differing versions of transposition table code to generate comparative timing performance data.

Calculating perft(14) today may be somewhat onerous, but in ten years it might be easy. But those coders ten years from now doing testing will need to know what the value of perft(14) truly is, so someone here and now must provide the answer.
Seems like there are better mountains to climb, but to each his own. Good luck.

Most of us around here are wasting electrons in some manner, homfly!

Sorry. When I saw that you had an interest in knot theory I could not resist making a joke

sje · Post by **sje** » Fri Aug 08, 2014 10:01 pm

Brunetti wrote:
sje wrote:To calculate perft(14), one way is to list all of the 96,400,068 unique(7) positions and their occurrence counts, then calculate perft(7) for each of these, then sum the products of the occurrence count and perft(7) for each position.
Why exactly 7+7? I mean, why shouldn't it take less effort with, say, 8+6? And couldn't it be even better to do that with a sort of iterative deepening? You calculate perft 1 of your 96M positions, then group them and so on.

There is a trade-off arising from the combination of work unit size, the number of work units, the need to share work units (and results), the yield from the use of a two-way associative transposition table of a particular size (which needs to be reset with each position for maximum efficiency), typical CPU throughput, ease of checkpoint/restart, etc.

zullil · Post by **zullil** » Fri Aug 08, 2014 10:32 pm

Adam Hair wrote:
zullil wrote:
sje wrote:
zullil wrote:Forgive me, but why is this tabulation worth doing at all?
Why do some people climb mountains?

Establishing correct perft(n) values supplies data for benchmarking and for program verification. It also allows for comparing performance of different algorithms along with comparisons of different hardware.

A perft() routine is a good place to have hooks for test code, and such can be used to gather timing statistics over a repeatable, known set of positions. One example here is to do deep perft() runs starting with an endgame positions to time tablebase access code. Other examples include selective activation of differing versions of transposition table code to generate comparative timing performance data.

Calculating perft(14) today may be somewhat onerous, but in ten years it might be easy. But those coders ten years from now doing testing will need to know what the value of perft(14) truly is, so someone here and now must provide the answer.
Seems like there are better mountains to climb, but to each his own. Good luck.
Most of us around here are wasting electrons in some manner, homfly!

Sorry. When I saw that you had an interest in knot theory I could not resist making a joke

And I've certainly put my share of electrons to dubious use.

Perft(14) revisited

Perft(14) revisited

Re: Perft(14) revisited

Trivia: The very first perft(7)

Re: Perft(14) revisited

Re: Perft(14) revisited

Re: Perft(14) revisited

Re: Perft(14) revisited

Re: Perft(14) revisited

Re: Perft(14) revisited

Re: Perft(14) revisited