yet another attempt on Perft(14)

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

ankan
Posts: 77
Joined: Sun Apr 21, 2013 3:29 pm
Full name: Ankan Banerjee

yet another attempt on Perft(14)

Post by ankan »

Here is my latest attempt in computing perft(14) using my GPU perft program:

Code: Select all

Perft(03):                8902, time:    0.201 s
Perft(04):              197281, time:    0.001 s
Perft(05):             4865609, time:    0.001 s
Perft(06):           119060324, time:    0.002 s
Perft(07):          3195901860, time:    0.016 s
Perft(08):         84998978956, time:    0.206 s
Perft(09):       2439530234167, time:    1.414 s
Perft(10):      69352859712417, time:   15.278 s
Perft(11):    2097651003696806, time:  179.091 s
Perft(12):   62854969236701747, time:  2932.64 s
Perft(13): 1981066775000396239, time:  42792.7 s

..b1-a3     1988096314966752669
..g1-f3     2585178594020547799
..b1-c3     2638654908094167513
..a2-a3     1707945110805813310
..g1-h3     2022544576756325295
..b2-b3     2286406377370528139
..c2-c3     2666334464745721380
..f2-f3     1359948551007941806
..d2-d3     4459055881158216540
..g2-g3     2388698250037891063
..h2-h3     1691016253163166371
..a2-a4     2493115507450124588
..b2-b4     2297640117530334747
..e2-e3     7478732807317823667
..c2-c4     3067868595132318179
..f2-f4     1905689484049474095
..g2-g4     2050748957027119617
..h2-h4     2546461593049574382
..d2-d4     6636612996377425812
..e2-e4     7614272181524252025
Perft(14):61885021521585518997, time:   795802 s
Complete output of the program (divided perfts till first 4 levels) is here:
https://raw.githubusercontent.com/ankan ... erft14.txt

After Steven Edwards reported mismatches due to false transposition table positives caused by 64 bit hash signatures, I updated my program to use 128 bit hashes. I also made some performance improvements and changes to utilize multiple GPUs in the machine if available. This run was performed on a machine with 3 Nvidia Titan X GPUs.

Instead of Steven's approach of computing perft(7) of all 96400068 unique(7) positions, I decided to compute perft using the regular method which even though is slightly slower it has the advantage of obtaining divided counts.

Unfortunately my result doesn't match the value reported by Peter Österlund back in 2013. With 128 bit hashes, it's unlikely to be caused by a false positive/hash collision but I can't rule out bugs in my program or random bit flips due to hardware failure/instability.

I am going to try one more time using unique(8) or unique(9) positions - or use the same regular method but using different set of random zobrist keys to rule out possibility of hardware instability.
Dann Corbit
Posts: 12537
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: yet another attempt on Perft(14)

Post by Dann Corbit »

ankan wrote:Here is my latest attempt in computing perft(14) using my GPU perft program:

Code: Select all

Perft(03):                8902, time:    0.201 s
Perft(04):              197281, time:    0.001 s
Perft(05):             4865609, time:    0.001 s
Perft(06):           119060324, time:    0.002 s
Perft(07):          3195901860, time:    0.016 s
Perft(08):         84998978956, time:    0.206 s
Perft(09):       2439530234167, time:    1.414 s
Perft(10):      69352859712417, time:   15.278 s
Perft(11):    2097651003696806, time:  179.091 s
Perft(12):   62854969236701747, time:  2932.64 s
Perft(13): 1981066775000396239, time:  42792.7 s

..b1-a3     1988096314966752669
..g1-f3     2585178594020547799
..b1-c3     2638654908094167513
..a2-a3     1707945110805813310
..g1-h3     2022544576756325295
..b2-b3     2286406377370528139
..c2-c3     2666334464745721380
..f2-f3     1359948551007941806
..d2-d3     4459055881158216540
..g2-g3     2388698250037891063
..h2-h3     1691016253163166371
..a2-a4     2493115507450124588
..b2-b4     2297640117530334747
..e2-e3     7478732807317823667
..c2-c4     3067868595132318179
..f2-f4     1905689484049474095
..g2-g4     2050748957027119617
..h2-h4     2546461593049574382
..d2-d4     6636612996377425812
..e2-e4     7614272181524252025
Perft(14):61885021521585518997, time:   795802 s
Complete output of the program (divided perfts till first 4 levels) is here:
https://raw.githubusercontent.com/ankan ... erft14.txt

After Steven Edwards reported mismatches due to false transposition table positives caused by 64 bit hash signatures, I updated my program to use 128 bit hashes. I also made some performance improvements and changes to utilize multiple GPUs in the machine if available. This run was performed on a machine with 3 Nvidia Titan X GPUs.

Instead of Steven's approach of computing perft(7) of all 96400068 unique(7) positions, I decided to compute perft using the regular method which even though is slightly slower it has the advantage of obtaining divided counts.

Unfortunately my result doesn't match the value reported by Peter Österlund back in 2013. With 128 bit hashes, it's unlikely to be caused by a false positive/hash collision but I can't rule out bugs in my program or random bit flips due to hardware failure/instability.

I am going to try one more time using unique(8) or unique(9) positions - or use the same regular method but using different set of random zobrist keys to rule out possibility of hardware instability.
Re: "Unfortunately my result doesn't match the value reported by Peter Österlund back in 2013."

It is also possible the original 2013 result is wrong.
If the SJE result matches perfectly when completed, then we will have a better reference.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
Dann Corbit
Posts: 12537
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: yet another attempt on Perft(14)

Post by Dann Corbit »

Sorting the divide by mobility, there is an interesting result:

7614272181524252025 : e2-e4
7478732807317823667 : e2-e3
6636612996377425812 : d2-d4
4459055881158216540 : d2-d3
3067868595132318179 : c2-c4
2666334464745721380 : c2-c3
2638654908094167513 : b1-c3
2585178594020547799 : g1-f3
2546461593049574382 : h2-h4
2493115507450124588 : a2-a4
2388698250037891063 : g2-g3
2297640117530334747 : b2-b4
2286406377370528139 : b2-b3
2050748957027119617 : g2-g4
2022544576756325295 : g1-h3
1988096314966752669 : b1-a3
1905689484049474095 : f2-f4
1707945110805813310 : a2-a3
1691016253163166371 : h2-h3
1359948551007941806 : f2-f3

e4 and d4, the most important (at least most frequently played moves) are one and three in the ranking.

I wonder if e3 is also excellent due to the mobility it offers, and is being overlooked.
Similarly for 1.d3

It appears that 1.c4 also offers excellent mobility

And as far as mobility goes, 1.f3 really stinks
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
Edmund
Posts: 670
Joined: Mon Dec 03, 2007 3:01 pm
Location: Barcelona, Spain

Re: yet another attempt on Perft(14)

Post by Edmund »

Impressive speed. Assuming a branching factor of 32.5 it would take you "only" roughly 10 months for the perft 15 numbers. Do you consider there are still efficiency gains possible with your software? If so, by what factor?
Dann Corbit
Posts: 12537
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: yet another attempt on Perft(14)

Post by Dann Corbit »

Edmund wrote:Impressive speed. Assuming a branching factor of 32.5 it would take you "only" roughly 10 months for the perft 15 numbers. Do you consider there are still efficiency gains possible with your software? If so, by what factor?
If he had 20 boxes with 10 cards each, he could snap it off in a jiffy.

I would like to see statistics about the move types generated.
What is the percentage of captures at each phase?
What is the percentage of e.p. captures at each phase?
What is the number of checkmates at each phase?
How many single reply moves are there at each phase?
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
ankan
Posts: 77
Joined: Sun Apr 21, 2013 3:29 pm
Full name: Ankan Banerjee

Re: yet another attempt on Perft(14)

Post by ankan »

Edmund wrote:Impressive speed. Assuming a branching factor of 32.5 it would take you "only" roughly 10 months for the perft 15 numbers. Do you consider there are still efficiency gains possible with your software? If so, by what factor?
The other approach of first listing down unique positions (and occurrence counts) till certain depth and then computing perfts of those unique positions for remaining depth should be much faster depending on for what depth the unique positions are computed.
This is the method Peter used in his original Perft(14) computation. He managed to compute unique(11) - which is more than 726 billion positions! followed by those many perft(3) computations. I don't have access to hard disks big enough to even store so many positions.

Steven's Perft(14) project is also using the same method but with just ~96 million unique(7) positions - followed by perft(7) of all those positions.

For my program, I don't expect much gains from unique(7) positions (as my regular transposition tables are already big enough to hold most of them) but with unique(8) or higher, the gains could be much more.

In my program, I never clear the transposition tables (I have separate transposition tables for most levels and they get overwritten whenever a newer position at same slot is found). Which means for my program it also depends on how the unique positions are sorted. If nearby positions in the list of unique positions are 'close' to each other (i.e, expected to result in duplicate positions in a few plies), transposition table hit rate would be better when computing perfts of these positions. I am trying to figure out what is the best way to sort a given set of positions for best transposition table utilization.

With all unique(9) or unique(10) positions available and sorted in a reasonable order, I expect about 10X gain in performance - but maybe I am being too optimistic given that I still haven't even computed unique(8) myself :).
Last edited by ankan on Sat Aug 13, 2016 9:49 pm, edited 1 time in total.
ankan
Posts: 77
Joined: Sun Apr 21, 2013 3:29 pm
Full name: Ankan Banerjee

Re: yet another attempt on Perft(14)

Post by ankan »

Dann Corbit wrote: I would like to see statistics about the move types generated.
What is the percentage of captures at each phase?
What is the percentage of e.p. captures at each phase?
What is the number of checkmates at each phase?
How many single reply moves are there at each phase?
Currently my program doesn't save these statistics, but it should be quite easy to add them for the regular perft approach (without too much overhead). I will try to get them in my next run.
thomasahle
Posts: 94
Joined: Thu Feb 27, 2014 8:19 pm

Re: yet another attempt on Perft(14)

Post by thomasahle »

Do you use zobrist transposition tables, or something like a regular chaining hash table with correctness guarantees?
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Peter Österlund's Perft(14) result

Post by sje »

From Peter Österlund 2013.04.02: perft(14) = 61,885,021,521,585,529,237
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: Peter Österlund's Perft(14) result

Post by sje »

sje wrote:From Peter Österlund 2013.04.02: perft(14) = 61,885,021,521,585,529,237
Which is 10,240 more than Ankan's result.

Which is THREE bits different (XOR) out of 66 bits.

Which could be a triplet of single bit random flips (cosmic rays).

Which shows that independent verification is so important.

Code: Select all

Peter: 110101101011010011110000110011000101100111100000100110000110010101
Ankan: 110101101011010011110000110011000101100111100000100011100110010101