Fastest perft

Anton · Post by **Anton** » Thu Sep 01, 2011 9:46 am

All compiled with gcc, Linux 32bit, on an antique machine.

+++++++++++++++++++++++++++++++++

#q2perft -O3 16mb hash.#

perft( 1)=           20 ( 0.000 sec)
perft( 2)=          400 ( 0.000 sec)
perft( 3)=         8902 ( 0.000 sec)
perft( 4)=       197281 ( 0.000 sec)
perft( 5)=      4865609 ( 0.090 sec)
perft( 6)=    119060324 ( 1.340 sec)
perft( 7)=   3195901860 (22.080 sec)

+++++++++++++++++++++++++++++++++

#q2perft -03 with profiling, 16mb hash.#

perft( 1)=           20 ( 0.000 sec)
perft( 2)=          400 ( 0.000 sec)
perft( 3)=         8902 ( 0.000 sec)
perft( 4)=       197281 ( 0.010 sec)
perft( 5)=      4865609 ( 0.080 sec)
perft( 6)=    119060324 ( 1.260 sec)
perft( 7)=   3195901860 (21.560 sec)

+++++++++++++++++++++++++++++++++

#q2perft -03 with profiling, 64mb hash.#

perft( 1)=           20 ( 0.000 sec)
perft( 2)=          400 ( 0.000 sec)
perft( 3)=         8902 ( 0.000 sec)
perft( 4)=       197281 ( 0.010 sec)
perft( 5)=      4865609 ( 0.070 sec)
perft( 6)=    119060324 ( 1.310 sec)
perft( 7)=   3195901860 (18.050 sec)

++++++++++++++++++++++++++++++++++++

#Original Oliperft:#

 1     0      0          20
 2     0      0         400
 3     0      0        8902
 4     0      1      197281
 5     0     27     4865609
 6     0    441   119060324
 7     0   7060  3195901860

Nodes: 3320034396 cs: 7060 knps: 47023

70.60 seconds.

+++++++++++++++++++++++++++++++++++

#Original Oliperft with profiling.#

  1     0      0          20
 2     0      0         400
 3     0      0        8902
 4     0      1      197281
 5     0     24     4865609
 6     0    362   119060324
 7     0   6223  3195901860

Nodes: 3320034396 cs: 6223 knps: 53344

62.23 seconds.

++++++++++++++++++++++++++++++++++++

#Modified Oliperft:#

 1     0      0          20
 2     0      0         400
 3     0      0        8902
 4     0      1      197281
 5     0     28     4865609
 6     0    402   119060324
 7     0   6828  3195901860

Nodes: 0 cs: 6828 knps: 0

68.28seconds.

+++++++++++++++++++++++++++++++++

#Modified Oliperft with profiling:#

 1     0      0          20
 2     0      0         400
 3     0      0        8902
 4     0      1      197281
 5     0     23     4865609
 6     0    379   119060324
 7     0   6332  3195901860

Nodes: 0 cs: 6333 knps: 0


63.33 seconds.

+++++++++++++++++++++++++++++++++

#Jetchess, run via 'wine' abstraction layer.(64mb hash)# 
 
  1  Nb1-a3   120142144
  2  Nb1-c3   148527161
  3  Ng1-f3   147678554
  4  Ng1-h3   120669525
  5   a2-a3   106743106
  6   a2-a4   137077337
  7   b2-b3   133233975
  8   b2-b4   134087476
  9   c2-c3   144074944
 10   c2-c4   157756443
 11   d2-d3   227598692
 12   d2-d4   269605599
 13   e2-e3   306138410
 14   e2-e4   309478263
 15   f2-f3   102021008
 16   f2-f4   119614841
 17   g2-g3   135987651
 18   g2-g4   130293018
 19   h2-h3   106678423
 20   h2-h4   138495290

Total:       3195901860

16.262 seconds!!

+++++++++++++++++++++++++++++++++

hgm · Post by **hgm** » Thu Sep 01, 2011 10:12 am

I guess q2perft currently makes poor use of hash memory: It divides up the given memory into 8 sections, section 1-7 used for storing perft(2)-perft(8) results, while the 8th section is sub-sectioned to contain all higher depths (half of it for perft(9), one quarter of it for perft(10), 1/8 for perft(11) etc.). This means that for perft(7) only three of the eight sections are fully used, for storing perft(2)-perft(3) results. (Perft(7) needs some 70,000 perft(3) results, while even with 16MB there are 128K entries per section. So the other sections will remain virtually empty.)

Sven · Post by **Sven** » Thu Sep 01, 2011 11:42 am

One general remark on "perft performance": I think it would be better to measure NPS as "total visited nodes per second", not as "counted leaf nodes per second". A "visited node" would be a node where move generation is performed. That would certainly allow a fair comparison of different perft implementations on the same system platform (hardware+OS+...), even for different positions (which is impossible with "counted leaf nodes" due to the different branching factors at the level of frontier nodes where bulk-counting is applied). Also "hashing vs. non-hashing" would not influence the NPS number.

To go one step further, some kind of platform abstraction would be nice. frcperft does this based on the number of elapsed CPU cycles and prints "ticks/move" (but still based on "counted leaf nodes", see my remark above). Another approach would use some "perft speed coefficients" per system platform based on benchmarking, and would divide the absolute NPS by that coefficient.

Of course, one simple way to compare speed of two perft implementations is to compare their total time spent on perft(N) for the same position on the same system platform, if the limitations of this approach are accepted.

Sven

Ajedrecista · Post by **Ajedrecista** » Thu Sep 01, 2011 12:00 pm

Hi Muller:

hgm wrote:OK, I made a Windows compile of the new qperft. It is now on my regular download page, replacing the qperft binary that was there before. Strange thing is that the Windows version is slower without hash (perft(7) takes 94 sec in stead of 75 sec), but it is faster with hash (233 sec vs 245 sec)!!!

Former qperft executable worked fine under my system; now, with your new recompile, it does not run (missing cygwin1.dll). What can I do? Thanks in advance.

Regards from Spain.

Ajedrecista.

hgm · Post by **hgm** » Thu Sep 01, 2011 12:03 pm

Agreed, most nps figures by perfts that print those are pure nonsense, as they represent not nodes per second, but moves per second. And then sometimes even moves that were not made, but only counted.

Qperft does not make the final ply, because its original purpose was to time the move generator for an engine under realistic circumstances. (Just putting a loop around it to execute it a billion timeswas not very satisfactory, as very quickly the branch prediction learns to be perfect.) And in an engine the leaf nodes typically do a move generation, but do not make any of the generated moves (or they would not be leaf nodes).

Pablo Vazquez · Post by **Pablo Vazquez** » Thu Sep 01, 2011 3:42 pm

Ajedrecista wrote:Hi Muller:

hgm wrote:OK, I made a Windows compile of the new qperft. It is now on my regular download page, replacing the qperft binary that was there before. Strange thing is that the Windows version is slower without hash (perft(7) takes 94 sec in stead of 75 sec), but it is faster with hash (233 sec vs 245 sec)!!!
Former qperft executable worked fine under my system; now, with your new recompile, it does not run (missing cygwin1.dll). What can I do? Thanks in advance.

Regards from Spain.

Ajedrecista.

Try this: http://www.mediafire.com/?7dieekrhramw6ui

Ajedrecista · Post by **Ajedrecista** » Thu Sep 01, 2011 3:58 pm

¡Hola Pablo!

Pablo Vazquez wrote:
Ajedrecista wrote:Hi Muller:

hgm wrote:OK, I made a Windows compile of the new qperft. It is now on my regular download page, replacing the qperft binary that was there before. Strange thing is that the Windows version is slower without hash (perft(7) takes 94 sec in stead of 75 sec), but it is faster with hash (233 sec vs 245 sec)!!!
Former qperft executable worked fine under my system; now, with your new recompile, it does not run (missing cygwin1.dll). What can I do? Thanks in advance.

Regards from Spain.

Ajedrecista.
Try this: http://www.mediafire.com/?7dieekrhramw6ui

Thank you very much for providing this link! Yesterday, I was unable to compile by myself (the system did not recognize gcc instruction and also did not find the source perft.c, which I copy in a Notepad and I changed its path trying that MinGW could find it, but it was impossible). It now runs without problem... but MUCH faster than older version of qperft. In my system (32-bit) JetChess is still a little faster (Perft(7) of starting position in ~ 8.7 seconds while qperft counts it in ~ 10 seconds). It is a huge improvement. I am curious on seeing multi-core perft counters (hopefully meleechess will support this in the near future).

Regards from Spain.

Ajedrecista.

hgm · Post by **hgm** » Thu Sep 01, 2011 4:11 pm

Ajedrecista wrote:Former qperft executable worked fine under my system; now, with your new recompile, it does not run (missing cygwin1.dll). What can I do? Thanks in advance.

Oops! Forgot to include the -mno-cygwin flag in the compile command.

I fixed that now, so that the version on my web page is now Cygwin-independent again.

The first perft(10) I tried on my Core-2-Duo (for the position after 1. h3 h6) took 4:20. Still too slow to compete with Steven Edward's monster; I really need that extra factor 10.

smrf · Post by **smrf** » Fri Sep 02, 2011 9:43 am

When Perft has to be a tool for to test several qualities of a move generator,
it should not only count leafs, but also provide relevant local information.
Because speed is not the only intension of writing a move generator.

See an old posting of mine:

http://www.talkchess.com/forum/viewtopi ... 18&t=17107

Regards, Reinhard.

Ajedrecista · Post by **Ajedrecista** » Fri Sep 02, 2011 12:13 pm

Hello everybody:

Just for seeing the difference between processors, I run Perft(9) of the initial position with JetChess, using 1024 MB of hash in an Intel Core i5-760 (32-bit OS) of the university; remember that JetChess uses only one core. With my Pentium D930, the same Perft(9) run with the same hash size took 2720.81 seconds, and now 761.432 seconds (0:12:41.432)! If I divide the number of nodes by the spent time (I know that it is not the best way to measure speed according to Sven Schüle and others) I get around 3203.87 MN/s... the difference is more than obvious.

[d]

2,439,530,234,167 (move pathes after 9 half moves).

Code: Select all

  1  Nb1-a3  85849641909
  2  Nb1-c3  109418317145
  3  Ng1-f3  108393009416
  4  Ng1-h3  86659653631
  5   a2-a3  74950758099
  6   a2-a4  101265301849
  7   b2-b3  96577095997
  8   b2-b4  97442160946
  9   c2-c3  108697368719
 10   c2-c4  120549219832
 11   d2-d3  176976245463
 12   d2-d4  227220482342
 13   e2-e3  259522947791
 14   e2-e4  263561543780
 15   f2-f3  68094899093
 16   f2-f4  84792070664
 17   g2-g3  99646370024
 18   g2-g4  92281289941
 19   h2-h3  74778417365
 20   h2-h4  102853440161

Total:    2439530234167

Time: 761.432 s

Regards from Spain.

Ajedrecista.

Fastest perft

Re: Fastest perft

Re: Fastest perft

Re: Fastest perft

Re: Fastest perft

Re: Fastest perft

Re: Fastest perft

Re: Fastest perft

Re: Fastest perft

Re: Fastest perft - made comparable

Re: Fastest perft