lazy smp questions

Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
User avatar
hgm
Posts: 23790
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: lazy smp questions

Post by hgm » Thu Sep 10, 2015 6:02 pm

The funny thing is that for me this never worked. When I have two processes running using the same hash, time-to-depth for the searching one is approximately 50% longer than when there is no second process analyzing with the same hash table.

User avatar
JVMerlino
Posts: 1003
Joined: Wed Mar 08, 2006 9:15 pm
Location: San Francisco, California

Re: lazy smp questions

Post by JVMerlino » Thu Sep 10, 2015 6:20 pm

hgm wrote:The funny thing is that for me this never worked. When I have two processes running using the same hash, time-to-depth for the searching one is approximately 50% longer than when there is no second process analyzing with the same hash table.
Strange. Well, you can run Myrddin yourself and see the difference. It supports the "cores" command so the commandline will work fine. But here's a comparison on the initial position between 1 core and 4 cores. It should be no surprise that the nps with 4 cores is more than 4x higher than with 1 core, since all processes are essentially doing the same work and the percentage of hash hits goes up considerably.

Code: Select all

 1     15      0            2 b1a3
 1     61      0            4 b1c3
 1     67      0           14 d2d4
 1     79      0           17 e2e4
 2     63      1           31 e2e4? b8c6 (2 KNPS)
 2     47      1           71 e2e4? b8c6 (4 KNPS)
 2     15      3          128 e2e4? d7d5 (4 KNPS)
 2      0      3          191 e2e4 e7e5 (6 KNPS)
 3     16      3          305 e2e4! (9 KNPS)
 3     32      4          359 e2e4! (7 KNPS)
 3     64      4          486 e2e4! (10 KNPS)
 3     67      6          625 e2e4 e7e5 d2d4 (10 KNPS)
 4     51      6          818 e2e4? e7e5 (13 KNPS)
 4     35      6         1356 e2e4? e7e5 (21 KNPS)
 4      6      7         2100 e2e4 e7e5 d2d4 b8c6 d1d4 (26 KNPS)
 5     22      9         3588 e2e4! (38 KNPS)
 5     36      9         5439 e2e4 d7d5 b1c3 g8f6 d1f3 d5e4 (58 KNPS)
 5     38      9         6663 b1c3! d7d5 b1c3 g8f6 d1f3 d5e4 (71 KNPS)
 5     46      9         7662 b1c3 e7e5 e2e4 b8c6 g1f3 (82 KNPS)
 5     58     10         9342 d2d4 d7d5 b1c3 g8f6 (85 KNPS)
 6     42     10        10438 d2d4? d7d5 (95 KNPS)
 6     26     12        15012 d2d4? d7d5 (121 KNPS)
 6      0     14        22505 d2d4 d7d5 b1c3 g8f6 g1f3 b8c6 (160 KNPS)
 6     10     14        25837 b1c3 d7d5 e2e4 g8f6 f1b5 b8c6 d2d3 (184 KNPS)
 7     23     17        43469 b1c3 d7d5 e2e4 d5d4 c3d5 g8f6 d1f3 f6d5 e4d5 (254
KNPS)
 8      7     23        70955 b1c3? d7d5 (303 KNPS)
 8     24     28       112400 e2e4 d7d5 e4d5 g8f6 d2d4 f6d5 f1b5 b8c6 g1f3 (401
KNPS)
 9      8     43       187633 e2e4? d7d5 (430 KNPS)
 9     11     62       279935 e2e3 b8c6 b1c3 d7d5 f1b5 (448 KNPS)
10     -5     71       345802 e2e3? b8c6 (482 KNPS)
10     19     85       421152 b1c3 b8c6 d2d4 d7d5 g1f3 g8f6 d1d3 c8g4 h2h3 c6b4
h3g4 (490 KNPS)
10     20    107       514840 e2e4 b8c6 b1c3 g8f6 g1f3 d7d5 e4e5 d5d4 e5f6 d4c3
f6g7 f8g7 d2c3 d8d1 e1d1 (478 KNPS)
11     22    170       876820 e2e4 b8c6 d2d4 d7d5 e4e5 c8f5 f1b5 e7e6 b5c6 b7c6
g1f3 f8e7 (515 KNPS)
12     38    363      1909648 e2e4! (525 KNPS)
12     33    424      2328605 e2e4 d7d5 e4d5 g8f6 d2d4 d8d6 g1e2 f6d5 c2c4 d5f6
c1f4 d6c6 f4d6 (548 KNPS)
13     35    778      4526696 e2e4 b8c6 d2d4 e7e6 d4d5 e6d5 e4d5 c6e5 f1b5 d8h4
b1c3 f8c5 g1h3 (581 KNPS)
14     25   1396      8356166 e2e4 d7d5 e4d5 g8f6 f1b5 b8d7 d2d4 a7a6 b5d3 f6d5
c2c4 d5b4 d3e4 d7f6 b1c3 f6e4 c3e4 (598 KNPS)
15     14   2815     16878162 e2e4 e7e5 b1c3 g8f6 g1f3 b8c6 f1b5 f8b4 a2a3 b4c3
d2c3 d7d6 b5c6 b7c6 c1g5 c8e6 (599 KNPS)
16     13   6252     37291645 e2e4 e7e5 b1c3 g8f6 g1f3 f8b4 f1b5 e8g8 a2a3 b4c3
d2c3 c7c6 b5c4 d7d5 e4d5 c6d5 (596 KNPS)
17     21  14483     86252048 e2e4 e7e5 b1c3 g8f6 g1f3 f8b4 a2a3 b4c3 d2c3 d7d6
f1c4 e8g8 c1g5 b8c6 d1d3 c8g4 e1g1 (595 KNPS)

Code: Select all

 1     15      0            2 b1a3
 1     61      1          237 b1c3 (14 KNPS)
 1     67      1         7005 d2d4 (437 KNPS)
 1     79      1        10588 e2e4 (661 KNPS)
 2     63      1        16569 e2e4? b8c6 (1035 KNPS)
 2     47      3        23244 e2e4? d7d5 (749 KNPS)
 2     36      3        30816 e2e4 d7d5 (994 KNPS)
 2     45      3        37492 g1f3 d7d5 (1209 KNPS)
 3     45      3        45850 g1f3 d7d5 (1479 KNPS)
 4     45      4        52637 g1f3 d7d5 (1119 KNPS)
 5     45      4        60318 g1f3 d7d5 (1283 KNPS)
 6     45      6        66600 g1f3 d7d5 (1057 KNPS)
 7     45      6        73528 g1f3 d7d5 (1167 KNPS)
 8     29      6        93948 g1f3? d7d5 (1491 KNPS)
 8     32     12       222652 g1f3 d7d5 b1c3 g8f6 d2d4 d8d6 (1781 KNPS)
 9     32     14       258295 g1f3 d7d5 d2d4 d5e4 (1831 KNPS)
10     19     20       423268 g1f3 d7d5 d2d4 g8f6 b1c3 b8c6 d1d3 c8g4 h2h3 g4e6
(2085 KNPS)
10     22     31       700960 d2d4 d7d5 b1c3 d7d5 d1d3 b8c6 g1f3 (2246 KNPS)
11     25     49      1232234 d2d4 d7d5 g1f3 b8c6 c1f4 c8f5 b1c3 e7e6 e2e3 f8b4
d1d2 (2469 KNPS)
11     26     64      1638297 e2e4 d7d5 e4d5 g8f6 f1b5 c8d7 b5c4 b7b5 c4d3 (2559
 KNPS)
12     31     99      2587188 e2e4 d7d5 e4d5 g8f6 f1b5 c8d7 b5c4 b7b5 (2589 KNPS
)
13     33    216      5804057 e2e4 e7e5 g1f3 g8f6 b1c3 b8c6 c3e4 (2675 KNPS)
14     18    399     10608837 e2e4 e7e5 g1f3 g8f6 b1c3 b8c6 f1b5 f8b4 a2a3 b4c3
d2c3 e8g8 c1g5 d7d6 b5c6 b7c6 (2656 KNPS)
15     23   1082     28333071 e2e4 e7e5 g1f3 b8c6 b1c3 g8f6 (2616 KNPS)
16     12   2563     66873714 e2e4 e7e5 g1f3 g8f6 b1c3 f8b4 f1c4 d7d6 a2a3 b4c3
d2c3 (2609 KNPS)
17     12   4959    127695303 e2e4 e7e5 (2574 KNPS)
jm

bob
Posts: 20642
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: lazy smp questions

Post by bob » Thu Sep 10, 2015 6:44 pm

JVMerlino wrote:
hgm wrote:The funny thing is that for me this never worked. When I have two processes running using the same hash, time-to-depth for the searching one is approximately 50% longer than when there is no second process analyzing with the same hash table.
Strange. Well, you can run Myrddin yourself and see the difference. It supports the "cores" command so the commandline will work fine. But here's a comparison on the initial position between 1 core and 4 cores. It should be no surprise that the nps with 4 cores is more than 4x higher than with 1 core, since all processes are essentially doing the same work and the percentage of hash hits goes up considerably.

Code: Select all

 1     15      0            2 b1a3
 1     61      0            4 b1c3
 1     67      0           14 d2d4
 1     79      0           17 e2e4
 2     63      1           31 e2e4? b8c6 (2 KNPS)
 2     47      1           71 e2e4? b8c6 (4 KNPS)
 2     15      3          128 e2e4? d7d5 (4 KNPS)
 2      0      3          191 e2e4 e7e5 (6 KNPS)
 3     16      3          305 e2e4! (9 KNPS)
 3     32      4          359 e2e4! (7 KNPS)
 3     64      4          486 e2e4! (10 KNPS)
 3     67      6          625 e2e4 e7e5 d2d4 (10 KNPS)
 4     51      6          818 e2e4? e7e5 (13 KNPS)
 4     35      6         1356 e2e4? e7e5 (21 KNPS)
 4      6      7         2100 e2e4 e7e5 d2d4 b8c6 d1d4 (26 KNPS)
 5     22      9         3588 e2e4! (38 KNPS)
 5     36      9         5439 e2e4 d7d5 b1c3 g8f6 d1f3 d5e4 (58 KNPS)
 5     38      9         6663 b1c3! d7d5 b1c3 g8f6 d1f3 d5e4 (71 KNPS)
 5     46      9         7662 b1c3 e7e5 e2e4 b8c6 g1f3 (82 KNPS)
 5     58     10         9342 d2d4 d7d5 b1c3 g8f6 (85 KNPS)
 6     42     10        10438 d2d4? d7d5 (95 KNPS)
 6     26     12        15012 d2d4? d7d5 (121 KNPS)
 6      0     14        22505 d2d4 d7d5 b1c3 g8f6 g1f3 b8c6 (160 KNPS)
 6     10     14        25837 b1c3 d7d5 e2e4 g8f6 f1b5 b8c6 d2d3 (184 KNPS)
 7     23     17        43469 b1c3 d7d5 e2e4 d5d4 c3d5 g8f6 d1f3 f6d5 e4d5 (254
KNPS)
 8      7     23        70955 b1c3? d7d5 (303 KNPS)
 8     24     28       112400 e2e4 d7d5 e4d5 g8f6 d2d4 f6d5 f1b5 b8c6 g1f3 (401
KNPS)
 9      8     43       187633 e2e4? d7d5 (430 KNPS)
 9     11     62       279935 e2e3 b8c6 b1c3 d7d5 f1b5 (448 KNPS)
10     -5     71       345802 e2e3? b8c6 (482 KNPS)
10     19     85       421152 b1c3 b8c6 d2d4 d7d5 g1f3 g8f6 d1d3 c8g4 h2h3 c6b4
h3g4 (490 KNPS)
10     20    107       514840 e2e4 b8c6 b1c3 g8f6 g1f3 d7d5 e4e5 d5d4 e5f6 d4c3
f6g7 f8g7 d2c3 d8d1 e1d1 (478 KNPS)
11     22    170       876820 e2e4 b8c6 d2d4 d7d5 e4e5 c8f5 f1b5 e7e6 b5c6 b7c6
g1f3 f8e7 (515 KNPS)
12     38    363      1909648 e2e4! (525 KNPS)
12     33    424      2328605 e2e4 d7d5 e4d5 g8f6 d2d4 d8d6 g1e2 f6d5 c2c4 d5f6
c1f4 d6c6 f4d6 (548 KNPS)
13     35    778      4526696 e2e4 b8c6 d2d4 e7e6 d4d5 e6d5 e4d5 c6e5 f1b5 d8h4
b1c3 f8c5 g1h3 (581 KNPS)
14     25   1396      8356166 e2e4 d7d5 e4d5 g8f6 f1b5 b8d7 d2d4 a7a6 b5d3 f6d5
c2c4 d5b4 d3e4 d7f6 b1c3 f6e4 c3e4 (598 KNPS)
15     14   2815     16878162 e2e4 e7e5 b1c3 g8f6 g1f3 b8c6 f1b5 f8b4 a2a3 b4c3
d2c3 d7d6 b5c6 b7c6 c1g5 c8e6 (599 KNPS)
16     13   6252     37291645 e2e4 e7e5 b1c3 g8f6 g1f3 f8b4 f1b5 e8g8 a2a3 b4c3
d2c3 c7c6 b5c4 d7d5 e4d5 c6d5 (596 KNPS)
17     21  14483     86252048 e2e4 e7e5 b1c3 g8f6 g1f3 f8b4 a2a3 b4c3 d2c3 d7d6
f1c4 e8g8 c1g5 b8c6 d1d3 c8g4 e1g1 (595 KNPS)

Code: Select all

 1     15      0            2 b1a3
 1     61      1          237 b1c3 (14 KNPS)
 1     67      1         7005 d2d4 (437 KNPS)
 1     79      1        10588 e2e4 (661 KNPS)
 2     63      1        16569 e2e4? b8c6 (1035 KNPS)
 2     47      3        23244 e2e4? d7d5 (749 KNPS)
 2     36      3        30816 e2e4 d7d5 (994 KNPS)
 2     45      3        37492 g1f3 d7d5 (1209 KNPS)
 3     45      3        45850 g1f3 d7d5 (1479 KNPS)
 4     45      4        52637 g1f3 d7d5 (1119 KNPS)
 5     45      4        60318 g1f3 d7d5 (1283 KNPS)
 6     45      6        66600 g1f3 d7d5 (1057 KNPS)
 7     45      6        73528 g1f3 d7d5 (1167 KNPS)
 8     29      6        93948 g1f3? d7d5 (1491 KNPS)
 8     32     12       222652 g1f3 d7d5 b1c3 g8f6 d2d4 d8d6 (1781 KNPS)
 9     32     14       258295 g1f3 d7d5 d2d4 d5e4 (1831 KNPS)
10     19     20       423268 g1f3 d7d5 d2d4 g8f6 b1c3 b8c6 d1d3 c8g4 h2h3 g4e6
(2085 KNPS)
10     22     31       700960 d2d4 d7d5 b1c3 d7d5 d1d3 b8c6 g1f3 (2246 KNPS)
11     25     49      1232234 d2d4 d7d5 g1f3 b8c6 c1f4 c8f5 b1c3 e7e6 e2e3 f8b4
d1d2 (2469 KNPS)
11     26     64      1638297 e2e4 d7d5 e4d5 g8f6 f1b5 c8d7 b5c4 b7b5 c4d3 (2559
 KNPS)
12     31     99      2587188 e2e4 d7d5 e4d5 g8f6 f1b5 c8d7 b5c4 b7b5 (2589 KNPS
)
13     33    216      5804057 e2e4 e7e5 g1f3 g8f6 b1c3 b8c6 c3e4 (2675 KNPS)
14     18    399     10608837 e2e4 e7e5 g1f3 g8f6 b1c3 b8c6 f1b5 f8b4 a2a3 b4c3
d2c3 e8g8 c1g5 d7d6 b5c6 b7c6 (2656 KNPS)
15     23   1082     28333071 e2e4 e7e5 g1f3 b8c6 b1c3 g8f6 (2616 KNPS)
16     12   2563     66873714 e2e4 e7e5 g1f3 g8f6 b1c3 f8b4 f1c4 d7d6 a2a3 b4c3
d2c3 (2609 KNPS)
17     12   4959    127695303 e2e4 e7e5 (2574 KNPS)
jm
Some of your numbers look a bit odd. IE NPS is more than 4x faster? That second number is eval? It doesn't match anywhere close between serial and parallel which seems odd. They are usually close (if not exact) for the same depth.

User avatar
JVMerlino
Posts: 1003
Joined: Wed Mar 08, 2006 9:15 pm
Location: San Francisco, California

Re: lazy smp questions

Post by JVMerlino » Thu Sep 10, 2015 6:54 pm

bob wrote:
JVMerlino wrote:
hgm wrote:The funny thing is that for me this never worked. When I have two processes running using the same hash, time-to-depth for the searching one is approximately 50% longer than when there is no second process analyzing with the same hash table.
Strange. Well, you can run Myrddin yourself and see the difference. It supports the "cores" command so the commandline will work fine. But here's a comparison on the initial position between 1 core and 4 cores. It should be no surprise that the nps with 4 cores is more than 4x higher than with 1 core, since all processes are essentially doing the same work and the percentage of hash hits goes up considerably.
Some of your numbers look a bit odd. IE NPS is more than 4x faster? That second number is eval? It doesn't match anywhere close between serial and parallel which seems odd. They are usually close (if not exact) for the same depth.
As I mentioned, the more than 4x faster is due to all processes essentially doing the same work and therefore the average time to visit a node gets shorter because of the increased percentage of hash hits.

As for the differing scores, I've never bothered to investigate -- for the same reason I chose the laziest of "lazy smp" implementations. :D

jm

User avatar
hgm
Posts: 23790
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: lazy smp questions

Post by hgm » Thu Sep 10, 2015 9:02 pm

Are you using threads or processes? I was using processes. Although I cannot see why this would matter, I cannot exclude it either. When I made the hash mask that isolates the index from the key process-dependent so that each process used a separate part of the shared memory, the speed went back to normal. If both processes used the full table, the nps drops and time-to-depth increases.

User avatar
cdani
Posts: 2104
Joined: Sat Jan 18, 2014 9:24 am
Location: Andorra
Contact:

Re: lazy smp questions

Post by cdani » Thu Sep 10, 2015 9:20 pm

hgm wrote:Are you using threads or processes? I was using processes. Although I cannot see why this would matter, I cannot exclude it either. When I made the hash mask that isolates the index from the key process-dependent so that each process used a separate part of the shared memory, the speed went back to normal. If both processes used the full table, the nps drops and time-to-depth increases.
Cheng, I suppose Nirvana, and Andscacs use threads, and with shared hash between threads.

bob
Posts: 20642
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: lazy smp questions

Post by bob » Thu Sep 10, 2015 9:25 pm

JVMerlino wrote:
bob wrote:
JVMerlino wrote:
hgm wrote:The funny thing is that for me this never worked. When I have two processes running using the same hash, time-to-depth for the searching one is approximately 50% longer than when there is no second process analyzing with the same hash table.
Strange. Well, you can run Myrddin yourself and see the difference. It supports the "cores" command so the commandline will work fine. But here's a comparison on the initial position between 1 core and 4 cores. It should be no surprise that the nps with 4 cores is more than 4x higher than with 1 core, since all processes are essentially doing the same work and the percentage of hash hits goes up considerably.
Some of your numbers look a bit odd. IE NPS is more than 4x faster? That second number is eval? It doesn't match anywhere close between serial and parallel which seems odd. They are usually close (if not exact) for the same depth.
As I mentioned, the more than 4x faster is due to all processes essentially doing the same work and therefore the average time to visit a node gets shorter because of the increased percentage of hash hits.

As for the differing scores, I've never bothered to investigate -- for the same reason I chose the laziest of "lazy smp" implementations. :D

jm
I'm not quite sure how increased hash hits speeds NPS up. For me it is typically the exact opposite.

User avatar
hgm
Posts: 23790
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: lazy smp questions

Post by hgm » Thu Sep 10, 2015 9:30 pm

It does if you count nodes where you have hash hits.

Dann Corbit
Posts: 10205
Joined: Wed Mar 08, 2006 7:57 pm
Location: Redmond, WA USA
Contact:

Re: lazy smp questions

Post by Dann Corbit » Thu Sep 10, 2015 9:31 pm

Do you count a hash hit as a node?

mar
Posts: 2015
Joined: Fri Nov 26, 2010 1:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: lazy smp questions

Post by mar » Thu Sep 10, 2015 9:36 pm

cdani wrote:Cheng, I suppose Nirvana, and Andscacs use threads, and with shared hash between threads.
Honestly I don't see the point: "lazy smp" doesn't pretend to be the best way to do smp (hence LAZY - just to clarify to some individuals ;)
- even though 100+elo compared to mostly _crappy_ YBW implementations that do nothing but wait seems fine to me :)
Of course we have evangelists here who love to spread rumors.
"this doesn't work coz I did it 30 years ago".
Good riddance.
if lazy smp was so lousy we wouldn't get so many negative reactions from stars. who gives a damn. I don't.
In fact this "community" starts to annoy me.

Post Reply