SMP: on same branch instead splitting?

Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
Highluder
Posts: 8
Joined: Fri Jun 28, 2013 7:03 am
Location: Germany

SMP: on same branch instead splitting?

Post by Highluder » Fri Jan 23, 2015 9:09 am

Hi,

has somebody experience what happens when in SMP (all?) threads analyse the same position?

I've read that an engine uses this with success. Because other processes fill evaluated positions in the same hashtable this is supposed to speedup the (main) process.

Has someone tested this? How is avoided that different processes do redundant work and analyse the same position at the same time before it is stored in hash?

What happens when hashtable gets full? Slowdown or not?

How does it work with many (e.g.16) cpus? Is it's scalability better than normal SMP?

Highluder
Posts: 8
Joined: Fri Jun 28, 2013 7:03 am
Location: Germany

Re: SMP: on same branch instead splitting?

Post by Highluder » Fri Jan 23, 2015 10:35 am

In original was written:

"So, it seems instead of parking threads when they have no work to do, it's better to spin them on the same search tree, since the hashtable was already mostly filled out."

This may improve SMP performance.

Perhaps this can be more generalized, in some kind of way i don't know. Maybe s.b. finds a clever trick to do SMP other than classical splitting. Maybe searching some knodes again is faster than splitting overhead.

Komodo 8 has a very effective SMP. Everyone is asking himself how they do it.


Frank

bob
Posts: 20562
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: SMP: on same branch instead splitting?

Post by bob » Fri Jan 23, 2015 12:11 pm

Highluder wrote:Hi,

has somebody experience what happens when in SMP (all?) threads analyse the same position?

I've read that an engine uses this with success. Because other processes fill evaluated positions in the same hashtable this is supposed to speedup the (main) process.

Has someone tested this? How is avoided that different processes do redundant work and analyse the same position at the same time before it is stored in hash?

What happens when hashtable gets full? Slowdown or not?

How does it work with many (e.g.16) cpus? Is it's scalability better than normal SMP?
Speedup is lousy for more than 2 processors. In fact, it is lousy for 2 as well... there are lots of old threads on this topic...

mar
Posts: 2001
Joined: Fri Nov 26, 2010 1:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: SMP: on same branch instead splitting?

Post by mar » Fri Jan 23, 2015 12:47 pm

Hard to say, I'm using lazy smp as described by Dan Homan, i.e. each other thread starts crunching on depth+1.
Whenever one of the helpers (or slaves if you wish) finishes the iteration, all others are aborted immediately.
The good thing is there is zero synchronization/copying overhead except at the start of each iteration, no need to specify minimum split depth and most importantly the implementation is trivial compared to YBW.
I don't have enough data on this but judging from CCRL I get about 100 elo for 4 cores vs 1.
When I look at YBW engines they get about the same.
Also from what I've seen in TCEC it had no problems competing with state of the art smp implementations (could have been luck of course).
I also suspect that (some?) YBW engines don't scale at all above 8 cores but I have no data about how lazy smp (if at all) scales above 4 cores.
I certainly don't plan to switch to anything else.

Post Reply