SMP: on same branch instead splitting?

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Highluder
Posts: 8
Joined: Fri Jun 28, 2013 9:03 am
Location: Germany

SMP: on same branch instead splitting?

Post by Highluder »

Hi,

has somebody experience what happens when in SMP (all?) threads analyse the same position?

I've read that an engine uses this with success. Because other processes fill evaluated positions in the same hashtable this is supposed to speedup the (main) process.

Has someone tested this? How is avoided that different processes do redundant work and analyse the same position at the same time before it is stored in hash?

What happens when hashtable gets full? Slowdown or not?

How does it work with many (e.g.16) cpus? Is it's scalability better than normal SMP?
Highluder
Posts: 8
Joined: Fri Jun 28, 2013 9:03 am
Location: Germany

Re: SMP: on same branch instead splitting?

Post by Highluder »

In original was written:

"So, it seems instead of parking threads when they have no work to do, it's better to spin them on the same search tree, since the hashtable was already mostly filled out."

This may improve SMP performance.

Perhaps this can be more generalized, in some kind of way i don't know. Maybe s.b. finds a clever trick to do SMP other than classical splitting. Maybe searching some knodes again is faster than splitting overhead.

Komodo 8 has a very effective SMP. Everyone is asking himself how they do it.


Frank
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: SMP: on same branch instead splitting?

Post by bob »

Highluder wrote:Hi,

has somebody experience what happens when in SMP (all?) threads analyse the same position?

I've read that an engine uses this with success. Because other processes fill evaluated positions in the same hashtable this is supposed to speedup the (main) process.

Has someone tested this? How is avoided that different processes do redundant work and analyse the same position at the same time before it is stored in hash?

What happens when hashtable gets full? Slowdown or not?

How does it work with many (e.g.16) cpus? Is it's scalability better than normal SMP?
Speedup is lousy for more than 2 processors. In fact, it is lousy for 2 as well... there are lots of old threads on this topic...
mar
Posts: 2559
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: SMP: on same branch instead splitting?

Post by mar »

Hard to say, I'm using lazy smp as described by Dan Homan, i.e. each other thread starts crunching on depth+1.
Whenever one of the helpers (or slaves if you wish) finishes the iteration, all others are aborted immediately.
The good thing is there is zero synchronization/copying overhead except at the start of each iteration, no need to specify minimum split depth and most importantly the implementation is trivial compared to YBW.
I don't have enough data on this but judging from CCRL I get about 100 elo for 4 cores vs 1.
When I look at YBW engines they get about the same.
Also from what I've seen in TCEC it had no problems competing with state of the art smp implementations (could have been luck of course).
I also suspect that (some?) YBW engines don't scale at all above 8 cores but I have no data about how lazy smp (if at all) scales above 4 cores.
I certainly don't plan to switch to anything else.