I use processes which are all launched at main app launch and pointed to the shared memory.hgm wrote:Are you using threads or processes? I was using processes. Although I cannot see why this would matter, I cannot exclude it either. When I made the hash mask that isolates the index from the key process-dependent so that each process used a separate part of the shared memory, the speed went back to normal. If both processes used the full table, the nps drops and time-to-depth increases.
lazy smp questions
Moderators: hgm, Rebel, chrisw
-
- Posts: 1357
- Joined: Wed Mar 08, 2006 10:15 pm
- Location: San Francisco, California
Re: lazy smp questions
-
- Posts: 1357
- Joined: Wed Mar 08, 2006 10:15 pm
- Location: San Francisco, California
Re: lazy smp questions
I do. Not sure about Crafty. Note that I do not probe the hash tables in qsearch, though.Dann Corbit wrote:Do you count a hash hit as a node?
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: lazy smp questions
I count each new node reached as a "node". When I make a move, in inc the counter as that is certainly a new node.. What happens at the next recursive search level I don't know. Could be a rep, a 50 move draw, a hash hit, or a search.Dann Corbit wrote:Do you count a hash hit as a node?
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: lazy smp questions
What's with the hostility? (a) it is certainly known to be a poor algorithm. Those that choose to use it certainly have the right to do so and it doesn't matter to me one bit. (b) I simply asked a question about numbers that looked a bit odd, nothing more, nothing less. Sometimes such comments lead to a bug being discovered and fixed.mar wrote:Honestly I don't see the point: "lazy smp" doesn't pretend to be the best way to do smp (hence LAZY - just to clarify to some individualscdani wrote:Cheng, I suppose Nirvana, and Andscacs use threads, and with shared hash between threads.
- even though 100+elo compared to mostly _crappy_ YBW implementations that do nothing but wait seems fine to me
Of course we have evangelists here who love to spread rumors.
"this doesn't work coz I did it 30 years ago".
Good riddance.
if lazy smp was so lousy we wouldn't get so many negative reactions from stars. who gives a damn. I don't.
In fact this "community" starts to annoy me.
To date, there is only one optimal way, perhaps another one or two that are fairly close to optimal approaches, and then the rest fall farther down the performance ladder. There are certainly better approaches than this that were used 30 years ago, not that that means anything in particular.
this:
I don't begin to understand. I think you might mean the opposite, that you WOULD get negative reactions.if lazy smp was so lousy we wouldn't get so many negative reactions from stars.
-
- Posts: 2554
- Joined: Fri Nov 26, 2010 2:00 pm
- Location: Czech Republic
- Full name: Martin Sedlak
Re: lazy smp questions
Well, I get +136 on CEGT 40/4, 4 cores vs 1. Error bars are very high, sure.bob wrote:(a) it is certainly known to be a poor algorithm.
Until your 25.0 is out with alleged "linear speedup", I don't see a single engine that would perform significantly better on 4 cores doing smp "right".
Feel free to neglect it, say what you want, I really don't care.
Yet somehow independent tests show something else that what you claim...
Using capital letters won't change it a bit.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: lazy smp questions
Error bars are meaningless at that level. 136 Elo means 4x faster. I can tell you THAT is not happening. My 25.0 does NOT support linear speedup if you mean 16x faster on 16 cores. I have certainly been getting 13x on 20 cores or better, but linear would be 20x which is not so likely.mar wrote:Well, I get +136 on CEGT 40/4, 4 cores vs 1. Error bars are very high, sure.bob wrote:(a) it is certainly known to be a poor algorithm.
Until your 25.0 is out with alleged "linear speedup", I don't see a single engine that would perform significantly better on 4 cores doing smp "right".
Feel free to neglect it, say what you want, I really don't care.
Yet somehow independent tests show something else that what you claim...
Using capital letters won't change it a bit.
I don't need to know "results" when I know the theory behind parallel search. 4x is NOT going to consistently happen with lazy SMP. 2x would be a remarkable result when tested in a reasonable way...
There are good papers to read that clue you in on the overhead, and the limitations on speedup for various approaches.
-
- Posts: 2272
- Joined: Mon Sep 29, 2008 1:50 am
Re: lazy smp questions
The error bars are not so high. Combining them I get +-26. So still 110 elo in worst case (with 95% confidence).Martin Sedlak wrote:Well, I get +136 on CEGT 40/4, 4 cores vs 1. Error bars are very high, sure.
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
Without ideas there is nothing to simplify.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: lazy smp questions
You have to look at all the data. For example, look at the average opponent rating for cheng 4cpu vs cheng 1cpu. 1cpu played against an opponent average about 50 Elo stronger than cheng 4cpu. What would you expect that to cause? Make cheng 1cpu look weaker? You can't compare Elo numbers between partially or fully disjoint sets of opponents...Michel wrote:The error bars are not so high. Combining them I get +-26. So still 110 elo in worst case (with 95% confidence).Martin Sedlak wrote:Well, I get +136 on CEGT 40/4, 4 cores vs 1. Error bars are very high, sure.
I wouldn't guess at either the rating difference or the Elo difference given that data. After it has been running a while, perhaps. But until the average opponent ratings get closer, you already have a 50 Elo error potential.
I've said this MANY times. To measure parallel performance, or even programming changes, you need a really stable test environment. Same opponents, same everything except for the changes to your own program. Then you get some pretty accurate data. Here, there are so many degrees of freedom in the test comparison is difficult to impossible.
-
- Posts: 2272
- Joined: Mon Sep 29, 2008 1:50 am
Re: lazy smp questions
Why not? As long if the graph is connected the comparison is fine.Robert Hyatt wrote:You have to look at all the data. For example, look at the average opponent rating for cheng 4cpu vs cheng 1cpu. 1cpu played against an opponent average about 50 Elo stronger than cheng 4cpu. What would you expect that to cause? Make cheng 1cpu look weaker? You can't compare Elo numbers between partially or fully disjoint sets of opponents...
If A plays B and B plays C and C plays D you can still compare A and D. The comparison via intermediate engines just blows up the error bars.
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
Without ideas there is nothing to simplify.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: lazy smp questions
That was my point. If the average ratings for player A's opponents is X, and the average rating for player B's opponents is X+50, it is going to be VERY difficult to compare their ratings with any accuracy and use the resulting Elo numbers to predict outcome between the two versions. The two versions of the original program are different, the average opponents are different, WHICH is responsible for the Elo gain or loss?Michel wrote:Why not? As long if the graph is connected the comparison is fine.Robert Hyatt wrote:You have to look at all the data. For example, look at the average opponent rating for cheng 4cpu vs cheng 1cpu. 1cpu played against an opponent average about 50 Elo stronger than cheng 4cpu. What would you expect that to cause? Make cheng 1cpu look weaker? You can't compare Elo numbers between partially or fully disjoint sets of opponents...
If A plays B and B plays C and C plays D you can still compare A and D. The comparison via intermediate engines just blows up the error bars.
So in that specific CEGT comparison, the error bars are not +/-26. They are more like +/- 75...
In this case, saying A is +130 better than B is quite inaccurate. It is most likely better, to be sure. But how much better is much harder to determine without more data points.