Hi.
I had some bad scaling with Andscacs with more than 4 cores as maybe you has seen in the general section:
2 threads: +61
4 threads: +108
6 threads: +117
7 threads: +119
8 threads: +96
Initially I thought than may be there was a bug, or some computer limitation like some cache collapse was kicking in, or maybe that my careless disregard or locking and other stuff was paying at last.
Of course any of this can be possible even with the change I explain here, but I have a hope is not like this. Hope because I don't have experience on this, of course
So I thought that it was not very logical that so many threads where thinking in the two same depths:
NewDepth = Depth + (((Depth + 1) & 1) ^ 1)
Initially I changed this for
NewDepth = Depth + (((Depth + 1) & 1) ^ 1) + (Depth > 5)
and I obtained
8 threads: +115
Not bad.
I tried to be a little more aggressive:
NewDepth = Depth + (((Depth + 1) & 1) ^ 1) + (Depth > 4) + (Depth > 6)
and I obtained
8 threads: +120
Now I'm trying something more aggressive. I will report.
I don't know if this happened to some of you with lazy smp, or if someone has tried something like I'm trying now.
After those attempts, I will try to obtain access to a 12 or 16 core machine, to see how this must be modified to scale well at those machines. May be some of you have experience on an ISP that offers such services. I will pay just for a month, because I suppose it will not be any cheap, but I think it will be enough.
Trying to improve lazy smp
Moderators: hgm, Rebel, chrisw
-
- Posts: 2204
- Joined: Sat Jan 18, 2014 10:24 am
- Location: Andorra
Trying to improve lazy smp
Daniel José - http://www.andscacs.com
-
- Posts: 12541
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: Trying to improve lazy smp
Didn't Dan Homan do something like this with ExChess?cdani wrote:Hi.
I had some bad scaling with Andscacs with more than 4 cores as maybe you has seen in the general section:
2 threads: +61
4 threads: +108
6 threads: +117
7 threads: +119
8 threads: +96
Initially I thought than may be there was a bug, or some computer limitation like some cache collapse was kicking in, or maybe that my careless disregard or locking and other stuff was paying at last.
Of course any of this can be possible even with the change I explain here, but I have a hope is not like this. Hope because I don't have experience on this, of course
So I thought that it was not very logical that so many threads where thinking in the two same depths:
NewDepth = Depth + (((Depth + 1) & 1) ^ 1)
Initially I changed this for
NewDepth = Depth + (((Depth + 1) & 1) ^ 1) + (Depth > 5)
and I obtained
8 threads: +115
Not bad.
I tried to be a little more aggressive:
NewDepth = Depth + (((Depth + 1) & 1) ^ 1) + (Depth > 4) + (Depth > 6)
and I obtained
8 threads: +120
Now I'm trying something more aggressive. I will report.
I don't know if this happened to some of you with lazy smp, or if someone has tried something like I'm trying now.
After those attempts, I will try to obtain access to a 12 or 16 core machine, to see how this must be modified to scale well at those machines. May be some of you have experience on an ISP that offers such services. I will pay just for a month, because I suppose it will not be any cheap, but I think it will be enough.
-
- Posts: 2204
- Joined: Sat Jan 18, 2014 10:24 am
- Location: Andorra
Re: Trying to improve lazy smp
Yes. This:Dann Corbit wrote:Didn't Dan Homan do something like this with ExChess?
NewDepth = Depth + (((Depth + 1) & 1) ^ 1)
So I extended it.
Daniel José - http://www.andscacs.com
-
- Posts: 2559
- Joined: Fri Nov 26, 2010 2:00 pm
- Location: Czech Republic
- Full name: Martin Sedlak
Re: Trying to improve lazy smp
I don't have 8 core machine here, but something seems odd.
Are you sure you tested on real 8 core machine? (not on a quad with HT on?)
Are you sure you tested on real 8 core machine? (not on a quad with HT on?)
-
- Posts: 2204
- Joined: Sat Jan 18, 2014 10:24 am
- Location: Andorra
Re: Trying to improve lazy smp
Of course. AMD FX-8350.mar wrote:I don't have 8 core machine here, but something seems odd.
Are you sure you tested on real 8 core machine? (not on a quad with HT on?)
If you want I can test your engine to see if there is similar behavior.
Daniel José - http://www.andscacs.com
-
- Posts: 2204
- Joined: Sat Jan 18, 2014 10:24 am
- Location: Andorra
Re: Trying to improve lazy smp
New improvement:
NewDepth = Depth + (((Depth + 1) & 1) ^ 1) + (Depth > 2) + (Depth > 4) + (Depth > 6)
4 threads: from +108 to +117
8 threads: from +120 to +134
So of course I'm trying something even more aggressive.
The updated version is here, only for 64 popcnt:
http://www.andscacs.com/andscacs074024.zip
I tried to find information about how this compare to other engines and I found this thread:
http://talkchess.com/forum/viewtopic.php?t=55563
It's important to say that my tests are against a gauntlet, not against Andscacs itself.
So with 4 threads Andscacs wins 117 elo against a gauntlet, and Zappa Mexico II, "known to scale particularly well", better than Stockfish, obtains 114 but in selfplay.
I know that is more difficult for a better engine to win itself because of the diminishing returns.
Do you think this holds or compares well? It exists any option that lazy eval is better than other ways of doing MP?
My intuition, using it because I have not experience on all this, believes that in lazy eval, not being necessary to do synchronizations between threads, at least will be more lightweight, and the more the threads, the gains can be better. Of course I will continue testing all this.
NewDepth = Depth + (((Depth + 1) & 1) ^ 1) + (Depth > 2) + (Depth > 4) + (Depth > 6)
4 threads: from +108 to +117
8 threads: from +120 to +134
So of course I'm trying something even more aggressive.
The updated version is here, only for 64 popcnt:
http://www.andscacs.com/andscacs074024.zip
I tried to find information about how this compare to other engines and I found this thread:
http://talkchess.com/forum/viewtopic.php?t=55563
It's important to say that my tests are against a gauntlet, not against Andscacs itself.
So with 4 threads Andscacs wins 117 elo against a gauntlet, and Zappa Mexico II, "known to scale particularly well", better than Stockfish, obtains 114 but in selfplay.
I know that is more difficult for a better engine to win itself because of the diminishing returns.
Do you think this holds or compares well? It exists any option that lazy eval is better than other ways of doing MP?
My intuition, using it because I have not experience on all this, believes that in lazy eval, not being necessary to do synchronizations between threads, at least will be more lightweight, and the more the threads, the gains can be better. Of course I will continue testing all this.
Daniel José - http://www.andscacs.com
-
- Posts: 855
- Joined: Sun May 23, 2010 1:32 pm
Re: Trying to improve lazy smp
I'm having similr problems with lazy smp in Vajolet.
i was thinking about let some thread searching the second best move, something like a parallel MULTIPV. but up to now I never tried it
i was thinking about let some thread searching the second best move, something like a parallel MULTIPV. but up to now I never tried it
-
- Posts: 2559
- Joined: Fri Nov 26, 2010 2:00 pm
- Location: Czech Republic
- Full name: Martin Sedlak
Re: Trying to improve lazy smp
Thanks, you don't have to. Peter did some testing recently and I know that cheng scales beyond 4 cores (somehow).cdani wrote:If you want I can test your engine to see if there is similar behavior.
As for your formula:
Code: Select all
NewDepth = Depth + (((Depth + 1) & 1) ^ 1) + (Depth > 2) + (Depth > 4) + (Depth > 6)
It seems to me that your problem might be that you don't terminate iteration when a helper finishes before "master", but I may be wrong.
-
- Posts: 2204
- Joined: Sat Jan 18, 2014 10:24 am
- Location: Andorra
Re: Trying to improve lazy smp
Ooops!mar wrote: As for your formula:Where is thread id? this would mean you have same depth for each helper thread.Code: Select all
NewDepth = Depth + (((Depth + 1) & 1) ^ 1) + (Depth > 2) + (Depth > 4) + (Depth > 6)
It seems to me that your problem might be that you don't terminate iteration when a helper finishes before "master", but I may be wrong.
Because I translated it from catalan to english I have done a mistake. The good one is:
NewDepth = Depth + (((thread_id + 1) & 1) ^ 1) + (thread_id > 2) + (thread_id > 4) + (thread_id > 6)
The same applies to all the other code I have put in this page.
Daniel José - http://www.andscacs.com
-
- Posts: 2204
- Joined: Sat Jan 18, 2014 10:24 am
- Location: Andorra
Re: Trying to improve lazy smp
Yes, I have to try this also.mar wrote: It seems to me that your problem might be that you don't terminate iteration when a helper finishes before "master", but I may be wrong.
Daniel José - http://www.andscacs.com