Trying to improve lazy smp
Posted: Sat Apr 11, 2015 9:09 pm
Hi.
I had some bad scaling with Andscacs with more than 4 cores as maybe you has seen in the general section:
2 threads: +61
4 threads: +108
6 threads: +117
7 threads: +119
8 threads: +96
Initially I thought than may be there was a bug, or some computer limitation like some cache collapse was kicking in, or maybe that my careless disregard or locking and other stuff was paying at last.
Of course any of this can be possible even with the change I explain here, but I have a hope is not like this. Hope because I don't have experience on this, of course
So I thought that it was not very logical that so many threads where thinking in the two same depths:
NewDepth = Depth + (((Depth + 1) & 1) ^ 1)
Initially I changed this for
NewDepth = Depth + (((Depth + 1) & 1) ^ 1) + (Depth > 5)
and I obtained
8 threads: +115
Not bad.
I tried to be a little more aggressive:
NewDepth = Depth + (((Depth + 1) & 1) ^ 1) + (Depth > 4) + (Depth > 6)
and I obtained
8 threads: +120
Now I'm trying something more aggressive. I will report.
I don't know if this happened to some of you with lazy smp, or if someone has tried something like I'm trying now.
After those attempts, I will try to obtain access to a 12 or 16 core machine, to see how this must be modified to scale well at those machines. May be some of you have experience on an ISP that offers such services. I will pay just for a month, because I suppose it will not be any cheap, but I think it will be enough.
I had some bad scaling with Andscacs with more than 4 cores as maybe you has seen in the general section:
2 threads: +61
4 threads: +108
6 threads: +117
7 threads: +119
8 threads: +96
Initially I thought than may be there was a bug, or some computer limitation like some cache collapse was kicking in, or maybe that my careless disregard or locking and other stuff was paying at last.
Of course any of this can be possible even with the change I explain here, but I have a hope is not like this. Hope because I don't have experience on this, of course
So I thought that it was not very logical that so many threads where thinking in the two same depths:
NewDepth = Depth + (((Depth + 1) & 1) ^ 1)
Initially I changed this for
NewDepth = Depth + (((Depth + 1) & 1) ^ 1) + (Depth > 5)
and I obtained
8 threads: +115
Not bad.
I tried to be a little more aggressive:
NewDepth = Depth + (((Depth + 1) & 1) ^ 1) + (Depth > 4) + (Depth > 6)
and I obtained
8 threads: +120
Now I'm trying something more aggressive. I will report.
I don't know if this happened to some of you with lazy smp, or if someone has tried something like I'm trying now.
After those attempts, I will try to obtain access to a 12 or 16 core machine, to see how this must be modified to scale well at those machines. May be some of you have experience on an ISP that offers such services. I will pay just for a month, because I suppose it will not be any cheap, but I think it will be enough.