I found this table in Bob's website.
+---------------+-----+-----+-----+-----+------+
|# processors | 1 | 2 | 4 | 8 | 16 |
+---------------+-----+-----+-----+-----+------+
|speedup | 1.0 | 2.0 | 3.7 | 6.6 | 11.1 |
+---------------+-----+-----+-----+-----+------+
Where can I find such data of stockfish and recent crafty?
question on performance of DTS
Moderator: Ras
-
- Posts: 2684
- Joined: Sat Jun 14, 2008 9:17 pm
Re: question on performance of DTS
Nowhereliuzy wrote:I found this table in Bob's website.
+---------------+-----+-----+-----+-----+------+
|# processors | 1 | 2 | 4 | 8 | 16 |
+---------------+-----+-----+-----+-----+------+
|speedup | 1.0 | 2.0 | 3.7 | 6.6 | 11.1 |
+---------------+-----+-----+-----+-----+------+
Where can I find such data of stockfish ?

Nobody has ever built up such a table for SF, as far as I know.
BTW although I concede that such a table, based on nodes/sec on a given hardware, has some validity I also think could be misleading because does not reflect a corresponding ELO increase speed-up.
Indeed ELO is much more complex then counting nodes, because it is important not only how many nodes you calculate but which nodes you calculate.
For instance you could have a SMP implementation with a very good speed up but with a slow system to stop the running threads when some of them finds a cut-off. In this case it is all to be demonstrated that such an implementation is better then a one with smaller speed up but more advanced inter-threads synchronization.
This is just an example, there are others, from where I made up my mind that such a tables are very misleading and naive, just like compare CPU of different architectures only by their clock frequency.

-
- Posts: 4186
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: question on performance of DTS
I think you got it wrong. The numbers mentioned are time to complete a task (f.i fixed depth = 20), and not nps scaling.
To the OP : Neither stockfish nor crafty do DTS. I know ZCT uses it.
To the OP : Neither stockfish nor crafty do DTS. I know ZCT uses it.
-
- Posts: 2684
- Joined: Sat Jun 14, 2008 9:17 pm
Re: question on performance of DTS
Ok. Sorry for the noise thenDaniel Shawul wrote:I think you got it wrong. The numbers mentioned are time to complete a task (f.i fixed depth = 20), and not nps scaling.
To the OP : Neither stockfish nor crafty do DTS. I know ZCT uses it.

Re: question on performance of DTS
Marco Costalba, can you do some test for stockfish.
I can not test it myself, because I don't have 16 cores CPU.
I can not test it myself, because I don't have 16 cores CPU.
-
- Posts: 2684
- Joined: Sat Jun 14, 2008 9:17 pm
Re: question on performance of DTS
I have neither.liuzy wrote:Marco Costalba, can you do some test for stockfish.
I can not test it myself, because I don't have 16 cores CPU.
Re: question on performance of DTS
Why stockfish and crafty don't use DTS since its performance is very good.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: question on performance of DTS
Crafty data has been posted on CCC several times. Unfortunately, most of the testing has been with 8 cores, although I have posted some 16 core data. In general, the current Crafty data is worse for smaller numbers of processors, but is reasonably close for 16 (last run on 16 cores I had was actually 11.5x, but the numbers are hard to compare where CB was doing 10 ply searches and Crafty is well beyond 20, which helps parallel search (deeper = better, particularly if you split at the root).liuzy wrote:I found this table in Bob's website.
+---------------+-----+-----+-----+-----+------+
|# processors | 1 | 2 | 4 | 8 | 16 |
+---------------+-----+-----+-----+-----+------+
|speedup | 1.0 | 2.0 | 3.7 | 6.6 | 11.1 |
+---------------+-----+-----+-----+-----+------+
Where can I find such data of stockfish and recent crafty?
I hardly ever see anyone else post parallel speedup data. I think there is some raw 8 cpu data (just large log files for various numbers of processsors all tested on exactly the same set of positions that came from the CB DTS paper.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: question on performance of DTS
So if a program runs 1.7x faster on 2 processors, that won't affect the Elo the same way as running on a single CPU that is 1.7x faster? None of my speedup data is about NPS. It is all about time to a specific depth, which is a real performance measurement that does predict Elo accurately.mcostalba wrote:Nowhereliuzy wrote:I found this table in Bob's website.
+---------------+-----+-----+-----+-----+------+
|# processors | 1 | 2 | 4 | 8 | 16 |
+---------------+-----+-----+-----+-----+------+
|speedup | 1.0 | 2.0 | 3.7 | 6.6 | 11.1 |
+---------------+-----+-----+-----+-----+------+
Where can I find such data of stockfish ?
Nobody has ever built up such a table for SF, as far as I know.
BTW although I concede that such a table, based on nodes/sec on a given hardware, has some validity I also think could be misleading because does not reflect a corresponding ELO increase speed-up.
Now if only the data he gave was counting nodes. But it wasn't.Indeed ELO is much more complex then counting nodes, because it is important not only how many nodes you calculate but which nodes you calculate.

That makes absolutely no sense to anyone familiar with parallel search. Time-to-depth is comparable to time-to-depth, whether the search is done in parallel or on faster hardware. We are _not_ measuring raw NPS and using that. If I did, both CB and Crafty would weigh in with a 16x speedup on 16 processors. But we don't measure parallel performance in such a flawed way. Never have, in fact. It is useful to compare NPS to see how much performance is lost to pure parallel issues (such as cache coherency, memory conflicts, etc) but we don't consider any of that when reporting parallel speed-up. At least no one I know of does.
For instance you could have a SMP implementation with a very good speed up but with a slow system to stop the running threads when some of them finds a cut-off. In this case it is all to be demonstrated that such an implementation is better then a one with smaller speed up but more advanced inter-threads synchronization.
You are _very_ naive to make that statement about something being naive.
This is just an example, there are others, from where I made up my mind that such a tables are very misleading and naive, just like compare CPU of different architectures only by their clock frequency.

-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: question on performance of DTS
Current Crafty is pretty good as well. DTS eliminates recursive search, which I didn't want to give up. So I implemented YBW in a way that is fairly close to DTS, but without having to go to a pure iterative (non-recursive) search. I may do this one day, but the recursive search certainly is cleaner and easier to understand.liuzy wrote:Why stockfish and crafty don't use DTS since its performance is very good.