I tested the latest Komodo, Houdini and Stockfish for MP implementation. I put play Engine 4 threads against the same Engine on 1 thread, both to fixed depth 11 for Komodo and Houdini, depth 12 for Stockfish.
4 physical i7 cores.
Program Score % Elo
1 Komodo 4 threads : 183.0/300 61.0 3039
2 Komodo 1 thread : 117.0/300 39.0 2961
Program Score % Elo
1 Houdini 4 threads : 181.0/360 50.3 3001
2 Houdini 1 thread : 179.0/360 49.7 2999
Program Score % Elo
1 SF 4 threads : 221.5/434 51.0 3004
2 SF 1 thread : 212.5/434 49.0 2996
As it can be seen Houdini and Stockfish are within error margins equal to constant depth on 4 and 1 thread. However Komodo 5.1MP shows +80 points increase for 4 threads compared to 1 thread to fixed depth 11. So, time to depth is an incorrect way of calculating Komodo's MP efficiency. It seems that it increases the width of the tree as much as it increases the depth with number of threads.
yanquis1972 wrote:interesting...is this a unique (in the literal sense) implementation of MP in a chess engine?
First time I encounter, but I only tested very few engines. Some folks like Bob Hyatt even stated bluntly that time-to-depth is the only way to measure MP efficiency. Not for Komodo.
Yes, the implementation of MP in Komodo is nothing like that in other top chess engines, and the observation is correct that Komodo MP is much stronger at the same depth than SP, but the speedup is also much less than for other engines so the end result is fairly close. I suspect that our method would be useless for a traditional full-width engine but works for highly selective engines like ours. I'll leave it to Don to decide what we wants to say about MP implementation.
This is misleading because komodo probably compensates for poor parallel implementation by searching wider, otherwise it shouldn't be getting any elos for fixed depth test. You should do the test with time and see how much each gain.
Daniel Shawul wrote:This is misleading because komodo probably compensates for poor parallel implementation by searching wider, otherwise it shouldn't be getting any elos for fixed depth test. You should do the test with time and see how much each gain.
I already wrote that it probably searches wider with the number of threads. But it's different from other top engines, which to given depth are pretty much the same strength on different number of threads. And the rule (as stated by some) that time-to-depth is determining MP efficiency does not apply to Komodo.
I will leave to test groups and individuals to test with time, there are plenty of volunteers. I was just curious about this aspect.
Daniel Shawul wrote:This is misleading because komodo probably compensates for poor parallel implementation by searching wider, otherwise it shouldn't be getting any elos for fixed depth test. You should do the test with time and see how much each gain.
"poor parallel implementation?"
The target of parallel implementation is not to get bigger depth but to play better.
If the implementation is good in helping komodo to play better than I think that it is wrong to call it poor.
Maybe it is the opposite and komodo compensates for poor pruning of lines that it should not prune by parallel implementation that prevent it to prune good moves in the relevant lines.
Daniel Shawul wrote:This is misleading because komodo probably compensates for poor parallel implementation by searching wider, otherwise it shouldn't be getting any elos for fixed depth test. You should do the test with time and see how much each gain.
I already wrote that it probably searches wider with the number of threads. But it's different from other top engines, which to given depth are pretty much the same strength on different number of threads. And the rule (as stated by some) that time-to-depth is determining MP efficiency does not apply to Komodo.
I will leave to test groups and individuals to test with time, there are plenty of volunteers. I was just curious about this aspect.
I saw it but I felt like stating the obvious once again should help consumers in making decisions.
I said you should do a timed test not time-to-depth. Simply play a game of 4 threads vs 1 thread for say 40/1. That should really tell how much each engine gain from parallelization.
Daniel Shawul wrote:This is misleading because komodo probably compensates for poor parallel implementation by searching wider, otherwise it shouldn't be getting any elos for fixed depth test. You should do the test with time and see how much each gain.
"poor parallel implementation?"
The target of parallel implementation is not to get bigger depth but to play better.
If the implementation is good in helping komodo to play better than I think that it is wrong to call it poor.
Maybe it is the opposite and komodo compensates for poor pruning of lines that it should not prune by parallel implementation that prevent it to prune good moves in the relevant lines.
Uri, I don't think you understood the test well. It was a _fixed depth_ test that is meaningless unless both parallel and serial versions search more or less similar tree.
If i decided to use alpha-beta only for the parallel search while using lmr+nullmove+other pruning for the sequential search who would you think would win for a fixed depth test. Get it?
yanquis1972 wrote:interesting...is this a unique (in the literal sense) implementation of MP in a chess engine?
First time I encounter, but I only tested very few engines. Some folks like Bob Hyatt even stated bluntly that time-to-depth is the only way to measure MP efficiency. Not for Komodo.
I remember the thread and I did not agree with that. What matters, for any engine, it is the elo gained. Time to depth for a selected number of representative (are they?) positions may or may not be an accurate way to predict how much strength is gained, for several reasons. There are too many assumptions in the process.