Peculiarity of Komodo 5.1MP

Daniel Shawul · Post by **Daniel Shawul** » Wed Jun 19, 2013 6:13 pm

michiguel wrote:
Laskos wrote:
yanquis1972 wrote:interesting...is this a unique (in the literal sense) implementation of MP in a chess engine?
First time I encounter, but I only tested very few engines. Some folks like Bob Hyatt even stated bluntly that time-to-depth is the only way to measure MP efficiency. Not for Komodo.
I remember the thread and I did not agree with that. What matters, for any engine, it is the elo gained. Time to depth for a selected number of representative (are they?) positions may or may not be an accurate way to predict how much strength is gained, for several reasons. There are too many assumptions in the process.

Miguel

This is making a mountain out of a molehill. When you say 'parallel efficiency' it is assumed you are comparing the efficiency of the 'parallel implementation' alone and not 'algorithmic efficiency'. Speed up is rightly defined as the time needed to complete the same work by the parallel search compared to the sequential search so time-to-depth is more or less a correct measure. If i suddenly decide my parallel uses alpha-beta while sequential uses lmr+nullmove, then you might as well have done the comparison on sequential search because that has nothing to do with parallel search.

michiguel · Post by **michiguel** » Wed Jun 19, 2013 6:32 pm

Daniel Shawul wrote:
michiguel wrote:
Laskos wrote:
yanquis1972 wrote:interesting...is this a unique (in the literal sense) implementation of MP in a chess engine?
First time I encounter, but I only tested very few engines. Some folks like Bob Hyatt even stated bluntly that time-to-depth is the only way to measure MP efficiency. Not for Komodo.
I remember the thread and I did not agree with that. What matters, for any engine, it is the elo gained. Time to depth for a selected number of representative (are they?) positions may or may not be an accurate way to predict how much strength is gained, for several reasons. There are too many assumptions in the process.

Miguel
This is making a mountain out of a molehill. When you say 'parallel efficiency' it is assumed you are comparing the efficiency of the 'parallel implementation' alone and not 'algorithmic efficiency'. Speed up is rightly defined as the time needed to complete the same work by the parallel search compared to the sequential search so time-to-depth is more or less a correct measure. If i suddenly decide my parallel uses alpha-beta while sequential uses lmr+nullmove, then you might as well have done the comparison on sequential search because that has nothing to do with parallel search.

What matters is the elo gained. The question is if the apparent speed up measured with tests (as we know them) predict it well. I do not think it is necessary true for every engine. They may be good approximations, but approximations nonetheless. Approximations could be more precise but less accurate.

Miguel

michiguel · Post by **michiguel** » Wed Jun 19, 2013 6:44 pm

michiguel wrote:
Daniel Shawul wrote:
michiguel wrote:
Laskos wrote:
yanquis1972 wrote:interesting...is this a unique (in the literal sense) implementation of MP in a chess engine?
First time I encounter, but I only tested very few engines. Some folks like Bob Hyatt even stated bluntly that time-to-depth is the only way to measure MP efficiency. Not for Komodo.
I remember the thread and I did not agree with that. What matters, for any engine, it is the elo gained. Time to depth for a selected number of representative (are they?) positions may or may not be an accurate way to predict how much strength is gained, for several reasons. There are too many assumptions in the process.

Miguel
This is making a mountain out of a molehill. When you say 'parallel efficiency' it is assumed you are comparing the efficiency of the 'parallel implementation' alone and not 'algorithmic efficiency'. Speed up is rightly defined as the time needed to complete the same work by the parallel search compared to the sequential search so time-to-depth is more or less a correct measure. If i suddenly decide my parallel uses alpha-beta while sequential uses lmr+nullmove, then you might as well have done the comparison on sequential search because that has nothing to do with parallel search.
What matters is the elo gained. The question is if the apparent speed up measured with tests (as we know them) predict it well. I do not think it is necessary true for every engine. They may be good approximations, but approximations nonetheless. Approximations could be more precise but less accurate.

Miguel

What matters is the theoretical speed up in positions that matters. Not every position in the game matter the same. For instance, do certain implementations speed up the same in positions with well defined PVs or positions in which PVs change a lot?

Miguel

Uri Blass · Post by **Uri Blass** » Wed Jun 19, 2013 6:53 pm

Daniel Shawul wrote:
Uri Blass wrote:
Daniel Shawul wrote:This is misleading because komodo probably compensates for poor parallel implementation by searching wider, otherwise it shouldn't be getting any elos for fixed depth test. You should do the test with time and see how much each gain.
"poor parallel implementation?"

The target of parallel implementation is not to get bigger depth but to play better.

If the implementation is good in helping komodo to play better than I think that it is wrong to call it poor.

Maybe it is the opposite and komodo compensates for poor pruning of lines that it should not prune by parallel implementation that prevent it to prune good moves in the relevant lines.
Uri, I don't think you understood the test well. It was a _fixed depth_ test that is meaningless unless both parallel and serial versions search more or less similar tree.
If i decided to use alpha-beta only for the parallel search while using lmr+nullmove+other pruning for the sequential search who would you think would win for a fixed depth test. Get it?

I understood the test well.
fixed depth is meaningless if the target is to find strength improvement but it was clearly not the target of the test.

The target was simply to compare the behaviour of Komodo with the behaviour of other programs(not to compare strength).

My point was that the conclusion that the parallel implementation is poor is simply wrong.

Note that I did not claim nothing based on the specific test.
I believe that the implementation of komodo of paralllel search is good
based on what I read about results of it against other programs and I read that with many cores(more than 4 cores) it did better against other programs relative to one core tests.

Daniel Shawul · Post by **Daniel Shawul** » Wed Jun 19, 2013 6:58 pm

Well i would go for average-case behavior but some might argue worst-case behavior is a good measure. The best-case is definitely not a good measure because it is probably a super-linear speedup which occurs by mere chance. So testing speedup on many test positions or playing many games is measuring average-case performance.
For many algorithms usually the average-case (expected behavior) is the most important, but for sequential alpha-beta the best-case is because the move-ordering is close to best. I am not sure about parallel search since getting the most help from it at critical positions can be arguably as important, but i would still go for average performance improvement.

Joerg Oster · Post by **Joerg Oster** » Wed Jun 19, 2013 7:01 pm

What settings for 'Min Split Depth' did you use for Houdini and Stockfish? Same for both?

Depending on the setting you compared a 1-core to an almost 1-core engine. More or less.

Interesting enough, Komodo5.1MP doesn't have such a parameter ...

Daniel Shawul · Post by **Daniel Shawul** » Wed Jun 19, 2013 7:11 pm

Well then you are hammering your own straw man because I said "_probably_ to compensate for poor parallel search".
If a parallel search benefited from going wider, then the sequential search should too. If we forget about the parallel search and just compare sequential searchers with 2x more time vs 1x searcher, you can see that going wider with should help 2x searcher more than going deeper. Hence it should have used that scheme too for the sequential search, which it doesn't. So this is a 'proof by contradiction' that the parallel search is implemented poorly. If not it will at least justify the 'probably'.

Laskos · Post by **Laskos** » Wed Jun 19, 2013 8:02 pm

Joerg Oster wrote:What settings for 'Min Split Depth' did you use for Houdini and Stockfish? Same for both?

Depending on the setting you compared a 1-core to an almost 1-core engine. More or less.

Interesting enough, Komodo5.1MP doesn't have such a parameter ...

8 for Houdini, the default 7 for SF. I don't think it's an issue.

bob · Post by **bob** » Wed Jun 19, 2013 8:11 pm

Laskos wrote:
Daniel Shawul wrote:This is misleading because komodo probably compensates for poor parallel implementation by searching wider, otherwise it shouldn't be getting any elos for fixed depth test. You should do the test with time and see how much each gain.
I already wrote that it probably searches wider with the number of threads. But it's different from other top engines, which to given depth are pretty much the same strength on different number of threads. And the rule (as stated by some) that time-to-depth is determining MP efficiency does not apply to Komodo.
I will leave to test groups and individuals to test with time, there are plenty of volunteers. I was just curious about this aspect.

Time-to-depth IS the correct measure. If Komodo searches wider, then it searches less efficiently. One wants to measure the COMPLETE SMP implementation, not just how the tree is split.

I can not imagine wanting to search wider on a parallel search. If it is a good idea, why not search wider on the sequential implementation as well?

bob · Post by **bob** » Wed Jun 19, 2013 8:14 pm

Daniel Shawul wrote:
Laskos wrote:
Daniel Shawul wrote:This is misleading because komodo probably compensates for poor parallel implementation by searching wider, otherwise it shouldn't be getting any elos for fixed depth test. You should do the test with time and see how much each gain.
I already wrote that it probably searches wider with the number of threads. But it's different from other top engines, which to given depth are pretty much the same strength on different number of threads. And the rule (as stated by some) that time-to-depth is determining MP efficiency does not apply to Komodo.
I will leave to test groups and individuals to test with time, there are plenty of volunteers. I was just curious about this aspect.
I saw it but I felt like stating the obvious once again should help consumers in making decisions.
I said you should do a timed test not time-to-depth. Simply play a game of 4 threads vs 1 thread for say 40/1. That should really tell how much each engine gain from parallelization.

This is going far beyond apples-to-oranges and is now using a basket full of different types of fruit.

What do you want to measure, in isolation from everything else?

1. SMP efficiency? time-to-depth is THE way to do this, Nothing else works.

2. search implementation details? fixed depth will highlight differences, but comparing strength is completely and utterly meaningless here. A full-width with no reductions, searched to 12 plies will simply crush a selective program with reductions and forward pruning. So what? The selective program will reach significantly deeper depths in the same time so the comparison is meaningless...

One should NOT compare Elo at fixed depth, just search efficiency (speedup).

Peculiarity of Komodo 5.1MP

Re: Peculiarity of Komodo 5.1MP

Re: Peculiarity of Komodo 5.1MP

Re: Peculiarity of Komodo 5.1MP

Re: Peculiarity of Komodo 5.1MP

Re: Peculiarity of Komodo 5.1MP

Re: Peculiarity of Komodo 5.1MP

Re: Peculiarity of Komodo 5.1MP

Re: Peculiarity of Komodo 5.1MP

Re: Peculiarity of Komodo 5.1MP

Re: Peculiarity of Komodo 5.1MP