Some fun with Komodo 8

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
cdani
Posts: 2204
Joined: Sat Jan 18, 2014 10:24 am
Location: Andorra

Re: Some fun with Komodo 8

Post by cdani »

mjlef wrote:As others have discovered, the methods used in Komodo lead to a strength gain at a given depth, on top of the shorter time to completion of that depth.
For example one of the ideas I have and I attempted superficially is to analyze better what to reduce more and what to reduce less. So the effect will be this, to spend more time on a level but to gain strength.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Some fun with Komodo 8

Post by bob »

mjlef wrote:Unfortunately, I will have to keep what Komodo does as a kind of "trade secret" for now, since it seems to give us an advantage over other programs. As others have discovered, the methods used in Komodo lead to a strength gain at a given depth, on top of the shorter time to completion of that depth. We changed part of it for Komodo 8 over the scheme Don came up with, and these changes seem to have improved efficiency and scaling.

I think you would agree that scaling using more processors using traditional schemes scales more and more poorly as the processors increases. At some point, doubling processors will reach a point where it only gives a few more elo. We want something better than that.

I make no claim that what we do is optimal, and we hope to make further improvements in the future. Better MP use is definitely something we need to work on.
While I agree that more and more CPUS add less and less Elo, I don't think that 1-2 or 2-4 reaches that point unless one uses the shared tt way of doing a parallel search, which is certainly not very good. Which would certainly suggest that for 1-2-4-8 and even 16 would be pretty efficient in the traditional sense. Rather than saying "OK, I can't improve the efficiency so I will use some of the horsepower to try to make the search a bit more accurate with less pruning or reducing." But at the low end of processors, the speedups are typically so good that is not very effective. Or if it is, doing the same thing with just one processor would be a gain (i.e. scaling back the aggressiveness of pruning or reducing).
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Some fun with Komodo 8

Post by bob »

cdani wrote:
mjlef wrote:As others have discovered, the methods used in Komodo lead to a strength gain at a given depth, on top of the shorter time to completion of that depth.
For example one of the ideas I have and I attempted superficially is to analyze better what to reduce more and what to reduce less. So the effect will be this, to spend more time on a level but to gain strength.
You can do the SAME thing with just one CPU. That's a key point. The speedup for 1-2 and 2-4 for most programs is pretty good or pretty easy to make it good for a new program. If your "less selectivity" idea is good for 2-4 cpus, it should also work for just one.
User avatar
cdani
Posts: 2204
Joined: Sat Jan 18, 2014 10:24 am
Location: Andorra

Re: Some fun with Komodo 8

Post by cdani »

bob wrote:You can do the SAME thing with just one CPU. That's a key point. The speedup for 1-2 and 2-4 for most programs is pretty good or pretty easy to make it good for a new program. If your "less selectivity" idea is good for 2-4 cpus, it should also work for just one.
Yes, I undertood this. But maybe you can use some time of other cpu to, by not interfering some main search, prepare things for the main search to be more efficient. Of course again you can do this with only one thread for all the work, but maybe with a lot more interferences or context switches than if it's a separated thread. It's just an idea.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Some fun with Komodo 8

Post by Laskos »

bob wrote:
cdani wrote:
mjlef wrote:As others have discovered, the methods used in Komodo lead to a strength gain at a given depth, on top of the shorter time to completion of that depth.
For example one of the ideas I have and I attempted superficially is to analyze better what to reduce more and what to reduce less. So the effect will be this, to spend more time on a level but to gain strength.
You can do the SAME thing with just one CPU. That's a key point. The speedup for 1-2 and 2-4 for most programs is pretty good or pretty easy to make it good for a new program. If your "less selectivity" idea is good for 2-4 cpus, it should also work for just one.
Things like LMR can screw up fixed depth strength, without much fixed time strength changes. The question would be "what is depth" with such aggressive LMR use.
User avatar
reflectionofpower
Posts: 1610
Joined: Fri Mar 01, 2013 5:28 pm
Location: USA

Re: Some fun with Komodo 8

Post by reflectionofpower »

bob wrote:
cdani wrote:
mjlef wrote: You can do the SAME thing with just one CPU. That's a key point. The speedup for 1-2 and 2-4 for most programs is pretty good or pretty easy to make it good for a new program. If your "less selectivity" idea is good for 2-4 cpus, it should also work for just one.
Bob,
What are you getting for nps off your 12 core on ICC? 30M+ nps? I suppose also the Linux squeezes more out of it too.
"Without change, something sleeps inside us, and seldom awakens. The sleeper must awaken." (Dune - 1984)

Lonnie
mjlef
Posts: 1494
Joined: Thu Mar 30, 2006 2:08 pm

Re: Some fun with Komodo 8

Post by mjlef »

Bob,

Have you found the elo gain for 1-2-4-8-16 processors for Crafty?

I think that since cpu speed in GHz has not been increasing much recently, processor makers have turned to trying to get more out of one CPU cycle, and more cores per chip. So it would be nice to see the elo gain for each increase in the number of processors. There seems to be a pretty big difference in this between programs, with some posting here suggesting adding another processor at some point actually hurts elo. Although I do not know the specifics. With say a limited hash table size for storing best moves and cutoffs, and multiple cores trying to access the same shared memory, perhaps at some point the slowdown due to external memory access ends up not being productive. Also, it seems to me that memory speeds have kinda reached a certain level with newer machines just not having faster external memory. At least the on chip caches are getting bigger.

Mark
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Some fun with Komodo 8

Post by bob »

reflectionofpower wrote:
bob wrote:
cdani wrote:
mjlef wrote: You can do the SAME thing with just one CPU. That's a key point. The speedup for 1-2 and 2-4 for most programs is pretty good or pretty easy to make it good for a new program. If your "less selectivity" idea is good for 2-4 cpus, it should also work for just one.
Bob,
What are you getting for nps off your 12 core on ICC? 30M+ nps? I suppose also the Linux squeezes more out of it too.
Typical number is 40-44M. It does drop some in endgames. And this is a pretty old box as well...

Processor is an ES5650 at 2.67ghz, dual chip 6 cores per chip. My newer iMac with a 4 core chip runs past 20M easily.
User avatar
reflectionofpower
Posts: 1610
Joined: Fri Mar 01, 2013 5:28 pm
Location: USA

Re: Some fun with Komodo 8

Post by reflectionofpower »

bob wrote:
reflectionofpower wrote:
bob wrote:
cdani wrote:
mjlef wrote: You can do the SAME thing with just one CPU. That's a key point. The speedup for 1-2 and 2-4 for most programs is pretty good or pretty easy to make it good for a new program. If your "less selectivity" idea is good for 2-4 cpus, it should also work for just one.
Bob,
What are you getting for nps off your 12 core on ICC? 30M+ nps? I suppose also the Linux squeezes more out of it too.
Typical number is 40-44M. It does drop some in endgames. And this is a pretty old box as well...

Processor is an ES5650 at 2.67ghz, dual chip 6 cores per chip. My newer iMac with a 4 core chip runs past 20M easily.
Nice
"Without change, something sleeps inside us, and seldom awakens. The sleeper must awaken." (Dune - 1984)

Lonnie
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Some fun with Komodo 8

Post by bob »

mjlef wrote:Bob,

Have you found the elo gain for 1-2-4-8-16 processors for Crafty?

I think that since cpu speed in GHz has not been increasing much recently, processor makers have turned to trying to get more out of one CPU cycle, and more cores per chip. So it would be nice to see the elo gain for each increase in the number of processors. There seems to be a pretty big difference in this between programs, with some posting here suggesting adding another processor at some point actually hurts elo. Although I do not know the specifics. With say a limited hash table size for storing best moves and cutoffs, and multiple cores trying to access the same shared memory, perhaps at some point the slowdown due to external memory access ends up not being productive. Also, it seems to me that memory speeds have kinda reached a certain level with newer machines just not having faster external memory. At least the on chip caches are getting bigger.

Mark
No disagreement with any of that. I have not seen any negative issues for my search through 16 cores. I've run on up to 64, but not recently, and never saw any case where 64 was actually worse than 32, so long as you don't look at an individual position where anything can happen.

I have done some cluster testing in the past, not specifically to measure Elo improvement but more commonly just to stress-test the parallel search. I've been puzzling over an imperfect NPS scaling on this box for months, no luck yet. I can run 12 copies of Crafty at 5M NPS per copy, but running one copy at 12 threads hits around 48M or so. A missing 12M (or 20% of the total processing power). I'm going to find it. 99% of the time this is a cache issue, but so far nothing has helped.