Peculiarity of Komodo 5.1MP

Joerg Oster · Post by **Joerg Oster** » Sat Jun 29, 2013 10:52 am

Don wrote:
Laskos wrote:
syzygy wrote:
(Maybe Kai should have taken the elo gain per core observed in other threads rounded to one decimal and based on that reconstruct the times to say 3 decimals? Would have made a fine scientific paper. However, this is just a forum thread.)
The problem is the times are not universal, they depend on TC or depth. I tried to give time benefit to 1 core Houdin and Komodo to match in strength 4 cores Houdini and Komodo respectively at ultra-short control 5''+0.1''. The benefit is 2.3 for Houdini and 2.0 for Komodo at this TC, not quite close to ~3 at some 1 minute/move TC. So, these time-to-depth ratios and the gain at the same depth (of Komodo) depend on TC and depth.
I have tried to explain something similar to non-technical people and it's difficult to make it clear.

So it's incorrect to say, "program 1 gains 120 ELO going from one to 4 cores" because the gain will be very much dependent on the time control and hardware.

I agree, though I doubt the differences will be that huge.
Maybe I will run a test with 2 or 3 different time controls ...

Don wrote:If you are doing a 3 ply search you will get 200 ELO or more going to 3 ply but you will only get perhaps 10 or 20 ELO going from 30 to 31 ply.

That's a bit exxaggerated, isn't it?
And isn't it only for practical reasons? Because noone will play a match with a tc of 1 week + 1 hour increment, for example.

Don wrote:But even on the same hardware and time control it's incorrect to say, "program X gets more than program Y going from 1 to 4 cores" because the weaker program has a big advantage for the reason I stated in the previous paragraph. So all these comparisons are at least slightly unfair to Houdini.

To accurately measure who gets the most benefit you have to NORMALIZE the levels with handicaps or time bonuses so that you have all the participating programs playing the same ELO on 1 core. Then, using the same normalized levels you run a second series with 4 cores to see which program gets the most benefit.

But then you will have Program A search deeper and thus gain less compared to Program B which is searching some plies less and therefore gain more. So what's the difference in simply playing normal time control matches?

Don · Post by **Don** » Sat Jun 29, 2013 11:35 am

Joerg Oster wrote:
Don wrote:
Laskos wrote:
syzygy wrote:
(Maybe Kai should have taken the elo gain per core observed in other threads rounded to one decimal and based on that reconstruct the times to say 3 decimals? Would have made a fine scientific paper. However, this is just a forum thread.)
The problem is the times are not universal, they depend on TC or depth. I tried to give time benefit to 1 core Houdin and Komodo to match in strength 4 cores Houdini and Komodo respectively at ultra-short control 5''+0.1''. The benefit is 2.3 for Houdini and 2.0 for Komodo at this TC, not quite close to ~3 at some 1 minute/move TC. So, these time-to-depth ratios and the gain at the same depth (of Komodo) depend on TC and depth.
I have tried to explain something similar to non-technical people and it's difficult to make it clear.

So it's incorrect to say, "program 1 gains 120 ELO going from one to 4 cores" because the gain will be very much dependent on the time control and hardware.
I agree, though I doubt the differences will be that huge.
Maybe I will run a test with 2 or 3 different time controls ...

The difference is non-trivial and easily measurable. It's not as big as the ELO starting value I talk about next though.

Don wrote:If you are doing a 3 ply search you will get 200 ELO or more going to 3 ply but you will only get perhaps 10 or 20 ELO going from 30 to 31 ply.
That's a bit exxaggerated, isn't it?

For Komodo going from 3 to 4 ply is more than 200 ELO - so I understated this.

Going from 30 to 31 ply is something I have not measured but the main point isn't my numbers but the fact that they are far from being the same. The fact of the matter is that the ELO increase is less with each doubling of time.

There are 2 reasons for this. If you have a program set to such a high level that it is playing chess at well over the 3000 Level, you cannot just double the time and expect it to add a whole new level of playing strength.

The other reason is that the number of draws sharply increase with the strength of the players too. We did a study once here with how the number of draws increased with longer time controls and some people used this to extrapolate an ELO rating for perfect play.

And isn't it only for practical reasons? Because noone will play a match with a tc of 1 week + 1 hour increment, for example.
[/qoute]
But in effect we are doing just that as our computers get faster and faster with each generation. If we look back on this 10 years from now what we are writing here now will be water under the bridge and not relevant.

But even at just the normal levels we see now the effect is non-trivial.

Don wrote:But even on the same hardware and time control it's incorrect to say, "program X gets more than program Y going from 1 to 4 cores" because the weaker program has a big advantage for the reason I stated in the previous paragraph. So all these comparisons are at least slightly unfair to Houdini.

To accurately measure who gets the most benefit you have to NORMALIZE the levels with handicaps or time bonuses so that you have all the participating programs playing the same ELO on 1 core. Then, using the same normalized levels you run a second series with 4 cores to see which program gets the most benefit.
But then you will have Program A search deeper and thus gain less compared to Program B which is searching some plies less and therefore gain more. So what's the difference in simply playing normal time control matches?
What is gained by adding time or cores depends on ELO, not "plies" of search.

There is another factor I did not even mention - but which is relatively minor for most programs but not all. Some programs are inherently more scalable, even if you ignore MP. If you run Stockfish 2.2 at fast time controls (20 seconds per game or less for example) it will get absolutely crushed by programs like Komodo and Houdini. But if you double this you will get a far greater increase in ELO than you will for other programs. The difference is extremely noticeable. So if you run it on 4 cores you will also get a bigger gain but not necessarily because the MP is good. So a good way to test MP scalability and ISOLATE it from this effect is to also test how much each program improves running 4x longer in single processor mode. Only then can you determine how much is MP scaling versus how much is natural scaling.

Joerg Oster · Post by **Joerg Oster** » Sat Jun 29, 2013 12:35 pm

Don wrote:
Joerg Oster wrote:
Don wrote:
Laskos wrote:
syzygy wrote:
(Maybe Kai should have taken the elo gain per core observed in other threads rounded to one decimal and based on that reconstruct the times to say 3 decimals? Would have made a fine scientific paper. However, this is just a forum thread.)
The problem is the times are not universal, they depend on TC or depth. I tried to give time benefit to 1 core Houdin and Komodo to match in strength 4 cores Houdini and Komodo respectively at ultra-short control 5''+0.1''. The benefit is 2.3 for Houdini and 2.0 for Komodo at this TC, not quite close to ~3 at some 1 minute/move TC. So, these time-to-depth ratios and the gain at the same depth (of Komodo) depend on TC and depth.
I have tried to explain something similar to non-technical people and it's difficult to make it clear.

So it's incorrect to say, "program 1 gains 120 ELO going from one to 4 cores" because the gain will be very much dependent on the time control and hardware.
I agree, though I doubt the differences will be that huge.
Maybe I will run a test with 2 or 3 different time controls ...
The difference is non-trivial and easily measurable. It's not as big as the ELO starting value I talk about next though.

Don wrote:If you are doing a 3 ply search you will get 200 ELO or more going to 3 ply but you will only get perhaps 10 or 20 ELO going from 30 to 31 ply.
That's a bit exxaggerated, isn't it?
For Komodo going from 3 to 4 ply is more than 200 ELO - so I understated this.

Going from 30 to 31 ply is something I have not measured but the main point isn't my numbers but the fact that they are far from being the same. The fact of the matter is that the ELO increase is less with each doubling of time.

Arghh. I'm so sorry. I totally got it wrong.

Of course, you're right.
SMP gain can be drawn as a logarithmic or a root graph. (Is that the right expression?) The higher the plies, the less you gain. Additionally.
And each program will have its own, individual, slightly different graph.

Don wrote:There are 2 reasons for this. If you have a program set to such a high level that it is playing chess at well over the 3000 Level, you cannot just double the time and expect it to add a whole new level of playing strength.

The other reason is that the number of draws sharply increase with the strength of the players too. We did a study once here with how the number of draws increased with longer time controls and some people used this to extrapolate an ELO rating for perfect play.

Joerg Oster wrote: And isn't it only for practical reasons? Because noone will play a match with a tc of 1 week + 1 hour increment, for example.
[/qoute]
But in effect we are doing just that as our computers get faster and faster with each generation. If we look back on this 10 years from now what we are writing here now will be water under the bridge and not relevant.

But even at just the normal levels we see now the effect is non-trivial.

Don wrote:But even on the same hardware and time control it's incorrect to say, "program X gets more than program Y going from 1 to 4 cores" because the weaker program has a big advantage for the reason I stated in the previous paragraph. So all these comparisons are at least slightly unfair to Houdini.

To accurately measure who gets the most benefit you have to NORMALIZE the levels with handicaps or time bonuses so that you have all the participating programs playing the same ELO on 1 core. Then, using the same normalized levels you run a second series with 4 cores to see which program gets the most benefit.
But then you will have Program A search deeper and thus gain less compared to Program B which is searching some plies less and therefore gain more. So what's the difference in simply playing normal time control matches?

Don wrote:What is gained by adding time or cores depends on ELO, not "plies" of search.

There is another factor I did not even mention - but which is relatively minor for most programs but not all. Some programs are inherently more scalable, even if you ignore MP. If you run Stockfish 2.2 at fast time controls (20 seconds per game or less for example) it will get absolutely crushed by programs like Komodo and Houdini. But if you double this you will get a far greater increase in ELO than you will for other programs. The difference is extremely noticeable. So if you run it on 4 cores you will also get a bigger gain but not necessarily because the MP is good. So a good way to test MP scalability and ISOLATE it from this effect is to also test how much each program improves running 4x longer in single processor mode. Only then can you determine how much is MP scaling versus how much is natural scaling.

You are right about SF.
Another solution might be to run with fixed time per move. Let's say 5 min per move. Though this would take a while...

But i think with this we would measure the gain at a very high point on the graph for each program.

Peculiarity of Komodo 5.1MP

Re: Peculiarity of Komodo 5.1MP

Re: Peculiarity of Komodo 5.1MP

Re: Peculiarity of Komodo 5.1MP