Don wrote:
We have discovered recently that Komodo on Windows is especially slow - we don't know why but our results on Linux is outstanding in comparison.
Don
Don,
Just let me encourage you: your program is fantastic. It is a real pleasure to have such an engine on my system. Komodo 5 is the engine I use primarily in IDeA and the results so far have been excellent. Keep going!
The above quote from your post is really quite strange, but it sounds like you're inching closer to finding the root cause. Just bearing in mind the (slight) issues you had with AMD & Intel processors, this couldn't be a compiler issue, could it?
Mincho Georgiev wrote:Just a hint. I have no idea what you're using, but nothing can't beat intel compiler's profiling under Windows. Nothing.
Also, make sure your polling is working as it should. Suppress buffering, e.t.c. Other than that, unless if you don't use some OS /win/ specific code, I don't see any reason for your executable to be slower under windows.
We did find out what part of the problem is, perhaps all of the problem and I can now compile about 3-4% faster binaries. I found this with help from Richard Vida.
That may explain most or all of this for Komodo. Linux is ALWAYS a little faster than Windows given the same or equal compiles - several have reported this same issue so I cannot expect to make up the full difference, I just want to fall in line with what is normal. For most programs this seems to be about 3-6 percent. For example Critter is approximately 6% slower in Windows. With this fix the Windows version is now only about 5% slower. I will try to reduce that even more but at least I am not out of line with other programs.
Don
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
Mincho Georgiev wrote:Just a hint. I have no idea what you're using, but nothing can't beat intel compiler's profiling under Windows. Nothing.
Also, make sure your polling is working as it should. Suppress buffering, e.t.c. Other than that, unless if you don't use some OS /win/ specific code, I don't see any reason for your executable to be slower under windows.
As I said Linux will always be slightly slower for some reason I don't understand. Someone posted some time back that it was due to the OS using registers that were not available to application (but are in Linux) and thus the compiler can get a bit more. I don't really know if that is the reason or not but if you check out several binaries where there is a windows and a linux compile you will see that the Linux version is faster. Critter shows 6% for example but Richard uses different compilers so it's not apples to apples. I use the same compiler for both and I cross compile. One of the things I checked out is whether the cross compile is the issue - and it isn't. The native mingw compile on windows produced a binary that was almost exactly the same performance.
Of course if you use a superior compiler on the Windows version you may be able to get a superior binary if you can make up for the difference and then some.
It's my understanding that recent GCC versions are no longer second to Intel with respect to code quality. I don't have actual experience with that so I don't know. Intel also makes a compiler for Linux so it could be tested. The problem is that each compiler performs differently depending on your application. I have heard people report that their chess program performs better using clang or other compilers but those are significantly weaker for Komodo.
P.S. I do I/O in a separate thread with blocking input. During the search I poll for input and time control periodically. I don't have any reason to believe that would be slower in Windows, but I will double check these issues. Thanks for the suggestion.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
Lets talk about the compiler, not the compiles. If you're using GNUCC, that would make sense. But if we are talking about Intel compiler, which is producing at least 5-10% faster executable than any other compiler on ANY platform, I really don't share your experience. Of course, I wouldn't like to start flame wars of any kind, been trough that already, just wanted to share my experience. ICC /till .11 - PGO/ is producing the fastest code I've seen with absolutely equal speed in both windows and linux. I say again - MINE experience with it. So - which one are you using. Otherwise I can't say much.
It's my understanding that recent GCC versions are no longer second to Intel
Mincho Georgiev wrote:Lets talk about the compiler, not the compiles.
How does that address the question of why Komodo runs so much faster on Linux? It may be true that if I run an inferior compiler on Linux that I may be able to remove the disparity, but I hope you can see that this is not a serious answer to my question. I can beat Roger Federer at tennis if you cut off his right arm and make him wear two 50 pound concreate boots. That might eliminate the disparity but it wouldn't explain why he is a better player.
I will open a separate thread on the technical forum to talk about which compiler should be used but that has no relevance to my question here.
If you're using GNUCC, that would make sense. But if we are talking about Intel compiler, which is producing at least 5-10% faster executable than any other compiler on ANY platform, I really don't share your experience. Of course, I wouldn't like to start flame wars of any kind, been trough that already, just wanted to share my experience. ICC /till .11 - PGO/ is producing the fastest code I've seen with absolutely equal speed in both windows and linux. I say again - MINE experience with it. So - which one are you using. Otherwise I can't say much.
It's my understanding that recent GCC versions are no longer second to Intel
I highly doubt it, especially regarding PGO.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
Mincho Georgiev wrote:Lets talk about the compiler, not the compiles.
How does that address the question of why Komodo runs so much faster on Linux? It may be true that if I run an inferior compiler on Linux that I may be able to remove the disparity, but I hope you can see that this is not a serious answer to my question. I can beat Roger Federer at tennis if you cut off his right arm and make him wear two 50 pound concreate boots. That might eliminate the disparity but it wouldn't explain why he is a better player.
I will open a separate thread on the technical forum to talk about which compiler should be used but that has no relevance to my question here.
If you're using GNUCC, that would make sense. But if we are talking about Intel compiler, which is producing at least 5-10% faster executable than any other compiler on ANY platform, I really don't share your experience. Of course, I wouldn't like to start flame wars of any kind, been trough that already, just wanted to share my experience. ICC /till .11 - PGO/ is producing the fastest code I've seen with absolutely equal speed in both windows and linux. I say again - MINE experience with it. So - which one are you using. Otherwise I can't say much.
It's my understanding that recent GCC versions are no longer second to Intel
I highly doubt it, especially regarding PGO.
GNUCC executable on linux is faster than GNUCC executable on Windows, at least for the tons of sources which I had dealt with.
Intel compiled one is exactly equivalent in therms of speed in both on same hardware. And how is that irrelevant?
Mincho Georgiev wrote:Just a hint. I have no idea what you're using, but nothing can't beat intel compiler's profiling under Windows. Nothing.
Also, make sure your polling is working as it should. Suppress buffering, e.t.c. Other than that, unless if you don't use some OS /win/ specific code, I don't see any reason for your executable to be slower under windows.
As I said Linux will always be slightly slower for some reason I don't understand.
Guess you mean Windows is slower.
Don wrote:Someone posted some time back that it was due to the OS using registers that were not available to application (but are in Linux) and thus the compiler can get a bit more. I don't really know if that is the reason or not but if you check out several binaries where there is a windows and a linux compile you will see that the Linux version is faster. Critter shows 6% for example but Richard uses different compilers so it's not apples to apples. I use the same compiler for both and I cross compile. One of the things I checked out is whether the cross compile is the issue - and it isn't. The native mingw compile on windows produced a binary that was almost exactly the same performance.
Of course if you use a superior compiler on the Windows version you may be able to get a superior binary if you can make up for the difference and then some.
It's my understanding that recent GCC versions are no longer second to Intel with respect to code quality. I don't have actual experience with that so I don't know. Intel also makes a compiler for Linux so it could be tested. The problem is that each compiler performs differently depending on your application. I have heard people report that their chess program performs better using clang or other compilers but those are significantly weaker for Komodo.
P.S. I do I/O in a separate thread with blocking input. During the search I poll for input and time control periodically. I don't have any reason to believe that would be slower in Windows, but I will double check these issues. Thanks for the suggestion.
I think in chess, where leaf-functions or effective leaf functions dominate, it is better to have more caller-save registers (scratch registers) and less callee-save registers. 64 bit Windows has 7 caller-save vs. 8 callee-save general purpose registers. 64 bit Linux has 9 caller-save vs. 6 callee-save registers, thus Linux is better for us, not to mention simd-registers.
Mincho Georgiev wrote:Lets talk about the compiler, not the compiles.
How does that address the question of why Komodo runs so much faster on Linux? It may be true that if I run an inferior compiler on Linux that I may be able to remove the disparity, but I hope you can see that this is not a serious answer to my question. I can beat Roger Federer at tennis if you cut off his right arm and make him wear two 50 pound concreate boots. That might eliminate the disparity but it wouldn't explain why he is a better player.
I will open a separate thread on the technical forum to talk about which compiler should be used but that has no relevance to my question here.
If you're using GNUCC, that would make sense. But if we are talking about Intel compiler, which is producing at least 5-10% faster executable than any other compiler on ANY platform, I really don't share your experience. Of course, I wouldn't like to start flame wars of any kind, been trough that already, just wanted to share my experience. ICC /till .11 - PGO/ is producing the fastest code I've seen with absolutely equal speed in both windows and linux. I say again - MINE experience with it. So - which one are you using. Otherwise I can't say much.
It's my understanding that recent GCC versions are no longer second to Intel
I highly doubt it, especially regarding PGO.
GNUCC executable on linux is faster than GNUCC executable on Windows, at least for the tons of sources which I had dealt with.
Intel compiled one is exactly equivalent in therms of speed in both on same hardware. And how is that irrelevant?
You are not being consistent so I don't believe you. You said that regardless of platform Intel is going to be 5-10% faster. That means there is still a major disparity unless the gain relative to gcc is much smaller on Linux - but that is not what you said.
I don't want to start a flame war either but I think you are making this up as you go.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
Mincho Georgiev wrote:Just a hint. I have no idea what you're using, but nothing can't beat intel compiler's profiling under Windows. Nothing.
Also, make sure your polling is working as it should. Suppress buffering, e.t.c. Other than that, unless if you don't use some OS /win/ specific code, I don't see any reason for your executable to be slower under windows.
As I said Linux will always be slightly slower for some reason I don't understand.
Guess you mean Windows is slower.
Yes, I meant that Linux seems to run chess programs much better.
Don wrote:Someone posted some time back that it was due to the OS using registers that were not available to application (but are in Linux) and thus the compiler can get a bit more. I don't really know if that is the reason or not but if you check out several binaries where there is a windows and a linux compile you will see that the Linux version is faster. Critter shows 6% for example but Richard uses different compilers so it's not apples to apples. I use the same compiler for both and I cross compile. One of the things I checked out is whether the cross compile is the issue - and it isn't. The native mingw compile on windows produced a binary that was almost exactly the same performance.
Of course if you use a superior compiler on the Windows version you may be able to get a superior binary if you can make up for the difference and then some.
It's my understanding that recent GCC versions are no longer second to Intel with respect to code quality. I don't have actual experience with that so I don't know. Intel also makes a compiler for Linux so it could be tested. The problem is that each compiler performs differently depending on your application. I have heard people report that their chess program performs better using clang or other compilers but those are significantly weaker for Komodo.
P.S. I do I/O in a separate thread with blocking input. During the search I poll for input and time control periodically. I don't have any reason to believe that would be slower in Windows, but I will double check these issues. Thanks for the suggestion.
I think in chess, where leaf-functions or effective leaf functions dominate, it is better to have more caller-save registers (scratch registers) and less callee-save registers. 64 bit Windows has 7 caller-save vs. 8 callee-save general purpose registers. 64 bit Linux has 9 caller-save vs. 6 callee-save registers, thus Linux is better for us, not to mention simd-registers.