eval pieces

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

mar
Posts: 2554
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: eval pieces

Post by mar »

sje wrote:About 5%:

Code: Select all

gail:Symbolic sje$ ls -l Symbolic
-rwxr-xr-x  1 sje  staff  1010200 Jun 19 22:52 Symbolic
gail:Symbolic sje$ strip Symbolic
gail:Symbolic sje$ ls -l Symbolic
-rwxr-xr-x  1 sje  staff  955168 Jun 20 04:36 Symbolic
That's executable, not object file.
User avatar
hgm
Posts: 27795
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: eval pieces

Post by hgm »

Isn't that the same, in Linux? I thought an (unstripped) executable was just an object file where no external symbols happen do be undefined.
mar
Posts: 2554
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: eval pieces

Post by mar »

I'm not sure about the format, you may be right. I guess the symbols are there (in the executable) for debugging purposes,
so if it exports the same symbols as in object file then yes, it would be the same.
But you need all exported symbols in object files, say you export some functions and use static linking.
So you need the symbols in object file but you don't need them in the executable.

In C++ there's also name mangling that inflates symbols (because you need to encode types as well since C++ supports overloading,
void myfunc(int) and void myfunc(const char *) have to map to different symbols)
and with templates it's even more complicated:
Let's say you have a template and it has some non-trivial functions in it (i.e. they cannot be inlined).
Let's say you use it in multiple .cpp files with the same template arguments and linking statically.
The linker is supposed to use only one instance in the final executable because otherwise it would mean lots of code duplication.

EDIT: I've also noticed that Microsoft linker even checks for identical functions (=implementation) so this can lead to even more savings in the resulting executable,
especially when using templates if you have functions that are independent from template arguments.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: eval pieces

Post by sje »

lucasart wrote:
bob wrote: c++ is teaching sloppy programming practices, unfortunately. For example, in Crafty, the ENTIRE source compiles into an object file of about 300K. egtb.cpp (the Nalimov code with massive template usage) compiles into a 1mb object file. From a performance perspective, that certainly sucks with two straws... L3 cache might be able to cope with that, but L1/L2 are not going to do so well..
What matters is the size of the executable, not the size of the various object files used before linking.

The problem of C++ is more in the compilation speed, which is extremely slow (partly because the syntax rules are extremely complicated and context sensitive, and also because the standard libraries are so crufty). But the linker can then throw away what's not needed. So the result should not be as large as the sum of object file sizes. The speed of the compiled code is equivalent to C (assuming the programmer knows what he's doing, though most people who use C++ really don't know what they're doing).
People should not forget that a significant portion of the run time memory footprint is going to include code from dynamically linked libraries; this doesn't show up when looking only at the size of the run time binary file. The effect on code cache pressure will vary; hopefully the chess program's update and evaluation routines rarely call any library functions or make system calls.

As for C++ compilation speed:

1) Get a faster computer
2) Use a compilation scheme which uses all available cores
3) Use a compilation scheme which uses networked helper machines
4) Avoid referencing unneeded include files
5) Avoid referencing an include file more than once per compilation unit
syzygy
Posts: 5563
Joined: Tue Feb 28, 2012 11:56 pm

Re: eval pieces

Post by syzygy »

The major contributor to the size of crafty's executable is indeed egtb.cpp.

After "make quick" on crafty-24.0 (i.e. no profiling stuff):
with egtb: 1277091 bytes unstripped, 1187376 bytes stripped.
without egtb: 370309 bytes unstripped, 341384 bytes stripped.

But it is true that the size of the egtb.o file exceeds the difference: 1559448 bytes. (And this is not a case of link-time optimisation.)

To really get rid of the egtb code small changes in the makefile are needed beyond adding -DNOEGTB to the "opt =" line.

I don't think the size of the egtb code is a big deal, as most of the code will not be used even in case of relatively heavy probing. Duplicating evaluation functions is much more likely to impact cpu cache performance.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: eval pieces

Post by bob »

lucasart wrote:
bob wrote: c++ is teaching sloppy programming practices, unfortunately. For example, in Crafty, the ENTIRE source compiles into an object file of about 300K. egtb.cpp (the Nalimov code with massive template usage) compiles into a 1mb object file. From a performance perspective, that certainly sucks with two straws... L3 cache might be able to cope with that, but L1/L2 are not going to do so well..
What matters is the size of the executable, not the size of the various object files used before linking.

The problem of C++ is more in the compilation speed, which is extremely slow (partly because the syntax rules are extremely complicated and context sensitive, and also because the standard libraries are so crufty). But the linker can then throw away what's not needed. So the result should not be as large as the sum of object file sizes. The speed of the compiled code is equivalent to C (assuming the programmer knows what he's doing, though most people who use C++ really don't know what they're doing).
If you read what I wrote... if an object file has no initialized data in it, then the size of the object file is directly proportional to the cache footprint it adds to the executable. If you initialize data in something, the size of the object file will certainly explode. but most don't do that. I don't care about c++ compilation speed. I care a lot about execution speed. I agree C++ can be just as fast as C if the programmer is careful, and knows what he is doing. But templates are primarily "code bloaters". Eugene's egtb.cpp is a good case in point. Rybka's evaluation was another.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: eval pieces

Post by bob »

mar wrote:
sje wrote:I'm rather doubtful about the merit of correlating object size with run time efficiency.
Yes and I bet most of that is symbols anyway :)
For typical code it is not really symbols, at least in my case. A procedure (file) is pretty self-inclusive. It might call a small number of functions, but there are not thousands of global variables. Most references are offset to the split block pointer which doesn't use any symbols.

But for c++, look out for the templates.

C++ started off as a pretty good idea. But it was a committee deal, and committees don't work. C was written by two people, it was designed by two people, it has withstood the test of time since the early 70's. No ISO, no POSIX, just a few bright guys deciding what they needed/wanted. C++ is quite the opposite. Too many people involved, the inability to decide on a reasonable standard and so they include everything anyone can imagine and then some, and it leads to lots of potential pitfalls that nobody is even aware of unless you are a compiler person...

And the bloat continues every few years when the next "standard" is developed and released...
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: eval pieces

Post by sje »

bob wrote:C++ started off as a pretty good idea. But it was a committee deal, and committees don't work. C was written by two people, it was designed by two people, it has withstood the test of time since the early 70's. No ISO, no POSIX, just a few bright guys deciding what they needed/wanted. C++ is quite the opposite. Too many people involved, the inability to decide on a reasonable standard and so they include everything anyone can imagine and then some, and it leads to lots of potential pitfalls that nobody is even aware of unless you are a compiler person...

And the bloat continues every few years when the next "standard" is developed and released...
You must admit that the advent of ANSI C in the 1980s made for a BIG improvement over the original C. The addition of function prototypes and consistent conversion of actual parameters eliminated major bug magnets. Prior to ANSI C, a coder had to rely on a utility like lint to do what the compiler should have been doing. Remember also that original C didn't have the void or unsigned char types, nor did it have enumeration types. No "//" comments, either.

As for C++, the availability of classes to allow for abstraction, encapsulation, and inheritance -- none really available in C -- make major multi-coder projects feasible. C++ operator overloading for user defined classes like bitboards, hashes. and moves helps make code legible. C++ inline function declarations help in several ways, not the least of which is to support type checking which preprocessor macros never had.

As for the latest and greatest C++11, it's main benefit is to standardize representation of low level concepts like threads, locks, timing/clocks, PRNGs, etc. in a platform independent manner. This is a great help with eliminating the forest of ifdef preprocessor macro definitions needed to select different code for Linux/Mac/Windows.
Henk
Posts: 7218
Joined: Mon May 27, 2013 10:31 am

Re: eval pieces

Post by Henk »

When I used C++ for the last time in 2007 they complained that I did not use routines that were Unicode compatible. I forgot what it was all about. Maybe portability ?
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: eval pieces

Post by bob »

sje wrote:
bob wrote:C++ started off as a pretty good idea. But it was a committee deal, and committees don't work. C was written by two people, it was designed by two people, it has withstood the test of time since the early 70's. No ISO, no POSIX, just a few bright guys deciding what they needed/wanted. C++ is quite the opposite. Too many people involved, the inability to decide on a reasonable standard and so they include everything anyone can imagine and then some, and it leads to lots of potential pitfalls that nobody is even aware of unless you are a compiler person...

And the bloat continues every few years when the next "standard" is developed and released...
You must admit that the advent of ANSI C in the 1980s made for a BIG improvement over the original C. The addition of function prototypes and consistent conversion of actual parameters eliminated major bug magnets. Prior to ANSI C, a coder had to rely on a utility like lint to do what the compiler should have been doing. Remember also that original C didn't have the void or unsigned char types, nor did it have enumeration types. No "//" comments, either.

As for C++, the availability of classes to allow for abstraction, encapsulation, and inheritance -- none really available in C -- make major multi-coder projects feasible. C++ operator overloading for user defined classes like bitboards, hashes. and moves helps make code legible. C++ inline function declarations help in several ways, not the least of which is to support type checking which preprocessor macros never had.

As for the latest and greatest C++11, it's main benefit is to standardize representation of low level concepts like threads, locks, timing/clocks, PRNGs, etc. in a platform independent manner. This is a great help with eliminating the forest of ifdef preprocessor macro definitions needed to select different code for Linux/Mac/Windows.
There's a huge difference between taking a language and writing a specification for it that tightest up ambiguities, and writing a language spec that includes everything anyone suggests, and then doubling that. :)

As far as the good of various C++ things, there are also the ugly things. Operator overloading? :) I have seen more bugs caused by using that than I would have imagined. Just use the wrong type argument and you use the wrong function, without noticing.