About off-topic threads [pruned branch]

Discussion of chess software programming and technical issues.

Moderator: Ras

wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: strcpy() revisited

Post by wgarvin »

Daniel Shawul wrote:
mar wrote:
Daniel Shawul wrote:The rest is like I said it, a bunch of desperate programmers, trying to look smart, beating their chest to irrelevant crowd of people. Lets see your wizardy in 'C programing forums', 'C++ programing forums'. Anyone who has visited a fair amount programing forums knows what is going on here is an ego-trip between self-proclaimed experts..
While I agree with you that programming should be primarily about algorithms and design and not about fancy language constructs
and that programming language is just a tool, what's wrong about discussing circumstances under which it can break?
How can you judge others while knowing absolutely nothing about other projects they worked on?
But you are certainly an expert in insulting others.
Please note the discussion is about strcpy(), not the branch about lockless hashtable undefined behaviour. What is your summary of the original strcpy() thread and the revisited strcpy() thread as is relevant to chess programmers? You would have a hard time justifying that 30 pages of discussion if you are being honest.

To be fair, this purely programming discussion is not the first one here as I have already mentioned, but that doesn't make it right. AFAIK off-topic programming discussions are allowed with-in reason (the best the charter says), which does not imply discussing something as boring as visiting strcpy() and revisiting it with 30 pages of one-liners.
It doesn't make it wrong either. I don't see why do we have to justify to you the chess-worthy-ness of a programming topic. We had fun discussing it, and undefined behavior does after all have at least some impact on chess programming and pretty much every other kind of programming. If the moderators feel things are getting out of hand, they can take whatever action they feel is necessary: issue warnings, delete some of the posts or threads, etc. That's completely reasonable. But I'm surprised at the intolerant attitude some people have here. If a thread doesn't interest you, there's no need for you to read it. Why should you care if it accumulates ten thousand views and 30 pages of posts?

Maybe people are a bit sensitive about it because of actual trolling, flame-wars etc. that have happened in General? I don't read General most of the time, so this is just a guess. But I know the moderators occasionally have to swing their clue-bat over there, and my impression at least is that the programming subforum is usually better-behaved than General. Everyone here is living proof that programmers can be passionate about topics that interest them. :P

Clearly there are some forum members who are annoyed or offended by the long, long threads about the strcpy() thing and the entire can-of-worms about UB that it opened up. But there are others who learned things during the discussion, so I don't think it was entirely a waste of time.

[Edit: however, I do agree that 30 pages was more than the topic really deserved. I guess we should try to keep our flame-wars shorter in the future! :lol:]
Daniel Shawul
Posts: 4186
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: strcpy() revisited

Post by Daniel Shawul »

Just say it plainly that strcpy() discussion was total garbage, and ofcourse you don't have to justify anything to me. What is puzzling is the deafening silence of the mods about the issue of pure programming discussions. I just recalled a perfect example of how I learned about effects of UB some 3 years ago with regard to order of destruction of objects, and how and where I asked questions, how I took my lectures... First my question is not asked here in a chess programming forum (so I do what I preach :)) but in c++ codeguru forum, and it ended in like a page of discussion (so it ended very short as expected). I was actually looking for more discussion, but those guys does not let you ramble on. Infact I believe it needed a bit more clarification, but I gave up looking for it when one of the guys started lecturing me about not using global variables even though I mentioned use of global variables is necessary for the application. I guess that is a problem everywhere, i.e. getting lectures on programming languages/style, but sort of justified in a programming forum, but not in a chess_programming forum. Anyway you might find the actual UB discussion interesting so here http://forums.codeguru.com/showthread.p ... ost1965764
Daniel Shawul
Posts: 4186
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: strcpy() revisited

Post by Daniel Shawul »

Infact I challenge the UB experts here to offer their solutions for my UB problem from three years ago. Lets see what you are made of :)
wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: strcpy() revisited

Post by wgarvin »

Daniel Shawul wrote:Just say it plainly that strcpy() discussion was total garbage, and ofcourse you don't have to justify anything to me. What is puzzling is the deafening silence of the mods about the issue of pure programming discussions. I just recalled a perfect example of how I learned about effects of UB some 3 years ago with regard to order of destruction of objects, and how and where I asked questions, how I took my lectures... First my question is not asked here in a chess programming forum (so I do what I preach :)) but in c++ codeguru forum, and it ended in like a page of discussion (so it ended very short as expected). I was actually looking for more discussion, but those guys does not let you ramble on. Infact I believe it needed a bit more clarification, but I gave up looking for it when one of the guys started lecturing me about not using global variables even though I mentioned use of global variables is necessary for the application. I guess that is a problem everywhere, i.e. getting lectures on programming languages/style, but sort of justified in a programming forum, but not in a chess_programming forum. Anyway you might find the actual UB discussion interesting so here http://forums.codeguru.com/showthread.p ... ost1965764
Okay, now I understand your objection to the UB threads better. 8-)

About your global variables case from back then: I think the problem you ran into is commonly called the static initialization order fiasco. The problem is that C++ guarantees to call the constructor and destructor for each global variable, but it doesn't promise to do them in any particular order. So any code you run "before" main (i.e. during one of those constructors of a global variable) or "after" main (i.e. during one of the destructors of a global variable) needs to be careful not to make use of any other global variable. All of the basic C types are safe, as are POD structs. But anything that does stuff in its ctor/dtor (such as STL containers, std::string, or any class that has an STL container or std::string as a member) can only be used after it has been properly constructed and before it has been destroyed. From the C++ point of view, that's the "lifetime" of the object; it doesn't really "exist" outside of those times, even if the raw storage space is still there. Constructors are invoked on some raw space and construct a usable object there. Destructors tear down that usable object, leaving just the raw space. So your "std::list dlist" got destroyed, and then the other class tried to access something in it and Bad Things(tm) ensued.

So how to make sure you only use globals that have been initialized and not destroyed yet? There are some tricks you can do to do a "lazy" initialization of the global, but they often come with a performance penalty (e.g. you replace it with a pointer and new/delete it yourself, but now you have to pay the cost of the pointer indirection every time you use it) and/or making them thread-safe can be a big challenge.

At my day job, we use a lot of "singleton" objects that would have this problem: they are global variables, and they need to be constructed and destructed. They often have complex dependencies between them, so we really need to control the order in which we initialize and destroy them. So what we always do is write our own Init and Destroy function for the program to initialize and destroy all of the singletons in the order that we want. We usually make some sort of "Singleton storer" template that reserves the proper size of space (and with proper alignment) for the singleton type, but doesn't have a constructor or destructor. (Its just raw space.) We call a method on this object, called CreateInstance or something, and that method uses placement new to construct the singleton in the space. Every place where we ever use this singleton, we call the GetInstance method which returns a reference to the proper object type. And when we are shutting down, we call DestroyInstance on it which invokes the singleton type's destructor.

So some variant of that might work for your case: make a template wrapper class that acts like the wrapped type except that it has no constructor or destructor, and it has methods to explicitly construct and destroy the wrapped type. Then initialize and destroy the wrappers yourself by calling those methods in a safe order when the program starts up or shuts down. Yes, this is kind of ugly... But the more globals of that sort that you have, the more important it is to be able to control the order yourself! And we use threads, but we make sure (during startup) not to start a worker thread before we have initialized all the singletons that it might try to access, and later (during shutdown) we wait for the thread to exit before we start tearing down those singletons.

A tricky part is that you want the wrapper to have the same alignment as the wrapped type; I'm not aware of a portable way to detect that alignment although the major compilers all support some extension for querying it. (Actually I think it can be deduced using some template metaprogramming, but that is real voodoo.) You could avoid that hassle by just aligning the wrapper to e.g. 16 since your wrapped type probably won't need more than 8 (for double or int64) or maybe 16 if it uses SIMD vector types.

Providing access to the wrapped "object" in a standard-compilant way is also tricky; you are probably casting a pointer to your raw storage into a pointer to the wrapped type, and the cast is probably illegal under the strict aliasing rule. At work we often have strict aliasing disabled anyway because we do plenty of type punning and nobody really understands how to do it safely AND efficiently across all our target compilers. In this case, I think even if the cast is technically undefined behavior, as long as you never try to access that storage using any _other_ type (such as whatever type you declared it with) then you should be OK, because the usual failure mode for strict-aliasing violations is the compiler believing that two memory accesses don't overlap because the types are different. So as long as you don't try to access the storage in some other way besides through that cast, it should be fine.
wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: strcpy() revisited

Post by wgarvin »

For the wrapper to be the proper size and alignment...

You could consider making a template that takes the size and alignment as parameters. It could allocate the size as a char array and use compiler extensions like __declspec(aligned) or whatever it is. You could specialize the template for the alignments you care about (1, 2, 4, 8, 16 and maybe 64 should be enough?) and build it out of a basic type that already has that alignment. Like, for an alignment of 1, use an array of N chars. For an alignment of 2, use an array of (N+1)/2 int16's. And so on. So this template might be "template< int Size, int Align > AlignedSpace { ... }" and then you could make something like "template< class T > SingletonWrapper : public AlignedSpace< sizeof(T), __alignof__(T) > { ... }"
though that __alignof__ is compiler-specific extension, but you can just throw 16 in there and it will work in most cases, maybe wasting a little bit of padding space.


Edit: I think I wasn't clear enough about one thing:
About your global variables case from back then: I think the problem you ran into is commonly called the static initialization order fiasco. The problem is that C++ guarantees to call the constructor and destructor for each global variable, but it doesn't promise to do them in any particular order.
Because of this problem, what we do at my day job is just not let C++ handle the construction/destruction for us. We only declare a global variable of a type that has a ctor/dtor if no other singletons/globals need to touch it, and if the ctor and dtor don't do very much. Anything nontrivial (such as STL containers) we would instead declare it using our wrapper template. In our case, we know when designing the classes that we want them to be singletons, so we can inherit from the template or just use macros that declare the proper CreateInstance/DestroyInstance/GetInstance methods in them. We use the same conventions for all of them. Maybe SingletonWrapper is a lousy name to use with types that aren't actually singletons! "GlobalWrapper<T>" might be better.
wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: strcpy() revisited

Post by wgarvin »

Here's some totally untested code that might be usable as a starting point. There might be something dumb in this, I couldn't sleep so I did this instead...

Code: Select all

template< unsigned Align > class AlignType {};
template<> class AlignType<1> { typedef char T; }
template<> class AlignType<2> { typedef int16_t T; }
template<> class AlignType<4> { typedef int32_t T; }
template<> class AlignType<8> { typedef int64_t T; }
//template<> class AlignType<16> { typedef __m128 T; }

template< unsigned Size, unsigned Align >
struct AlignedStorage
{
    typedef AlignedType<Align>::T ElemT;    
    ElemT   m_data[ (Size+sizeof(ElemT)-1)/sizeof(ElemT) ];
};

template< class T >
struct AlignOf
{
protected:
    struct AlignmentFinder { char a; T b; }
public:    
    enum { N = sizeof(AlignmentFinder) - sizeof(T) };
};

template< class T >
class GlobalWrapper : protected AlignedStorage< sizeof(T), AlignOf<T>::N >
{
public:
    T& Get() { return *reinterpret_cast<T*>(m_data); }
    const T& Get() const { return *reinterpret_cast<const T*>(m_data); }
    void Create()  { new((void*)m_data) T(); }
    void Destroy() { Get().~T(); }
};
For this to work, T needs to have a public default constructor and a public destructor. The destructor call will be non-virtual even if T has a virtual destructor, but that's OK since we know it really was a T we constructed there. :P It will fail to compile if T has a strange alignment requirement (like if you told the compiler it needed 64-byte alignment) since the AlignType template is only specialized for the few alignments you can get using basic types, no __declspec(align) etc. You have to call Get() on the wrapper object to get access to the thing wrapped inside, which you must Create() yourself before using it, and Destroy() when you are done.


[Edit: okay, so now I'm all smug... what is needed now is for Rein or Ronald or anyone else, to drop by and point out some undefined behavior in this code that I completely overlooked! That would be karmically appropriate.]
Daniel Shawul
Posts: 4186
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: strcpy() revisited

Post by Daniel Shawul »

See what that led you to? :) I am afraid there is even more to be discussed because the problem I had required use of automatic destructors. Trust me, I am not trying to be deliberately difficult. I am sure I have said why it was necessary for me to know the order of destruction somewhere in that forum. When I realized the problem is going to require me studying singleton objects and stuff, I realized this was more than what I bargained for!

The problem is about RAII( resource acquisition is initialization) for a 'field' (big array of say 100000 doubles) of variable-sized vectors. A declaration of a single array of that type costs a lot of memory, thus the program's capability to run bigger problems will be limited. Also memory is allocated (malloced) as much as the program needs it once, but is recycled afterwards when a declared Field variable goes out of scope.

Code: Select all

Field A;
Field B;
//use A
//use B
vs

Code: Select all

{
Field A;
//use A
}
{
Field B;
//use B
}
The latter can run twice as big a problem as the former (unless the first can be optimized somehow). For this reason, I also had to chase down the gargantuanous amount of temporary copies that C++ produces when passing and returning from functions, writing expressions, etc and fix them accordingly. This was even more pronounced that A,B,C are variables that the user of the library could declare to write expressions such as (A*x^2+B*x+C etc.) at run time. That required intricate template meta-programing to reduce the number of copies, reduce the number of FLOPS to evaluate the expression, basically to write a C++ code as efficient and fast as a plain FORTRAN code that solves the problem with explicit for-loops. I am not sure I did all that but that was the final goal, so that is the kind of torture you have to go through to achieve a seemingly straightforward job. There were some C++ books that cover such topics, but I had already gone enough in my search of elegance with C++.

As to the solution I used, I simply note for program termination using 'atexit(clean_up_func)', and then check for the boolean isProgramTerminated in the destructors of the Field's. That way I stop using reclaiming memory when the program is about to be terminated. This is clearly a hack, but to this day I haven't found a better solution.
wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: strcpy() revisited

Post by wgarvin »

Yes. I think it was probably this kind of thing that Bjarne Stroustrup was thinking of when he uttered his famous quote about how "C makes it easy to shoot yourself in the foot; C++ makes it harder, but when you do it blows your whole leg off. "

C++ contains lots of elegant, helpful features. But sometimes when you inevitably mix those features together you can run into tricky difficulties that would probably never happen to a C programmer. C++ lets you "artificially" create hard problems for yourself and then you end up spending effort trying to solve those problems instead of the original problem you were writing the program to solve. If you aren't careful you can wander "down a rabbit hole" and spend an entire day trying to solve a static order initialization fiasco instead of just tackling it the way a C programmer would: by manually initializing and destroying all of the things involved.

C++ is like a "kitchen sink" language. Actually it contains a bathtub, a swimming pool and three kitchen sinks. There's lots of features in it that don't play terribly well with other features.

Example: when you mix exceptions with constructors and destructors, suddenly you have to try and make your code exception-safe at every point. You end up wanting to throw them to signal error situations from a constructor; but from a destructor, you have to be really careful... you have to avoid throwing an exception while already handling an exception, or the program will abort. In short you take on a huge mental burden for (from my point of view) the relatively minor gain of being able to use exceptions. Its far simpler to just disable them and not worry about them at all. Especially because I've hardly ever met a C++ programmer who was skilled enough and knowledgeable to write exception-safe code all the time. I'm certainly not smart enough to do that, I'm sure I would accidentally introduce exception-safety bugs into my programs all the time. I've met a few programmers who probably think they are able to do it, but that's not qute the same thing! :lol:

In a garbage-collected language like Java or C#, the benefits of exceptions outweigh the hassle. Programmer can throw an exception in the middle of some complex task and leak memory, and that is fine. He can avoid leaking other resources using try...finally, and since memory probably makes up 95% of the resources he is dealing with, this is not too onerous. But in C++ you can't usually afford to leak ANYTHING, so being exception-safe ends up being an onerous task, at least in my opinion. Others may disagree. In fact they probably will because C++ has so many features that almost everyone mostly uses a subset of the language. One problem is, every programmer and team favors a slightly different subset.

Another example of a feature that I have never (ever) seen used on purpose in production code: virtual base classes. This is the solution to a multiple-inheritance problem that most sane developers manage to avoid causing in practice. If you drink the object-oriented kool-aid to the max, you might end up thinking you need virtual base classes to model something-or-other "properly". But by that point I would argue you are seriously into the territory of creating newer harder problems for yourself. I've never seen a problem that I thought virtual base classes were the actual good solution for. If I ever came across them in production code, I think I would have a strong urge to throw away that code and rewrite whatever it was! :lol: But somebody somewhere, probably uses them for something and considers them a crucial and useful feature.