ansi-C question

Discussion of chess software programming and technical issues.

Moderator: Ras

Carey
Posts: 313
Joined: Wed Mar 08, 2006 8:18 pm

Re: ansi-C question

Post by Carey »

bob wrote:
Carey wrote:
bob wrote:
Zach Wegner wrote:What you wrote will work about anywhere. I'm not sure about ANSI C, but multiplication is done implicitly with mod x for x-bit arithmetic. For instance magic multiplication relies on this.

However, if z is more than 32 bit, then you must do a cast after the multiplication: (unsigned int)(x * y);
The problem is the "unsigned int". It is not guaranteed to be 32 bits wide. It can be 64 bits wide just as easily. Unfortunately the ANSI committee didn't give us data types that allow us to specify 32 bits in a portable way. That "unsigned int" is 64 bits on a Cray, for example. And on the Dec Alphas depending on the O/S you are using.
Sure they did, Bob. That's what uint32_t does.
That is not a natural data type. short, int, float are the usual types. Yes those came long later but not until the standard was already well-mangled.
Yes they are natural data types. They aren't some made up data type. Like doing 37 bit integers on a 32 bit cpu. (With exceptions of doing 32 bit stuff on a 36 bit system, and those kinds of things. But stdint.h even allows for that.)

They are just synonyms for existing, real, supported types.

True, they aren't your traditionally named, usual data types, but that doesn't make then unnatural. Just typedefs.


They came about 10 years later after the original C standard was done. In 1999. I mentioned that. I also mentioned one of the reasons why it wasn't in the original C89 standard.

In 1989, they had already gone about as far as they could in the time they had, and there was already enough changes and inovations that doing yet more inventive stuff would have caused problems.

Plus, it takes a few years of people actually using a standard to see where the weak points are and what else needs to be done. Hence the C99 standard.

Standard C may be 'mangled' as you put it, but it's better than working with K&R. So in spite of its flaws, you are better off with it than without it.

For example, exactly what is a "char"???
A minimum storage unit capable of holding one character of the underlying alphabet.

It can be more than 8 bits. Both because of the underlying hardware, and because of the language being used. (It might be 16 bit encoding of the character set, for example.)

Because historically people have used it as a tiny 'int', it can also be signed or unsigned.

And because historically compiler writers have done whatever they want, both C89 & C99 have to leave the default signness as 'implementation defined' Changing that now (or in 1999 or even back in 1989) would have broken way too much code. Something that was definetly not allowed for them or desired by them.

Of course, by using the data types that were given to us in 1999, or even our own portability header, we avoid all those issues. You just have to get over the inertia of using 'int' and 'unsigned int' and so on.

Once you break your habit, data type issues become much less of an issue or concern.


It's guaranteed to be exactly 32 bits. No more and no less.

True, that's C99. It wasn't practical to do it back in C89 (that would have been too 'radical' of an idea, and they were already reaching the limit of what people were ready to tolerate).

That and more is in stdint.h They even provide things like uint_fast32_t for math that needs to be at least 32 bits but can be more if it would be faster (or more convenient) for the compiler or platform.

If you are using a compiler that doesn't provide stdint.h (and inttypes.h), then you probably should find a better compiler.

Or if you are desperate, you can at least fake it tolerably well (with custom headers) and still use those standard types in your program.

There are some portable ones floating around on the web, and I think a few people have posted similar headers here in the forum.

There's no reason to use vague sizes or compiler specific types in your code. Hasn't been for years.


I do realize that Microsoft refuses to support even a tiny bit of C99. Not even something as simple as stdint.h You can either switch to GNU C, or Intel C (I think it has it). Or you can do some custom headers to hide all that with #ifdef's etc. and let your own program be blissfully unaware of the low level details.
Again, that is a portability issue that I have to deal with, since 99% of the machines on the planet are running windows, and the majority of those use MSVC variants, rather than gcc/icc...
Agreed. Since a lot of people use MSVC, that's why I bothered to point out a few options that can be used. So you can at least fake it well enough that your own code never has to use 'unsigned int' or 'long long' etc. It can be blissfully unaware of what types the compiler or system prefers.

I would not mind writing purely for Linux in fact, as it would make the code _far_ cleaner, and I may well one day take this plunge as the current conditional compile stuff is a mess...
I can't vouch for your code. The thread / process stuff and whatever other portability issues that are involved. And to be honest, I rarely look at Crafty. (No offense, but I *do* prefer older programs. I'm still waiting for you to finally release Blitz & CrayBlitz, and for a couple more historic programs that I've been promised. One I already have, but I can't talk about it or release it yet... That really irks me! :lol: )

But using things like stdint, or some of the headers that have been posted in here can go quite a way to hiding the differences for data types, printing data types, 64 bit constants, etc. etc.


There are enough portability headers already in existance (plus what C already gives us) that there is no good reason to depend on compiler specific types and stuff. Your own code shouldn't even know about 'long long' and so on.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: ansi-C question

Post by bob »

Carey wrote:
bob wrote:
Carey wrote:
bob wrote:
Zach Wegner wrote:What you wrote will work about anywhere. I'm not sure about ANSI C, but multiplication is done implicitly with mod x for x-bit arithmetic. For instance magic multiplication relies on this.

However, if z is more than 32 bit, then you must do a cast after the multiplication: (unsigned int)(x * y);
The problem is the "unsigned int". It is not guaranteed to be 32 bits wide. It can be 64 bits wide just as easily. Unfortunately the ANSI committee didn't give us data types that allow us to specify 32 bits in a portable way. That "unsigned int" is 64 bits on a Cray, for example. And on the Dec Alphas depending on the O/S you are using.
Sure they did, Bob. That's what uint32_t does.
That is not a natural data type. short, int, float are the usual types. Yes those came long later but not until the standard was already well-mangled.
Yes they are natural data types. They aren't some made up data type. Like doing 37 bit integers on a 32 bit cpu. (With exceptions of doing 32 bit stuff on a 36 bit system, and those kinds of things. But stdint.h even allows for that.)

They are just synonyms for existing, real, supported types.

True, they aren't your traditionally named, usual data types, but that doesn't make then unnatural. Just typedefs.


They came about 10 years later after the original C standard was done. In 1999. I mentioned that. I also mentioned one of the reasons why it wasn't in the original C89 standard.

In 1989, they had already gone about as far as they could in the time they had, and there was already enough changes and inovations that doing yet more inventive stuff would have caused problems.

Plus, it takes a few years of people actually using a standard to see where the weak points are and what else needs to be done. Hence the C99 standard.

Standard C may be 'mangled' as you put it, but it's better than working with K&R. So in spite of its flaws, you are better off with it than without it.

For example, exactly what is a "char"???
A minimum storage unit capable of holding one character of the underlying alphabet.

It can be more than 8 bits. Both because of the underlying hardware, and because of the language being used. (It might be 16 bit encoding of the character set, for example.)

Because historically people have used it as a tiny 'int', it can also be signed or unsigned.

And because historically compiler writers have done whatever they want, both C89 & C99 have to leave the default signness as 'implementation defined' Changing that now (or in 1999 or even back in 1989) would have broken way too much code. Something that was definetly not allowed for them or desired by them.

Of course, by using the data types that were given to us in 1999, or even our own portability header, we avoid all those issues. You just have to get over the inertia of using 'int' and 'unsigned int' and so on.

Once you break your habit, data type issues become much less of an issue or concern.


It's guaranteed to be exactly 32 bits. No more and no less.

True, that's C99. It wasn't practical to do it back in C89 (that would have been too 'radical' of an idea, and they were already reaching the limit of what people were ready to tolerate).

That and more is in stdint.h They even provide things like uint_fast32_t for math that needs to be at least 32 bits but can be more if it would be faster (or more convenient) for the compiler or platform.

If you are using a compiler that doesn't provide stdint.h (and inttypes.h), then you probably should find a better compiler.

Or if you are desperate, you can at least fake it tolerably well (with custom headers) and still use those standard types in your program.

There are some portable ones floating around on the web, and I think a few people have posted similar headers here in the forum.

There's no reason to use vague sizes or compiler specific types in your code. Hasn't been for years.


I do realize that Microsoft refuses to support even a tiny bit of C99. Not even something as simple as stdint.h You can either switch to GNU C, or Intel C (I think it has it). Or you can do some custom headers to hide all that with #ifdef's etc. and let your own program be blissfully unaware of the low level details.
Again, that is a portability issue that I have to deal with, since 99% of the machines on the planet are running windows, and the majority of those use MSVC variants, rather than gcc/icc...
Agreed. Since a lot of people use MSVC, that's why I bothered to point out a few options that can be used. So you can at least fake it well enough that your own code never has to use 'unsigned int' or 'long long' etc. It can be blissfully unaware of what types the compiler or system prefers.

I would not mind writing purely for Linux in fact, as it would make the code _far_ cleaner, and I may well one day take this plunge as the current conditional compile stuff is a mess...
I can't vouch for your code. The thread / process stuff and whatever other portability issues that are involved. And to be honest, I rarely look at Crafty. (No offense, but I *do* prefer older programs. I'm still waiting for you to finally release Blitz & CrayBlitz, and for a couple more historic programs that I've been promised. One I already have, but I can't talk about it or release it yet... That really irks me! :lol: )

But using things like stdint, or some of the headers that have been posted in here can go quite a way to hiding the differences for data types, printing data types, 64 bit constants, etc. etc.


There are enough portability headers already in existance (plus what C already gives us) that there is no good reason to depend on compiler specific types and stuff. Your own code shouldn't even know about 'long long' and so on.
My point above was that "char x" doesn't say much about x. It can be signed or unsigned, while "int x" is always signed unless explicitly declared as unsigned. That is not a good standard. It should be "regular" and that certainly is not... I've never seen char anything but 8 bits, but I have seen them default to signed and unsigned depending on the compiler.

And again, portability is an issue. Until _everybody_ accepts int64_t, it will be dangerous (and problematic) to use it if portability is important, for Crafty obviously it is...

There was absolutely nothing wrong with having "int8", "int16", "int32", "int64" and "int128" as data types. Fortran did that in the 60's (real*4, real*8 for example). There was nothing wrong with having "int" to be the fastest integer width available, but allowing specificity would have been quite natural as given above. As opposed to using obscure data types that completely change the way something is declared, rather than just extending it as adding the number of bits to the type would do...

JMHO. But irregardless, the "standard" has some problems that ought not be there (char is one example...)
Carey
Posts: 313
Joined: Wed Mar 08, 2006 8:18 pm

Re: ansi-C question

Post by Carey »

bob wrote:
Carey wrote:
bob wrote:
Carey wrote:
bob wrote:
Zach Wegner wrote:What you wrote will work about anywhere. I'm not sure about ANSI C, but multiplication is done implicitly with mod x for x-bit arithmetic. For instance magic multiplication relies on this.

However, if z is more than 32 bit, then you must do a cast after the multiplication: (unsigned int)(x * y);
The problem is the "unsigned int". It is not guaranteed to be 32 bits wide. It can be 64 bits wide just as easily. Unfortunately the ANSI committee didn't give us data types that allow us to specify 32 bits in a portable way. That "unsigned int" is 64 bits on a Cray, for example. And on the Dec Alphas depending on the O/S you are using.
Sure they did, Bob. That's what uint32_t does.
That is not a natural data type. short, int, float are the usual types. Yes those came long later but not until the standard was already well-mangled.
Yes they are natural data types. They aren't some made up data type. Like doing 37 bit integers on a 32 bit cpu. (With exceptions of doing 32 bit stuff on a 36 bit system, and those kinds of things. But stdint.h even allows for that.)

They are just synonyms for existing, real, supported types.

True, they aren't your traditionally named, usual data types, but that doesn't make then unnatural. Just typedefs.


They came about 10 years later after the original C standard was done. In 1999. I mentioned that. I also mentioned one of the reasons why it wasn't in the original C89 standard.

In 1989, they had already gone about as far as they could in the time they had, and there was already enough changes and inovations that doing yet more inventive stuff would have caused problems.

Plus, it takes a few years of people actually using a standard to see where the weak points are and what else needs to be done. Hence the C99 standard.

Standard C may be 'mangled' as you put it, but it's better than working with K&R. So in spite of its flaws, you are better off with it than without it.

For example, exactly what is a "char"???
A minimum storage unit capable of holding one character of the underlying alphabet.

It can be more than 8 bits. Both because of the underlying hardware, and because of the language being used. (It might be 16 bit encoding of the character set, for example.)

Because historically people have used it as a tiny 'int', it can also be signed or unsigned.

And because historically compiler writers have done whatever they want, both C89 & C99 have to leave the default signness as 'implementation defined' Changing that now (or in 1999 or even back in 1989) would have broken way too much code. Something that was definetly not allowed for them or desired by them.

Of course, by using the data types that were given to us in 1999, or even our own portability header, we avoid all those issues. You just have to get over the inertia of using 'int' and 'unsigned int' and so on.

Once you break your habit, data type issues become much less of an issue or concern.


It's guaranteed to be exactly 32 bits. No more and no less.

True, that's C99. It wasn't practical to do it back in C89 (that would have been too 'radical' of an idea, and they were already reaching the limit of what people were ready to tolerate).

That and more is in stdint.h They even provide things like uint_fast32_t for math that needs to be at least 32 bits but can be more if it would be faster (or more convenient) for the compiler or platform.

If you are using a compiler that doesn't provide stdint.h (and inttypes.h), then you probably should find a better compiler.

Or if you are desperate, you can at least fake it tolerably well (with custom headers) and still use those standard types in your program.

There are some portable ones floating around on the web, and I think a few people have posted similar headers here in the forum.

There's no reason to use vague sizes or compiler specific types in your code. Hasn't been for years.


I do realize that Microsoft refuses to support even a tiny bit of C99. Not even something as simple as stdint.h You can either switch to GNU C, or Intel C (I think it has it). Or you can do some custom headers to hide all that with #ifdef's etc. and let your own program be blissfully unaware of the low level details.
Again, that is a portability issue that I have to deal with, since 99% of the machines on the planet are running windows, and the majority of those use MSVC variants, rather than gcc/icc...
Agreed. Since a lot of people use MSVC, that's why I bothered to point out a few options that can be used. So you can at least fake it well enough that your own code never has to use 'unsigned int' or 'long long' etc. It can be blissfully unaware of what types the compiler or system prefers.

I would not mind writing purely for Linux in fact, as it would make the code _far_ cleaner, and I may well one day take this plunge as the current conditional compile stuff is a mess...
I can't vouch for your code. The thread / process stuff and whatever other portability issues that are involved. And to be honest, I rarely look at Crafty. (No offense, but I *do* prefer older programs. I'm still waiting for you to finally release Blitz & CrayBlitz, and for a couple more historic programs that I've been promised. One I already have, but I can't talk about it or release it yet... That really irks me! :lol: )

But using things like stdint, or some of the headers that have been posted in here can go quite a way to hiding the differences for data types, printing data types, 64 bit constants, etc. etc.


There are enough portability headers already in existance (plus what C already gives us) that there is no good reason to depend on compiler specific types and stuff. Your own code shouldn't even know about 'long long' and so on.
My point above was that "char x" doesn't say much about x. It can be signed or unsigned, while "int x" is always signed unless explicitly declared as unsigned. That is not a good standard. It should be "regular" and that certainly is not...
There wasn't much the C89 standards team could do about it.

As I've told you before, their charter required them to honor existing implementations as much as possible, and to avoid doing any more invention than they absolutely had to.

Making 'char' signed or unsigned by default would have broken 50% of the existing programs.

If people had used char as just a char rather than small integer, there wouldn't have been any problem. But people did use them as small integers.

It wasn't that they didn't think about it. There was too much existing code and too many compiler writers complained.

It goes back to some of the vagueness in the original K&R 'specification' and the implementations that came from it.

It's likely somebody even suggested making the standard support both (via compiler switch or pragma) and there were probably even objections to that.

They knew it was an issue. They just couldn't do anything to solve it that wouldn't break lots of code or that the writers would agree to. (shrug)

There might have even been issues with systems that couldn't handle signed char's at all. Only unsigned. They would have needed to convert to full sized ints before doing anything with it.

I've never seen char anything but 8 bits, but I have seen them default to signed and unsigned depending on the compiler.
8 bits are certainly traditional, but they don't have to be. For example, a 36 bit system would be a prime candidate for 6 or 9 bit chars. During the years it took to do C89, there were many people still doing odd hardware sizes and wanting C to be able to support it.

There were some systems that couldn't even really work with individual bytes in hardware. It was just word based and C had char=short=int=long and if you wanted 8 bits, you had to do it in software.

The C standard is very flexible for hardware. They had to be because there was such a wide variety of hardware that had C compilers.


And again, portability is an issue. Until _everybody_ accepts int64_t, it will be dangerous (and problematic) to use it if portability is important, for Crafty obviously it is...
It's a lot *less* of an issue if you do use some portability header to hide it, rather than trying to deal with it in your own code.

That way if something needs to change, only the portability header needs to be updated and that instantly back-ports to all the previous versions of Crafty as well.

There was absolutely nothing wrong with having "int8", "int16", "int32", "int64" and "int128" as data types. Fortran did that in the 60's (real*4, real*8 for example). There was nothing wrong with having "int" to be the fastest integer width available, but allowing specificity would have been quite natural as given above. As opposed to using obscure data types that completely change the way something is declared, rather than just extending it as adding the number of bits to the type would do...
I wouldn't call "uint32_t" obscure. On the contrary, they chose that form very deliberately. If a type ends in "_t" then you know it's an official C standard type, rather than something added by the compiler vendor or defined in some header by some library vendor.


I'm not saying "uint32" wouldn't have worked. I'm just saying they chose "_t" suffix very deliberately.

Doesn't seem to be too rational to quibble over _t

If you really don't like _t, then typedef it to your own favorite naming convention.


JMHO. But irregardless, the "standard" has some problems that ought not be there (char is one example...)
Show me one perfect computer programming language....

They all have problems. C just happens to have fewer than most, but since it's more widely used, the problems are more visible.

And you have to consider C's origins. It was not fully specified in the K&R years. If you wanted to know what something did, you read the book and looked at a few existing compilers.

It was years later when the first standard was done.

Compare that to Fortran or Ada, where the standard was done first and then the compilers were written.

Or Pascal. The standard is so restrictive and limited you can't write real world programs with it. There is no such thing as portability with it.

Or Forth. Which in spite of the standard is still pretty much a language defined as 'implementation defined.'

Or C++ team who took sadistic pleasure in massive invention and radical changes at every meeting.

Or....
User avatar
Zach Wegner
Posts: 1922
Joined: Thu Mar 09, 2006 12:51 am
Location: Earth

Re: ansi-C question

Post by Zach Wegner »

I don't consider them "natural" data types because they aren't defined directly by the language, they are defined in a header file. So they are just like time_t, size_t, div_t, etc. It's mostly a moot point, but if I want to I can make a declaration like:

int int32_t;

Which I'm not going to do, but it takes away from the "weight" of the definition.
Carey
Posts: 313
Joined: Wed Mar 08, 2006 8:18 pm

Re: ansi-C question

Post by Carey »

Zach Wegner wrote:I don't consider them "natural" data types because they aren't defined directly by the language, they are defined in a header file. So they are just like time_t, size_t, div_t, etc. It's mostly a moot point, but if I want to I can make a declaration like:

int int32_t;

Which I'm not going to do, but it takes away from the "weight" of the definition.
I think what you are saying is that you don't consider them 'natural' because the names are defined in a header instead of the actual C data type internal to the compiler, like 'int'.

I can understand that. I don't entirely agree with it, but I understand it.

The only reason it wasn't made a part of the internal C itself was to avoid cluttering up the name space. By it being in a header, if you don't need it or you want to do it yourself, the types don't get in your way.

C and the standardization teams have always been very concerned about the effects their decisions would have on existing code.

If they had made the stuff in stdint.h and stdbool.h and so on an internal part of C itself (like 'int') then you know somebody would have complained about it breaking existing code.

By putting the stuff as headers, even though it is an official part of the language proper and not a library item, they were able to make a number of additions with minimal risk to existing code.


Actually, there's nothing stopping a compiler from doing them internally. At least somewhat. Take the 64 bit data type, for example... The only difference is the compiler uses its own type name (long long, __int64, or whatever) rather than something standardized.


So the types themselves are certainly inherent in the compiler. Just the standardized names aren't.

Both parts are fully part of the C language proper, and not part of the library or such.


So to me, it seems pretty silly to try and take care of all those compiler differences in your own code when you don't have to.
Tord Romstad
Posts: 1808
Joined: Wed Mar 08, 2006 9:19 pm
Location: Oslo, Norway

Re: ansi-C question

Post by Tord Romstad »

Carey wrote:Show me one perfect computer programming language....

They all have problems. C just happens to have fewer than most, but since it's more widely used, the problems are more visible.
I dislike C intensely, but currently there is no other language which offer the same amount of performance and portability across a wide range of platforms. Technically, BitC should be able to fill C's niche, and to programmers like me, it would be vastly more comfortable to work with. But of course, it is very unlikely to ever be widely implemented. :(
And you have to consider C's origins. It was not fully specified in the K&R years. If you wanted to know what something did, you read the book and looked at a few existing compilers.

It was years later when the first standard was done.

Compare that to Fortran or Ada, where the standard was done first and then the compilers were written.
Is that a bad thing?
Or Pascal. The standard is so restrictive and limited you can't write real world programs with it. There is no such thing as portability with it.
Perhaps the official standard is restrictive and limited, but the Delphi dialect seems to have become a de facto standard, and is far less restrictive and limited. FreePascal, which claims to support the Delphi dialect, is available on all major platforms.
Or Forth. Which in spite of the standard is still pretty much a language defined as 'implementation defined.'
Yeah. But because all Forth programmers seem to make their own implementation anyway, I don't think anyone cares.
:)
Or C++ team who took sadistic pleasure in massive invention and radical changes at every meeting.
Ugh.

Tord
Carey
Posts: 313
Joined: Wed Mar 08, 2006 8:18 pm

Re: ansi-C question

Post by Carey »

Tord Romstad wrote:
Carey wrote:Show me one perfect computer programming language....

They all have problems. C just happens to have fewer than most, but since it's more widely used, the problems are more visible.
I dislike C intensely, but currently there is no other language which offer the same amount of performance and portability across a wide range of platforms. Technically, BitC should be able to fill C's niche, and to programmers like me, it would be vastly more comfortable to work with. But of course, it is very unlikely to ever be widely implemented. :(
I've never tried BitC so I don't know anything about it. Looking at the docs you link to, I think it would take me quite a bit of effort to adjust, though.

Personally, I kind of like FreePascal. (I'm an old Pascal programmer, so that explains it... :) ) But the tool set isn't really far enough along for my comfort, plus the performance is substantially below what I get with other compilers due to better optimizers. So I never use it for anything but toys.

I miss stronger type checking.... :( I've developed some very bad habits by using C. Slinging around 'INTs, using the same variable for radically different types of data, etc.
And you have to consider C's origins. It was not fully specified in the K&R years. If you wanted to know what something did, you read the book and looked at a few existing compilers.

It was years later when the first standard was done.

Compare that to Fortran or Ada, where the standard was done first and then the compilers were written.
Is that a bad thing?
Yes and no.

Definetly good from a standardization point, but bad because they didn't have an implementation to look at and play with while deciding what features it should have.

I think they probably did a better job with Fortran than they did Ada.

Fortran has existed long enough that for each new standard version, they pretty much knew what was needed & wanted by the people using it. So it was evolution.

Ada was pure creativity.
Or Pascal. The standard is so restrictive and limited you can't write real world programs with it. There is no such thing as portability with it.
Perhaps the official standard is restrictive and limited, but the Delphi dialect seems to have become a de facto standard, and is far less restrictive and limited. FreePascal, which claims to support the Delphi dialect, is available on all major platforms.
I pretty much agree with you.

The point was about standards, though. Standard Pascal is a standard that shouldn't even bother to exist. It's so limited and restrictive there's no point to it. You can't do anything productive without violating the standard.

FreePascal is pretty good. (It's what GNU Pascal should have been, except they never got up off their... well, you get the point.)

But it's optimization abilities are still weak. And the tool sets are still a little rough around the edges.

I tried to do a simple Pascal program some time back, but the performance wasn't that great.

I even dug out my earliest chess programs from 20 years ago. It was originally written in Pascal and I had a direct line by line port to K&R C for it. Even with a little tweaking, the FreePascal one was about half the speed of the C version.

Or Forth. Which in spite of the standard is still pretty much a language defined as 'implementation defined.'
Yeah. But because all Forth programmers seem to make their own implementation anyway, I don't think anyone cares.
:)
I went through a brief Forth phase about the time it was standardized and it did seem there were a number of people who weren't happy that a standard existed... :lol:
Or C++ team who took sadistic pleasure in massive invention and radical changes at every meeting.
Ugh.

Tord
Yeah... My opinion of C++ is heavily colored by all the stuff I read about their standardization 'process'. Somebody would show up at a meeting with a new idea that hadn't been implemented yet or even fully thought out, but they'd vote and decide right then and there to put it in, then they'd spend months trying to fix the bugs and work around all the side effects, etc.

(shudder)

I guess that's one of the reasons I have such a high opinion of the C standarization process. They truely did care about backward portability and minimal side-effects for the new stuff they were introducing.

There are areas where they messed up or could have gone a little further, but they truely did try their best to make a good language standard and still provide enough flexibility for all the odd architectures that existed or might come along.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: ansi-C question

Post by bob »

Carey wrote:
bob wrote:
Carey wrote:
bob wrote:
Carey wrote:
bob wrote:
Zach Wegner wrote:What you wrote will work about anywhere. I'm not sure about ANSI C, but multiplication is done implicitly with mod x for x-bit arithmetic. For instance magic multiplication relies on this.

However, if z is more than 32 bit, then you must do a cast after the multiplication: (unsigned int)(x * y);
The problem is the "unsigned int". It is not guaranteed to be 32 bits wide. It can be 64 bits wide just as easily. Unfortunately the ANSI committee didn't give us data types that allow us to specify 32 bits in a portable way. That "unsigned int" is 64 bits on a Cray, for example. And on the Dec Alphas depending on the O/S you are using.
Sure they did, Bob. That's what uint32_t does.
That is not a natural data type. short, int, float are the usual types. Yes those came long later but not until the standard was already well-mangled.
Yes they are natural data types. They aren't some made up data type. Like doing 37 bit integers on a 32 bit cpu. (With exceptions of doing 32 bit stuff on a 36 bit system, and those kinds of things. But stdint.h even allows for that.)

They are just synonyms for existing, real, supported types.

True, they aren't your traditionally named, usual data types, but that doesn't make then unnatural. Just typedefs.


They came about 10 years later after the original C standard was done. In 1999. I mentioned that. I also mentioned one of the reasons why it wasn't in the original C89 standard.

In 1989, they had already gone about as far as they could in the time they had, and there was already enough changes and inovations that doing yet more inventive stuff would have caused problems.

Plus, it takes a few years of people actually using a standard to see where the weak points are and what else needs to be done. Hence the C99 standard.

Standard C may be 'mangled' as you put it, but it's better than working with K&R. So in spite of its flaws, you are better off with it than without it.

For example, exactly what is a "char"???
A minimum storage unit capable of holding one character of the underlying alphabet.

It can be more than 8 bits. Both because of the underlying hardware, and because of the language being used. (It might be 16 bit encoding of the character set, for example.)

Because historically people have used it as a tiny 'int', it can also be signed or unsigned.

And because historically compiler writers have done whatever they want, both C89 & C99 have to leave the default signness as 'implementation defined' Changing that now (or in 1999 or even back in 1989) would have broken way too much code. Something that was definetly not allowed for them or desired by them.

Of course, by using the data types that were given to us in 1999, or even our own portability header, we avoid all those issues. You just have to get over the inertia of using 'int' and 'unsigned int' and so on.

Once you break your habit, data type issues become much less of an issue or concern.


It's guaranteed to be exactly 32 bits. No more and no less.

True, that's C99. It wasn't practical to do it back in C89 (that would have been too 'radical' of an idea, and they were already reaching the limit of what people were ready to tolerate).

That and more is in stdint.h They even provide things like uint_fast32_t for math that needs to be at least 32 bits but can be more if it would be faster (or more convenient) for the compiler or platform.

If you are using a compiler that doesn't provide stdint.h (and inttypes.h), then you probably should find a better compiler.

Or if you are desperate, you can at least fake it tolerably well (with custom headers) and still use those standard types in your program.

There are some portable ones floating around on the web, and I think a few people have posted similar headers here in the forum.

There's no reason to use vague sizes or compiler specific types in your code. Hasn't been for years.


I do realize that Microsoft refuses to support even a tiny bit of C99. Not even something as simple as stdint.h You can either switch to GNU C, or Intel C (I think it has it). Or you can do some custom headers to hide all that with #ifdef's etc. and let your own program be blissfully unaware of the low level details.
Again, that is a portability issue that I have to deal with, since 99% of the machines on the planet are running windows, and the majority of those use MSVC variants, rather than gcc/icc...
Agreed. Since a lot of people use MSVC, that's why I bothered to point out a few options that can be used. So you can at least fake it well enough that your own code never has to use 'unsigned int' or 'long long' etc. It can be blissfully unaware of what types the compiler or system prefers.

I would not mind writing purely for Linux in fact, as it would make the code _far_ cleaner, and I may well one day take this plunge as the current conditional compile stuff is a mess...
I can't vouch for your code. The thread / process stuff and whatever other portability issues that are involved. And to be honest, I rarely look at Crafty. (No offense, but I *do* prefer older programs. I'm still waiting for you to finally release Blitz & CrayBlitz, and for a couple more historic programs that I've been promised. One I already have, but I can't talk about it or release it yet... That really irks me! :lol: )

But using things like stdint, or some of the headers that have been posted in here can go quite a way to hiding the differences for data types, printing data types, 64 bit constants, etc. etc.


There are enough portability headers already in existance (plus what C already gives us) that there is no good reason to depend on compiler specific types and stuff. Your own code shouldn't even know about 'long long' and so on.
My point above was that "char x" doesn't say much about x. It can be signed or unsigned, while "int x" is always signed unless explicitly declared as unsigned. That is not a good standard. It should be "regular" and that certainly is not...
There wasn't much the C89 standards team could do about it.

As I've told you before, their charter required them to honor existing implementations as much as possible, and to avoid doing any more invention than they absolutely had to.
I don't follow. We had char type in K&R. And some vendors chose to default to signed, some chose to go unsigned. What was wrong with making this a "standard" definition. I mean, isn't that what a _STANDARD_ is supposed to do? I would not be 100% certain that no compiler chose "unsigned" as the default for an int. So why didn't the standards committee leave that loophole open as well? This was simply a poor decision, and I have seen it wreck programs (including mine when I moved it to an IBM RS6000 running AIX that assumed unsigned for chars breaking my chess board representation). From a user's perspective, we want preciseness, not vagueness... And yet vagueness is what we got in this case (among others of course).


Making 'char' signed or unsigned by default would have broken 50% of the existing programs.

As a long-time C programmer I don't agree with this. 95% of C compilers used signed as the default for chars. Just a few went unsigned. This was still the case after the "standard" was released as well. The reason I am sure here is that I compiled Crafty on nearly every compiler on the planet back then, and the only problem I found was the IBM C compiler on the RS6000/AIX workstations. Crafty ran cleanly on every other machine I could find at the time...

/quote]

If people had used char as just a char rather than small integer, there wouldn't have been any problem. But people did use them as small integers.

It wasn't that they didn't think about it. There was too much existing code and too many compiler writers complained.

It goes back to some of the vagueness in the original K&R 'specification' and the implementations that came from it.

It's likely somebody even suggested making the standard support both (via compiler switch or pragma) and there were probably even objections to that.

They knew it was an issue. They just couldn't do anything to solve it that wouldn't break lots of code or that the writers would agree to. (shrug)

There might have even been issues with systems that couldn't handle signed char's at all. Only unsigned. They would have needed to convert to full sized ints before doing anything with it.

I've never seen char anything but 8 bits, but I have seen them default to signed and unsigned depending on the compiler.
8 bits are certainly traditional, but they don't have to be. For example, a 36 bit system would be a prime candidate for 6 or 9 bit chars. During the years it took to do C89, there were many people still doing odd hardware sizes and wanting C to be able to support it.

There were some systems that couldn't even really work with individual bytes in hardware. It was just word based and C had char=short=int=long and if you wanted 8 bits, you had to do it in software.

The C standard is very flexible for hardware. They had to be because there was such a wide variety of hardware that had C compilers.


And again, portability is an issue. Until _everybody_ accepts int64_t, it will be dangerous (and problematic) to use it if portability is important, for Crafty obviously it is...
It's a lot *less* of an issue if you do use some portability header to hide it, rather than trying to deal with it in your own code.

That way if something needs to change, only the portability header needs to be updated and that instantly back-ports to all the previous versions of Crafty as well.

There was absolutely nothing wrong with having "int8", "int16", "int32", "int64" and "int128" as data types. Fortran did that in the 60's (real*4, real*8 for example). There was nothing wrong with having "int" to be the fastest integer width available, but allowing specificity would have been quite natural as given above. As opposed to using obscure data types that completely change the way something is declared, rather than just extending it as adding the number of bits to the type would do...
I wouldn't call "uint32_t" obscure. On the contrary, they chose that form very deliberately. If a type ends in "_t" then you know it's an official C standard type, rather than something added by the compiler vendor or defined in some header by some library vendor.


I'm not saying "uint32" wouldn't have worked. I'm just saying they chose "_t" suffix very deliberately.

Doesn't seem to be too rational to quibble over _t

If you really don't like _t, then typedef it to your own favorite naming convention.


JMHO. But irregardless, the "standard" has some problems that ought not be there (char is one example...)
Show me one perfect computer programming language....

They all have problems. C just happens to have fewer than most, but since it's more widely used, the problems are more visible.

And you have to consider C's origins. It was not fully specified in the K&R years. If you wanted to know what something did, you read the book and looked at a few existing compilers.

It was years later when the first standard was done.

Compare that to Fortran or Ada, where the standard was done first and then the compilers were written.

Or Pascal. The standard is so restrictive and limited you can't write real world programs with it. There is no such thing as portability with it.

Or Forth. Which in spite of the standard is still pretty much a language defined as 'implementation defined.'

Or C++ team who took sadistic pleasure in massive invention and radical changes at every meeting.

Or....[/quote]

BTW, this was perhaps the dumbest approach to defining a standard I have ever seen, Rather than describe a standard that would really define the language to make portability a given rather than a wish, they chose to try to write a standard that all existing compilers would already meet. That's hardly a reasonable definition of a standard... Because it didn't fix a thing that was broken, portability wise...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: ansi-C question

Post by bob »

Tord Romstad wrote:
Carey wrote:Show me one perfect computer programming language....

They all have problems. C just happens to have fewer than most, but since it's more widely used, the problems are more visible.
I dislike C intensely, but currently there is no other language which offer the same amount of performance and portability across a wide range of platforms. Technically, BitC should be able to fill C's niche, and to programmers like me, it would be vastly more comfortable to work with. But of course, it is very unlikely to ever be widely implemented. :(
And you have to consider C's origins. It was not fully specified in the K&R years. If you wanted to know what something did, you read the book and looked at a few existing compilers.

It was years later when the first standard was done.

Compare that to Fortran or Ada, where the standard was done first and then the compilers were written.
Is that a bad thing?
Or Pascal. The standard is so restrictive and limited you can't write real world programs with it. There is no such thing as portability with it.
Somehow we don't agree on the definition of "portability" then. Pascal is so tightly defined I have _never_ seen a pascal program that compiles and runs on machine X that will not compile and run on machine Y. In fact, the p-code was 100% compatible across platforms, the predecessor of the Java byte-code approach...

Perhaps the official standard is restrictive and limited, but the Delphi dialect seems to have become a de facto standard, and is far less restrictive and limited. FreePascal, which claims to support the Delphi dialect, is available on all major platforms.
Or Forth. Which in spite of the standard is still pretty much a language defined as 'implementation defined.'
Yeah. But because all Forth programmers seem to make their own implementation anyway, I don't think anyone cares.
:)
Or C++ team who took sadistic pleasure in massive invention and radical changes at every meeting.
Ugh.

Tord
BTW, the standard for fortran was _not_ done before the compilers were written. Later standards were done to address changes that were deemed necessary (pointers, etc) and then new compilers were written to meet those standards, but Fortran evolved over many years, and until fortran 77, there was no uniform agreement on many things and vendors were adding things right and left (again pointers comes to mind but there were other things). The standards were written so that new features would be compatible across all compliant compilers. Why could the ANSI C committee not do that same thing? Obviously they could have, but they didn't, for reasons I'll never grasp...
Carey
Posts: 313
Joined: Wed Mar 08, 2006 8:18 pm

Re: ansi-C question

Post by Carey »

bob wrote:
Carey wrote:
bob wrote:
Carey wrote:
bob wrote:
Carey wrote:
bob wrote:
Zach Wegner wrote:What you wrote will work about anywhere. I'm not sure about ANSI C, but multiplication is done implicitly with mod x for x-bit arithmetic. For instance magic multiplication relies on this.

However, if z is more than 32 bit, then you must do a cast after the multiplication: (unsigned int)(x * y);
The problem is the "unsigned int". It is not guaranteed to be 32 bits wide. It can be 64 bits wide just as easily. Unfortunately the ANSI committee didn't give us data types that allow us to specify 32 bits in a portable way. That "unsigned int" is 64 bits on a Cray, for example. And on the Dec Alphas depending on the O/S you are using.
Sure they did, Bob. That's what uint32_t does.
That is not a natural data type. short, int, float are the usual types. Yes those came long later but not until the standard was already well-mangled.
Yes they are natural data types. They aren't some made up data type. Like doing 37 bit integers on a 32 bit cpu. (With exceptions of doing 32 bit stuff on a 36 bit system, and those kinds of things. But stdint.h even allows for that.)

They are just synonyms for existing, real, supported types.

True, they aren't your traditionally named, usual data types, but that doesn't make then unnatural. Just typedefs.


They came about 10 years later after the original C standard was done. In 1999. I mentioned that. I also mentioned one of the reasons why it wasn't in the original C89 standard.

In 1989, they had already gone about as far as they could in the time they had, and there was already enough changes and inovations that doing yet more inventive stuff would have caused problems.

Plus, it takes a few years of people actually using a standard to see where the weak points are and what else needs to be done. Hence the C99 standard.

Standard C may be 'mangled' as you put it, but it's better than working with K&R. So in spite of its flaws, you are better off with it than without it.

For example, exactly what is a "char"???
A minimum storage unit capable of holding one character of the underlying alphabet.

It can be more than 8 bits. Both because of the underlying hardware, and because of the language being used. (It might be 16 bit encoding of the character set, for example.)

Because historically people have used it as a tiny 'int', it can also be signed or unsigned.

And because historically compiler writers have done whatever they want, both C89 & C99 have to leave the default signness as 'implementation defined' Changing that now (or in 1999 or even back in 1989) would have broken way too much code. Something that was definetly not allowed for them or desired by them.

Of course, by using the data types that were given to us in 1999, or even our own portability header, we avoid all those issues. You just have to get over the inertia of using 'int' and 'unsigned int' and so on.

Once you break your habit, data type issues become much less of an issue or concern.


It's guaranteed to be exactly 32 bits. No more and no less.

True, that's C99. It wasn't practical to do it back in C89 (that would have been too 'radical' of an idea, and they were already reaching the limit of what people were ready to tolerate).

That and more is in stdint.h They even provide things like uint_fast32_t for math that needs to be at least 32 bits but can be more if it would be faster (or more convenient) for the compiler or platform.

If you are using a compiler that doesn't provide stdint.h (and inttypes.h), then you probably should find a better compiler.

Or if you are desperate, you can at least fake it tolerably well (with custom headers) and still use those standard types in your program.

There are some portable ones floating around on the web, and I think a few people have posted similar headers here in the forum.

There's no reason to use vague sizes or compiler specific types in your code. Hasn't been for years.


I do realize that Microsoft refuses to support even a tiny bit of C99. Not even something as simple as stdint.h You can either switch to GNU C, or Intel C (I think it has it). Or you can do some custom headers to hide all that with #ifdef's etc. and let your own program be blissfully unaware of the low level details.
Again, that is a portability issue that I have to deal with, since 99% of the machines on the planet are running windows, and the majority of those use MSVC variants, rather than gcc/icc...
Agreed. Since a lot of people use MSVC, that's why I bothered to point out a few options that can be used. So you can at least fake it well enough that your own code never has to use 'unsigned int' or 'long long' etc. It can be blissfully unaware of what types the compiler or system prefers.

I would not mind writing purely for Linux in fact, as it would make the code _far_ cleaner, and I may well one day take this plunge as the current conditional compile stuff is a mess...
I can't vouch for your code. The thread / process stuff and whatever other portability issues that are involved. And to be honest, I rarely look at Crafty. (No offense, but I *do* prefer older programs. I'm still waiting for you to finally release Blitz & CrayBlitz, and for a couple more historic programs that I've been promised. One I already have, but I can't talk about it or release it yet... That really irks me! :lol: )

But using things like stdint, or some of the headers that have been posted in here can go quite a way to hiding the differences for data types, printing data types, 64 bit constants, etc. etc.


There are enough portability headers already in existance (plus what C already gives us) that there is no good reason to depend on compiler specific types and stuff. Your own code shouldn't even know about 'long long' and so on.
My point above was that "char x" doesn't say much about x. It can be signed or unsigned, while "int x" is always signed unless explicitly declared as unsigned. That is not a good standard. It should be "regular" and that certainly is not...
There wasn't much the C89 standards team could do about it.

As I've told you before, their charter required them to honor existing implementations as much as possible, and to avoid doing any more invention than they absolutely had to.
I don't follow. We had char type in K&R. And some vendors chose to default to signed, some chose to go unsigned. What was wrong with making this a "standard" definition. I mean, isn't that what a _STANDARD_ is supposed to do?
A standard isn't supposed to break 50% of the existing programs.

And their charter required them to make a standard based on the existing implementations.

That was the problem.

Furthermore, they had to support systems that didn't use two's complement, but still provide for expected behavior.

If people had used char only for characters, there wouldn't have been a problem. But by using it as a tiny int, that created a lot of expectations. And the C standards team was caught in a no-win situation.

If they made it signed, it wouldn't have worked right on some systems, and it would have broken 50% of the existing programs.

If they made it unsigned, it wouldn't have worked right on some systems and it would have broken 50% of the existing programs.

There were good arguments that could be made for either form. And consequences for both.

In a sitatuion like that, you make it 'implementation defined'.

I would not be 100% certain that no compiler chose "unsigned" as the default for an int. So why didn't the standards committee leave that
I'm not aware of any cases, but that certainly doesn't make it so.

Right off the top of my head, I don't remember reading about any complaints or arguments over that score, like it was for char.

My guess is that the original K&R C wasn't vague about that, so everybody (or nearly everybody) did it the same way.
loophole open as well? This was simply a poor decision, and I have seen it wreck programs (including mine when I moved it to an IBM RS6000 running AIX that assumed unsigned for chars breaking my chess board representation). From a user's perspective, we want preciseness, not vagueness... And yet vagueness is what we got in this case (among
others of course).
Well, guess what.... you don't have to use plain ambigous 'char' You can qualify it. And many compilers do come with a compiler switch to force it to the other format for those programs that barf.

That's kind of the whole point to why C99 added the stdint.h file. You don't have to depend on the signness or size of any native type. You can always know exactly what you get.

If you need a certain size or type, then you can get it, regardless whether you are working on an 8 bit micro or a 32 bit 386 or a 64 bit Cray. And who knows, the next standard might add support 128 bit numbers. (I don't know if they will.)


Making 'char' signed or unsigned by default would have broken 50% of the existing programs.

As a long-time C programmer I don't agree with this. 95% of C compilers used signed as the default for chars. Just a few went unsigned. This was still the case after the "standard" was released as well. The reason I am sure here is that I compiled Crafty on nearly every compiler on the planet back then, and the only problem I found was the IBM C compiler on the RS6000/AIX workstations. Crafty ran cleanly on every other machine I could find at the time...

/quote]

If people had used char as just a char rather than small integer, there wouldn't have been any problem. But people did use them as small integers.

It wasn't that they didn't think about it. There was too much existing code and too many compiler writers complained.

It goes back to some of the vagueness in the original K&R 'specification' and the implementations that came from it.

It's likely somebody even suggested making the standard support both (via compiler switch or pragma) and there were probably even objections to that.

They knew it was an issue. They just couldn't do anything to solve it that wouldn't break lots of code or that the writers would agree to. (shrug)

There might have even been issues with systems that couldn't handle signed char's at all. Only unsigned. They would have needed to convert to full sized ints before doing anything with it.

I've never seen char anything but 8 bits, but I have seen them default to signed and unsigned depending on the compiler.
8 bits are certainly traditional, but they don't have to be. For example, a 36 bit system would be a prime candidate for 6 or 9 bit chars. During the years it took to do C89, there were many people still doing odd hardware sizes and wanting C to be able to support it.

There were some systems that couldn't even really work with individual bytes in hardware. It was just word based and C had char=short=int=long and if you wanted 8 bits, you had to do it in software.

The C standard is very flexible for hardware. They had to be because there was such a wide variety of hardware that had C compilers.


And again, portability is an issue. Until _everybody_ accepts int64_t, it will be dangerous (and problematic) to use it if portability is important, for Crafty obviously it is...
It's a lot *less* of an issue if you do use some portability header to hide it, rather than trying to deal with it in your own code.

That way if something needs to change, only the portability header needs to be updated and that instantly back-ports to all the previous versions of Crafty as well.

There was absolutely nothing wrong with having "int8", "int16", "int32", "int64" and "int128" as data types. Fortran did that in the 60's (real*4, real*8 for example). There was nothing wrong with having "int" to be the fastest integer width available, but allowing specificity would have been quite natural as given above. As opposed to using obscure data types that completely change the way something is declared, rather than just extending it as adding the number of bits to the type would do...
I wouldn't call "uint32_t" obscure. On the contrary, they chose that form very deliberately. If a type ends in "_t" then you know it's an official C standard type, rather than something added by the compiler vendor or defined in some header by some library vendor.


I'm not saying "uint32" wouldn't have worked. I'm just saying they chose "_t" suffix very deliberately.

Doesn't seem to be too rational to quibble over _t

If you really don't like _t, then typedef it to your own favorite naming convention.


JMHO. But irregardless, the "standard" has some problems that ought not be there (char is one example...)
Show me one perfect computer programming language....

They all have problems. C just happens to have fewer than most, but since it's more widely used, the problems are more visible.

And you have to consider C's origins. It was not fully specified in the K&R years. If you wanted to know what something did, you read the book and looked at a few existing compilers.

It was years later when the first standard was done.

Compare that to Fortran or Ada, where the standard was done first and then the compilers were written.

Or Pascal. The standard is so restrictive and limited you can't write real world programs with it. There is no such thing as portability with it.

Or Forth. Which in spite of the standard is still pretty much a language defined as 'implementation defined.'

Or C++ team who took sadistic pleasure in massive invention and radical changes at every meeting.

Or....
BTW, this was perhaps the dumbest approach to defining a standard I have ever seen,
(shrug) The C standardization process was an open process. You could have taken part in it and voiced your opinion.

If you don't like the result, then don't use the language.

Rather than describe a standard that would really define the language to make portability a given rather than a wish, they chose to try to write a standard that all existing compilers would already meet. That's hardly a reasonable definition of a standard... Because it didn't fix a thing that was broken, portability wise...
Because portability, in the way you mean, wasn't their goal. I suspect that the kind of portability you are meaning & wishing for is impossible.

Their goal was to properly define the standard for reasonable backward portability and reasonable cross platform portability.

They did pretty good for a portable assembler language. The fact that it has succeeded so well for so many years is proof of that.