CCS Compiler does not clear bss data (uninitialised data)

David Brown (Westcontrol)

I've just read in the Code Composer Studio v4.2 Users Guide for MSP430:

B.5.1 Initializing Static and Global Variables
The ANSI/ISO C standard specifies that static and global (extern) variables without explicit initializations
must be pre-initialized to 0 (before the program begins running). This task is typically performed when the
program is loaded and is implemented in the IAR compiler:

/* IAR, global variable, initialized to 0 upon program start */
int Counter;

However, the TI CCS compiler does not pre-initialize these variables; therefore, it is up to the application
to fulfill this requirement:

/* CCS, global variable, manually zero-initialized */
int Counter = 0;

Please, can someone tell me that this is just a bad joke? I used CCS for a TMS320F241 DSP chip something like 12 years ago, and wasted a lot of time finding out about this problem. After all, you don't expect your problems to be caused by the compiler failing to follow one of the most basic functions of a C compiler. But it is hard to comprehend that this is the case in the current tools.

over 14 years ago

Joerg Quinten over 14 years ago

Guru 17650 points

Hi David,

well, I need to quote one of my favorite bands for answering your question: 'SAD BUT TRUE'!

Find it i.e. on Metallica's black album 'Metallica' 2nd track or - which will give it a better taste: track no.9 on 'LIVE ***: binge & purge'!

Rgds
aBUGSworstnightmare

Jens-Michael Gross over 14 years ago

Guru 227245 points

David Brown said:
The ANSI/ISO C standard specifies that static and global (extern) variables without explicit initializations must be pre-initialized to 0

Where? I think it's just a joke to assume that something I didn't assign a value to will have a defined value anyway.
Nobody has written that GCC complies to any C standard. Especially not to any of the many derivates. It may or may not, the documentation tells.

if you want a variable to have a specific value at startup, then initialize it. Either in the definition or in the code before it is used.

I must admit that mspgcc indeed doe sinitialize these variables (and it requires specific tweaking to have them NOT initialized), but then,on some MSPs the watchdog was kicking in before the memory for all these 'uninitialized' variables has been cleared. Whcih gives an impresson on how long this might take. So leave variables uninitialized when you don't need them initialized (and then they will have a random value or the value from before the last reset) or initialize them if necessary. And don't let the compiler generate code that does probably unnecessary work and forces you to dig into the compiler depths to make it stop.

Also, initializing uninitialized variables with 0 was never part of the original C or C++ language description. It has been added many, MANY years later, probably because a compiler manufacturer was too tired of the support requests from people who were too lazy to say what a variable has to be on startup. And then others adopted it as a (time consuming!) convenience option. Well, a few milliseconds longer statup time is not worth mentioning on a PC program that requires minutes to start anyway, btu on a microcontroller, every millisecond counts.

David Brown (Westcontrol) over 14 years ago in reply to Jens-Michael Gross

Prodigy 120 points

Hi,

I don't have a copy of the standards here to quote chapter and verse, though it should be easy to obtain through a quick post to the comp.lang.c Usenet newsgroup. But be assured that all statically allocated objects with no explicit initialisation are given the arithmetic value 0 or the NULL pointer, and this has been a requirement of C since the early days of K&R. It is not something that has been added later. In practice, it is handled by startup library code rather than the compiler itself, but it is still part of the compiler toolchain.

If you want a variable to have a specific non-zero value, then you must initialise it explicitly. But if you want it to be initialised to zero, then you are free to leave it without explicit initialisation - the toolchain will clear it to zero. You can assume that to be true, just as you can assume that an "int" is at least 16-bit and a "long" is at least 32-bit, because these are standard parts of the language. It is not open to discussion or interpretation by the compiler implementers - they may give you an option to change the behaviour if they think it would be useful, but they may not make such fundamental changes to the language and still call it "C". I don't know of hand which standards CCS says it follows, but without clearing the uninitialised data to zero, it doesn't follow any of them.

I do understand that there may be times when an embedded programmer might want to do something different - there are always special cases. It is useful to be able to write your own startup code - and CCS has hooks in place to let you do that. In fact, it should not be a big problem to add in code to fix this flaw. But the default behaviour should be the one that follows the standards and provides least surprise to the user - not something that is a very rare request with few uses. And if the watchdog is an issue when clearing the bss, then the library startup code should kick the watchdog as necessary, or the startup code should be re-written to run faster (for some unknown reason, such startup code is often written as a slow character-by-character loop in hand-written assembly, rather than a much faster loop which can just as well be written in C). It is significantly faster and more compact to clear the bss in a startup loop than to do so as part of the initialised data copying, and even more significantly faster and more compact than initialising the data as part of the code.

mvh.,

David

old_cow_yellow over 14 years ago in reply to David Brown (Westcontrol)

Guru 58965 points

IAR c-startup code will zero them.

If you have a lot of them, that c-startup will only be able to zero part of them. The watch-dog will generate a reset before it finishes and re-start the c-startup again. So the first part will be zeroed again, and again, and again, ...

Hardy Griech over 14 years ago in reply to Jens-Michael Gross

Expert 2870 points

Jens-Michael Gross said:

Where? I think it's just a joke to assume that something I didn't assign a value to will have a defined value anyway.

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf, chapter 6.7.8/10 states it more or less clearly:

10 If an object that has automatic storage duration is not initialized explicitly, its value is
indeterminate. If an object that has static storage duration is not initialized explicitly,
then:
— if it has pointer type, it is initialized to a null pointer;
— if it has arithmetic type, it is initialized to (positive or unsigned) zero;
— if it is an aggregate, every member is initialized (recursively) according to these rules;
— if it is a union, the first named member is initialized (recursively) according to these
rules

It's relatively easy to make CCS a little bit more C compliant:

int _system_pre_init( void )
/**
* Set RAM to zero.
* This is a lowlevel initialization routine of the runtime!
*
* \note
*    the runtime of the CCE20 does not do that on its own!
*/
{
   extern uint8_t __bss__;
   extern uint8_t _stack;

   // turn watchdog off
   watchdog_init();

   memset( &__bss__, 0, &_stack - &__bss__ );
   return 1;
}   // _system_pre_init

It also took me some time to find this out - so please TI use the above _system_pre_init() as the standard case and let the software developers still define their own _system_pre_init() for the special case were no initialization is wanted. Feel free (TI) to use the above snippet ;-)

Hardy

David Brown (Westcontrol) over 14 years ago in reply to Hardy Griech

Prodigy 120 points

Hi Hardy,

Thanks for that post. I had already read as far as the "_system_pre_init" hook, which is the most convenient place to put such code. But I hadn't yet found the exact names to use for the start and end of the bss section (different toolchains use different formats for these symbols), so you've saved me a bit of effort there.

Your code here is good enough for me at the moment - startup time is not a critical issue, and the bss in this application is small. But if you (or anyone else reading this, like the TI people who are supposed to have implemented this) are interested, I can give you a few more points to consider to improve the efficiency.

It seems that CCS uses weird and non-standard section naming, with initialised data also being copied from flash into the ".bss" section - so some of bss is initialised data, and some is uninitialised data and should be cleared. Every other compiler that I have checked (which is quite a few) uses ".data" for the initialised data section, and keeps it distinct from the bss. It makes it easy to write the startup routines - you clear the bss, and copy the initial contents from flash into data.

This means that unless TI splits these sections in future versions of the compiler, an optimal startup routine would need to clear only the non-initialised part.

By declaring the __bss__ and _stack as uint8_t and using memset, this code is clearing the startup data using 8-bit writes. On the msp430, it will run much faster if 16-bit writes are used. I assume that these two symbols are 16-bit aligned, though I can't see any indication in the linker command file. A loop doing 16-bit writes, or maybe a partially unrolled loop doing perhaps 4 such writes per iteration, would give faster code (but it would need larger alignment in the linker command file).

For devices which have dma, then my guess is that using the dma channels will give the best performance for startup, assuming bss is more than a certain size. The code could set one dma channel to clear the bss, and another to copy over the initialised data, and then use the processor to handle other startup tasks such as disabling the watchdog, and preparing to run the C++ global constructors (though it couldn't start these until the dma copies were finished).

Best regards,

David

Joerg Quinten over 14 years ago in reply to David Brown (Westcontrol)

Guru 17650 points

Hi,

why do you want to use a function for turning the WDT off (or initialize it to something different) when you can do it with a simple instruction?

Kind regards
aBUGSworstnightmare

/*
* WDTCTL, Watchdog Timer+ Register
*
* WDTHOLD -- Watchdog timer+ is stopped
* ~WDTNMIES -- NMI on rising edge
* ~WDTNMI -- Reset function
* ~WDTTMSEL -- Watchdog mode
* ~WDTCNTCL -- No action
* ~WDTSSEL -- SMCLK
* ~WDTIS0 -- Watchdog clock source bit0 disabled
* ~WDTIS1 -- Watchdog clock source bit1 disabled
*
* Note: ~<BIT> indicates that <BIT> has value zero
*/

WDTCTL = WDTPW + WDTHOLD;

darkwzrd over 14 years ago in reply to Hardy Griech

Expert 1385 points

HardyGriech said:

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf, chapter 6.7.8/10 states it more or less clearly:

10 If an object that has automatic storage duration is not initialized explicitly, its value is
indeterminate. If an object that has static storage duration is not initialized explicitly,
then:
— if it has pointer type, it is initialized to a null pointer;
— if it has arithmetic type, it is initialized to (positive or unsigned) zero;
— if it is an aggregate, every member is initialized (recursively) according to these rules;
— if it is a union, the first named member is initialized (recursively) according to these
rules

Hmmm....The document you present is a draft for the second Technical Corrigenda. Assuming that this entry was present before the corrections (and it probably was), this could be a violation.

The latest manual that I could find (I'm not a user of their compiler, so please correct me if there is a better reference), http://focus.ti.com/general/docs/lit/getliterature.tsp?literatureNumber=slau132e&fileType=pdf), says on page 11 that it adheres to C89.

It appears on the surface that the compiler is in direct violation of the standard. Can anyone present any evidence otherwise?

On second glance, the manual does mention this fact on page 93:

5.12 Initializing Static and Global Variables

The ANSI/ISO C standard specifies that global (extern) and static variables without explicit initializations

must be initialized to 0 before the program begins running. This task is typically done when the program is

loaded. Because the loading process is heavily dependent on the specific environment of the target

application system, the compiler itself makes no provision for initializing to 0 otherwise uninitialized static

storage class variables at run time. It is up to your application to fulfill this requirement.

Initialize Global Objects

NOTE: You should explicitly initialize all global objects which you expected the compiler would set

to zero by default.

Does that count as an extension then? What is considered a violation and what isn't? Any C language experts in the house?

Jens-Michael Gross over 14 years ago in reply to darkwzrd

Guru 227245 points

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf, chapter 6.7.8/10 states it more or less clearly:

This document draft is from 2005, which is likely a bit 'newer' that the C89 standard.

Also, it talks about objects and object members, which shall be initialized. I don't consider global variables as objects in an object-oriented sense, so question is whether it applies at all to this topic, even IF the compiler would claim to follow this standard (if it were a standard at all, as it is jsut a draft)

Nevertheless, the compiler does not claim to be compliant to any other standard than C89. Even the reference to K&R doesn't imply that the compiler conforms with 'their' version of C/C++, but rather that any references in the compiler description are references to parts of this book.

However, I admit, that most people are used to have their lazy variable definitions filled with a zero-initialization by the compiler.
It has, however, the drawback of not being able to leave uninitialized data exactly that: uninitialized. Initialisation takes time, and it is better for an MCU system to have to do what's necessary when it's necessary, than to have things done which are not necessary - without a possibility to suppress them.

Sometimes, I wish TI had not produces a C compiler but rather a D compiler. With D being similar but not identical to C. Then programmers would be happy to have something they almost know but would know that they have to do some things different, rather than complaining about things which are not as they expected - and staying in their worn-out paths of programming an MCU like any other PC program. A great fraction of the threads in this forum are started just because things don't work on an MCU as they do on a PC.

David Brown (Westcontrol) over 14 years ago in reply to Jens-Michael Gross

Prodigy 120 points

The references in the C standards to "objects" and "object members" refer to any item of data storage, with "members" referring to struct or array members. The terms predate C++, and have nothing to do with object oriented programming.

Producing a C compiler that only vaguely follows a standard that is well over 20 years old is absurd, although as far as I have tested CCS also supports C99 except for the failure to initialise statically allocated data - a requirement of C89 and K&R C, and every C standard since. To be fair on TI, I can't see any indication on their website or in their documentation that they claim to support any standards - in fact, they barely claim to be a C/C++ compiler at all, and mostly refer to it as just a "compiler".

Having statically allocated data initialised to zero is not "lazy" - it is part of the language, and it was included because it is efficient in time and space. If you want to initialise data to something other than 0, you can specify an initialiser when the variables are defined - that is also part of the language, and it is expected to be implemented efficiently (and CCS does so). Neither TI nor anyone else gets to omit support for features on the grounds that people who expect a C compiler to compile C code are "lazy".

All data requires a value at some point. Many programs are written with the expectation that the initial value will be 0, and the most efficient way to achieve that is with a tight loop at startup that clears the bss section. This has been well-established practice for decades, because it requires minimal code space and minimal time to achieve the initialisation. I can certainly see that there are occasions when statically allocated data need not be initialised, but it is a rarity - and it is even rarer that there is significant time to be saved by not zeroing out part of the bss. I think it is a good think for a compiler to make it easy to have data sections that are not cleared - but most compilers have some way to achieve that. But that is the exception, not the rule.

As for D, I think you are arguing without knowing much about the language. D is a higher level language than C or C++, it assumes a 32-bit (or more) processor, and it supports language features such as garbage collection, run-time typing, true strings, hash tables, etc. These are features that are totally out of place on a small microcontroller like the msp430, and can easily lead to significant unexpected overhead on a larger processor like an Cortex-M3. In other words, D is a good language for PC programming but not for small embedded systems.

I can't speak for other threads in this forum, having only read a few (I find web forums a very inefficient way to communicate, but I know that TI people read these forums), but I can assure you that I know the difference between microcontroller programming and PC programming.

Jens-Michael Gross over 14 years ago in reply to David Brown (Westcontrol)

Guru 227245 points

David Brown said:
Producing a C compiler that only vaguely follows a standard that is well over 20 years old is absurd

Why? It does the job. And all the addendums are counterproductive in a microcontroller environment.
If you want to use all features of latest convenience add-ons of a today ultra-high-level-language, then the MSP isn't your target of choice anyway. A PC or something like that is definitely suited better.
Guess why some (successful) people are still writing soe parts of their MCU firmware in assembly language? Not because it is so convenient, but because it does what you want - no less but also no more.

David Brown said:
in fact, they barely claim to be a C/C++ compiler at all, and mostly refer to it as just a "compiler".

Indeed. It is a C-style compiler that follows in many but not all points the K&R description of the C language. Especially, it is a compiler designed for compiling MSP430 programs. Not intended to be used for anything else.

David Brown said:
Having statically allocated data initialised to zero is not "lazy"

No. It's lazy to no write how you want a variable to be initialized and assuming that it will be automatically initialized.
If you initialize it to 0, it's still up to the compiler to place it into the zeroed data section and not generate an entry into the initialized data area which is copied from flash to ram on startup.

David Brown said:
If you want to initialise data to something other than 0, you can specify an initialiser when the variables are defined

Delete the 'other than 0' and I wholeheartedly agree. If you want it to have a value on startup, then give it a value. If it is not necessary, then the compiler shoudl nbot waste any time in assigning an unneeded value anyway.

David Brown said:
All data requires a value at some point.

Of course. But not at initialization. Ther eis no point in filling any buffers which are written at runtime anyway, with init zeroes.

Did you ever wonder why many PC programs contain large (often megabytes) areas of zeroes in their binary? Usually the initialisation of buffers which are never intended to be zeroed but filled with runtime-calculated data.

I agree that the majority of definitions is for data that needs to be initialized. But the majority of data bytes does not. And on microcontrollers you usually statically allocate buffers (using malloc only makes sense if you have virtually unlimited resources).

Initializing data that does not need initialization is a waste of time. Soemthign that is unimportant on PCs, but may be crucial on MCUs.

David Brown said:
I think it is a good think for a compiler to make it easy to have data sections that are not cleared - but most compilers have some way to achieve that. But that is the exception, not the rule.

I agree. And it's a bad thing to have to learn these 'some ways' jus tto get things not being doneou don't need to and don't want to have done.
It's that simple: say what you want (e.g. have this variablek initialized to 0) and don't need to have to say what you don't want.

David Brown said:
As for D, I think you are arguing without knowing much about the language.

After posting this I realized that there is indeed something called 'D'. I could have said 'C--' or 'X' or whatever instead. I was not referring to the actual existing language 'D'.

David Brown said:
but I can assure you that I know the difference between microcontroller programming and PC programming.

If so, then you're a rare breed. Many people who are coming here (usually those who leave again shortly after) think the MSP is a full-featured PC with at least Pentium-like power and resources, but shrinked to a thumbnail and sold for a fraction. I've had some 'experts' doing complex double math operations where simple integer shifting would have been enough, and then complaining about the MSP being too slow.

When I think back to the 'good old' C64 times, I can remember many good action-packed games which were done on a system with not much more resources (ram/rom) than the MSP and much less processing power.
I even started with C in these times.

David Brown (Westcontrol) over 14 years ago in reply to Jens-Michael Gross

Prodigy 120 points

I really hate web forums like this - it makes it so much more time-consuming to log in, read, and post, and so much more inconvenient to quote and reply properly. Give me a proper mailing list or Usenet newsgroup any day. But I'll try to get the quoting right, so that my reply makes sense.

Jens-Michael Gross said:
If you want to use all features of latest convenience add-ons of a today ultra-high-level-language, then the MSP isn't your target of choice anyway. A PC or something like that is definitely suited better.
Guess why some (successful) people are still writing soe parts of their MCU firmware in assembly language? Not because it is so convenient, but because it does what you want - no less but also no more.

I don't want the latest high-level languages on the MSP. When I program on a PC, I use high-level languages (typically Python). When I program on microcontrollers, I use low-level languages - the great majority in C these days, with occasional bits in assembly (though it's many years since I've written entire projects in assembly), and very occasionally C++.

The reason I want support for a more modern C standard is for better development features. I want the stricter checking available in modern compilers - CCS, especially in its default setup for errors, warnings and remarks, is terrible - it will accept code that I view as total rubbish. A prime example is that using undeclared functions is considered a minor remark - it should be an error. (And no, enabling strict ansi checking is never an option for embedded compilers, since it turns all embedded extensions into errors.)

I want to be able to mix declarations and executable code, and use zero-length arrays, and use non-int bitfields, and inline functions, and restrict pointers, and the dozens of other improvements that have been made since C89. These let me write better code - the source code is shorter and clearer, and the compiled target code is smaller and faster. These are good things for any development process, and especially good for microcontroller programming. It turns out that CCS does in fact accept most C99 constructs that I have tried - but I would be a lot happier if it actually said it did, rather than leaving the user guessing.

I work with something like a dozen processor architectures (with maybe 4 or 5 being "current" at any given time). I have worked with a large number of compilers. I know what code the compiler generates for different source code, because I regularly read the generated assembly to be sure it is correct and efficient. And I know that better language support leads to better code.

Jens-Michael Gross said:
All data requires a value at some point.
Of course. But not at initialization. Ther eis no point in filling any buffers which are written at runtime anyway, with init zeroes.[/quote]

Correctness trumphs speed every time. A buffer that is unnecessarily zeroed leads to a few tens of microseconds longer startup time. A variable that is not correctly zeroed leads to a broken program. So the only conceivable correct implementation for a compiler is to zero the uninitialised data as required by every C standard (including C89), and let those programmers concerned about those extra tens of microseconds work around that issue (it's not hard).

Jens-Michael Gross said:
Did you ever wonder why many PC programs contain large (often megabytes) areas of zeroes in their binary? Usually the initialisation of buffers which are never intended to be zeroed but filled with runtime-calculated data.

That would be a broken compiler or toolchain. A real toolchain will either require the OS to zero its bss on load, or it will clear it with a simple loop.

Jens-Michael Gross said:
As for D, I think you are arguing without knowing much about the language.
After posting this I realized that there is indeed something called 'D'. I could have said 'C--' or 'X' or whatever instead. I was not referring to the actual existing language 'D'.[/quote]

OK. The real "D" language is perhaps not a bad choice for a larger device, such as an ARM.

I'll agree with you that C is actually a poor language for use in microcontrollers - it's a poor language, in many respects. But it's the best we've got - if it is implemented correctly.

Jens-Michael Gross said:
but I can assure you that I know the difference between microcontroller programming and PC programming.
If so, then you're a rare breed. [/quote]

I know I'm rare, and I am also unknown in this forum, so it is not unreasonable for you to initially guess that I am like most others here. But I'm speaking here from a position of knowledge and experience, both of my own work and of helping others. And I have no doubts whatsoever that CCS's failure to clear bss is an inexcusable flaw, and based on that single flaw, and their attitude (we know it doesn't follow the standards and will cause people trouble, but we don't care) I would not recommend the tools to anyone.

Jens-Michael Gross over 14 years ago in reply to David Brown (Westcontrol)

Guru 227245 points

David Brown said:
I really hate web forums like this - it makes it so much more time-consuming to log in, read, and post, and so much more inconvenient to quote and reply properly.

I'm not happy with it too. It is slow and the fact that you won't see all previous posts when replying is not a help too. But the editor is quite capable , once you got used to it HTML-style handling or line feeds.

David Brown said:
I know I'm rare, and I am also unknown in this forum, so it is not unreasonable for you to initially guess that I am like most others here.

I didn't guess anything :) I only wrote in general about common misunderstandings when 'entering' the microcontroller world.
I think the most often asked question (there are many btu this is really frequently asked) is "why do I get a 'code size exceeds available memory' error when using fprinf in my poject?" (usually, the target in question is a LaunchPad)

David Brown said:
I want the stricter checking available in modern compilers - CCS, especially in its default setup for errors, warnings and remarks, is terrible - it will accept code that I view as total rubbish.

Indeed, the default errors and warnings are a total mess. I program with mspgcc, which is based on gcc, and the list of additional warnings I enable are almost the major part of the makefile (in bytes) :)

David Brown said:
I want to be able to mix declarations and executable code, and use zero-length arrays, and use non-int bitfields, and inline functions, and restrict pointers, and the dozens of other improvements that have been made since C89.

I too use much of this, where it is appropriate. However, the automatic intialisation done by mspgcc has caused me quote some headache until I iscovered what happens and how I can get around it. Because the initialization of data may take longer than the watchdog timeout, mspgcc disables the WDT before initializing the data. Whis was not stated anywhere in the docs, was not expected and definitely was not what I wanted. (what use is a WDT if it is disabled right on startup). I ended up by writing my own startup code. Since puttign everythng manually in uninitialized data segment was way too much work.
I guess that's why CCS does not initialize everything - it costs too much time and as I said, it's not necessary for the majority of bytes (not objects).

David Brown said:
It turns out that CCS does in fact accept most C99 constructs that I have tried - but I would be a lot happier if it actually said it did, rather than leaving the user guessing.

I guess it IS stated somewhere. There's just a guide missing that tells you where to look. (This piece of sourcecode, that release note comment...). Getting in-depth information about open-sorce projects can be a real pain (but for closed-source projects it can be even impossible).
The difference between a typical proprietary Windows program and a typical open-source Linux-originated is that the windows program usually is big, looks nice, runs slow, more or less does the job and has a documentation that tells you how to press the proper button to make it do its job. And if you'r enot satisfied, that this is your problem, not the developers'. The Linux program is really small, fast, can do almost everything, but only has a commandline interface (or somethign not much better), the documentation lists all the nice features but besides some examples of how to certain things (just like the windows buttons, jsut without the button itself) you're lost. As if the programmer tells you "I've done this very clever piece of code that can do everything I wanted it to do, but now I move to something else and have no more time describing how to properly use it in every detail - analyze the sourcecode if you want to know"
Both is not the best way to go. With respect to GCC, I want to use it for programming, not to (re-)program it.
I still use mspgcc as the flexibility it offers (once you got the info how to use it) is superior to the competitors (IAR/GCC). If I wold rely on a debugger, things were different, but I never launched a debugger for any of my various MSP projects.

David Brown said:
Correctness trumphs speed every time.

No. A few microsecond and the realtime event you wanted to catch is gone and you didn't notice, rendering the whole project void. Correctness for the correctness' sake is a luxury you cannot afford on microcontroller projects. If, bu tonly if you have the time, you can go for 'nice' code or canonically correct structures or whatever. But the first and most important rule is to match the project requirements. Which for MCUs usally are 'get it done in the available time and space' and not 'win a design contest with your code'.

Our laser power supply could have easily destroyed the laser within a few milliseconds while doing unnecessary startup code work in the controller.

David Brown said:
A real toolchain will either require the OS to zero its bss on load, or it will clear it with a simple loop.

If the programmer has 'initialized' a buffer by assigning a single value, even if 0, the compiler has no choice (by the standard) to put it into pre-initialized data space and generate a full set of initialization data for it which is stored in the code for being copied. Of course if it is really uninitialized it should be put into a ram segment that is not copied from the binary.

David Brown said:
I'll agree with you that C is actually a poor language for use in microcontrollers

Oh, it does its job, as long as you don't forget what your target is. Assembly coding (whcih I have also extensively done in my C64 times, and later with Espire for GEOS) doesn't make much fun. And I don't know which other language would be as efficient for MCUs. C is widely known, the compiers have matured over the years, I don't think there's anothe rlanguage that would do better. Still it's not perfect (especially since the C language has no constructs to deal with e.g. the status register or interrupt functions, so every compiler does its own extension for this)

Anyway, my current work is coding in ActionScript for Flex. It's like adding the worst parts of C++ and JavaScripts to something that is as opaque as 'the Fog'. Luckily, I don't have to care for microseconds in this project. :)

David Brown (Westcontrol) over 14 years ago in reply to Jens-Michael Gross

Prodigy 120 points

Jens-Michael Gross said:
I want the stricter checking available in modern compilers - CCS, especially in its default setup for errors, warnings and remarks, is terrible - it will accept code that I view as total rubbish.
Indeed, the default errors and warnings are a total mess. I program with mspgcc, which is based on gcc, and the list of additional warnings I enable are almost the major part of the makefile (in bytes) :)[/quote]

I too normally program with mspgcc (and use gcc for most of the other targets I work with). But when this project was started by a colleague, it used an msp430f2xxx device, and mspgcc had poor support for them at that time. Thus the obvious choice of compilers was CCS, and I'm now continuing it with the same tools. And I have absolutely the same attitude to gcc warnings - I use just about all of them.

Jens-Michael Gross said:
However, the automatic intialisation done by mspgcc has caused me quote some headache until I iscovered what happens and how I can get around it. Because the initialization of data may take longer than the watchdog timeout, mspgcc disables the WDT before initializing the data. Whis was not stated anywhere in the docs, was not expected and definitely was not what I wanted. (what use is a WDT if it is disabled right on startup). I ended up by writing my own startup code. Since puttign everythng manually in uninitialized data segment was way too much work.

One of the disadvantages of tools like mspgcc is that without a serious commercial backing (as there is for gcc for the ARM, the AVR, and many other targets), documentation tends to be low-priority. But I'd rather use a tool that is correct but poorly documented, than a tool that carefully documents its flaws!

Personally, I normally disable watchdogs unless I have reason to believe that the hardware is dodgy (perhaps a poor power supply with no decent power-on reset device). A watchdog doesn't help for software - at best it hides problems and makes them harder to spot. I read an article once which described watchdog resets as "hitting a dead man repeatedly on the head with a hammer in the hope that he'll wake up". If a software issue causes the program to hang, then the software is broken, and the watchdog will not help it. It is possible, however, to use a watchdog reset or interrupt as a debugging aid, perhaps to print out information in the event of a hang.

Jens-Michael Gross said:
I guess that's why CCS does not initialize everything - it costs too much time and as I said, it's not necessary for the majority of bytes (not objects).

The time required is not much in real world terms unless you have huge buffers, or very strict requirements, or very poor initialisation code (I've seen some terrible examples that take many times longer than necessary). And while it is, as you say, not necessary for the majority of bytes (such as those in buffers), it is necessary of the majority of objects - and that need is more important than timing needs. If a buffer is large enough that you specifically want to disable initialisation for it, then it is easy to put it into a particular section - that's a lot less effort than specifically adding "= 0" to all your statically allocated data. And note that declaring data as "int a = 0;" takes more flash space, and longer startup time, than "int a" with proper initialisation code. And putting the "a = 0" inside a function is even worse in time and space.

Jens-Michael Gross said:
Correctness trumphs speed every time.
No. A few microsecond and the realtime event you wanted to catch is gone and you didn't notice, rendering the whole project void. Correctness for the correctness' sake is a luxury you cannot afford on microcontroller projects. If, bu tonly if you have the time, you can go for 'nice' code or canonically correct structures or whatever. But the first and most important rule is to match the project requirements. Which for MCUs usally are 'get it done in the available time and space' and not 'win a design contest with your code'.[/quote]

No, you are very wrong here - but I think that's a misunderstanding of the terms. Correctness is all-important. It's easy to write a program that is fast but doesn't do what it needs to do. If you have specific time requirements, then those are part of the specification and the program is only correct if fulfils those timing requirements. Getting the timing right is then part of getting the program correct, just like meeting any other requirements. Correctness is not a "luxury" - it is fundamental. Sometimes things like style or elegance is a luxury, and must bow to speed requirements (i.e., fast and ugly code is better than slow and neat code if the code must be fast). But if you can stay within your timing budget, style and readability of the code is vital to being sure that the code is correct.

Jens-Michael Gross said:
Our laser power supply could have easily destroyed the laser within a few milliseconds while doing unnecessary startup code work in the controller.

If that is the situation, I would question the wisdom of using a microcontroller in this way. Microcontroller startups are seldom accurate to that sort of level - power-supply startup and power-on reset devices are not normally suitable for such specific tight timing budgets. The usual technique is to have something like an enable signal with a pull-up or pull-down to ensure a fixed state at power-on, and only activate the important electronics once the main program is in action. Of course, this is highly dependent on the project.

Jens-Michael Gross said:
If the programmer has 'initialized' a buffer by assigning a single value, even if 0, the compiler has no choice (by the standard) to put it into pre-initialized data space and generate a full set of initialization data for it which is stored in the code for being copied. Of course if it is really uninitialized it should be put into a ram segment that is not copied from the binary.

That is not true - as with most things in C compilation, it all falls under the "as-if" rule. It is perfectly allowable for the compiler to put data that is explicitly initialised to 0 into the bss, because it has the same effect. It is also perfectly allowable to use something like run-length encoding for explicitly initialised data, rather than storing everything in full in the text section (though that is by far the most common method).

Jens-Michael Gross said:
Anyway, my current work is coding in ActionScript for Flex. It's like adding the worst parts of C++ and JavaScripts to something that is as opaque as 'the Fog'. Luckily, I don't have to care for microseconds in this project. :)

ActionScript is something like javascript, isn't it? I recently completed a project which involved an active web page (served up from a TI Cortex-M3 device, programmed with gcc). But I avoided writing the javascript directly. I used pyjamas to convert a nice, clear Python program into the monstrosity of javascript that was needed to give correct results on all common webbrowsers.

Jens-Michael Gross over 14 years ago in reply to David Brown (Westcontrol)

Guru 227245 points

David Brown said:
A watchdog doesn't help for software - at best it hides problems and makes them harder to spot.

That's true. But in our case, the devices run on industrial environment. Power surges, ESD and EMI is a common problem and if something happens, the device shall come up without user intervention (power cycle). And as you know yourself, the programmer will always find only one bug less than there is. So in case something unexpected hits the software and it hangs, the WDT will kepp the thing running until the bug has been tracked down and fixed. A device that fails completely can cost the customer a multiple of its original cost. So the WTD is needed and has to be alwys on, without any exception.

Sure, many otehr projects may not require this kind of security fallback (and it can be abused to hide obvious design and programming flaws), but having to disable the WDT so the device will start up at all is the least thing one should consider (wellk, one should not consider this at all). Unfortunately, it was done on MSPGCC as default and this was not even documented. It took me quite some time to figure out why the WDT was no triggering when I expected it to trigger intentionally: I was about to blame TI for the WDT not being enabled on startup despite of the documentation, when I finally discovered that the mspgcc startup code was the reason. I had NEVER expected this. But on the MSPs with larger ram (especially the 1611) it took too much time to initialize the data.

David Brown said:
The time required is not much in real world terms unless you have huge buffers, or very strict requirements, or very poor initialisation code (I've seen some terrible examples that take many times longer than necessary)

Some projects, especially in teh very-low-power section where the MSP is switched off (includin RAM retention) and just listens to a port interrupt, startup time can be a vital thing. Other than on PC where users are patiently waiting many seconds before the (compared to the MSP) ultrafast machine has started a program, on MCUs even a microsecond can be an eternity.
And indeed, I've seen poor init code myself more than once (and not only init code). I've seen people doing things in time-critical sections which were neither necessary nor at least efficiently coded.

David Brown said:
it is necessary of the majority of objects - and that need is more important than timing needs.

Neccessity? No. If it is necessary, initialize it. If not, don't and do not expect it to be done.
It's like implicit type conversion. If you want it converted, to an explicit typecast. If not, don't complain about the compiler giving you a warning.
Say what you want and do not assume that are done without you saying you want them done.

David Brown said:
If a buffer is large enough that you specifically want to disable initialisation for it, then it is easy to put it into a particular section - that's a lot less effort than specifically adding "= 0" to all your statically allocated data.

Is it? Depends on the compiler. Usually, the programmer does not know of this option or just doesn't care - leading to code that starts slower than necessary or is bigger than necessary.

David Brown said:
And note that declaring data as "int a = 0;" takes more flash space, and longer startup time, than "int a" with proper initialisation code.

Why? If the initialization is with 0, the compiler can as well put it into a section where it jsu tzeroes the ram instead of copying the code. That' ssomething the compiler can decide and should - but often doesn't.

David Brown said:
And putting the "a = 0" inside a function is even worse in time and space

If this initialization is only required once (and not each time this part of the code is called), then of course you're right. Anyway, doing everything at the same time, but fast, and doing it slow, but only when needed, has both advantages and disadvantages. Sometimes, spreading an action is less effective doe to overhead, but still desireable because the grouped, more efficient action will violate the timing requirements. It depends on the individual case, but forcing one and forbidding the other is surely not the way to go.

And in addition to all this, explicitely writing what you want to have done increases readability, which is important when more than one works on a project. Seeing a variable that is initialized to 0 tells me that there was a reason to put the initialization there. Seeing no init tells me that there is no initialization needed.

Anyway, there is no point in taking this discussion any further.
Fact is that CCS never claimed to implement any C standard,only parts of the implementation description are referring to this specific part of K&R, and fact is also, that the documentation explicitely writes that you have to init your variables. What you expect and what you get are two different things, as long as you get what you have been promised to get.

David Brown said:
Correctness is not a "luxury" - it is fundamental. Sometimes things like style or elegance is a luxury

Often enough 'correctness' is broken down to a certain style or whatever. besides things that are really important, many unimportant things have made their way into 'standards'. Many things that are considered necessary to be 'correct', are rather of a bureaucracy type. If you ask why this 'has' to be done, the answer is 'because it has been written in the standard' and no other explanation. If this is the way to go, I wonder why standards undergo revisions or are replaced. Ther ewas a standard and if sticking strictly to the idea of standards, there was no more change allowed, thus no other standard may ever replace it. Since any 'new' standard would be a violation of the old.

If you allow the idea of changing or replacing standards, you have to allow that people do not follow a certain standard, as long as they do not claim following it.

So I agree: correctness is about doing exactly what's requested (or at least should be), but it is not about doing things for their own sake. Like providing a declaration of compliance about EMI protection for a adhesive paper label. (European bureaucracy demands those stupid things, just because the label is attached to an electric device and therefore a component of an electric device and therefore needs such a certificate like all other 'components')

David Brown said:
If that is the situation, I would question the wisdom of using a microcontroller in this way.

Actually we didn't. :) We included a hardware 'watchdog' which needs to be triggered by the controller. So before the first trigger, the output was disabled. Anyway, it was a matter of cost - every piece of additional hardware increases the manufacturing costs.

David Brown said:
It is perfectly allowable for the compiler to put data that is explicitly initialised to 0 into the bss, because it has the same effect.

Allowable, yes, I didn't question that. I said 'should', not 'must'. As it would require less resources. But at least on PC, waste of resources is not a topic of interest for most. And I understand why: time-to-market is an important factor, often more important than efficient code. If the hardware has not enough resources, the next hardware generation will have,
I remember a software for lawyers. The license contract contained a part requiring the hardware always being updated to the current state of technology by the customer. I wondered why, until I discovered that they are still running their old DOS based software, hiding it in a virtual machine with a frontend that fakes a real windows program. With every update of their database and every add-on, the whole thing was running slower and slower, so the customer had to compensate with ever faster machines. Nobody cared, despite the fact that the software was very expensive (and still is). But it works, so it seems to be 'perfectly allowable'.

David Brown said:
ActionScript is something like javascript, isn't it?

Yes and no. It has some similarities, including the concept of dynamic objects, yet is is a compiler language and has many aspects which rather look like C++ than JavaScript. It is, however, implementing the weaknesses of both as well as their strengths.
I've worked with JavaScript too (using Dojo) and PHP. (and a lot more - I'm not focused on any language, nor do I care for)
But I'm engineer, not information scientist, even if I often work as programmer as well as as hardware designer. So I look at many software-related things from a different angle.
Maybe that's what makes me so sucessful with my projects. :)
I've seen enough information scientists producing crap that perfectly followed all design rules (being 'correct' from an information scientists view) but still was unusable in the field (so not being 'correct' from an engineers view). (two different types of correctness - see above)

David Brown (Westcontrol) over 14 years ago in reply to Jens-Michael Gross

Prodigy 120 points

Jens-Michael Gross said:
The time required is not much in real world terms unless you have huge buffers, or very strict requirements, or very poor initialisation code (I've seen some terrible examples that take many times longer than necessary)
Some projects, especially in teh very-low-power section where the MSP is switched off (includin RAM retention) and just listens to a port interrupt, startup time can be a vital thing. Other than on PC where users are patiently waiting many seconds before the (compared to the MSP) ultrafast machine has started a program, on MCUs even a microsecond can be an eternity.[/quote]

There are times when you need startup times measured in microseconds, and there are times when you need large ram buffers. But it is very rare to need both at once. In embedded development, there are always exceptions, and always special cases - there can be particular reasons for needing that combination in a particular project. But as a general point, if you are needing something that has microsecond response from reset and also large data blocks, then you should be splitting up the functionality.

Jens-Michael Gross said:
And note that declaring data as "int a = 0;" takes more flash space, and longer startup time, than "int a" with proper initialisation code.
Why? If the initialization is with 0, the compiler can as well put it into a section where it jsu tzeroes the ram instead of copying the code. That' ssomething the compiler can decide and should - but often doesn't.[/quote]

Compilers almost always put "int a = 0;" initialisers in the same section as "int b = 1;" initialisers - i.e., they have a direct copy of the initial values in a flash section, and copy it over to ram with a "memcpy" style operation at startup. They certainly could put "int a = 0;" in the same section as "int c;", i.e., the bss which is typically cleared at startup with a "memset" type of operation. Compilers (or to be more precise, the toolchain, as the library and linker are involved too) are allowed to implement startup in many different ways, as long as "int c" variables are zeroed and "int b = 1;" variables get their specified initial values. A smart enough tool could use run-length encoding to save space, for example. But in practice they all (except broken toolchains) use a simple and clear system which puts "int a = 0" into the initialised data section rather than the bss, and which clears the bss with a memset loop and copies the initialised data with a memcpy loop.

Jens-Michael Gross said:
And in addition to all this, explicitely writing what you want to have done increases readability, which is important when more than one works on a project. Seeing a variable that is initialized to 0 tells me that there was a reason to put the initialization there. Seeing no init tells me that there is no initialization needed.

Oh, I agree that explicit is generally better than implicit - writing clear code is vital. But writing concise code is also important. When you write code in C you can assume that the rules of C are followed, and that means you do not have to waste space writing out the details of something that is guaranteed by the language. So you can write something like "if (p && p->flag) {...}" - you know it is safe because the language guarantees that "p->flag" will not be evaluated unless p is non-zero. You know it, I know it, and everyone else who is experienced with C knows it. This makes the code clearer, and therefore better, than something involving extra nesting "if (p != NULL) { if (p->flag) { ..." even though the later form is more explicit. In the same way, a global or file scope definition "int a;" is perfectly clearly stating that "a" is initialised to zero. The language C has no way to express uninitialised statically allocated data.

Jens-Michael Gross said:
Anyway, there is no point in taking this discussion any further.
Fact is that CCS never claimed to implement any C standard,only parts of the implementation description are referring to this specific part of K&R, and fact is also, that the documentation explicitely writes that you have to init your variables. What you expect and what you get are two different things, as long as you get what you have been promised to get.

It is true that CCS doesn't claim to implement any specific C standard. But it is advertised and sold as a C/C++ compiler, and thus it is reasonable to expect it supports features that have been part of the C language since before there were any standards, which have remained intact through all later standards and revisions, and which are implemented by all other serious C compilers (AFAIK). You can argue until you are blue in the face that it is sometimes useful not to waste time initialising buffers, but the fact remains that this is part of the language and has to be implemented and has to be the default for a C toolchain. It is good to have a way to change this behaviour, but the standard behaviour must be the default.

Jens-Michael Gross said:
If you allow the idea of changing or replacing standards, you have to allow that people do not follow a certain standard, as long as they do not claim following it.

"Standard" can have many meanings. In this case, there are things that are fundamental parts of the language, and things that have varied over time in the different standards. When new standards are made, they certainly do change things - but the don't make anything incompatible without very good reason. Instead, they are mostly additions to the language. And it is therefore fair enough for a compiler to support C89 standard but not C99 - it would make the tool old fashioned and a poor choice compared to its more capable competitors, but its still reasonable enough (especially as CCS actually supports a fair amount of C99). It is also reasonable to support most of a standard, but not everything - very few, if any, embedded compilers fully support C99 (how many have full library support for locales and wide characters?). But you don't change the fundamentals. You don't decide that your compiler will have 8-bit ints - even if that would make the code smaller and faster. You don't decide to change the integer promotion rules - even if it would make more sense. You don't decide to skip the zeroing of uninitialised data - even if it would make startup faster.

Jens-Michael Gross said:
ActionScript is something like javascript, isn't it?
Yes and no. It has some similarities, including the concept of dynamic objects, yet is is a compiler language and has many aspects which rather look like C++ than JavaScript. It is, however, implementing the weaknesses of both as well as their strengths.
I've worked with JavaScript too (using Dojo) and PHP. (and a lot more - I'm not focused on any language, nor do I care for)
But I'm engineer, not information scientist, even if I often work as programmer as well as as hardware designer. So I look at many software-related things from a different angle.
Maybe that's what makes me so sucessful with my projects. :)
I've seen enough information scientists producing crap that perfectly followed all design rules (being 'correct' from an information scientists view) but still was unusable in the field (so not being 'correct' from an engineers view). (two different types of correctness - see above)[/quote]

I too am an engineer (and like you, I work with lots of programming languages and also with hardware design). "Correctness" is "does it work according to specifications". Things like style or elegance of the solution are about the quality of the system, not its correctness (except of course to the extent that "quality" is part of the specifications, which is normally the case). The lawyer software you mentioned is perhaps poor quality and inelegant, but it is correct because it does the job.

**Attention** This is a public forum

MSP low-power microcontrollers

MSP low-power microcontroller forum

CCS Compiler does not clear bss data (uninitialised data)