Two debug windows, same variable, two different values...

Bruno Saraiva

Fellows,

This is a screenshot of CCS 6.1 trying to debug TM4C123x. This project suddently began to give me all sorts of headaches...

The displayed variable is called as5047sensor. It is stored in location 0x20004835, as can be seen both on the yellow pop-up window and on the while expressions window.

The element accuunc was recently set to 0, and such attribution in fact worked. The value on the right (white) part of the image is correctly 0, in address 0x20004849. However, on the same moment, the yellow popup shows a huge number (all formats properly visible there):

There is something quite weird with that variable, as the next line would try to access as5047sensor.accuunc, but that causes a FaultISR.

Patiently tried to debug the FaultISR as per available documents, but found no evident cause. I know not enough assembly to figure this out, but there was nothing popping out as too illegal. Checking the NVIC fault registers, one hint is a bit set on Unaligned Access Usage Fault.

Further info: general interrupt is disable at this stage.

Any suggestions are most welcome!!!

over 8 years ago

0 Robert Adsett72 over 8 years ago

Guru 10570 points

Even before getting to that line my first thought was alignment issues. You are accessing a word on a byte address, that's probably incorrect.

Note this also meshes nicely with your other issue. Maybe you have mixed alignment flags in your build process?

Robert

0 Bruno Saraiva over 8 years ago in reply to Robert Adsett72

Guru 13040 points

Robert,

That sounds like being the problem. Thanks for the input!

There is one included library which has several structures declared. They are used for serial communication, and the transported bytes must be positioned exactly as declared on the structure, we can't risk automatic fills or alignments inside one of those, for the comm would fail.

One of the structures there has been recently modified, and my guess that's what triggered the other issue and later this one.

Here's an example of one of the structures, with alignment header/footer:

#pragma pack(push, 1)
typedef struct
{
	int32_t   	pos_x;			// Current position in mm
	float		speed;			// Current speed in in mm per second
	float   	pos_y;			// Gauge absolute cylinder position mm
	float     	acc_z;			// Current vertical acceleration in g? m/s^2?
	float		acc_x;			// Acceleration along distance, m/s^2
	uint8_t		gaugeSide;		// LSB (bit0) is 0 for left, 1 for right side
}
int_ntga_t;					// Gauge board package structure
#pragma pack(pop)

I honestly don't know exactly what these parameters do. I know the intention is to avoid them to be optimized and have their relative position and sizes changed, but I would like to have access to the documentation where the parameters are explained. Can you point me to the right doc?

Meanwhile, I believe the problem will be fixed if I look at each of the structures, and pad any of them which is not a multiple of 4 bytes, adding some useless chars in the end... Will do such right now and see what happens...

Nope, problem remains... The variable still shows a huge weird value, and this line causes a fault. Where else are there alignment directives and parameters to look for on the project?

0 Bruno Saraiva over 8 years ago in reply to Bruno Saraiva

Guru 13040 points

Ok, a bit further into it.

I tried to force the variable to a 4-alignment using:

#pragma DATA_ALIGN (as5047sensor, 4);
as5047_data_t		as5047sensor;

Still it did not work, the "ugly number" changed but was still a different type of ugly.

Then I changed the order in which the elements were declared. The uin64_t originally came after five 32bit variables, so it would be "cut in half" if thinking multiples of 8.

I moved the 64bit element to right after 2 32bit elements, and the program now runs properly.

The veredict is that the problem is in fact some bad memory access of a 64bit variable which is not aligned to a multiple of 8. A link pointing to the proper document to learn about these #pragma alignement directives might be usefull for further research, and if anyone has more comments and hints on the subject, they shall be welcome!

0 Robert Adsett72 over 8 years ago in reply to Bruno Saraiva

Guru 10570 points

The best would be either to get rid of alignment directives entirely or apply them to the whole program (including libraries).

Robert

0 Robert Adsett72 over 8 years ago in reply to Bruno Saraiva

Guru 10570 points

Bruno Saraiva said:
They are used for serial communication, and the transported bytes must be positioned exactly as declared on the structure, we can't risk automatic fills or alignments inside one of those, for the comm would fail.

And that's not a particularly good method of solving the problem IMO. As you've found out sprinkling explicit alignment around makes the code brittle.

What you want to do is make the serial code independent of alignment so it doesn't matter what is used in the program the serial interface still works. The way this is done is via cracking functions. To send data you break the values up into the constituent bytes and send them over the wire in the correct order. Receiving is done in the reverse. This is independent of structure member alignment, order, size (mostly) and type. It even lets you use bitfields. The structure can contain information (such as timestamps) that are not sent across the wire.

The cracking functionality would usually be enclosed in a set of macros or in C++ perhaps inline functions.

The usual objection is that this will result in large or slow code but for a decent compiler the code ends up matching what the compiler has to do in any case. In the event that the packing and alignment provably match what is needed for the serial buffering the cracking effectively becomes a nop.

Robert

Even if you get it to work by judicious placements of alignment pragmas you will still have brittle code. And of course the pragmas are specific to a particular compiler (or worse specific versions of a compiler).

0 f. m. over 8 years ago in reply to Bruno Saraiva

Guru 11940 points

A link pointing to the proper document to learn about these #pragma alignement directives might be usefull for further research, and if anyone has more comments and hints on the subject, they shall be welcome!

The toolchain documentation, especially the compiler.

The "#pragma" is the "standard way" to implement non-standard, compiler-specific stuff. Compilers are compelled to ignore unknown pragmas.

I currently deal with such issues to maintain portability between different platforms (including non-ARM, less than 32-bit), and I can tell you it is no fun ...

Then I changed the order in which the elements were declared. The uin64_t originally came after five 32bit variables, so it would be "cut in half" if thinking multiples of 8.
I moved the 64bit element to right after 2 32bit elements, and the program now runs properly.

The Cortex M4 cannot access data widths greater than 32 bit at once. The uint64_t element will always require two consecutive accesses, and any alignment greater than 4 is useless.

Alignment/"packing" becomes an issue with data sizes less than the bus width.

0 f. m. over 8 years ago in reply to Robert Adsett72

Guru 11940 points

Bruno Saraiva
They are used for serial communication, and the transported bytes must be positioned exactly as declared on the structure, we can't risk automatic fills or alignments inside one of those, for the comm would fail.

And that's not a particularly good method of solving the problem IMO. As you've found out sprinkling explicit alignment around makes the code brittle.

Missed this part at first.

If bandwidth is not a particular concern, I would consider an ASCII based protocol (strings, and "stringized" numbers). Assuming the relevant stdlib functions are present, portability is a bliss - compared to binary modes.

0 Bruno Saraiva over 8 years ago in reply to f. m.

Guru 13040 points

fm, thanks for the comments and suggestions.
Concerning the idea of converting values to asc, however, this I will NEVER do.
I believe there's nothing more counterefficient than textized values used for communication. On an extreme example, I could send you one char as information, or TRUE,TRUE,FALSE,TRUE,FALSE,FALSE,FALSE,TRUE. It can make a lot of difference...
These days there's a lot of data being sent up and down with a gazzillion overhead... it probably even collaborates to the problem of temperature increase of the planet! Things like JSON for example. Or "modern" web pages that require 6Kb to display HELLO WORLD...
We will keep closer attention to our structures, and even pad a byte or two in the end to make them multiples of four... organize the variables in an optimized way to have the 4's as adjacent as possible... But really, we will NOT send a text such as "MELTTEMP=421.2339081". Not, at least, in our applications which transport 40,000 values in one second. :)
Craking will be studied further on - but for now, a more intelligent alignment of variables inside the protocol definition consumes 3 minutes as opposed to coding and testing that... So it will remain on the idea pile for some time...
Of course, there are no two identical applications and optimal solutions - and that's what keeps food on the table of a great number of developers.
Cheers!

0 f. m. over 8 years ago in reply to Bruno Saraiva

Guru 11940 points

Concerning the idea of converting values to asc, however, this I will NEVER do.
...

Your answer suggests that you haven't thought it though. Never mind, it is up to you.

Whole operating systems (Unix/Linux) follow this idea - they use ASCII file formats and transfer protocol formats whenever possible.

Only if readability and portability is is of concern, of course ...

0 Bruno Saraiva over 8 years ago in reply to f. m.

Guru 13040 points

Yes, f.m., we actually considered ASCII when we began this adventure. We did not go that way, though.

I am likely an off-the-curve developer, having graduated as a mechanical engineer and having received quite little formal education on programming... So the solutions implemented under my shift are probably somewhat ackward... Do you know these few guitar players who are lefty and play with standard instruments upside down? Sort of "it works, but it is definitely not the way it should be"??? By the time someone decided to teach them, it was probably too late! :)

still, I'll throw a thought/curiosity about ASCII messages moving around - they are SO human based, that they were probably the first reason for development of in-chip cryptography! At least with our binary messages (not that they contain anything secret...), they look "cool, high tech and undecipherable to naked eye!".

:)

and this became a relaxing thread... readers, if you are looking for technical content, they ceased a few posts above!

0 Robert Adsett over 8 years ago in reply to Bruno Saraiva

Guru 27665 points

Bruno Saraiva said:
a more intelligent alignment of variables inside the protocol definition consumes 3 minutes as opposed to coding and testing that

Don't forget that has to be tested too and you have to search every possible use of associated structures and test them. I would expect 3 minutes to be a considerable underestimate of the effort involved.

Robert

0 f. m. over 8 years ago in reply to Robert Adsett

Guru 11940 points

Robert Adsett said:

Don't forget that has to be tested too and you have to search every possible use of associated structures and test them. I would expect 3 minutes to be a considerable underestimate of the effort involved.

Robert

And who would ever consider to port such protocol parsing code to several platforms, with differing architectures, toolchains, register sizes, endianess, and alignment requirements ... ;-)

0 Bruno Saraiva over 8 years ago in reply to Robert Adsett

Guru 13040 points

Robert,

That's so true! Testing platforms have grown from non-existing to more than 50% of our engineering time on the past 5 years... When we consider not only the embedded platforms but the cloud storage, web interfaces, automatic email communications, etc, today there is certainly more engineering on the tests themselves than on the products. Something that we also learned with time and pain, as tech entrepreneus of a small company in South America...

...but... there's ideal life, and real life...

Unfortunately, at this stage, we still test several engine parts of our single-prop plane, well... during flight.

0 f. m. over 8 years ago in reply to Bruno Saraiva

Guru 11940 points

Bruno Saraiva said:
Unfortunately, at this stage, we still test several engine parts of our single-prop plane, well... during flight.

And how much do you pay those brave test pilots ? ;-)

0 Robert Adsett over 8 years ago in reply to Bruno Saraiva

Guru 27665 points

I'm just suggesting that you might actually find it faster (or at least not significantly slower) to test with adding cracking routines than with sprinkled alignment directives since a whole class of problems disappears. For instance TDD routines can actually test cracking functionality but cannot test alignment directives leaving that to slower, cumbersome manual testing.

Robert

0 cb1 over 8 years ago in reply to Robert Adsett

Guru 47900 points

Poster Robert, this reporter, (possibly) poster f.m., and (just maybe) Luis find this book of value:

0 Robert Adsett over 8 years ago in reply to cb1

Guru 27665 points

A worthwhile read especially if you think TDD does not/cannot apply to embedded programming. Not heavy reading.

We use a different set of tools then James does but the principles directly apply. In my experience so far TDD is the second most effective quality tool you can have in your toolbox.

Robert

0 cb1 over 8 years ago in reply to Robert Adsett

Guru 47900 points

And the "MOST effective" one... so that we're not forced to, (pardon) "infer?" (staff shouts, "Lint!")

0 Chester Gillon over 8 years ago

Guru 92251 points

Bruno Saraiva said:
Patiently tried to debug the FaultISR as per available documents, but found no evident cause. I know not enough assembly to figure this out, but there was nothing popping out as too illegal. Checking the NVIC fault registers, one hint is a bit set on Unaligned Access Usage Fault.

Some questions:

1) If you single the assembly in the CCS debugger, can you find which instruction causes the Unaligned Access Usage Fault on the access to the 64-bit as5047sensor.accuunc?

2) The CCS debugger expression view shows some of the other 32-bit integer variables are not aligned, does the code access those unaligned 32-bit integers correctly?

3) Are you using the TI ARM compiler, and if so have you changed the --unaligned_access option?

This option defaults to on for Cortex devices, which tells the compiler the target supports unaligned accesses for 16-bit or 32-bit values.

4) What is the value of the UNALIGN_TRP bit 3 in the Configuration and Control Register (CCR)?

The reason for these questions is that the UNALIGN_TRP bit when set allows instructions to generate unaligned 16-bit and 32-bit accesses, but Unaligned LDM, STM, LDRD, and STRD instructions always generate an Unaligned Access Usage Fault.

See the ARM Cortex-M4 documentation 3.3.5. Address alignment and 4.3.7. Configuration and Control Register

Bruno Saraiva said:
The element accuunc was recently set to 0, and such attribution in fact worked. The value on the right (white) part of the image is correctly 0, in address 0x20004849. However, on the same moment, the yellow popup shows a huge number (all formats properly visible there):

The yellow pop-up shows the least significant 32-bits of accuunc are zero but some higher order bits are set. That might be a bug in the CCS debugger.

0 Robert Adsett72 over 8 years ago in reply to cb1

Guru 10570 points

Your staff doesn't get those large salaries for nothing.

Although C/C++ may need a static analyzer more than say ADA or Eiffel it does have a high quality relatively inexpensive one in Gimpel's product. I don't think the same can be said for other languages.

Robert

I understand ADA to have some good static analyzers available but not inexpensively

0 Bruno Saraiva over 8 years ago in reply to Chester Gillon

Guru 13040 points

Chester,

Thanks for the comments. They are indeed focused right on the issue!

Some of the code has already been changed (BUT NO PROJECT SETTINGS), and I'll drop some answers here for the record and to help understanding.

1) Yes, assembly debug was possible. As the code changed a bit for debugging, and since the problem is gone after I rearranged the structs to avoid odd positions, I won't have the exact location. But I remember it was one of the wide-value access instructions of the image below:

2) There was no apparent sign of problems on the 32-bit integers to that point. Indeed, lots of them appeared to be stored in odd locations...

3) Yes, using TI v5.2.8 compiler. A couple of days ago I was still using 5.2.7, any chance that the simple presence of 5.2.8 modified some alignment directive?

I have surely not changed specifically this alignment option - I probably never needed to access the Runtime Model Options page where I just found the related entry. And in fact, it is ON. Should OFF be a better option for TM4's? I checked a different, older project, which is compiling with 5.2.7, and it also shows ON - but probably all of these related projects on my workbench are pasted from some similar base project anyway...

4) UNALIGN_TRP bit was tricky to find... here they are for today's status:

And yes, the 0 value on the yellow popup versus the ugly value on the green variable monitoring seems to be a bug - does these things get forwarded to whoever needs to look into that?

Cheers!

0 Chester Gillon over 8 years ago in reply to Bruno Saraiva

Guru 92251 points

Bruno Saraiva said:
I won't have the exact location. But I remember it was one of the wide-value access instructions of the image below:

The Cortex-M4 requires the memory address for LDM and STM instructions to be 4 byte aligned, and if the memory address isn't aligned a "Unaligned Access Usage Fault" will be generated. I created the following example program using the TI ARM v5.2.8 compiler:

#pragma pack(push, 1)
typedef struct
{
	unsigned char byte;
    unsigned char padding;
	unsigned short half_word;
	unsigned int word;
	unsigned long long long_word;
} packed_struct;
#pragma pack(pop)

typedef struct
{
	unsigned char byte;
    unsigned char padding;
	unsigned short half_word;
	unsigned int word;
	unsigned long long long_word;
} unpacked_struct;

packed_struct packed;
unpacked_struct unpacked;

char buffer_a[sizeof (packed_struct) * 2];
packed_struct *const misaligned_packed = (packed_struct *) &buffer_a[1];

char buffer_b[sizeof (unpacked_struct) * 2];
unpacked_struct *const misaligned_unpacked = (unpacked_struct *) &buffer_b[1];

int main(void) {
	unpacked.byte = 0xfc;
	unpacked.half_word = 0xfb;
	unpacked.word = 0xeeeeeeea;
	unpacked.long_word = 0xdddddddd00000000LLU;
	unpacked.long_word++;

	packed.byte = 0xff;
	packed.half_word = 0xfffe;
	packed.word = 0xfffffffd;
	packed.long_word = 0xdeaddead00000000LLU;
	packed.long_word++;

	misaligned_packed->byte = 0xff;
	misaligned_packed->half_word = 0xfffe;
	misaligned_packed->word = 0xfffffffd;
	misaligned_packed->long_word = 0xdeaddead00000000LLU;
	misaligned_packed->long_word++;

	misaligned_unpacked->byte = 0xfc;
	misaligned_unpacked->half_word = 0xfb;
	misaligned_unpacked->word = 0xeeeeeeea;
	misaligned_unpacked->long_word = 0xdddddddd00000000LLU;
	misaligned_unpacked->long_word++;
	return 0;
}

It was compiled with the default --unaligned_access=on. The program contains two structures which have the same layout, but one is marked as packed with one byte alignment (packed_struct) and the other doesn't having any packing specified (unpacked_struct). Both structures are of size 16 bytes, with an 8 byte field long_word at offset 8.

Looking at the generated code:

a) For the unpacked_struct structure the compiler uses ldm.w and stm.w instructions to access the long_word, and thus the address of the long_word field must be 4 byte aligned.

b) For the packed_struct structure the compiler uses pairs of ldr / str instructions to access the long_word field, and with the Cortex-M4 which supports unaligned word accesses the address of the long_word field doesn't have to aligned.

When the example is run:

- The unpacked and packed variables which have four byte alignment can be modified successfully.

- The misaligned_packed variable, which has a deliberate non-word aligned address, can be modified successfully as the compiler only uses ldr / str instructions.

- The misaligned_unpacked variable, which has a deliberate non-word aligned address, causes an "Unaligned Access Usage Fault" when the following strm.w instruction attempts to write to the long_word field with at the unaligned address of 0x20000229:

In conclusion as of TI ARM compiler v5.2.8 the behavior appears to be:

1) If a structure is packed the compiler doesn't appear to use ldm / stm instructions to access 64-bit fields, and so the 64-bit fields can be accessed even if not 4 byte aligned.

2) If a structure is not packed the compiler uses ldm / stm instructions to access 64-bit fields, and so the address of the 64-bit field (and the start address of the structure) must be 4 byte aligned.

Therefore, ensuring all your structures are marked as packed *may* allow 64-bit fields to be used without any requiring any alignment. However, since the compiler documentation doesn't specify the conditions under which ldm / stm instructions are generated you would be relying upon undocumented behavior, which may break the code under some conditions.

The test project is attached TM4C129_alignment.zip

[As an aside I found there is a hidden --no_stm TI ARM compiler option which prevents the compiler from generating stm instructions. However, --no_stm is only intended to be used to avoid a Silicon errata on some Hercules Cortex-R4 based devices and the compiler reports an error if attempt to use --no_stm without use of -mv7R4. See RE: C compilator and TMS570LS3137 bug DEVICE#B064 ]

Bruno Saraiva said:
Should OFF be a better option for TM4's?

From a quick look at the generated code, turning --unaligned_access to Off generates can generate more code if the memory address of a load/store might not be aligned. Given that the TM4C devices can handle unaligned half-word or word accesses don't see the benefit of changing --unaligned_access to Off.

Also, changing --unaligned_access to Off doesn't prevent the compiler from generating ldm / stm instructions for accesses for 64-bit fields in unpacked structures, and so wouldn't have fixed your problem.

Bruno Saraiva said:
And yes, the 0 value on the yellow popup versus the ugly value on the green variable monitoring seems to be a bug - does these things get forwarded to whoever needs to look into that?

I tried to repeat the problem with my example program under CCS 6.1.3, but couldn't see the problem - in that the CCS debugger pop-up showed the correct value for 64-bit variables. Suggest you report this as a separate problem on the Code Composer Studio forum.

Arm-based microcontrollers

Arm-based microcontrollers forum

Two debug windows, same variable, two different values...