Compiler: C66xx DSP aligned and unaligned memory accesses

Guy Mardiks

Tool/software: TI C/C++ Compiler

Hello,

I would appreciate if you can clarify about the aligned and unaligned access in the C66xx DSP.

From what i understood the DSP supports both aligned and unaligned accesses but each has its own syntax.

When writing is C, what is the default assumption of the compiler when accessing memory through a pointer (i.e. uint32_t *ptr or uint64_t *ptr)?

When do i need to use the mem4/mem8 or amem4/amem8 intrinsics?

What about 16bit word accesses (i.e. uint16_t) , what happens if the pointer is not aligned to 16bit (there is no mem16)?

What happens with void pointers ?

Thanks

over 7 years ago

0 George Mock over 7 years ago

TI__Guru**** 244080 points

I think most of your questions are answered in the section titled Methods to Align Data in the C6000 compiler manual.

However, there is an error in that section. When compiling for C6600, arrays are aligned on an 8-byte boundary, and not a 16-byte boundary. Therefore, the example macro ALIGNED_ARRAY is incorrect. I filed CODEGEN-5247 in the SDOWP system to have this error corrected.

Thanks and regards,

-George

0 Guy Mardiks over 7 years ago in reply to George Mock

Genius 4135 points

Hello,
I looked at the section you mentioned and this made things even more unclear.
It sounds as if there is a requirement that any array (pointer) must me aligned to 64bit regardless of it's type??
This is not something than can be guaranteed. it is suggested to use assert but we need to actually work with unaligned addresses (or at least we cannot guarantee addresses will be aligned to 64 bit).

This made more more confused as to how to use pointers and when to use the mem4/mem8 and especially what happens with uint16 pointers (16 bits)

The document is not descriptive enough and it would be appreciated if you can elaborate more , and give an example per type to make it clear .

Thanks
Guy

0 Alberto Chessa over 7 years ago in reply to Guy Mardiks

Mastermind 6650 points

Hi,
The C compiler should assume default alignment as per compiler object representation (§8.2 of latest compiler manual), that is:
char: 8 bit boundary
short: 16 bit boundary
int32: 32 bits
int64: 64 bits

Sometimes more restrictive alignment can be used (by the compiler) when allocating the data, such as static scope arrays (always aligned to a 64 bits boundary) or function call argument passing, but the code generated to load/store will always use the minimum alignment in accordance with the object representation. This is not a problem since the more restrictive alignment is always compatible with the representation one (that's is, a uint16_t datum aligned at 32 bits is aligned at 16 bits also).

If you need a less restrictive alignment, such as a packed structure with uint16_t data not aligned to 16bits or more, you have to inform the compiler with the proper directive (such as pragma pack) to generate unaligned memory access.

See type attribute in compiler manual. For instance:
struct __attribute__((packed)) packed_short_t { short s; }
struct packed_short_t* p;
p[0].s=1; //generate code for unaligned access (byte access)

On the other side, if you can assume a more restrictive alignment, you can use assertion and intrinsics to let's the compiler optimize you code by grouping multiple read/write in only one wider read/write and use SIMD instructions.

Note that, as far as I know, there is no way to force the compiler to always guarantee the proper alignment. If for instance you allocate a static scope array of uint16, you can assume is it 64 bits aligned and therefore use assertion to process 2 data or 4 data at a time, but if your processing loops does not start and end at multiple of 2 or 4 index, the assumption will be false.

uint16_t x[32]; // static file scope - aligned 64 bits
uint8_t u8[32];

f2(uint16_t* p { .... } //no assertion, compiler assume 16bits aligned
f64(uint16_t* p) { _nassert((int)p % 8 == 0); ... } //with assertion, compiler assume 64bits aligned
fpacked(struct packed_short_t* p) { .... } //bye aligned array of 16bits (wrapped inside a struct)

f64(x); //passing x, inside f() _assert((int) p % 8 == 0) is true - OK
f64(&x[1]) //passing x+1, inside f() _assert((int) p % 8 == 0) is false - FAIL
f2(&x[1]) // it is ok
f2((uint16_t*)&u8[1]); //not 16 bits aligned - FAIL
fpacked((struct packed_short_t*)&u8[1]); // no 16bits aligned - OK

when the assertion fail, the only symptom will be a wrong result at run-time (no exception)

0 George Mock over 7 years ago in reply to Guy Mardiks

TI__Guru**** 244080 points

Guy Mardiks said:
I looked at the section you mentioned and this made things even more unclear.

I agree that the section is not written on point to your questions. But it does make clear some details I think are relevant.

The compiler (not the user) aligns arrays on an 8-byte boundary
Memory addresses returned from malloc (and related functions) are aligned an 8-byte boundary
Arrays that are defined inside structures may not be aligned on an 8-byte boundary

Thanks and regards,

-George

0 Guy Mardiks over 7 years ago in reply to George Mock

Genius 4135 points

Hello, Thanks.

1. Does that mean that when having an array that is shared between multiple cores - does this mean i always make it aligned to 8bytes, otherwise DSP might get a different address than other cores?

2. when array is under a struct, does the compiler always generate unaligned accesses (as it may not be aligned) or does it mean the user must use the unaligned access intrinsic?

3. when cannot be certain of an alignment of a given address and still need to read more than 1 byte (16bits / 32bits /64 bits),
does this mean i must use the mem4/mem8 intrinsics? what about 16bits - no mem2?
what about:
char *ptr;
int x=*((int *)ptr);
does the compiler generates 32bit aligned access due to casting or will it generate unaligned access?

4. how do void pointers treated (as 32bit aligned?)?

Thanks

0 George Mock over 7 years ago in reply to Guy Mardiks

TI__Guru**** 244080 points

Generally speaking ... The compiler assumes a scalar is aligned to the size of the type, i.e. char is 1-byte aligned, short is 2-byte aligned, etc. Otherwise, nothing is assumed about alignment. The compiler, as an optimization, attempts to determine when a greater alignment must be in effect. When a greater alignment can be proven, it is used.

To make this a bit more concrete ... For a function with this prototype ...

int fxn(int *ptr, int length);

The compiler presumes ptr contains an address that is aligned to 4 bytes. But suppose you always pass the base address of an array as the first argument. This means ptr is actually aligned to 8 bytes. Then you can inform the compiler about this fact by adding ...

_nassert((int) ptr % 8 == 0);

This extra information makes it possible for the compiler to use SIMD instructions that require 8-byte alignment of the memory operands.

Guy Mardiks said:
1. Does that mean that when having an array that is shared between multiple cores - does this mean i always make it aligned to 8bytes, otherwise DSP might get a different address than other cores?

No. But if you do make it 8-byte aligned, inform the compiler with an _nassert.

Guy Mardiks said:
3. when cannot be certain of an alignment of a given address and still need to read more than 1 byte (16bits / 32bits /64 bits),
does this mean i must use the mem4/mem8 intrinsics?

Yes

Guy Mardiks said:
2. when array is under a struct, does the compiler always generate unaligned accesses

For a reference like ptr->array[i], the compiler can determine the alignment automatically, and act accordingly.

Guy Mardiks said:
4. how do void pointers treated (as 32bit aligned?)?

The C language does not allow a void pointer to be dereferenced. You first have to copy it to a pointer that is a non-void type. It is the user's responsibility to insure the alignment requirements of that non-void type are always met.

Thanks and regards,

-George

0 Guy Mardiks over 7 years ago in reply to George Mock

Genius 4135 points

Hello, Thanks.

You answered no about needing to have an array aligned to 8bytes when the array is shared between different cores (no all DSP cores).
Can you explain why it will be OK? from the previous answers, DSP ALWAYS aligns array to 8bytes regardless of type. if the same array type is defined in several cores and mapped to the same memory section wont DSP linker map the array aligned to 8bytes while other cores may align to its type for example, which means the from DSP the array start address will result different than on other cores?

Does the compiler treat casted variable as if they were of the casted type, from alignment point of view?
i.e.
char *ptr;
x = *((int *)ptr) -- will the compiler generate 4bytes access or will it behave according to the pointer's actual type and/or create unaligned access in such a situation?

Thanks

0 George Mock over 7 years ago in reply to Guy Mardiks

TI__Guru**** 244080 points

Guy Mardiks said:
You answered no about needing to have an array aligned to 8bytes when the array is shared between different cores (no all DSP cores).
Can you explain why it will be OK?

I need to refine my answer. It is not always OK.

The C6000 compiler aligns all arrays to an 8-byte boundary. If it sees something like ...

extern int int_array[];

... the compiler presumes int_array is aligned on an 8-byte boundary. But if it sees ...

int fxn(int *ptr, int length)

Because ptr points to an int, the compiler presumes the address is aligned to a 4-byte boundary. Even though it is likely ptr is the base address of an array, it may not be, and so the compiler conservatively presumes ptr is not aligned to an 8-byte boundary.

So, depending on how you present a shared memory object to the C6000 compiler, you may or may not have to worry with 8-byte alignment.

Guy Mardiks said:
char *ptr;
x = *((int *)ptr) -- will the compiler generate 4bytes access

The compiler presumes the type being casted to is in effect, and not the type casted from. In this specific case the compiler presumes ptr is aligned to a 4-byte boundary.

Thanks and regards,

-George

0 Guy Mardiks over 7 years ago in reply to George Mock

Genius 4135 points

Thank you.

Code Composer Studio™︎

Code Composer Studio forum

Compiler: C66xx DSP aligned and unaligned memory accesses