This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320F280049C: Device header files do not resolve char16_t to unsigned short, breaking UTF-16 string constants

Part Number: TMS320F280049C
Other Parts Discussed in Thread: C2000-CGT

Tool/software:

According to the C11 standard the following initialization should be valid:

#include <uchar.h>
#include <stdint.h>
const char16_t abc[] = u"This is supposed to work without throwing warnings..\n";

But when I try to compile it, I get the following warning:

$ ../C2000-CGT/bin/cl2000 --c11 -I../C2000-CGT/include test.c
"test.c", line 3: error: a value of type "unsigned short [54]" cannot be used to initialize an entity of type "const char16_t []"
1 error detected in the compilation of "test.c".

>> Compilation failure

The root cause is that if we follow the typedef chain back through least_uint16_t to __uint16_t, we find that for the CLA it is defined correctly, as unsigned short; but for the C2000, it is incorrectly defined as unsinged int.  Even though these are the same thing, the compiler hates life and gives up.

I can make this code work by modifying it as follows:

#include <uchar.h>
#include <stdint.h>
const char16_t * abc = (char16_t *)u"This is supposed to work without throwing warnings..\n";

And while this compiles without errors, I can no longer use sizeof(abc) since it is now just a pointer.  So it would appear there is no way to initialize a char16_t array.

If I do the unspeakable and force the definition of least_uint16_t to unsigned short (include/machine__types.h) jinstead of unsigned int, then lo and behold, the original example compiles without errors.  I have no idea what else breaks when I do this, but just wanted to demonstrate that this is the root cause.

So the compiler apparently resolves u"This is UTF-16 text" to unsigned short, but char16_t resolves to unsigned int.  Ironically, these are the same size, but one is a 16-bit integer intentionally, the other just happens to be that size; they are not the same.

The only time the type of char16_t matters is when defining UTF-16 strings, so I propose that either u"asdf" resolves to unsigned int, or least_uint16_t should resolve to unsigned short.  Why on earth would it not?

Is there another syntax I could use for defining UTF-16 strings?  A compiler option which magically makes this work?  Is it safe to #define _CHAR16_T_DECLARED before including any system headers and to typedef unsigned short char16_t?

Any guidance would be appreciated.