Using .struct within .include files, with the optimizing assembler/compiler

Walter Snafu

I'm interested in using data structures in assembly code, similar to how data structures are used in C. That is, you define a data structure type and then declare a name as a specific instance of that data structure type -- and then put it all within an include file (in C this is a .h header file). Then include that file within the various code files that reference the named data structure. Also, one of the code files allocates memory for the named data structure, and the linker puts it all together.

I wanted to do that with TI DSP assembly code, using the .struct directive, plus the .include directive. I presumed it would all work just fine, and I couldn't find any warnings to the contrary in the user guides.

I ran into a problem.

I am using linear assembly code (files with the .sa file extension). [Note: I LOVE the TI linear assembly optimizing compiler!] But when used in the above described manner, it doesn't work. And here is my best explanation of why. The optimizer eliminates unused variables (which makes sense). However, the data structure declared in the .include file ends up getting eliminated, because that files 'sees' no allocation for that data structure. The data structure has been optimized out-of-existence. Then the compiler complains about unresolved references to the now-gone data structure.

I tried various solutions. One pseudo-solution was to make the memory allocation *known* to the including file, either by doing the memory allocation within the included file, or within the including file -- either way, the allocation is now visible within the including file, so the compiler makes no complaint. But both those are pseudo-solutions, because when this is done multiple times (for multiple including files) it ends up allocating memory for the same data structure many times over.

I haven't yet found a solution. Any solutions?

over 13 years ago

0 George Mock over 13 years ago

TI__Guru**** 249860 points

I don't understand the problem you try to describe. Please show an example.

Thanks and regards,

-George

0 Walter Snafu over 13 years ago in reply to George Mock

Genius 3450 points

Okay, I created a simple example. There are two files: my_include_file.asm and my_test_file.sa

my_include_file.asm

MY_STRUCT_type .struct
fee            .word
fi             .word
foe            .word
fum            .word
MY_STRUCT_LEN .endstruct

MY_STRUCT      .tag MY_STRUCT_type
;              .bss MY_STRUCT, MY_STRUCT_LEN
               .bss MY_STRUCT, 16

(NOTE: Contrary to the manual, I couldn't get the compiler to cope with the structure length (highlighted above in red). So I had to 'hard-encode' the structure length "16", which compiled, though it's less convenient. That issue is unrelated to what I point out below.)

******************************************************************

my_test_file.sa

            .include "my_include_file.asm"
My_proc:    .proc
            .reg fum
            LDW *+B14(MY_STRUCT.fum), fum
            .endproc fum

******************************************************************

The problem is the underlined line of code above. It's the line that allocates memory for MY_STRUCT. The problem is that the compiler complains (and doesn't compile) if that line is absent. The compiler demands that the line is present, either within the included file (my_include_file.asm), or within the including file (my_test_file.sa).

Here is the problem. If that line is within the included file, then EVERY file that includes it will allocate memory for MY_STRUCT, which would be an unnecessary mistake. On the other hand, if that line isn't within the included file, then the compiler demands that it be within EVERY including file that accesses that data structure. Again, multiple files end up allocating memory for MY_STRUCT, it's an unnecessary mistake.

0 George Mock over 13 years ago in reply to Walter Snafu

TI__Guru**** 249860 points

Write it this way ...

LDW *+B15(MY_STRUCT_type.fum), fum

That is, directly refer to the name for the structure type. As you have seen, you can also refer to the name of an actual instance of the structure. But that doesn't always make sense.

Thanks and regards,

-George

0 Walter Snafu over 13 years ago in reply to George Mock

Genius 3450 points

Georgem said:
Write it this way ...
LDW *+B15(MY_STRUCT_type.fum), fum

That is, directly refer to the name for the structure type. As you have seen, you can also refer to the name of an actual instance of the structure. But that doesn't always make sense.

George, I tried your suggestion and the compiler did not complain. However, I am baffled how it can be a general solution. MY_STRUCT_type is merely a TYPE, not an actual allocated place in memory. In fact, MY_STUCT_type can be applied multiple times when allocating memory for DIFFERENT data. For example, MY_STRUCT, YOUR_STRUCT, and MARYs_STRUCT can all be declared as the type MY_STRUCT_type, and your implementation has no way of distinguishing them apart.

I'll make a crude analogy. We can declare x, y, and z as "integer". But the command: LDW *+B15(integer), fum has no way of distinguishing which integer is to be loaded.

Can you clarify?

(Also, can you clarify why the compiler does not accept the structure length? (The parameter MY_STRUCT_LEN in my previous example)

0 George Mock over 13 years ago in reply to Walter Snafu

TI__Guru**** 249860 points

In these instructions

LDW *+B14(MY_STRUCT.fum), fum
LDW *+B14(MY_STRUCT_type.fum), fum

the value in the () is the same. That value is the offset of fum within the structure. It is presumed that B14 holds the base address of such a structure.

It seems you presume B14 must hold the address of an explicitly named structure. But that is not the only case. What if there were an array of these structures, and this instruction appears in a loop which iterates through that array. B14 is initialized to the base of the array, then incremented by the size of the structure for each iteration of the loop. In such a case, only (MY_STRUCT_type.fum) could possibly work.

Thanks and regards,

-George

0 Walter Snafu over 13 years ago in reply to George Mock

Genius 3450 points

Georgem said:

In these instructions

LDW *+B14(MY_STRUCT.fum), fum
LDW *+B14(MY_STRUCT_type.fum), fum

the value in the () is the same. That value is the offset of fum within the structure. It is presumed that B14 holds the base address of such a structure.

It seems you presume B14 must hold the address of an explicitly named structure. But that is not the only case. What if there were an array of these structures, and this instruction appears in a loop which iterates through that array. B14 is initialized to the base of the array, then incremented by the size of the structure for each iteration of the loop. In such a case, only (MY_STRUCT_type.fum) could possibly work.

Georgem,

Thanks for your help. To be helpful to TI, I ought explain the source of my confusion. In the TMS320C6000 Assembly Language Tools v 7.3, section 4.8, page 76, it gives the following as their only example:

COORDT   .struct                 ; structure tag definition
X .byte                          ;
Y .byte
T_LEN    .endstruct

COORD .tag COORDT ; declare COORD (coordinate)
.bss COORD, T_LEN ; actual memory allocation

LDB *+B14(COORD.Y), A2 ; move member Y of structure
; COORD into register A2

In this example, in the load instruction (LDB), I'm getting two cues. (1) It uses the name of the allocated space (COORD), rather that the name of the structure type (COORDT). Also, (and just as importantly here), it does NOT show the B14 register pre-loaded with any special value. That is, it does not show B!4 pre-loaded with the address of COORD.

From that example in the manual, I surmised that B14 must be pre-loaded somewhere else. Aha, probably by the compiler/assembler/linker(!), because they 'know' where these data structures are allocated in memory. That is, I do not have to pre-load B14, because the manual's example does not pre-load B14 -- it is taken care of automagically by the compiler/assembler/linker.

I further assumed (apparently incorrectly) the compiler/assembler/linker takes care of pre-loading the B14 register (a special register set aside for such things) with the base address of the BSS segment (into which we allocate our data structures), and that way ALL of our data structures can be addressed the same simple way -- without having to re-load B14 with the base address of each data structure. In other words, I could use the following loads in rapid succession:

LDB *+B14(MY_STRUCT.Y), A1
LDB *+B14(YOUR_STRUCT.Y), A2
LDB *+B14(MARYs_STRUCT.Y), A3

I thought that a reasonable interpretation, given the manual's description of it.

Georgem, thanks for your help correcting the situation. I see now that I must re-load B14 with the base address of each data structure I use. (Note to TI: Perhaps the next version of the manual can show that in the example.)

Code Composer Studio™︎

Code Composer Studio forum

Using .struct within .include files, with the optimizing assembler/compiler