This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM2632: I encountered a problem while copying the example and putting the code into TCMA

Part Number: AM2632
Other Parts Discussed in Thread: SYSCONFIG

Tool/software:

I want to place the calculation code in the TCMA, and I made the following modifications based on the example routine:

Added code:

.TI.local : {} >> R5F_TCMA | R5F_TCMB | OCRAM 

.TI.onchip : {} >> OCRAM | FLASH

.TI.offchip : {} > FLASH

Placed before the following code:

/* Sections needed for C++ projects /

GROUP {

.ARM.exidx: {} palign(8) / Needed for C++ exception handling /

.init_array: {} palign(8) / Contains function pointers called before main /

.fini_array: {} palign(8) / Contains function pointers called after main */ } > OCRAM

When writing the function, declare it as: 

(Because my program is currently using fixed-point calculations, converting to floating-point would be a massive undertaking, so I am temporarily using the DSP's IQmath library on AM2632)

_iq24 attribute((local(1))) _IQ24mpy1(_iq24 A, _iq24 B)
{
_iq24 result;
asm volatile(
"smull   r2, r3, %1, %2
"
"lsr     r2, r2, #24
"
"lsr     r3, r3, #8
"
"add     %0, r2, r3
"
: "=r" (result)
: "r" (A), "r" (B)
: "r2", "r3", "cc"
);
return result;
}

When I run it, the code still executes in RAM rather than in TCMA, which can be verified by checking the addresses through disassembly.

_IQ24mpy1():
7004dd74: FB802307 smull r2, r3, r0, r7
7004dd78: EA4F6212 lsr.w r2, r2, #0x18
7004dd7c: EA4F2313 lsr.w r3, r3, #8
7004dd80: EB020003 add.w r0, r2, r3

I compared the linker.cmd of the example and my project, and they are almost identical. The MAP file generated by my debug build does not contain the function _IQ24mpy1, whereas the map file generated by the example routine contains the following code segment.

.irqstack 
*          0    70057a08    00000100     UNINITIALIZED
                  70057a08    00000100     --HOLE--

.fiqstack 
*          0    70057b08    00000100     UNINITIALIZED
                  70057b08    00000100     --HOLE--

.svcstack 
*          0    70057c08    00001000     UNINITIALIZED
                  70057c08    00001000     --HOLE--

.abortstack 
*          0    70058c08    00000100     UNINITIALIZED
                  70058c08    00000100     --HOLE--

.undefinedstack 
*          0    70058d08    00000100     UNINITIALIZED
                  70058d08    00000100     --HOLE--

.init_array 
*          0    70040000    00000000     UNINITIALIZED

.TI.local 
*          0    00000040    00001c06     
                  00000040    00000802     basic_smart_placement.o (.text.annotated_function_f2)
                  00000842    00001002     basic_smart_placement.o (.text.annotated_function_f3)
                  00001844    00000402     basic_smart_placement.o (.text.annotated_function_f4)

my map file contains bellow:

.irqstack 
*          0    70059ca0    00000100     UNINITIALIZED
                  70059ca0    00000100     --HOLE--

.fiqstack 
*          0    70059da0    00000100     UNINITIALIZED
                  70059da0    00000100     --HOLE--

.svcstack 
*          0    70059ea0    00001000     UNINITIALIZED
                  70059ea0    00001000     --HOLE--

.abortstack 
*          0    7005aea0    00000100     UNINITIALIZED
                  7005aea0    00000100     --HOLE--

.undefinedstack 
*          0    7005afa0    00000100     UNINITIALIZED
                  7005afa0    00000100     --HOLE--

.init_array 
*          0    70040000    00000000     UNINITIALIZED

.trigText 
*          0    00000040    00000594     
                  00000040    00000594     mathlib.am263x.r5f.ti-arm-clang.release.lib : ti_arm_trig.obj (.trigText)

.trigData 
*          0    000005d8    000000b0     
                  000005d8    000000b0     mathlib.am263x.r5f.ti-arm-clang.release.lib : ti_arm_trig.obj (.trigData)

why? how can I put my code in TCMA?

  • Hi Zhou, 
    Could you please tell me which SDK version are you using and syscfg version.?

  • Hi Nilabh,

    9.2.0.56 and 1.20.0

  • Hi Hilabh,

    'll try it today

  • Hi Hilabh,

         I referred to the link you provided and made changes almost identical to those in the example routine, but it still doesn't work.

         I made some modifications in the CMD file, defining the regions trigData and trigText used in the ti_arm library, and found that I could arbitrarily configure them to be in TCMA or TCMB. However, my own defined .armiqmath would not be generated in TCM, even though I tried all code optimization levels.

         I suspect that it might be related to the ti_arm library. If I allocate trigText and trigData in TCMB, the code will throw an exception when it reaches the call to the ti_arm_sin function. Upon inspection, the assembly address jumps to a location in TCMA, where there are no trigText and trigData. The code only runs normally if trigText and trigData are allocated in TCMA or not allocated at all.

        .vectors  : {    } > R5F_VECS   , palign(8)
    
    	GROUP : {
    
    
    	}  > R5F_TCMA
    
    	GROUP : {
    	.trigData : {} palign(8)
    	.trigText :{}palign (8)
    	.armiqmath : {    } palign(8)
    	}  > R5F_TCMB
    
        GROUP  :   {
        .text.hwi : {    } palign(8)
        .text.cache : {    } palign(8)
        .text.mpu : {    } palign(8)
        .text.boot : {    } palign(8)
        .text:abort : {    } palign(8)
        } > OCRAM
        
        
        
    static inline __attribute__((__section__(".armiqmath"),always_inline))_iq24 _IQ24mpy1(_iq24 A, _iq24 B)
    {
        _iq24 result;
        __asm__ volatile(
            "smull   r2, r3, %1, %2\n"
            "lsr     r2, r2, #24\n"
            "lsr     r3, r3, #8\n"
            "add     %0, r2, r3\n"
            : "=r" (result)
            : "r" (A), "r" (B)
            : "r2", "r3", "cc"
        );
        return result;
    }

          I have provided my project, using CCS12.7.0. Additionally, if I want to place all used iqmath (Q=24) functions (multiplication, division, sin, cos, square root) in TCMA, how should I proceed?

  • Hi Zhou,

    Couple of things to confirm1

    1. Is the code armiqmath code that you wrote is working properly when you execute it from MSRAM(any place other than TCM),just to confirm code has no issues?

    2. Which example from SDK are you using?

    3. Can you send me the whole map file  both working and non-working cases

  • Hi Nilabh, 

           Apologies for the delayed response due to the holiday. 

           1. Yes, it's working properly in OCRAM, and I got correct answer with my function.

           2. I did not use an example. I used my own project. The example I compared against is basic_smart_placement in \examples\kernel\nortos\basic_smart_placement.

           3. I will send the map file to the TI engineers  to see if they can forward it to you. This map file is non-working case. No matter how I modify it, I cannot allocate the custom 'armiqmath' into TCMA, so I do not have a working map file either.

  • Hi Zhou,

    I will try to get back on this by wednesday next week.

  • static __attribute__((__section__(".text.armiqmath")))_iq24 _IQ24mpy1(_iq24 A, _iq24 B)
    {
    _iq24 result;
    __asm__ volatile(
    "smull r2, r3, %1, %2\n"
    "lsr r2, r2, #24\n"
    "lsr r3, r3, #8\n"
    "add %0, r2, r3\n"
    : "=r" (result)
    : "r" (A), "r" (B)
    : "r2", "r3", "cc"
    );
    return result;
    }

    Hi Zhou,

    I was able to get this function in TCM with above syntax

    Also I would recommend to read this for more details on compiler inline feature and section placement

    configure them to be in TCMA or TCMB. However, my own defined .armiqmath would not be generated in TCM, even though I tried all code optimization levels.

    Also I see that when we inline the function the behavior is that it is not visible in map file in as per the below:

    software-dl.ti.com/.../function_attributes.html

  • However, my own defined .armiqmath would not be generated in TCM, even though I tried all code optimization levels.

    If you remove the inline you would we able to see it in map file. When you do inline all the definitions are replaced with the assembly code.

  •      I suspect that it might be related to the ti_arm library. If I allocate trigText and trigData in TCMB, the code will throw an exception when it reaches the call to the ti_arm_sin function. Upon inspection, the assembly address jumps to a location in TCMA, where there are no trigText and trigData. The code only runs normally if trigText and trigData are allocated in TCMA or not allocated at all.

    This is not very clear to me, but what I can say is triText and trigData are just linker groups so you would not see them in disassembly..

  • Hi Nilabh,

        As described in the email, even after overflowing inline and always inline, I still did not achieve the desired result. After compiling, I checked the Memory Allocation, but there was still no content in TCMA.

        I used the following software environment: CCS12.8.0 sysconfig 1.21.999 SDK10.0.0.37 optimization: s

        Additionally, I made the following modifications to the example program basic_smart_placement at SDKPATH\examples\kernel\nortos\basic_smart_placement.

        I made the following modifications to the annotated_function_f2 function, and I modified the function to accept parameters and set a return value for it.

    int __attribute__((local(2))) annotated_function_f2(int A, int B)
    {
        int result;
        __asm__ volatile(
            "smull   r2, r3, %1, %2\n"
            "lsr     r2, r2, #24\n"
            "lsr     r3, r3, #8\n"
            "add     %0, r2, r3\n"
            : "=r" (result)
            : "r" (A), "r" (B)
            : "r2", "r3", "cc"
        );
        return result;
    }

    I made almost no changes to the other parts, only constructed variables to ensure that the compilation could successfully pass.

    void __attribute__((local(1))) annotated_function_f1(void)
    {
        calRes = annotated_function_f2(5, 5);
        annotated_function_f3();
        annotated_function_f4();
    }
    

    After compiling, I checked the Memory Allocation and found that the modified annotated_function_f2 was not present in TCMA.

    The functions I did not modify, annotated_function_f3 and annotated_function_f4, are present in the TCMA.

    Similarly, I was unable to find the functions annotated_function_f1 and annotated_function_f2 in the map file; only annotated_function_f3 and annotated_function_f4 were present, which indeed are included in TCMA.

    .init_array 
    *          0    70040000    00000000     UNINITIALIZED
    
    .TI.local 
    *          0    00000040    00001404     
                      00000040    00001002     basic_smart_placement.o (.text.annotated_function_f3)
                      00001042    00000402     basic_smart_placement.o (.text.annotated_function_f4)
    
    MODULE SUMMARY
    
           Module                                code    ro data   rw data
           ------                                ----    -------   -------
        .\
           basic_smart_placement.o               12454   132       40     
           main.o                                28      0         0      
        +--+-------------------------------------+-------+---------+---------+
           Total:                                12482   132       40     

    If I change the optimization level from S to turning off optimization without modifying the code compilation, then annotated_function_f2 will reappear. Why is that?

    .init_array 
    *          0    70040000    00000000     UNINITIALIZED
    
    .TI.local.1 
    *          0    00000040    00000042     
                      00000040    0000001e     basic_smart_placement.o (.text.annotated_function_f1)
                      0000005e    00000002     --HOLE-- [fill = 0]
                      00000060    00000022     basic_smart_placement.o (.text.annotated_function_f2)
    
    .TI.local.2 
    *          0    00000090    000063fe     
                      00000090    00004ffe     basic_smart_placement.o (.text.annotated_function_f3)
                      0000508e    00000002     --HOLE-- [fill = 0]
                      00005090    000013fe     basic_smart_placement.o (.text.annotated_function_f4)
    
    MODULE SUMMARY
    
           Module                                code    ro data   rw data
           ------                                ----    -------   -------
        .\
           basic_smart_placement.o               61622   132       40     
           main.o                                38      0         0      
        +--+-------------------------------------+-------+---------+---------+
           Total:                                61660   132       40     
                                                                          

     If the issue cannot be resolved quickly, can you provide a example program that includes functions allocated in TCMA? I should be able to freely modify the function content without affecting memory allocation. Alternatively, I will provide the following assembly code, please provide an example program that places it in TCMA for execution, where I can call and pass values from the outside.

     static __attribute__((__section__(".armiqmath")))_iq24 _IQ24mpy_test(_iq24 A, _iq24 B)
    {
        _iq24 result;
        __asm__ volatile(
            "smull   r2, r3, %1, %2\n"
            "lsr     r2, r2, #24\n"
            "lsr     r3, r3, #8\n"
            "add     %0, r2, r3\n"
            : "=r" (result)
            : "r" (A), "r" (B)
            : "r2", "r3", "cc"
        );
        return result;
    }
    const float f2iqScale = 16777216.0f;
    const float iq2fScale = 0.000000059604645f;
          static __attribute__((__section__(".armiqmath")))_iq24  _IQdiv_test(_iq24 iqDividend, _iq24 iqDivisor)
    {
        _iq24 result;
        __asm__ volatile (
            "vmov    s0, %1                \n"
            "vmov    s1, %2                \n"
            "vcvt.f32.s32   s0, s0         \n"
            "vcvt.f32.s32   s1, s1         \n"
            "vdiv.f32       s2, s0, s1     \n"
            "vldr.f32       s3, [%3]       \n"
            "vmul.f32       s2, s2, s3     \n"
            "vcvt.s32.f32   s2, s2         \n"
            "vmov           %0, s2         \n"
            : "=r" (result)
            : "r" (iqDividend), "r" (iqDivisor), "r" (&f2iqScale)
            : "s0", "s1", "s2", "s3", "cc", "memory"
        );
        return result;
    }
    static __attribute__((__section__(".armiqmath")))_iq24 _IQ24sin_test1(const _iq24 inputIq24Num)
    {
        _iq24 iqSinResult;
        float floatSinResult;
        float inputFloatNum;
        __asm__ volatile (
            "vmov    s0, %1           \n"
            "vcvt.f32.s32 s0, s0      \n"
            "vmov.f32 s1, %2          \n"
            "vmul.f32 s0, s0, s1      \n"
            "vmov    %0, s0           \n"
            : "=r" (inputFloatNum)
            : "r" (inputIq24Num), "r" (iq2fScale)
            : "s0", "s1", "cc", "memory"
        );
        floatSinResult = ti_arm_sin(inputFloatNum);
        __asm__ volatile (
            "vmov    s0, %1               \n"
            "vldr    s1, [%2]             \n"
            "vmul.f32   s0, s0, s1        \n"
            "vcvt.s32.f32 s0, s0          \n"
            "vmov    %0, s0               \n"
            : "=r" (iqSinResult)
            : "r" (floatSinResult), "r" (&f2iqScale)
            : "s0", "s1", "cc", "memory"
        );
        return iqSinResult;
    }
    

  • Hi Zhou,

    I am attaching a project with code placed in TCM A when I build it in debug mode I am able to see the code getting placed in TCM A and also inlined.

    I do not see the same when I am in release mode- I believe the reason could be optimization, which I am checking with the compiler team.

    I will update on the second issue by early next week, meanwhile please check the example project.

    /cfs-file/__key/communityserver-discussions-components-files/908/code_5F00_placement_5F00_TCMA.zip