This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Compiler/TMS570LS1227: Compiler v16.9.8.LTS Optimisation Level 3 Unrolling 'for' loop failure

Part Number: TMS570LS1227

Tool/software: TI C/C++ Compiler

Hi,

I've recently updated my CCS project to run compiler v16.9.8.LTS from v5.2.7.

Immediately evident is that a group of my functions now fail functionally with optimisation set "-O3 --opt_for_speed=3". Everything works correctly with optimisation set "-O3 --opt_for_speed=2".

Differences in the pertinent optimisation report .NFO files indicate that the only differences between the two speed/size choices are concerned with the unrolling of 'for' loops within these functions. These functions perform a high level of single dimension array manipulation, both constant and variable arrays.

Inspection of the v16.9.8.LTS open defects report does not indicate this as a known issue.

I can obviously work around this by using "-O3 --opt_for_speed=2" for these functions, but this does not allay my fear that this issue exists elsewhere within my codebase.

Can you please advise.

Regards, Tony.

  • Can you identify one function, or better yet one loop, which computes the wrong result when built with -O3 --opt_for_speed=3?  If so, for the source file which contains that function or loop, please send in a test case as described in the article How to Submit a Compiler Test Case.  While we cannot run the code, we can often identify the problem by inspecting the compiler generated assembly code.

    Thanks and regards,

    -George

  • Hi George,

    Attached in my source file as requested.

    I'm sorry that the scenario functions arenot simpler, but I have been unsuccessful in reducing this remaining complexity without the issue disappearing.

    The compiler is CGT v16.9.8.LTS.

    Compile options are:

    'Building file: ../Tony.c'
    'Invoking: ARM Compiler'
    "C:/ti/ccsv7/tools/compiler/ti-cgt-arm_16.9.8.LTS/bin/armcl" --cmd_file="C:\work\PD_MV\Firmware\BRANCH\RamBlDev_AJM\Compiler.opt" -mv7R4 --code_state=32 --float_support=VFPv3D16 -O3 --opt_for_speed=2 --include_path="C:/work/PD_MV/Firmware/BRANCH/RamBlDev_AJM" --include_path="C:/work/PD_MV/Firmware/BRANCH/RamBlDev_AJM/Release" --include_path="C:/work/PD_MV/Firmware/BRANCH/RamBlDev_AJM/Ancillary" --include_path="C:/work/PD_MV/Firmware/BRANCH/RamBlDev_AJM/PdBootLoader" --include_path="C:/work/PD_MV/Firmware/BRANCH/RamBlDev_AJM/DriverSoftware/CommunicationDrivers" --include_path="C:/work/PD_MV/Firmware/BRANCH/RamBlDev_AJM/DriverSoftware/ComplexDrivers" --include_path="C:/work/PD_MV/Firmware/BRANCH/RamBlDev_AJM/DriverSoftware/InputOutputDrivers" --include_path="C:/work/PD_MV/Firmware/BRANCH/RamBlDev_AJM/DriverSoftware/MemoryDrivers" --include_path="C:/work/PD_MV/Firmware/BRANCH/RamBlDev_AJM/DriverSoftware/MicrocontrollerDrivers" --include_path="C:/work/PD_MV/Firmware/BRANCH/RamBlDev_AJM/ServiceSoftware/CommunicationServices" --include_path="C:/work/PD_MV/Firmware/BRANCH/RamBlDev_AJM/ServiceSoftware/InputOutputServices" --include_path="C:/work/PD_MV/Firmware/BRANCH/RamBlDev_AJM/ServiceSoftware/MemoryServices" --include_path="C:/work/PD_MV/Firmware/BRANCH/RamBlDev_AJM/ServiceSoftware/SystemServices" --include_path="C:/work/PD_MV/Firmware/BRANCH/RamBlDev_AJM/ToolChain/include" --include_path="C:/work/PD_MV/Firmware/BRANCH/RamBlDev_AJM/ToolChain/Rtl" --define=DET_ENABLED --define=__TI_EABI__ -g --plain_char=unsigned --preproc_with_comment --preproc_with_compile --diag_remark=97 --diag_suppress=552 --diag_warning=225 --display_error_number --gen_func_subsections=on --enum_type=packed --abi=eabi --asm_listing --c_src_interlist --gen_opt_info=2 "../Tony.c"
    'Finished building: ../Tony.c'

    The program Tony_Main() is called as follows:

    static unsigned char arrayTony16[16] = {0x00U, 0x11U, 0x22U, 0x33U, 0x44U, 0x55U, 0x66U, 0x77U, 0x88U, 0x99U, 0xAAU, 0xBBU, 0xCCU, 0xDDU, 0xEEU, 0xFFU};
    static unsigned char arrayTony48[48] = {0xB8U, 0xEAU, 0x30U, 0xF3U, 0x60U, 0x01U, 0x95U, 0xE5U, 0x8EU, 0x9BU, 0x97U, 0x1AU, 0xC0U, 0x4BU, 0x32U, 0x5FU,
    0xBFU, 0x40U, 0xC8U, 0x16U, 0x43U, 0x5BU, 0xF7U, 0xC0U, 0xD9U, 0x08U, 0xE8U, 0xB6U, 0x91U, 0x8AU, 0x0FU, 0x29U,
    0x55U, 0xB6U, 0x3AU, 0xF0U, 0xA2U, 0x69U, 0x7EU, 0xA2U, 0x21U, 0x06U, 0xE9U, 0xCEU, 0x7AU, 0x98U, 0x08U, 0xACU};

    void main(void)
    {
    Tony_Main(&arrayTony16[0], &arrayTony48[0]);
    }

    With -O3 --opt_for_speed=2, resultant arrayTony16 =

    {0x45U, 0x03U, 0xDBU, 0xD1U, 0x61U, 0x2FU, 0x86U, 0xC0U, 0x73U, 0x72U, 0X7CU, 0x38U, 0xC1U, 0x65U, 0x21U, 0x7AU}

    With -O3 --opt_for_speed=3, resultant arrayTony16 =

    {0x9EU, 0xF1U, 0xDBU, 0xD1U, 0x61U, 0x2FU, 0x86U, 0xC0U, 0xA8U, 0x80U, 0X7CU, 0x38U, 0xC1U, 0x65U, 0x21U, 0x7AU}

    In view of this issue, I am minded to revert to CGT v5.2.9 which I believe is also long term support.

    Regards, Tony

    void Tony_Main(unsigned char nDataPtr[], const unsigned char nExtPtr[]);
    
    
    
    static void Tony_Munge1(unsigned char nMunge1Ptr[], const unsigned char nPtr[]);
    static void Tony_Munge2(unsigned char nMunge2Ptr[]);
    static void Tony_Munge3(unsigned char nMunge3Ptr[]);
    
    void Tony_Main(
        unsigned char nDataPtr[],
        const unsigned char nExtPtr[])
    {
        signed char nIndex;
    
        Tony_Munge1(nDataPtr, &nExtPtr[2U * (16U)]);
        Tony_Munge2(nDataPtr);
    
        for (nIndex = 1; nIndex >= 0; nIndex--)
        {
            Tony_Munge1(nDataPtr, &nExtPtr[((unsigned char)nIndex * (16U))]);
            if (0 != nIndex)
            {
                Tony_Munge3(nDataPtr);
            }
        }
    }
    
    static void Tony_Munge1(
        unsigned char nMunge1Ptr[],
        const unsigned char nPtr[])
    {
        unsigned char nIndex;
    
        for (nIndex = 0U; nIndex < (16U); nIndex++)
        {
            (nMunge1Ptr[nIndex]) ^= nPtr[nIndex];
        }
    }
    static void Tony_Munge2(
        unsigned char nMunge2Ptr[])
    {
        unsigned char nTemp;
    
        nTemp = nMunge2Ptr[0];
        nMunge2Ptr[0] = nMunge2Ptr[4];
        nMunge2Ptr[4] = nMunge2Ptr[8];
        nMunge2Ptr[8] = nMunge2Ptr[12];
        nMunge2Ptr[12] = nTemp;
        nTemp = nMunge2Ptr[1];
        nMunge2Ptr[1] = nMunge2Ptr[5];
        nMunge2Ptr[5] = nMunge2Ptr[9];
        nMunge2Ptr[9] = nMunge2Ptr[13];
        nMunge2Ptr[13] = nTemp;
        nTemp = nMunge2Ptr[2];
        nMunge2Ptr[2] = nMunge2Ptr[6];
        nMunge2Ptr[6] = nMunge2Ptr[10];
        nMunge2Ptr[10] = nMunge2Ptr[14];
        nMunge2Ptr[14] = nTemp;
        nTemp = nMunge2Ptr[3];
        nMunge2Ptr[3] = nMunge2Ptr[7];
        nMunge2Ptr[7] = nMunge2Ptr[11];
        nMunge2Ptr[11] = nMunge2Ptr[15];
        nMunge2Ptr[15] = nTemp;
    }
    
    static void Tony_Munge3(
        unsigned char nMunge3Ptr[])
    {
        unsigned char nTemp[(16U)];
        unsigned char nIndex;
    
        nTemp[15] = nMunge3Ptr[0] ^ nMunge3Ptr[1];
        nTemp[14] = nMunge3Ptr[2] ^ nMunge3Ptr[3];
        nTemp[13] = nMunge3Ptr[4] ^ nMunge3Ptr[5];
        nTemp[12] = nMunge3Ptr[6] ^ nMunge3Ptr[7];
    
        nTemp[11] = nMunge3Ptr[8] ^ nMunge3Ptr[9];
        nTemp[10] = nMunge3Ptr[10] ^ nMunge3Ptr[11];
        nTemp[9]  = nMunge3Ptr[12] ^ nMunge3Ptr[13];
        nTemp[8]  = nMunge3Ptr[14] ^ nMunge3Ptr[15];
    
        nTemp[7]  = nMunge3Ptr[0] ^ nMunge3Ptr[1];
        nTemp[6]  = nMunge3Ptr[2] ^ nMunge3Ptr[3];
        nTemp[5]  = nMunge3Ptr[4] ^ nMunge3Ptr[5];
        nTemp[4]  = nMunge3Ptr[6] ^ nMunge3Ptr[7];
    
        nTemp[3]  = nMunge3Ptr[8] ^ nMunge3Ptr[9];
        nTemp[2]  = nMunge3Ptr[10] ^ nMunge3Ptr[11];
        nTemp[1]  = nMunge3Ptr[12] ^ nMunge3Ptr[13];
        nTemp[0]  = nMunge3Ptr[14] ^ nMunge3Ptr[15];
    
        for (nIndex = 0U; nIndex < (16U); nIndex++)
        {
            nMunge3Ptr[nIndex] = nTemp[nIndex];
        }
    }
    

  • Please show me the contents of this command file ...

    Tony Morrell said:
    --cmd_file="C:\work\PD_MV\Firmware\BRANCH\RamBlDev_AJM\Compiler.opt"

    Or, if it is easier, attach it to your next post.  It is possible, though not certain, these options could affect things.

    Thanks and regards,

    -George

  • Thank you for sending in such a detailed test case.  That must have taken a while to put together.  

    I was unable to determine whether this is a problem in the compiler or not.  Even so, I filed CODEGEN-5032 in the SDOWP system to have this investigated.  You are welcome to follow it with the SDOWP link below in my signature.

    Regarding the options in the command file I asked about in my previous post ... For now, I presume they have no effect on the problem.  But they might.  So I would appreciate if you would send those options in.

    Thanks and regards,

    -George

  • Hi George,

    I'm afraid the Compiler.opt file only consists of the following two lines:

    --check_misra=2.2
    --display_error_number

    Regards, Tony.

  • Hi again George,
    I've just looked at the CODEGEN-5032 and would like to clarify that the boundary between code working and not working is more completely:
    CODE WORKING when --opt_for_speed=0 or 1 or 2
    CODE FAILING when --opt_for_speed=3, 4 or 5
    It is the additional optimisations when going from level 2 to level 3 that are causing the issue.
    Regards, Tony.
  • I added the additional options and clarification to the SDOWP record.  Many of the fields in these records are not public, so the entry may appear the same to you.

    Thanks and regards,

    -George