This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Compiler/MSP432P401R: Incorrect _sxtb16() implementation TI ARM Compiler v18.1.3.LTS

Part Number: MSP432P401R

Tool/software: TI C/C++ Compiler

Hi, 

I was working with the CMSIS v5 DSPLib and noticed a difference in behavior when using the TI ARM Compiler v16.9.9.LTS versus the  TI ARM Compiler v18.1.3.LTS.

I narrowed the issue down to the _sxtb16() intrinsic function: 

Table 5-4 of ARM Optimizing C/C++ Compiler User's Guide

Section 4.6.186 of ARM Architecture Reference Manual Thumb-2 Supplement

Assembly code for correct implementation on TI ARM Compiler v16.9.9.LTS:

Assembly code for incorrect implementation on TI ARM Compiler v18.1.3.LTS

You can see the difference in the way the data is loaded with ldr or ldrsb.w. I've attached a main.c file with minimal code to reproduce the issue. 

43325.main.c
#include "msp.h"
#include <stdint.h>

#define __SIMD32_TYPE int32_t
#define __SIMD32(addr)        (*(__SIMD32_TYPE **) & (addr))

#define __SXTB16(VAL)                       ((unsigned int)_sxtb16(VAL, 0))

/**
 * \brief   Rotate Right in unsigned value (32 bit)
 * \details Rotate Right (immediate) provides the value of the contents of a register rotated by a variable number of bits.
 * \param [in]  VAL     Value to rotate
 * \param [in]  SHIFT   Number of Bits to rotate
 * \return              Rotated value
 */
#define __ROR(VAL, SHIFT)                   ((unsigned int)__ror(VAL, SHIFT))

/**
* @brief 32-bit fractional data type in 1.31 format.
*/
typedef int32_t q31_t;

/**
 * @brief 16-bit fractional data type in 1.15 format.
 */
typedef int16_t q15_t;

/**
 * @brief 8-bit fractional data type in 1.7 format.
 */
typedef int8_t q7_t;

/**
 * main.c
 *
 * Converts an array of q7_t values to an array of q15_t values
 */
void main(void)
{
	WDT_A->CTL = WDT_A_CTL_PW | WDT_A_CTL_HOLD;		// stop watchdog timer

	q31_t in = 0x00000000;
	q31_t in1 = 0x00000000;
	q31_t in2 = 0x00000000;

	q15_t dst[4] = {0x0000};
	q7_t  src[4] = {0xD7, 0x0A, 0x00, 0x08};

	q15_t *pDst = dst;
	const q7_t *pIn = src;     /* Src pointer */

	/* C = (q15_t) A << 8 */
    /* convert from q7 to q15 and then store the results in the destination buffer */
    in = *__SIMD32(pIn)++;

    /* rotate in by 8 and extend two q7_t values to q15_t values */
    int ror8 = __ROR(in, 8);
    in1 = __SXTB16(ror8);  // Expected output: 0x0008000A

    /* extend remainig two q7_t values to q15_t values */
    in2 = __SXTB16(in);   // Expected output: 0x0000FFD7

    *__SIMD32(pDst)++ = in2;
    *__SIMD32(pDst)++ = in1;

    while(1);
}

Can you verify if this is a real issue?

Best regards, 
Caleb Overbay

  • I am unable to reproduce the exact same assembly output.  But I did reproduce it closely enough to conclude there is probably a problem.  So, I filed CODEGEN-5473 in the SDOWP system to have this investigated.  You are welcome to follow it with the SDOWP link below in my signature.

    The reason I cannot precisely reproduce your results is probably because I don't use the same compiler build options.  To be certain I am reproducing exactly the same problem for the exactly the same reason, please submit a test case as described in the article How to Submit a Compiler Test Case.  I'll add it to the existing report.

    Thanks and regards,

    -George

  • Hi George,

    Thanks for looking into this. I've attached my pre-processed main.c file and the console build log containing the compiler and linker options. Let me know if you need anything else from me.

    compiler_bug_build_console.txt
    **** Clean-only build of configuration Debug for project DSP_Test ****
    
    "C:\\ti\\ccsv8\\utils\\bin\\gmake" -k -j 4 clean -O 
     
    DEL /F  "DSP_Test.hex"  "DSP_Test.out" 
    DEL /F "main.lst" "startup_msp432p401r_ccs.lst" "system_msp432p401r.lst" 
    DEL /F "main.obj" "startup_msp432p401r_ccs.obj" "system_msp432p401r.obj" 
    DEL /F "startup_msp432p401r_ccs.d" "system_msp432p401r.d" 
    DEL /F "main.asm" "startup_msp432p401r_ccs.asm" "system_msp432p401r.asm" 
    Finished clean
     
    
    **** Build Finished ****
    
    **** Build of configuration Debug for project DSP_Test ****
    
    "C:\\ti\\ccsv8\\utils\\bin\\gmake" -k -j 4 all -O 
     
    Building file: "../startup_msp432p401r_ccs.c"
    Invoking: ARM Compiler
    "C:/ti/ccsv8/tools/compiler/ti-cgt-arm_16.9.9.LTS/bin/armcl" -mv7M4 --code_state=16 --float_support=FPv4SPD16 -me --include_path="C:/ti/ccsv8/ccs_base/arm/include" --include_path="C:/ti/ccsv8/ccs_base/arm/include/CMSIS" --include_path="C:/Work/Neural_Nets/DSP_Test" --include_path="C:/ti/ccsv8/tools/compiler/ti-cgt-arm_16.9.9.LTS/include" --advice:power=all --define=__MSP432P401R__ --define=ccs -g --gcc --diag_warning=225 --diag_wrap=off --display_error_number --abi=eabi -k --asm_listing --c_src_interlist --preproc_with_compile --preproc_dependency="startup_msp432p401r_ccs.d_raw"  "../startup_msp432p401r_ccs.c"
    Finished building: "../startup_msp432p401r_ccs.c"
     
    Building file: "../main.c"
    Invoking: ARM Compiler
    "C:/ti/ccsv8/tools/compiler/ti-cgt-arm_16.9.9.LTS/bin/armcl" -mv7M4 --code_state=16 --float_support=FPv4SPD16 -me --include_path="C:/ti/ccsv8/ccs_base/arm/include" --include_path="C:/ti/ccsv8/ccs_base/arm/include/CMSIS" --include_path="C:/Work/Neural_Nets/DSP_Test" --include_path="C:/ti/ccsv8/tools/compiler/ti-cgt-arm_16.9.9.LTS/include" --advice:power=all --define=__MSP432P401R__ --define=ccs -g --gcc --diag_warning=225 --diag_wrap=off --display_error_number --abi=eabi -k --asm_listing --c_src_interlist --preproc_with_compile --preproc_dependency="main.d_raw"  "../main.c"
    "../main.c", line 66: remark #1527-D: (ULP 2.1) Detected SW delay loop using empty loop. Recommend using a timer module instead
    Finished building: "../main.c"
     
    Building file: "../system_msp432p401r.c"
    Invoking: ARM Compiler
    "C:/ti/ccsv8/tools/compiler/ti-cgt-arm_16.9.9.LTS/bin/armcl" -mv7M4 --code_state=16 --float_support=FPv4SPD16 -me --include_path="C:/ti/ccsv8/ccs_base/arm/include" --include_path="C:/ti/ccsv8/ccs_base/arm/include/CMSIS" --include_path="C:/Work/Neural_Nets/DSP_Test" --include_path="C:/ti/ccsv8/tools/compiler/ti-cgt-arm_16.9.9.LTS/include" --advice:power=all --define=__MSP432P401R__ --define=ccs -g --gcc --diag_warning=225 --diag_wrap=off --display_error_number --abi=eabi -k --asm_listing --c_src_interlist --preproc_with_compile --preproc_dependency="system_msp432p401r.d_raw"  "../system_msp432p401r.c"
    "../system_msp432p401r.c", line 156: remark #2623-D: (ULP 5.4) Detected an assignment to a type with size less than int. To avoid unnecessary sign extension, use int-sized types for local varaibles and convert to smaller types for static storage.
    "../system_msp432p401r.c", line 189: remark #2623-D: (ULP 5.4) Detected an assignment to a type with size less than int. To avoid unnecessary sign extension, use int-sized types for local varaibles and convert to smaller types for static storage.
    Finished building: "../system_msp432p401r.c"
     
    Building target: "DSP_Test.out"
    Invoking: ARM Linker
    "C:/ti/ccsv8/tools/compiler/ti-cgt-arm_16.9.9.LTS/bin/armcl" -mv7M4 --code_state=16 --float_support=FPv4SPD16 -me --advice:power=all --define=__MSP432P401R__ --define=ccs -g --gcc --diag_warning=225 --diag_wrap=off --display_error_number --abi=eabi -k --asm_listing --c_src_interlist -z -m"DSP_Test.map" --heap_size=1024 --stack_size=512 -i"C:/ti/ccsv8/ccs_base/arm/include" -i"C:/ti/ccsv8/tools/compiler/ti-cgt-arm_16.9.9.LTS/lib" -i"C:/ti/ccsv8/tools/compiler/ti-cgt-arm_16.9.9.LTS/include" --reread_libs --diag_wrap=off --display_error_number --warn_sections --xml_link_info="DSP_Test_linkInfo.xml" --rom_model -o "DSP_Test.out" "./main.obj" "./startup_msp432p401r_ccs.obj" "./system_msp432p401r.obj" "../msp432p401r.cmd"  -llibc.a 
    <Linking>
    remark #10371-D: (ULP 1.1) Detected no uses of low power mode state changing instructions
    remark #10372-D: (ULP 4.1) Detected uninitialized Port 1 in this project. Recommend initializing all unused ports to eliminate wasted current consumption on unused pins.
    remark #10372-D: (ULP 4.1) Detected uninitialized Port 2 in this project. Recommend initializing all unused ports to eliminate wasted current consumption on unused pins.
    remark #10372-D: (ULP 4.1) Detected uninitialized Port 3 in this project. Recommend initializing all unused ports to eliminate wasted current consumption on unused pins.
    remark #10372-D: (ULP 4.1) Detected uninitialized Port 4 in this project. Recommend initializing all unused ports to eliminate wasted current consumption on unused pins.
    remark #10372-D: (ULP 4.1) Detected uninitialized Port 5 in this project. Recommend initializing all unused ports to eliminate wasted current consumption on unused pins.
    remark #10372-D: (ULP 4.1) Detected uninitialized Port 6 in this project. Recommend initializing all unused ports to eliminate wasted current consumption on unused pins.
    remark #10372-D: (ULP 4.1) Detected uninitialized Port 7 in this project. Recommend initializing all unused ports to eliminate wasted current consumption on unused pins.
    remark #10372-D: (ULP 4.1) Detected uninitialized Port 8 in this project. Recommend initializing all unused ports to eliminate wasted current consumption on unused pins.
    remark #10372-D: (ULP 4.1) Detected uninitialized Port 9 in this project. Recommend initializing all unused ports to eliminate wasted current consumption on unused pins.
    remark #10372-D: (ULP 4.1) Detected uninitialized Port 10 in this project. Recommend initializing all unused ports to eliminate wasted current consumption on unused pins.
    Finished building target: "DSP_Test.out"
     
    
    **** Build Finished ****

    main.pp

    Best regards,

    Caleb Overbay

  • Thanks for the test case.  I added it to the existing report.

    Thanks and regards,

    -George

**Attention** This is a public forum