Compiler/C2000-CGT: assignment of structs

Veena Kamath

Part Number: C2000-CGT
Other Parts Discussed in Thread: C2000WARE

Tool/software: TI C/C++ Compiler

Hi,

I want to understand how an assignment of 2 struct variables are handled by the compiler. I have 2 struct variables a and b of same type and I am doing a = b.

If I look at the asm file generated by the compiler (I used C28x compiler v18.9), I see it gets translated to couple of MOV and RPT instruction. I tried with optimization enabled and disabled

However when I compile the same code using a different makefile (same compiler version), I see that it uses memcpy function instead. The makefile is quite large and spans across multiple files. Hence I am unable to send the same.

I want to understand in what cases does the struct assignment use memcpy function? Is it some compiler/linker flag or is there any other dependencies?

Thanks,

Veena

over 6 years ago

0 Archaeologist over 6 years ago

TI__Guru* 84225 points

What you're seeing is the compiler inlining the call to memcpy. You should see a RPT over a PREAD or PWRITE instruction.

Check to see if you are using the --rpt_threshold option, which can prevent inlining of memcpy loops for large structures.

If both your source and destination structs are volatile, this will also prohibit inlining with PREAD or PWRITE.

0 Veena Kamath over 6 years ago in reply to Archaeologist

TI__Mastermind 32425 points

Hi,

I searched in all the make files for the usage of rpt_threshold and couldn't find any such reference. I tried adding --rpt_threshold=256. It did not help.

I also searched for the --no_rpt and -mi option. I haven't explicitly declared the variables as volatile.

Below is the console output for build: (This uses memcpy)

/opt/ti/ccsv8/tools/compiler/ti-cgt-c2000_18.9.0.STS//bin/cl2000 -I=/home/veena/Argo/benchmarks/ti/vxlib/src/vx/dhrystone -I=/home/veena/Argo/benchmarks/ti/vxlib/src/vx/dhrystone/c28 -I=/home/veena/Argo/benchmarks/ti/vxlib/src/common/c2x -I=/home/veena/Argo/benchmarks/ti/vxlib/src/common/c2x/driverlib -I=/opt/ti/ccsv8/tools/compiler/ti-cgt-c2000_18.9.0.STS//include -D=REG=register -D=DEVICE_CLOCK -D=NO_OF_RUNS=50000 --abi=eabi --display_error_number --emit_warnings_as_errors --diag_remark=10068 --opt_level=off -g --silicon_version=28 --float_support=fpu32 --tmu_support=tmu0 --vcu_support=vcu2 --fp_mode=relaxed --gen_func_subsections=on --keep_asm --rpt_threshold=256 --preproc_with_compile --preproc_dependency=/home/veena/Argo/benchmarks/out/C28/debug/module/dhrystone/dhry_2.dep -fr=/home/veena/Argo/benchmarks/out/C28/debug/module/dhrystone/ -fs=/home/veena/Argo/benchmarks/out/C28/debug/module/dhrystone/ -ft=/home/veena/Argo/benchmarks/out/C28/debug/module/dhrystone/ -eo=.obj -fc=/home/veena/Argo/benchmarks/ti/vxlib/src/vx/dhrystone/dhry_2.c

Below is the console output from my Makefile (This uses RPT)

/opt/ti/ccsv8/tools/compiler/ti-cgt-c2000_18.9.0.STS//bin/cl2000 -v28 -ml -mt --keep_asm --abi=eabi --float_support=fpu32 --tmu_support=tmu0 --vcu_support=vcu2 -DCPU1 -D_LAUNCHXL_F28379D --gen_func_subsections --fp_mode=relaxed --display_error_number --diag_remark=10068 --diag_suppress=10063 --diag_warning=225 --obj_directory=objs -DDEVICE_CLOCK -DNO_OF_RUNS=50000 -I/opt/ti/ccsv8/tools/compiler/ti-cgt-c2000_18.9.0.STS//include -I../../common/c2x/driverlib -I../../common/c2x -DREG=register dhry_1.c dhry_2.c ../../common/c2x/device.c ../../common/c2x/driverlib/sysctl.c c28/strcmp.asm c28/strcpy.asm -z --heap_size=0x800 --stack_size=0x400 -I/opt/ti/ccsv8/tools/compiler/ti-cgt-c2000_18.9.0.STS//lib -m"dhrystone.map" ../../../concerto/c28/lnk.cmd -llibc.a -o dhry.out

I couldn't find any major difference in the compiler options.

Thanks,

Veena

0 Veena Kamath over 6 years ago in reply to Veena Kamath

TI__Mastermind 32425 points

On a similar note, Is there a way to inline the strcpy function with RPT || PREAD instruction?

Regards,
Veena

0 Archaeologist over 6 years ago in reply to Veena Kamath

TI__Guru* 84225 points

Veena Kamath said:
On a similar note, Is there a way to inline the strcpy function with RPT || PREAD instruction?

Use optimization --opt_level=2 or greater, use option --opt_for_speed=3 or greater, and don't forget to include string.h.

You won't get a PREAD instruction, but it will be an optimized inlined loop.

0 Veena Kamath over 6 years ago in reply to Archaeologist

TI__Mastermind 32425 points

Hi,

I used -O3 and --opt_for_speed=4 and I still see "LCR strcpy" in the generated assembly code.

Thanks,
Veena

0 Archaeologist over 6 years ago in reply to Veena Kamath

TI__Guru* 84225 points

Veena Kamath said:
I couldn't find any major difference in the compiler options.

The difference is the -mt (--unified_memory) option. With it, you get the RPT||PREAD; without it, you do not.

You also need -ml (--large_memory_model), but that's the default now.

0 Archaeologist over 6 years ago in reply to Veena Kamath

TI__Guru* 84225 points

Veena Kamath said:
I used -O3 and --opt_for_speed=4 and I still see "LCR strcpy" in the generated assembly code.

Please show me all of the options for that test case.

0 Veena Kamath over 6 years ago in reply to Archaeologist

TI__Mastermind 32425 points

Hi,

This is the build command I found in the console:

"D:/ti/ccsv8/tools/compiler/ti-cgt-c2000_18.9.0.STS/bin/cl2000" -v28 -ml -mt --cla_support=cla1 --float_support=fpu32 --tmu_support=tmu0 --vcu_support=vcu2 -O3 --opt_for_speed=4 --fp_mode=relaxed --include_path="C:/Users/a0132123/workspace_v827/test4/dhrystone" --include_path="D:/Gitorious/benchmarks/ti/vxlib/src/vx/dhrystone" --include_path="C:/Users/a0132123/workspace_v827/test4/dhrystone/device" --include_path="D:/ti/c2000/C2000Ware_1_00_03_00/driverlib/f2837xs/driverlib" --include_path="D:/Gitorious/benchmarks/ti/vxlib/src/common" --include_path="D:/ti/ccsv8/tools/compiler/ti-cgt-c2000_18.9.0.STS/include" --advice:performance=all --define=DEVICE_CLOCK --define=NO_OF_RUNS=1000000 --define=REG=register --define=_LAUNCHXL_F28377S --define=CPU1 --diag_suppress=10063 --diag_warning=225 --diag_wrap=off --display_error_number --gen_func_subsections=on --abi=coffabi --disable_inlining -k --asm_listing --preproc_with_compile --preproc_dependency="dhry_1.d_raw" "../dhry_1.c"

Thanks,
Veena

0 George Mock over 6 years ago in reply to Veena Kamath

TI__Guru**** 244930 points

I am unable to find any combination of compiler options which cause a call to strcpy to be inlined. So, I filed the entry CODEGEN-5705 in the SDOWP system to have this investigated. It does not report a bug, because the generated code is correct. But, it reports a performance issue, since the generated code could be faster. You are welcome to follow it with the SDOWP link below in my signature.

Thanks and regards,

-George

0 Veena Kamath over 6 years ago in reply to George Mock

TI__Mastermind 32425 points

Thanks George.
I added a macro as shown below. This helped me in getting better results using inlined code
#define strcpy(d,s) memcpy(d,s,sizeof(s))

Thanks and Regards,
Veena

C2000™︎ microcontrollers

C2000 microcontrollers forum

Compiler/C2000-CGT: assignment of structs