Compiler/TDA2PXEVM: [EVE_SW] VCOP calculation result is not reflected

user5307126

Intellectual 820 points

Part Number: TDA2PXEVM

Tool/software: TI C/C++ Compiler

Hello TI-san,

I am trying to run the following VCOP kernel code:

---

void vcop_kernel
(
__vptr_uint8 in_ptr,
__vptr_uint32 optr0_Y_ptr,
__vptr_uint32 optr1_Y_ptr,
__vptr_uint16 optr2_Y_ptr,
__vptr_uint16 wptr_C,
unsigned short in1_num,
unsigned short in0_num_vld,
unsigned short in1_num_vld
)
{
__agen Addrc;
__vector Vidxi10, Vc16, Vc8;

Addrc = 0;

Vc16 = 16;
Vc8 = 8;
Vidxi10 = wptr_C[Addrc];

for (int I0 = 0; I0 < 8; I0++)
{
__agen Addr2;
__vector Vidxoff, Vnum0, Vnum1, Vivld, Vidx0;

Addr2 = I0 * sizeof(*optr2_Y_ptr);

Vidxoff = -8;
Vnum0 = in0_num_vld;
Vnum1 = in1_num_vld;
Vivld = 0xFFFFFFFF;

Vidx0 = optr2_Y_ptr[Addr2].onept();

for (int I1 = 0; I1 < in1_num/VCOP_SIMD_WIDTH; I1++)
{
__agen Addri, Addr0, Addr1;
__vector Vin0, Vcur0, Vmin1st0, Vmin2nd0, Vidx10, Vflag0, Vflag10;

Addri = (I0 * sizeof(*in_ptr) * 512) + (I1 * sizeof(*in_ptr) * VCOP_SIMD_WIDTH);

Addr0 = (I1 * sizeof(*optr0_Y_ptr) * VCOP_SIMD_WIDTH);
Addr1 = (I1 * sizeof(*optr1_Y_ptr) * VCOP_SIMD_WIDTH);

Vin0 = in_ptr[Addri];

Vmin1st0 = optr0_Y_ptr[Addr0];
Vmin2nd0 = optr1_Y_ptr[Addr1];

Vidxoff += Vc8;

Vcur0 = Vidx0;
Vcur0 |= Vin0 << Vc16; // (Vin0 << 16) | Vidx0

Vidx10 = Vidxoff + Vidxi10;

Vflag0 = (Vidx0 >= Vnum0);
Vflag10 = (Vidx10 >= Vnum1);
Vflag10 = Vflag10 | Vflag0;

Vcur0 = select(Vflag10, Vivld, Vcur0);

(Vmin1st0, Vcur0).minmax();
(Vmin2nd0, Vcur0).minmax();

optr0_Y_ptr[Addr0] = Vmin1st0;
optr1_Y_ptr[Addr1] = Vmin2nd0;
}

}

for (int I0 = 0; I0 < 1; I0++)
{
__agen Addr2;
__vector Vidx0, Vc;

Addr2 = I0;

Vidx0 = optr2_Y_ptr[Addr2];
Vc = 8;

Vidx0 += Vc;

optr2_Y_ptr[Addr2] = Vidx0;
}
}

---

However, this kernel code often produces incorrect results for the outputs "optr0_Y_ptr" and "optr1_Y_ptr" when "in1_num" is small (ex.16,32).

Needless to say, the _vcop_vloop_done () function is called after the _vloops () function for this kernel code.

Do you know why this kind of failure happens?

Also, would you please let me know if there is anything wrong with this kernel code?

The version of the EVE compiler in use is arp32_1.0.7.

Best regards,

Yudai ISHIBASHI

over 5 years ago

0 Anshu Jain over 5 years ago

TI__Guru 56820 points

Hi Yudai San,

When you say this code often produces incorrect results, do you mean the output is not always the same?

Regards,

Anshu

0 user5307126 over 5 years ago in reply to Anshu Jain

Intellectual 820 points

Hello Anshu-san,

When I run this program with the same "in1_num", the outputs are always the same.

However, the outputs change when I change the build option of this program from release to debug.

Regards,

Yudai

0 Anshu Jain over 5 years ago in reply to user5307126

TI__Guru 56820 points

Hi Yudai San,

Debug and release mode can have different order of instruction which may result in different output. Can you look at the corresponding assembly file and see where the difference is coming?

Regards,

ANshu

0 user5307126 over 5 years ago in reply to Anshu Jain

Intellectual 820 points

Hello Anshu-san,

Could you tell me the command lines to generate assembler files for debug mode and release mode?

Regards,

Yudai

0 Anshu Jain over 5 years ago in reply to user5307126

TI__Guru 56820 points

Hi Yudai-San,

You can use --keep_asm flag to generate the assembly code.

Regards,

Anshu

0 user5307126 over 5 years ago in reply to Anshu Jain

Intellectual 820 points

Hello Anshu-san,

I attach the kernel source code and the assembler codes generated in release mode and debug mode respectively.

For example, the result differs between release mode and debug mode when the parameters are as follows:

in1_num = 32, in0_num_vld = 323, in1_num_vld = 29

Do you notice anything about this result?

Regards,

Yudai

vcop_kernel.zip

0 Anshu Jain over 5 years ago in reply to user5307126

TI__Guru 56820 points

Hi Yudai-San,

I don't see anything wrong in the kernel. If this mismatch of output is consistent between release and debug mode then can you try removing one instruction at a time and see when they start matching. This will help in figuring out the instruction which is causing the issue then we can see what is going wrong because of it.

Regards,

Anshu

0 user5307126 over 5 years ago in reply to Anshu Jain

Intellectual 820 points

Hello Anshu-san,

I understand that there is no problem with the kernel code.

I use EDMA around the kernel code, so I'll check that out.

Regards,

Yudai

Processors

Processors forum

Compiler/TDA2PXEVM: [EVE_SW] VCOP calculation result is not reflected