AWR2944EVM: How to change the data in Vector multiplication Cofficient RAM?

Cherry Zhou

Part Number: AWR2944EVM

Hi team,

Here's an issue from the customer may need your help:

1) The customer made some changes to the original DPC based on demo:

Changed data layout and number of bits after 2DFFT to make 3DFFT results exported from 2DFFT using matlab inconsistent with results directly using hardware accelerator for 3DFFT.

The original data was laid out continuously in accordance with TX1 TX3 TX4 TX2, modified to TX1 TX2 TX3 TX4 and discontinuous. A total of 43 bits of data after zeroing between 16 virtual channels are 3DFFT, all 16 virtual channels are horizontal antenna virtual channels.

The hardware accelerator that performs 3DFFT has also been changed by the customer. The size of the dopplersubmat is 43*dopplerbins(number of dopplerBin per subband)*bytespersample(8byte)*2(ping pong). Change srcAidX in HWA configuration to 42 and angle FFT to 64. Obj->cfarAzimFFTCfg.numAzimFFTBins = 4 * mathUtils_getValidFFTSize(16); SrcBidx = 43*bytespersample; corresponding change to EDMA, change account to 43*bytespersample. SrcBidx = 43*bytespersample, dstBidx = 43*bytespersample; and so on.

The above changes are in the functions DPU_DopplerProcHWA_configHwaCFARAzimFFT and DPU_DopplerProcHWA_configEdmaAzimFFTIn.

The customer has modified the memory size of the associated data used by the hardware accelerator, the data read from the hardware accelerator, the size of the FFT, and the associated EDMA.

Are the above changes possible? What other changes are needed? Is there a configuration tutorial available for using the hardware accelerator?

2) They also tried the data injection test, only the first Doppler unit of the 10th distance gate was injected, the other full assignment was 0, and the injected test data was a sinusoidal signal as follows:

The figure on the left shows the data exported by dopplersubmat (processed with matlab), and the figure on the right shows the data exported by AzimuthScratchBuf after 3DFFT, which is also 0 on the other range units. There are no peaks, which are consistent with the results of the matlab simulation.

3) The customer sets cmultScaleEn to disable because each iteration reads 43 bits of data into the Hwa memory. According to the TRM, if this bit is set to disable and cmultMode = 0110b, the sample read is multiplied by the number in Vector multiplication Cofficient RAM. Changing the vecMultiMOde1RamAddrOffset=172 (offset 43 bits) does yield a different result than before.

How to change the data in Vector multiplication Cofficient RAM? What value is required to change if you want to keep the sample read unchanged?

Use the function HWA_configRam(obj->hwahandle,HWA_RAM_type_VECTORMULTIPLY_RAM, &Buffer[0],86*sizeof(Int32_t),0)) make a modified assignment to this section of memory, but the board will get stuck in the DPC. According to the TRM, the RAM supports up to 1024 data, and the startIdx = 0 set by the customer should be no problem.

Could you help check this case? Thanks.

Best Regards,

Cherry

over 2 years ago

0 Cherry Zhou over 2 years ago

TI__Mastermind 22235 points

Hi,

Just a quick update:

The customer has been able to assign a value to the memory space pointed to by the HWA_RAM_type_VECTORMULTIPLY_RAM and has also converted the number to be written to RAM through the function mathUtils_asymQuantInt().

Now the data written is [0,10,1,...] ], but the results are still not correct. The resulting VA plot is as follows:

Thanks and regards,

Cherry

0 Cherry Zhou over 2 years ago in reply to Cherry Zhou

TI__Mastermind 22235 points

Hi,

The result after modifying the HWA_RAM_type_VECTORMULTIPLY_RAM is shown in the following figure:

But the side lobe of the 2944 is a little high. Are the above results correct?

The customer does not have the configuration window function and should output after the FFT after ABS and log2 have been completed.

According to the TRM, the output should be 16 bits of data, with the upper 8 bits being 0 and the lower 8 bits being the output data.

But the actual data read out is typically not 0 in the upper 8 bits. Why?

The processing block diagram is shown in the second figure (from the TRM).

Thanks and regards,

Cherry

0 Kaushik Gowda over 2 years ago in reply to Cherry Zhou

TI__Mastermind 20605 points

Hi Cherry,

Cherry Zhou said:
A total of 43 bits of data after zeroing between 16 virtual channels are 3DFFT, all 16 virtual channels are horizontal antenna virtual channels.

1) I'm unable to understand this line, can you please rephrase or explain the same little differently?

2) Is this just an observation or a question?

3) cmultScaleEn is applicable only in CMULT mode 0101b. vecMultiMOde1RamAddrOffset indicates the address from where your vector with which your input samples are multiplied starts with. "The customer sets cmultScaleEn to disable because each iteration reads 43 bits of data into the Hwa memory" -> What does this mean?

You can directly load data into the vector coeff RAM using memcopy or EDMA before triggering the HWA by setting CMULT_MODE is set to 0b0000. If not, the access to the RAM will be locked.

What value is required to change if you want to keep the sample read unchanged? -> CMULT mode 0110b resets the start address of the vector coeff RAM by default for each iteration if that is what you meant by this question.

Mainly, there is some ambiguity concerning the "43 bits". What are they trying to achieve and how are they trying to do so?

In your latest reply, the sidelobes seem to be a result of not applying windowing for your input samples. Can you confirm if the effect is minimized with the application of windows?

Also, have the confirmed the output of FFT ABS?

Regards,
Kaushik

0 Annie Liu over 2 years ago in reply to Kaushik Gowda

TI__Genius 10295 points

Hi Kaushik,

Thanks for your response.

1. The reason for reading 43 points into HWA is that the front end of the newly designed antenna is not a uniform linear array, and it is necessary to perform lamda/2 interpolation in the middle of the 16 virtual channels.The final result is 43 bits.

2. The problem with cmultScaleEn is now fixed. If cmultScaleEn is set to Enable, the input data will be complex multiplied with the values of 12 I,Q channels. But now the input number exceeds 12 in each iteration, so set cmultScaleEn to Disable, and let the input number be multiplied by the number in coeff RAM.

3. Coeff RAM can be written in dopplerHWA_config. After the number [0,1,0,1....] saved in the coeff RAM is converted by the function mathUtils_asymQuantInt, each input number can be multiplied with the number in the coeff RAM and remains unchanged.

4. After inverting the output 3DFFT result according to the output formatter in 2944TRM, it can be the same as the 3DFFT done by matlab with the 2DFFT result.

5. After verifying the 3DFFT, a new problem appeared, about Local Maximum detection and os-cfar. Tested in the dark room, used the serial port to print out the output results, and found that 64-point 3DFFT was performed. When the detected peak appears in the last angle channel (63), the signal amplitude on the angle channel (0) which is increased by 1 will be higher than that on the (63) channel. This part uses the SDK's own configuration. The customer only modified the points of the angle FFT from 48 points to 64 points. May I ask what is the reason for this? The customer checked the information of all points detected by CFAR through the serial port, and found that the value of channel 0 was detected by os-cfar.

Thanks,

Annie

0 Kaushik Gowda over 2 years ago in reply to Annie Liu

TI__Mastermind 20605 points

Hi Annie,

Glad to hear the customer has been able to progress!

In your first point, I understand the customer's use case but, can you confirm if the result is 43 bits or 43 bytes? In the earlier case, I would like to know how the 43 bits are fed into the HWA using the input formatter if possible.

For the concern in 5) It seems like there is a bin mismatch for angle FFT. Is my understanding correct? With higher FFT size, your energy would be more appropriately distributed across the bins which is why you can see a small shift. This delta can change the bin that the CFAR picks. If possible, can you share the plots that you are referring to?

Regards,

Kaushik

0 Cherry Zhou over 2 years ago in reply to Kaushik Gowda

TI__Mastermind 22235 points

Hi Kaushik,

Kaushik Gowda said:
can you confirm if the result is 43 bits or 43 bytes?

43 refers to 43 8-byte complex numbers, i.e. the result of 2DFFT.

Kaushik Gowda said:
For the concern in 5) It seems like there is a bin mismatch for angle FFT. Is my understanding correct? With higher FFT size, your energy would be more appropriately distributed across the bins which is why you can see a small shift. This delta can change the bin that the CFAR picks. If possible, can you share the plots that you are referring to?

a) Measurements in the darkroom found that the local maximum test did not detect the channel with the highest amplitude of the 64 angle channels in the current dopplerBin in the beam(angle)-doppler plot. Only the maximum value is currently observed on the 0 th angle channel, but local max detects the peak of the 63 th angle channel. The result is not plausible whether a loop or a non-circular local max is used for detection. May I ask what caused it?

b) The customer outputs all of the results of OS-cfar detection on the serial port and detects the peak of the 0 th angle channel in OS-cfar. However, after combining the results of CFAR and BitMask detected by local max in function DPU_DopplerProcHWA_extractObjectList, the peak value of channel 0 can no longer be detected. And the peak of the 63 th angle channel still exists.

Is it possible to assume that the highest peak is not detected by local max? Of course, there is a correct case where the 63rd angle channel is higher than the 0th angle channel and the 62nd angle channel.

3. The customer tried to print the result of the local maximum test on the serial port and the result was bitmask, the peak detected was 1, otherwise 0. See page 5844 and page 5845 of the 2944TRM for the format of the output.
The customer's 3DFFT is 64 points, and should be stored in 2 32-bit spaces, and there is no high bit of zeroing, but the serial output is 0 for all the upper two bytes of 32-bit data.

In the function DPU_DopplerProcHWA_extractObjectList, use the pointer feed to store the local maximum bitmask in rangeBin units of 32 dopplerBin. So a rangeBin requires a total of 64 UINT32_t spaces, and local_bitmax is of type UINT32_t*.

memcpy(local_bitmax+rangeBin*64,localMaxMat,256);

Finally, a bitmask on rangeBin is output using the serial interface, as shown in the following figure, the upper two bytes of each UINT32_t are all zeros:

for(local_max_pri=0;local_max_pri<64;local_max_pri+=2)

{

CLI_write("%x %x\r\n",*(result->local_bitmax+result->objOut[energy_max_index].rangeBin*64+local_max_pri),*(result->local_bitmax+result->objOut[energy_max_index].rangeBin*64+local_max_pri+1));

}

Does the bitmask reside in memory in the following order:

DopplerBin0[ant0,ant1...ant63],dopplerBin1[ant0,ant1 ... Ant63],..., dopplerBin31

Thanks and regards,

Cherry

+1 Cherry Zhou over 2 years ago in reply to Cherry Zhou

TI__Mastermind 22235 points

Hi Kaushik,

Trying to separate the long response from customer as making it easy to check:

4. The bitmask of the local maximum peak detection is now successfully sent out on the serial port, and the output of the angle channel on rangeBin and dopplerBin where the highest energy detected target is located is as follows:

0,s virtual ant's energy = 42547
1,s virtual ant's energy = 41710
2,s virtual ant's energy = 39698
3,s virtual ant's energy = 35494
4,s virtual ant's energy = 36560
5,s virtual ant's energy = 37542
6,s virtual ant's energy = 37150
7,s virtual ant's energy = 36932
8,s virtual ant's energy = 37391
9,s virtual ant's energy = 36321
10,s virtual ant's energy = 34487
11,s virtual ant's energy = 38531
12,s virtual ant's energy = 39768
13,s virtual ant's energy = 39870
14,s virtual ant's energy = 39113
15,s virtual ant's energy = 37467
16,s virtual ant's energy = 35740
17,s virtual ant's energy = 35225
18,s virtual ant's energy = 34042
19,s virtual ant's energy = 35967
20,s virtual ant's energy = 35672
21,s virtual ant's energy = 31711
22,s virtual ant's energy = 37391
23,s virtual ant's energy = 37888
24,s virtual ant's energy = 33256
25,s virtual ant's energy = 38197
26,s virtual ant's energy = 39936
27,s virtual ant's energy = 39515
28,s virtual ant's energy = 35936
29,s virtual ant's energy = 37685
30,s virtual ant's energy = 39321
31,s virtual ant's energy = 40000
32,s virtual ant's energy = 40983
33,s virtual ant's energy = 41328
34,s virtual ant's energy = 40444
35,s virtual ant's energy = 37273
36,s virtual ant's energy = 38077
37,s virtual ant's energy = 39626
38,s virtual ant's energy = 39515
39,s virtual ant's energy = 38256
40,s virtual ant's energy = 36456
41,s virtual ant's energy = 36029
42,s virtual ant's energy = 35102
43,s virtual ant's energy = 34487
44,s virtual ant's energy = 37920
45,s virtual ant's energy = 39113
46,s virtual ant's energy = 39239
47,s virtual ant's energy = 38659
48,s virtual ant's energy = 37888
49,s virtual ant's energy = 37313
50,s virtual ant's energy = 37150
51,s virtual ant's energy = 37505
52,s virtual ant's energy = 38313
53,s virtual ant's energy = 38313
54,s virtual ant's energy = 35060
55,s virtual ant's energy = 37920
56,s virtual ant's energy = 40361
57,s virtual ant's energy = 40902
58,s virtual ant's energy = 40526
59,s virtual ant's energy = 40304
60,s virtual ant's energy = 39870
61,s virtual ant's energy = 39553
62,s virtual ant's energy = 41287
63,s virtual ant's energy = 42409

5. The following figure shows the angle unit of the maximum energy target detected and the energy value of the adjacent angle channel, where ANT sub represents the angle channel where the peak is detected minus one and ANT add represents the angle channel detected plus one.

As shown in the figure, peaks were detected on the 63 th angle channel, +1 was the 0 th channel, and -1 was the 62 th channel. The corresponding output can also be found in the above file. Where the value on channel 0 is greater than the value of channel 63 where the peak is currently detected.

6. The following figure shows all targets detected on rangeBin and dopplerBin where the most energy targets are located (in OS-cfar, all angular channels on this rangeBin and dopplerBin are detected):

7. The following figure shows the bitmask on this rangeBin detected by local maximum, where a row represents the results of peak detection on a dopplerBin corresponding to the Doppler channels of [31,0], [63,32], respectively.

The first line represents the 0 dopplerBin, which is 4882120 82304022, where 4882120 represents the channel peak detection result of [31,0's], expressed in hexadecimal and convert it to binary.

It can be seen that the targets in the figure above are all detected in the graph, but the angle channel 0, which is the maximum value on the current dopplerBin, is not detected as a peak (other detected peaks, while detected in OS-cfar, are beyond the set angle range. So it's filtered out when it's added to the target list.)

Why does the local maximum detection feature of the hardware accelerator not detect the highest peak? Currently, both lateral and longitudinal threshold detection for local maximum is turned off, setting the LM_thresh_BITMASK to 11b.

So in theory, only the magnitude of the test unit and the reference unit around him should be compared. Why can't the maximum value be detected?

Thanks and regards,

Cherry

Sensors

Sensors forum

AWR2944EVM: How to change the data in Vector multiplication Cofficient RAM?