This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AWR1843: Debugging HWA/EDMA hang

Part Number: AWR1843

Hello,

I'm seeing a strange issue with the AOA HWA DPU, where trying to fetch certain large doppler index values causes a hang in the EDMA wait.  I'm not certain whether the problem is in the HWA or in the EDMA.  I have 2 questions:

1. Is there any known errata or bugs in the EDMA or HWA where a hang could be caused by high doppler index values in the AOA HWA DPU?  When I hard code the sourceAddress doppler index to 170, I see the failure consistently.  There are no HWA/EDMA changes compared with the AWR1843 demo application.

2. Is there a good way to debug HWA/EDMA related issues?  Any kind of app note or reference would be helpful, besides the Keystone EDMA user's guide/HWA User's guide, both of which I have used up to this point.

Thanks

  • Hi,

    To my knowledge, we don't know of any such bug. It would be very helpful if you could share how you are testing this so I could recreate this issue on my end. I will go over the AoA code and see if I can find anything that could possibly be causing this.

    Also, can you clarify on what you mean by a hang occuring in an EDMA wait? Do you mean the next paramset never gets a EDMA event or something else?

    I think a couple of ways to debug these is to use their callbacks and set some flags/ have breakpoints to stop execution after every paramset. I think there could be an option to single step the hardware accelerator. I will look into it and get back to you. In the meanwhile, if you could give me some more details on the two points I mentioned above, that would be very helpful.

    Regards,

    Aayush

  • Hi Aayush,

    I pass in a config with 3 TX chirps per subframe and specify the # of loops, for example:

    subFrameCfg 0 0 0 3 130 40 0 1 1 40
    When I say "hangs", I mean that EDMA_isTransferComplete never returns done, meaning the EDMA interrupt status never gets set.  I'm not sure if this is due to the HWA or the EDMA.
    Inside the cfgAndTrigger_EDMA_2DFFT function, I force dopplerIdx = 170, and this causes the failure every time.
    If you could help provide me information on how to single step the HWA, that would be very helpful.  Then, I could determine whether the EDMA or HWA is the one getting stuck.
    Thanks
  • Hi,

    Unfortunately, I don't think it's straight forward to single step the HWA. You can have all paramsets use software triggered mode and trigger them one by one with some fixed delay to ensure their processing completes. You can put breakpoints here to effectively single step the HWA.

    Could you send me the confiuration file you're using? I'll hardcode the dopplerIdx to 170 along with that and recreate the error on my end as a first step.

    Also, I'm assuming that even without forcing the doppler index to 170, the config you use would hang for larger doppler indexes in AoA as you mentioned earlier, meaning that the entire data processing chain and the demo effectively hang anyway (if there is an object in the detection list with a high enough doppler bin index). Is this view correct?

    Regards,

    Aayush

  • Hi Aayush,

    I have messaged you the config separately.

    Your view is correct.  It seems like for some large doppler index, the AoA will hang and the entire data processing chain/demo will hang.

    Thanks

  • Hi Jovial,

    Thanks for confirming. I also have access to your configuration. I will have to spend some time recreating and debugging this issue. It will take me some time to do get started with this activity, let me get back to you on Monday with my findings. 

    Here's my initial thoughts: Only the 2D out EDMA transfer is concerned with the doppler index. Even the EDMA in and processing of the 2D FFT processes the entire range gate, so the doppler index value shouldn't matter.

    The 2D FFT out is a very simple A-synced transfer. There could be some issue in the transfer, or perhaps an issue in the way the EDMA wait is implemented.

    Here is one thing you could do: When hardcoding the doppler index to 170, set a breakpoint right before  the first AoAProcHWA_cfgAndTrigger_EDMA_2DFFT call. See if the contents of aoaHwaObj->edmaDstOut2DFFTBuffAddr change before the breakpoint, and after, when the code gets stuck in EDMA wait.

    You can do this with CCS' memory browser. If the content changes, it means that the transfer is going through and there is some issue with the EDMA wait. If there is no change, it means that the transfer is not happening in the first place.

    You can do the same for the memory banks in HWA. See if the contents of M2, M3 change. If they do, it means the HWA paramset is getting triggered correctly. If they don't, it means there is some issue with the HWA paramset.

    I will try to do a similar kind of debug on my end, just mentioned this in case it helps you.

    Regards,

    Aayush

  • Hi Aayush,

    Thank you for the suggestion.  I will try that experiment and let you know what I see.

    Best

  • Hi Jovial,

    Another thing that might be worth doing is using the AoA testcases. Currently, they generate radarcube data for randomly created CFAR peaks. You can change this to peaks with a fixed doppler index and try to recreate the issue there.

    Test_aoaDpu_cfarListGen generates the CFAR detection list randomly (this can be changed). 
    Test_aoaDpu_cubaDataGen generates the radarcube data corresponding to the detection list.
    The DPU config and process functions are called in the testcases to configure and execute the DPU.
    You can use this to reliably generate appropriate radarcube data and have the AoA process just a single detection.
    Regards,
    Aayush
  • Hi Aayush.

    I'll take a look at the tests, thank you!  One other thing I notice is that the hang doesn't happen on the very first object.  Instead, the AoA processes a few objects (around 12) before the hang occurs.  Another note is that when I step through and inspect the memory as you suggested, the problem doesn't occur.  This makes me think that maybe there is some kind of timing issue, but still debugging.

    Best

  • Hi Aayush,

    I tried running the AoA HWA unit test that comes with SDK 03.04.  Is this a valid test case?

    Test feature : :Tx3:Rx4:Chp171:D256:R128:extVel0

    For me, this test passes at Chirp=170, and fails at Chirp171 with an EDMA read bus error.

    Thanks,

    jovial

    Edit: I see this doesn't fit into the radarCube memory limitation, so it is an invalid test case.

  • Hi Jovial,

    I see. A question: was the original configuration also trying to access a doppler index that didn't fit into the radarcube memory? Could you try and recreate the error in the test case? The testcase is a nicer way to try and recreate the error than hardcoding the doppler index to 170.

    Regards,

    Aayush

  • Hi Aayush,

    I did confirm that the original configuration did fit into radarcube memory, so I don't think that is the issue.

    I am able to produce the hang with the original test case, by looping over the below test set repeatedly, AND changing the dopplerIdx to be 170 inside the 2D FFT trigger.

    Test feature : :Tx3:Rx4:Chp170:D256:R128

    With this test, I see the hang happen after ~10 loops.

    Thanks,

    jovial

  • Hi Jovial,

    For 170 chirps, wouldn't a doppler index of 170 be out of bounds? Could you try increasing the chirps by a bit and seeing if you still encounter the issue?

    Regards,

    Aayush

  • My understanding is that the doppler index is relative to the doppler bins, not the number of chirps.  In this case, since there are 256 doppler bins, 170 should be a valid doppler index.  Is that an incorrect understanding?

  • Hi Jovial,

    Sorry I got turned around for a moment. You are right, doppler index indexes into the doppler bins. I know this thread has been open for a while now; I will take out time tomorrow to recreate the issue with the testcases and discuss my findings with you and the wider team.

    Regards,

    Aayush

  • Hi Aayush,

    Have you had a chance to test and discuss with the wider team?  I'm still curious to get any feedback on this issue.

    Thanks

  • Hi Jovial,

    I'm really sorry about the delay. I wasn't able to get to this yet, please give me a little more time to investigate this issue.

    Regards,

    Aayush

  • Thanks Aayush for the update, I look forward to hearing back from you.

  • Hi, 

    Just want to indicate that I am still interested in getting feedback on this question.  

    Thanks

  • Hi,

    I'm really sorry I haven't been able to get back to you. Running and verifying this experiment is taking a little time to schedule. It is still on my to do list, I will check this and get back to you. Again, really sorry for the delay.

    Regards,

    Aayush

  • Hi Aayush,

    Have you been able to run the experiment or get any feedback?

    Thanks