This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM3357: Data abort exception on NDK main task

Part Number: AM3357

Hi,

I am using the NDK version 2.26.0.08 and PDK 1.0.10 to run a UDP server application on an AM3357 device. Everything works fine, but sometimes and randomly I get a Data abort exception.

Exception call stack,
0 PBM_free(void *) at pbm.c:230,PC = 0x80040D78 FP = 0x00020002
1 xdc_runtime_System_asprintf_va__E() at :0,PC = 0x00030002 FP = 0x00020002
2 do_AngelSWI(int, void *) at _kill.c:78,PC = 0x8005F308 FP = 0x00020002
3 _kill(int, int) at _kill.c:19,PC = 0x8005F308 FP = 0x00020002

Using RTOS object view, the preempted task is the main NDK task. This task is configured:

CI_IPNET NA;
CI_ROUTE RT;

/* Add IP address for interface 1 */
NA.IPAddr = htonl(tInst.u32IpAddr);
NA.IPMask = htonl(tInst.u32NetMask);
CfgAddEntry(hCfg, CFGTAG_IPNET, 1, 0, sizeof(CI_IPNET),
            (UINT8 *)&NA, 0);

uint cfg;
/* Set NDK Kernel Task Priority */
cfg = 11;
CfgAddEntry(hCfg, CFGTAG_OS, CFGITEM_OS_TASKPRIKERN,
            CFG_ADDMODE_UNIQUE, sizeof(uint), (uint8_t*)&cfg, 0);
/** Set NDK Kernel Task stack size */
cfg = 0x4000;
CfgAddEntry(hCfg, CFGTAG_OS, CFGITEM_OS_TASKSTKLOW,
            CFG_ADDMODE_UNIQUE, sizeof(uint), (uint8_t*)&cfg, 0);

I also check the stack peak and it is 1164 (no stack overflow).

When the exception appears, if I try to read the status of the NDK on the ROV:

Target memory read failed at address: 0xffffffffffffffff, length: 4

This read is at an INVALID address according to the application's section map. The application is likely either uninitialized or corrupt.

How I could get more information? and what could cause the Data Abort on the main NDK task?

Any recommendation is appreciated.

Josep

  • Hello Josep,

    This problem is a bit difficult to debug. I have a few suggestions:

    1. The NDK you're using is a quite old version. Have you considered updating it to a newer version? The latest Processor SDK for AM335x is 6.3 (here) which has PDK 1.0.17 and NDK 3.61.1.1.

    2. You can add some debug messages to NDK and rebuild it. Please refer documentation here for how to rebuild the NDK. You can use UART_printf() in NDK to dump debugging messages to UART console.

    Regards,

    Jianzhong

  • Hi Jianzhong,

    Following your suggestions, I update the NDK to the lastest version 3.61.1.1 but the problem persist. Looking for the changes I made to the code, the GPIO ISR priority changes from the default value 2 to 35. The NDK uses Hwi with priority 32, so now the GPIO ISR can be nested by the NDK interrupt. As a workaround I return the priority of the GPIO ISR to value 2 and the problem seems to be resolved.

    Could this situation cause some kind of data corruption and NDK process hangs with data abort? 

    I assume the Hwi module is managing the nesting process and storing all the context on the stack. Am I correct or should I add some kind of control in the Isr user callback?

    On the other hand, the deep limit of nested ISR is defined by the stack size I defined hwi/swi system stack size of 65536 and FIQ stack size 1024. How do I know if an interrupt is using either nISR or FIQ?

    Regards, Josep

  • Hi Josep,

    It seems to me you're using SYS/BIOS for thread managing. Is that right? If that's the case, how did you set the priority of the GPIO ISR and the NDK Hwi? SYS/BIOS doesn't maintain Hwi priorities. Please refer to SYS/BIOS User's Guide here, section 3.2.4 Thread Priorities.

    I assume the Hwi module is managing the nesting process and storing all the context on the stack. Am I correct or should I add some kind of control in the Isr user callback?

    Yes, this is correct if you use BIOS to create Hwi objects.

    How do I know if an interrupt is using either nISR or FIQ?

    For AM335x devices, FIQ is not used. Please refer to this thread.

    Regards,

    Jianzhong

  • Hi Jianzhongxu,

    Thanks for the reply.

    It seems to me you're using SYS/BIOS for thread managing. Is that right?

    Yes we are using SYS?BIOS for thread managing.

    f that's the case, how did you set the priority of the GPIO ISR and the NDK Hwi?

    We are not changing the NDK Hwi priority, 20 value is defined on NIMU_ICSS_interruptInit function (nimu_icssEthDriver.c).

    To change the GPIO ISR priority we use the GPIO configuration structure:

    GPIO_v1_config

    setting the intPriority field to our desired value.

    For AM335x devices, FIQ is not used.

    Ok, thanks for the clarification

    Since FIQ is not used, my concern for the number of nested interrupt is not possible as it is impossible to overflow 65536 hwi stack.

    Regards,

    Josep Castro

  • Thanks for the clarification. Both NIMU_ICSS and GPIO driver are built on top of SYS/BIOS which allows Hwi nesting, as documented in 3.4.2 Hardware Interrupt Nesting and System Stack Size. Your system stack size of 65536 should be enough.

    I'm not sure what could cause the exception. I'll give it some more thoughts.