This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

PROCESSOR-SDK-DRA8X-TDA4X: VPAC LDC Crash Issue on Custom Board

Part Number: PROCESSOR-SDK-DRA8X-TDA4X

Hello,

We have developed an application to generate Top View Image from the input image using VPAC LDC HWA. The application is tested and working fine on the EVM board.

On custom board we have 2 TDA4x SOC.when we test this application on one TDA4 x it  crashes , however the same application works on other TDA4x.

Note that we are using QNX based PSDK for running on Custom board and using Linux while running on EVM. We have tested other modules (Vision Algo on C66 , DL algo on C7x , DOF , Pyramid on HWA) on both the TDA4x wih QNX and they execute without any issues.We have used similar application for Fish Eye correction and its works well on custom board. Only difference is LUT and Image resolution rest all config parameters are same.

Specifically we have issue in LDC application. To investigate further we used debugger to see the LDC Kernel operation.

We observed that LDC had finished its operation and output was also generated after ProcessRequest Call, we checked this using debugger and memory dump.

When we stepped further we found that it crashed in the tivxEventWait call.

We debugged further and reached a point of failure which is SemaphoreP_pend() call inside the tivxEventWait() call in the file tivx_event.c. We also received some crash logs as below.
[MAIN_Cortex_R5_0_1] [UDMA] [Error] Sciclient event config failed!!!
[UDMA] [Error] Event config failed!!
[UDMA] [Error] Global master event register failed!!!
[Error] UDMA init failed!!
Exception occurred in ThreadType_Hwi.
Hwi handle: 0x0.
Hwi stack base: 0xa3c805e8.
Hwi stack size: 0x2000.
R0 = 0xe59ff018  R8  = 0xa3c805a2
R1 = 0x00000004  R9  = 0xa3c802d4
R2 = 0xffffffff  R10 = 0xa3c802d8
R3 = 0xa3c82580  R11 = 0x0ff80000
R4 = 0xa36bb098  R12 = 0xffffffff
R5 = 0x00000023  SP(R13) = 0xa3c82560
R6 = 0x00000001  LR(R14) = 0xa3841000
R7 = 0x00000000  PC(R15) = 0xfffffffe
PSR = 0xa000019f
DFSR = 0x00000000  IFSR = 0x00000000
DFAR = 0x00000000  IFAR = 0x00000000
ti.sysbios.family.arm.exc.Exception: line 209: E_undefinedInstruction: pc = 0xfffffffe, lr = 0xa3841000.
xdc.runtime.Error.raise: terminating execution.
Also we conducted experiment to put input and output buffers in internal MSMC section and same behaviour is observed, which says issue might not be related to DDR access.
Please let us know what could be the reason for such behaviour.
Regards,
Swapnil Nagare
  • Hi Swapnil,

    As we discussed in the call, lets first analyze the lookup table and see if there is any issue in LUT. Because all the debug points to something related to lut.

    Rgds,

    Brijesh

  • Hi Swapnil,

    Can we try disabling all error interrupts in the LDC driver? I think with this LDC LUT, some error event is getting generated and in the error ISR, it seems crashing. 

    To disable error interrupt, can you disable code from line number 145 to 160 in the file pdk\packages\ti\drv\vhwa\src\drv\vhwa_m2mLdcIntr.c, rebuild pdk and sdk and try it again?

    Rgds,

    Brijesh

  • Hi Brijesh,

    As per your suggestion we did the chenges and tested the application.

    The application still crashes at same point but the crash log indicate dataAbort exception , previously undefined Instruction was reported.

     

    [MAIN_Cortex_R5_0_1] [UDMA] [Error] Sciclient event config failed!!!

    [UDMA] [Error] Event config failed!!

    [UDMA] [Error] Global master event register failed!!!

    [Error] UDMA init failed!!

    Exception occurred in ThreadType_Hwi.

    Hwi handle: 0x0.

    Hwi stack base: 0xa3cac5e8.

    Hwi stack size: 0x2000.

    R0 = 0xe59ff018  R8  = 0xa3cac5a2

    R1 = 0x00000004  R9  = 0xa3cac2d4

    R2 = 0x09900c00  R10 = 0xa3cac2d8

    R3 = 0xa3cae580  R11 = 0x0ff80000

    R4 = 0xa36e6c98  R12 = 0xa7842958

    R5 = 0x00000023  SP(R13) = 0xa3cae560

    R6 = 0x00000001  LR(R14) = 0xa386cdc0

    R7 = 0x00000000  PC(R15) = 0xa3861f98

    PSR = 0xa000019f

    DFSR = 0x00000801  IFSR = 0x00000000

    DFAR = 0x404080cd  IFAR = 0x00000000

    ti.sysbios.family.arm.exc.Exception: line 205: E_dataAbort: pc = 0xa3861f98, lr = 0xa386cdc0.

    xdc.runtime.Error.raise: terminating execution.

    Also one observation after doing this changes the application seems to work sometimes.

    Not sure about this behavior though.

    Regards,

    Swapnil 

  • Hi Swapnil,

    ok, atleast, it progresses.

    Can you check what is available at 0xa3861f98 and 0xa386cdc0 offsets? it is clearly some corruption at these offsets.. 

    Also can you again try putting breakpoint LDC ISR and see if it is getting hit?

    Rgds,

    Brijesh

  • Hi Brijesh,

    I checked the address in memory as you suggested i have attaached the snspshot of the same for your reference.

    I tired to debug further inside the Semaphore_Pend function and located the exact function where it crashes.
    In file sysbios/knl/Semaphore.c Semaphore_pend function in Task_restore(tskKey) function it crashes.


    Also i observed that sometimes the crash log is different and it crashes at
    osal/src/tirtos/SemaphoreP_tirtos.c file SemaphoreP_pend function at SEMOSAL_Assert((handle == NULL_PTR)) and exception is reported for Prefetch_DataAbort.

    So at different times the crash changes and also sometimes it runs successfully.

    Its difficult to narrow down the root cause but looks like some memory corruption.


    I will look in further to find more info.

    Regards,
    Swapnil 

  • Hi Swapnil,

    Does it mean it reaches to the LDC frame completion ISR atleast? From above explanation, It seems that semaphore is getting corrupted, but ISR seems to coming. Could you please confirm?

    From the completion ISR, can you run upto LDC node callback by running every step and check if anything memory pointed by semaphore is changing?

    Regards,

    Brijesh

  • Hi Brijesh,

    I checked the ISR but it does not reach there.

    Yes i am going tho check the semaphore memory address and when it is getting changed.

    Regards,Swapnil

  • Hi Swapnil,

    May be that semaphore is already corrupted, that's why you never see ISR. 

    Can you put while (1) after submitting request to driver (Fvid2_submitRequest), even before calling semaphore? If it is just semaphore corruption then ISR should still come and then node callback should get called.. Can you please check this? 

    Regards,

    Brijesh 

  • Hi Brijesh,

    Sorry for late response.

    We tried adding a while loop after Fvid2_submitRequest and tested.

    It again crashed after while loop executed for some iterations.We added prints inside the while loop and it seems it executed for 7 times before crashing.

    Below is the crash log.

    [MAIN_Cortex_R5_0_1] [UDMA] [Error] Sciclient event config failed!!!

    [UDMA] [Error] Event config failed!!
    [UDMA] [Error] Global master event register failed!!!
    [Error] UDMA init failed!!
    Exception occurred in ThreadType_Hwi.
    Hwi handle: 0x0.
    Hwi stack base: 0xa3c825e8.
    Hwi stack size: 0x2000.
    R0 = 0xe59ff018 R8 = 0xa3c825a2
    R1 = 0x00000004 R9 = 0xa3c822d4
    R2 = 0x09100c00 R10 = 0xa3c822d8
    R3 = 0xa3c84580 R11 = 0x0ff80000
    R4 = 0xa36bc098 R12 = 0x03800040
    R5 = 0x00000023 SP(R13) = 0xa3c84560
    R6 = 0x00000001 LR(R14) = 0xa3841a40
    R7 = 0x00000000 PC(R15) = 0x03800040
    PSR = 0x2000019f
    DFSR = 0x00000000 IFSR = 0x0000000d
    DFAR = 0x00000000 IFAR = 0x03800040
    ti.sysbios.family.arm.exc.Exception: line 201: E_prefetchAbort: pc = 0x03800040, lr = 0xa3841a40.
    xdc.runtime.Error.raise: terminating execution

    I have added the kernel c file for your reference.

    Regards,

    Swapnil N

    vx_vpac_ldc_target_while_loop.c
    Fullscreen
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    /*
    *
    * Copyright (c) 2017-2019 Texas Instruments Incorporated
    *
    * All rights reserved not granted herein.
    *
    * Limited License.
    *
    * Texas Instruments Incorporated grants a world-wide, royalty-free, non-exclusive
    * license under copyrights and patents it now or hereafter owns or controls to make,
    * have made, use, import, offer to sell and sell ("Utilize") this software subject to the
    * terms herein. With respect to the foregoing patent license, such license is granted
    * solely to the extent that any such patent is necessary to Utilize the software alone.
    * The patent license shall not apply to any combinations which include this software,
    * other than combinations with devices manufactured by or for TI ("TI Devices").
    * No hardware patent is licensed hereunder.
    *
    * Redistributions must preserve existing copyright notices and reproduce this license
    * (including the above copyright notice and the disclaimer and (if applicable) source
    * code license limitations below) in the documentation and/or other materials provided
    * with the distribution
    *
    * Redistribution and use in binary form, without modification, are permitted provided
    * that the following conditions are met:
    *
    * * No reverse engineering, decompilation, or disassembly of this software is
    * permitted with respect to any software provided in binary form.
    *
    * * any redistribution and use are licensed by TI for use only with TI Devices.
    *
    * * Nothing shall obligate TI to provide you with source code for the software
    * licensed and provided to you in object code.
    *
    * If software source code is provided to you, modification and redistribution of the
    * source code are permitted provided that the following conditions are met:
    *
    * * any redistribution and use of the source code, including any resulting derivative
    * works, are licensed by TI for use only with TI Devices.
    *
    * * any redistribution and use of any object code compiled from the source code
    * and any resulting derivative works, are licensed by TI for use only with TI Devices.
    *
    * Neither the name of Texas Instruments Incorporated nor the names of its suppliers
    *
    * may be used to endorse or promote products derived from this software without
    * specific prior written permission.
    *
    * DISCLAIMER.
    *
    * THIS SOFTWARE IS PROVIDED BY TI AND TI'S LICENSORS "AS IS" AND ANY EXPRESS
    * OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
    * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
    * IN NO EVENT SHALL TI AND TI'S LICENSORS BE LIABLE FOR ANY DIRECT, INDIRECT,
    * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
    * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
    * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
    * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
    * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
    * OF THE POSSIBILITY OF SUCH DAMAGE.
    *
    */
    /* ========================================================================== */
    /* Include Files */
    /* ========================================================================== */
    #include "TI/tivx.h"
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

  • Hi Swapnil,

    Does it mean it works fine for 6 iterations? Do you get the matching output from LDC for all 6 iterations? 

    Still looks like some memory corruption, PC is pointing to 0x03800040 location, which does not look correct. Can you check which function is available at 0xa3841a40 memory location? 

    Also can you check, why do you get below errors? it looks like udma itself is not initialized.. Can you put breakpoint on init and see why it fails?

    [MAIN_Cortex_R5_0_1] [UDMA] [Error] Sciclient event config failed!!!

    [UDMA] [Error] Event config failed!!
    [UDMA] [Error] Global master event register failed!!!
    [Error] UDMA init failed!!

     

    Rgds,

    Brijesh

  • Hi Brijesh,

    No it did not ran 7 times.

    We had added prints inside while loo and the prints came for 7 times and then it crashed.It never got out of while loop before crash.

    I will look into UDMA init function, but doesnt seem crash is due to UDMA as we have seen previously that the output gets generated correctly.

    Regards,

    Swapnil N

  • Swapnil,

    DMA engine, which LDC uses, is also handled by the UDMA driver, so could you please check why udma_init is failing? 

    Also why is there a for loop only for submit request? Does single frame complete interrupt come fine? 

    Rgds,

    Brijesh 

  • HI Brijesh.

    Correct LDC uses the UDMA engine and because we have seen previously that even though the kernel crashes it is still able to generate output, we have confirmed this last time by checking memory in CCS.

    I will anyways verify why it is failing in UDMA init.

    Also i checked that it never reaches the CompleteISR.

    Regards,

    Swapnil N

  • Brijesh,

    Can we update the thread with the final resolution?

    Regards

    Karthik

  • Sure,

    We found bug in the scalar driver, which was registering wrong interrupt ie interrupt for the LDC module. Now when the request for the LDC module is submitted and ISR is triggered in the scalar driver. Since the scalar driver is not opened, the pointers were not correct and which was causing crash..The issue is fixed by changing interrupt registration in scalar driver.

    Rgds,

    Brijesh