This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

DRA829V: U-boot hangs in dra829 custom board.

Part Number: DRA829V
Other Parts Discussed in Thread: DRA829

Hi,

In our custom board based on DRA829 processor we are facing u-boot hangs some time with the below message printed on console could you please suggest the reason for that.

we are using PSDK linux  7.3

ti_sci_get_response: Message receive failed. ret = -110
RM_RA:Mbox config send fail -110
TISCI ring free fail (-110) ring_idx 96
ti_sci_get_response: Message receive failed. ret = -110
RM_PSIL:Mbox send fail -110
ti_sci_get_response: Message receive failed. ret = -110
RM_RA:Mbox config send fail -110
TISCI ring free fail (-110) ring_idx 50
ti_sci_get_response: Message receive failed. ret = -110
RM_RA:Mbox config send fail -110
TISCI ring free fail (-110) ring_idx 97

  • Divyanshu,


    Request you to provide full logs from U-Boot-SPL stage & also what is the boot mode you are using?

    - Keerthy

  • Hi keerthy,

    we observed this log in both UART boot mode  and eMMC boot mode.

    The below log is not coming every time it happening randomly while booting. 

    U-Boot 2020.01-00001-g2dbac40304-dirty (Aug 11 2021 - 13:17:26 +0530)

    SoC:   J721E SR1.0
    Model: Texas Instruments K3 J721E SoC
    DRAM:  1 GiB
    ti_sci_get_response: Message receive failed. ret = -110
    Mbox send fail -110
    ti_sci_power_domain_on: get_device(146) failed (-110)
    ti_sci_get_response: Message receive failed. ret = -110
    Mbox send fail -110
    ti_sci_power_domain_on: get_device(146) failed (-110)
    ti_sci_get_response: Message receive failed. ret = -110
    Mbox send fail -110
    ti_sci_power_domain_on: get_device(287) failed (-110)
    ti_sci_get_response: Message receive failed. ret = -110
    Mbox send fail -110
    ti_sci_power_domain_on: get_device(287) failed (-110)
    ti_sci_get_response: Message receive failed. ret = -110
    Mbox send fail -110
    ti_sci_power_domain_on: get_device(287) failed (-110)
    No serial driver found
    resetting ...
    ti_sci_get_response: Message receive failed. ret = -110
    Mbox send fail -110
    ti_sci_sysreset_request: reboot_device failed (-110)

    sometime as below log

    U-Boot 2020.01-00002-g4805fecf28-dirty (Aug 20 2021 - 11:22:18 +0530)

    SoC:   J721E SR1.0
    Model: Texas Instruments K3 J721E SoC
    DRAM:  1 GiB
    "Synchronous Abort" handler, esr 0x8600000d
    elr: ffffffffc08ff000 lr : 0000000080855518 (reloc)
    elr: 0000000000000000 lr : 00000000bff56518
    x0 : 00000000bdee6150 x1 : 0000000000000092
    x2 : 0000000000000000 x3 : 0000000000000038
    x4 : 0000000000006794 x5 : 0000000000670000
    x6 : 0000000000950000 x7 : 000000000000117c
    x8 : 00000000bded67f8 x9 : 0000000000000008
    x10: 0000000000000002 x11: 0000000000001178
    x12: 00000000bded641c x13: 00000000000030e0
    x14: 0000000000000000 x15: 00000000bded67f8
    x16: 00000000bff564ec x17: 0000000000000000
    x18: 00000000bdee0df8 x19: 0000000000000000
    x20: 00000000bded66c8 x21: 0000000000000001
    x22: 0000000000000000 x23: 00000000bdee3c40
    x24: 00000000800aaaf3 x25: 00000000deadbeef
    x26: 0000000000000005 x27: 0000000000000000
    x28: 0000000000000000 x29: 00000000bded6660

    Code: "Synchronous Abort" handler, esr 0x96000004
    elr: 0000000080802708 lr : 00000000808026e8 (reloc)
    elr: 00000000bff03708 lr : 00000000bff036e8
    x0 : 00000000bffa29e0 x1 : 0000000000000000
    x2 : 0000000000000020 x3 : 0000000002800000
    x4 : 00000000bded5fc0 x5 : 00000000bded5e58
    x6 : 0000000000000030 x7 : 000000000000000f
    x8 : 00000000bded64c8 x9 : 0000000000000008
    x10: 00000000ffffffe8 x11: 0000000000000010
    x12: 0000000000000006 x13: 000000000001869f
    x14: 0000000000000000 x15: 0000000000000021
    x16: 00000000bff564ec x17: 0000000000000000
    x18: 00000000bdee0df8 x19: fffffffffffffff0
    x20: 00000000bffab0b3 x21: 00000000fffffffc
    x22: 00000000bffac65b x23: 00000000bffa29e0
    x24: 00000000800aaaf3 x25: 00000000deadbeef
    x26: 0000000000000005 x27: 0000000000000000
    x28: 0000000000000000 x29: 00000000bded6500

    Code: d1004273 91196ed6 aa0003f7 12800075 (b9400261)
    Resetting CPU ...

  • hi Keerthy,

    Any suggestion on the above error still we are facing the above issues Sometime.

    Regards

    Divyanshu 

  • Hi Divyanshu,

    Requests to DMSC are getting timed out(-110 is ETIMEOUT). As suggested by  .
    Please connect JTAG and check the state of DMSC(M3) & A72. Also Paste the log from the first print:


    U-Boot 2020.01-00001-g2dbac40304-dirty (Aug 11 2021 - 13:17:26 +0530)

    SoC:   J721E SR1.0

    The above is U-Boot print. I am sure there will SPL prints before that. Copy paste the entire log.

    - Keerthy

  • Hi Keerthy,

    please find the attached full log in UART Boot mode

    U-Boot SPL 2020.01-00002-g4805fecf28-dirty (Sep 02 2021 - 10:08:44 +0530)
    SYSFW ABI: 3.1 (firmware rev 0x0015 '21.1.1--v2021.01a (Terrific Lla')
    Trying to boot from UART
    CCCCCCCCSOH)/0(STX)/0(CAN) packets, 10 retries
    Loaded 840444 bytes
    Loading Environment from MMC... *** Warning - No MMC card found, using default environment
    
    init_env from device 7 not supported!
    Starting ATF on ARM64 core...
    
    NOTICE: BL31: v2.4(release):07.03.00.005-dirty
    NOTICE: BL31: Built : 11:11:12, Aug 19 2021
    
    U-Boot SPL 2020.01-00002-g4805fecf28-dirty (Sep 02 2021 - 10:10:47 +0530)
    SYSFW ABI: 3.1 (firmware rev 0x0015 '21.1.1--v2021.01a (Terrific Lla')
    Trying to boot from UART
    CC)/0(CAN) packets, 4 retries
    Loaded 888876 bytes
    
    
    U-Boot 2020.01-00002-g4805fecf28-dirty (Sep 02 2021 - 10:10:47 +0530)
    
    SoC: J721E SR1.0
    Model: Texas Instruments K3 J721E SoC
    DRAM: 1 GiB
    "Synchronous Abort" handler, esr 0x8600000d
    elr: ffffffffc08ff000 lr : 000000008082749c (reloc)
    elr: 0000000000000000 lr : 00000000bff2849c
    x0 : 00000000bdee62a0 x1 : 0000000000000000
    x2 : 00000000bff28480 x3 : 0000000000000055
    x4 : 00000000bdee15d0 x5 : 0000000000000055
    x6 : 0000000000000031 x7 : 000000000000000f
    x8 : 00000000bde3b068 x9 : 0000000000000008
    x10: 00000000ffffffd0 x11: 0000000000000010
    x12: 0000000000000006 x13: 000000000001869f
    x14: 0000000000000000 x15: 0000000000000021
    x16: 00000000bff28480 x17: 0000000000000000
    x18: 00000000bdee0df8 x19: 00000000ffffffda
    x20: 0000000000000001 x21: 00000000fffffffc
    x22: 00000000bffac675 x23: 00000000bffa2a3e
    x24: 00000000800aa8fb x25: 00000000deadbeef
    x26: 0000000000000005 x27: 0000000000000000
    x28: 0000000000000000 x29: 00000000bde3b040
    
    Code: "Synchronous Abort" handler, esr 0x96000004
    elr: 0000000080802708 lr : 00000000808026e8 (reloc)
    elr: 00000000bff03708 lr : 00000000bff036e8
    x0 : 00000000bffa2a3e x1 : 0000000000000000
    x2 : 0000000000000020 x3 : 0000000002800000
    x4 : 00000000bde3a9a0 x5 : 00000000bde3a838
    x6 : 0000000000000030 x7 : 000000000000000f
    x8 : 00000000bde3aea8 x9 : 0000000000000008
    x10: 00000000ffffffe8 x11: 0000000000000010
    x12: 0000000000000006 x13: 000000000001869f
    x14: 0000000000000000 x15: 0000000000000021
    x16: 00000000bff28480 x17: 0000000000000000
    x18: 00000000bdee0df8 x19: fffffffffffffff0
    x20: 00000000bffab111 x21: 00000000fffffffc
    x22: 00000000bffac675 x23: 00000000bffa2a3e
    x24: 00000000800aa8fb x25: 00000000deadbeef
    x26: 0000000000000005 x27: 0000000000000000
    x28: 0000000000000000 x29: 00000000bde3aee0
    
    Code: d1004273 9119d6d6 aa0003f7 12800075 (b9400261)
    Resetting CPU ...
    
    resetting ...
    "Synchronous Abort" handler, esr 0x8600000d
    elr: ffffffffc08ff000 lr : 000000008082749c (reloc)
    elr: 0000000000000000 lr : 00000000bff2849c
    x0 : 00000000bdee62a0 x1 : 0000000000000000
    x2 : 00000000bff28480 x3 : 0000000000000055
    x4 : 00000000bdee15d0 x5 : 0000000000000055
    x6 : 0000000000000031 x7 : 000000000000000f
    x8 : 00000000bde3ac58 x9 : 0000000000000008
    x10: 00000000ffffffd0 x11: 0000000000000010
    x12: 0000000000000006 x13: 000000000001869f
    x14: 0000000000000000 x15: 0000000000000021
    x16: 00000000bff28480 x17: 0000000000000000
    x18: 00000000bdee0df8 x19: 00000000ffffffda
    x20: 0000000000000001 x21: 00000000fffffffc
    x22: 00000000bffac675 x23: 00000000bffa2a3e
    x24: 00000000800aa8fb x25: 00000000deadbeef
    x26: 0000000000000005 x27: 0000000000000000
    x28: 0000000000000000 x29: 00000000bde3ac30
    
    Code: "Synchronous Abort" handler, esr 0x96000004
    elr: 0000000080802708 lr : 00000000808026e8 (reloc)
    elr: 00000000bff03708 lr : 00000000bff036e8
    x0 : 00000000bffa2a3e x1 : 0000000000000000
    x2 : 0000000000000020 x3 : 0000000002800000
    x4 : 00000000bde3a590 x5 : 00000000bde3a428
    x6 : 0000000000000030 x7 : 000000000000000f
    x8 : 00000000bde3aa98 x9 : 0000000000000008
    x10: 00000000ffffffe8 x11: 0000000000000010
    x12: 0000000000000006 x13: 000000000001869f
    x14: 0000000000000000 x15: 0000000000000021
    x16: 00000000bff28480 x17: 0000000000000000
    x18: 00000000bdee0df8 x19: fffffffffffffff0
    x20: 00000000bffab111 x21: 00000000fffffffc
    x22: 00000000bffac675 x23: 00000000bffa2a3e
    x24: 00000000800aa8fb x25: 00000000deadbeef
    x26: 0000000000000005 x27: 0000000000000000
    x28: 0000000000000000 x29: 00000000bde3aad0
    
    

    Below is the Program Counter after it got hanged

    Regards

    Divyanshu

  • Divyanshu,

    I believe offline DDR analysis is helping reducing the issue frequency. Request you to continue on that.
    Since that is helping this could be tied to DDR stability on your board. Closing this. If need be please
    feel free to respond on this.

    - Keerthy