This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM62A7: Custom board: SDK 09.01: Poor object detection performance

Part Number: AM62A7
Other Parts Discussed in Thread: SK-AM62A-LP

After getting the AI demos to run, I happened to notice they aren't running at anywhere near full speed.  How can I debug why the 2nd bar graph (what core?) is at 100%?

Image from our devkit:

On the EVM running 09.01 :
Image Clarification: fps 2-3, temp 42, nearly all cores idle. Demo looks smooth, slideshow
Object Detection: fps 30 , temp 43, cores 30-50% used. Demo looks smooth. DDR RD 2700MB/s, DDR WR 401MB/s, DDR Total 3170MB/s
Semantic Segmentation: fps 18 , temp 44, cores 20-50% used. Demo looks smooth though HDMI output appears to be glitching
Multichannel: fps 25, temp 45. Demo looks smooth

On the AM62A devkit 09.01 :
Image Classification: fps 2-3, temp 46, nearly all cores idle. Demo looks smooth, slideshow
Object Detection: fps 3 , temp 46, c7x_1 94% (others 2%). Demo looks laggy. DDR RD 800MB/s, DDR WR 33MB/s, DDR Total 830MB/s
Semantic Segmentation: fps 2 , temp 47, c7x_1 97% (others 2%). Demo looks laggy
Multichannel: fps 1, temp 37. Demo looks laggy

Note: Not sure why the demo is getting cut off, this happens on both the evm and our board.  Seems to depend on the monitor though both report to be 1920x1080

  • Hi Johnathan,

    As I understand, this is a follow-on from this thread: https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1341665/sk-am62a-lp-manually-compile-u-boot-for-evm (mostly for anyone else following the topic).

    The behavior you're seeing is odd... as you note, the C7x is clearly being utilized here, so the networks are being offloaded to the remote core (i.e., not running CNN's on Arm A53). It seems like this TIDL/C7x component of the pipeline is running very slow, such that the entire pipeline is a few FPS and other cores are at <5%.

    I'd like to collect more diagnostic information on your SDK+board, but first want to clarify which boards are being used with which SDK:

    On the EVM running 09.01 :
    • As I understand, this is the SDK you built, running on the TI SK-AM62A-LP board? or is it the SDK from TI on this board?
    On the AM62A devkit 09.01 :
    • And this is your board, running the SDK you built?

    Diagnostics to collect:

    • Run `k3conf dump clock`
      • or `k3conf dump clock 211` for only C7x clock
      • I want to see what frequency the C7x clock is running at. It could be a very low frequency (w/o PLL, 25 MHz, as I recall) whereas it should be either 850 MHz or 1000 MHz, dependent on 0.75 V or 0.85 V for VDDCORE, respectively
    • As you run your application, have a terminal open and running the /opt/vx_remote_app_arm.out binary to print more OpenVX messages. I am assuming this binary made it into your filesystem. I typically run this in the background of the same terminal I launch commands from
      • if you launch from CLI, also `export TIDL_RT_DEBUG=1` for more TIDL logging
    • Instead of launching from the GUI, let's try from /opt/edgeai-gst-apps so we can see more logging info.
      • You can run the following command; change the YAML file target based on which type of network to run. You may need to modify the inputs to use a stored video file.
      • /opt/edgeai-gst-apps/apps_python/app_edgeai.py /opt/edgeai-gst-apps/configs/object_detection.yaml -n
        
        • the -n tag disables ncurses output, which can hide some logging messages
        • Note that image_classification.yaml is set to run off image files at 1 fps.
    • The GUI-based performance bars is using the same basis as a CLI tool, /opt/edgeai-gst-apps/scripts/perf_stats, which has instructions to build and run. I don't expect this to yield new information, but wanted to mention as an option in case it aids your workflow :)

    I've seen the same thing with the GUI cutting off part of the feed on the edges. It's running inside a Qt application which may be handling sizes strangely. Running from CLI should circumvent this, but we should also remedy this on our side since it's hiding important perf information.

    Please respond back with some of the logging produced by above diagnostics recommendations.

    BR,
    Reese

  • I tested a 2nd 62A SOM with same baseboard and sdcard and its getting the correct perfomance.

    The slower module is reporting only 1.25Ghz on the ARM and a 0hz DSP clock? How could they have different DSP clock settings with the same software? We are applying the same DSP 1GHz settings on both.  Atleast with the ARM, the opp table has "opp-supported-hw" bits which could disable a higher clock speed on me...

    $ diff -u0 /tmp/works.txt /tmp/slow.txt 
    --- /tmp/works.txt	2024-05-06 16:56:55.643191313 -0400
    +++ /tmp/slow.txt	2024-05-06 16:58:30.095559105 -0400
    @@ -18,4 +18,4 @@
    -|   135     |     0    | DEV_A53SS0_CORE_0_A53_CORE0_ARM_CLK_CLK                                                              | CLK_STATE_READY     | 1400000000      |
    -|   136     |     0    | DEV_A53SS0_CORE_1_A53_CORE1_ARM_CLK_CLK                                                              | CLK_STATE_READY     | 1400000000      |
    -|   137     |     0    | DEV_A53SS0_CORE_2_A53_CORE2_ARM_CLK_CLK                                                              | CLK_STATE_READY     | 1400000000      |
    -|   138     |     0    | DEV_A53SS0_CORE_3_A53_CORE3_ARM_CLK_CLK                                                              | CLK_STATE_READY     | 1400000000      |
    +|   135     |     0    | DEV_A53SS0_CORE_0_A53_CORE0_ARM_CLK_CLK                                                              | CLK_STATE_READY     | 1250000000      |
    +|   136     |     0    | DEV_A53SS0_CORE_1_A53_CORE1_ARM_CLK_CLK                                                              | CLK_STATE_READY     | 1250000000      |
    +|   137     |     0    | DEV_A53SS0_CORE_2_A53_CORE2_ARM_CLK_CLK                                                              | CLK_STATE_READY     | 1250000000      |
    +|   138     |     0    | DEV_A53SS0_CORE_3_A53_CORE3_ARM_CLK_CLK                                                              | CLK_STATE_READY     | 1250000000      |
    @@ -148,2 +148,2 @@
    -|   208     |     0    | DEV_C7X256V0_C7XV_CORE_0_C7XV_CLK                                                                    | CLK_STATE_READY     | 1000000000      |
    -|   211     |     0    | DEV_C7X256V0_CLK_C7XV_CLK                                                                            | CLK_STATE_READY     | 1000000000      |
    +|   208     |     0    | DEV_C7X256V0_C7XV_CORE_0_C7XV_CLK                                                                    | CLK_STATE_READY     | 0               |
    +|   211     |     0    | DEV_C7X256V0_CLK_C7XV_CLK                                                                            | CLK_STATE_READY     | 0               |
    @@ -169,4 +169,4 @@
    -|   204     |     0    | DEV_CODEC0_VPU_ACLK_CLK                                                                              | CLK_STATE_READY     | 100000000       |
    -|   204     |     1    | DEV_CODEC0_VPU_BCLK_CLK                                                                              | CLK_STATE_READY     | 100000000       |
    -|   204     |     2    | DEV_CODEC0_VPU_CCLK_CLK                                                                              | CLK_STATE_READY     | 100000000       |
    -|   204     |     3    | DEV_CODEC0_VPU_PCLK_CLK                                                                              | CLK_STATE_READY     | 100000000       |
    +|   204     |     0    | DEV_CODEC0_VPU_ACLK_CLK                                                                              | CLK_STATE_READY     | 400000000       |
    +|   204     |     1    | DEV_CODEC0_VPU_BCLK_CLK                                                                              | CLK_STATE_READY     | 400000000       |
    +|   204     |     2    | DEV_CODEC0_VPU_CCLK_CLK                                                                              | CLK_STATE_READY     | 400000000       |
    +|   204     |     3    | DEV_CODEC0_VPU_PCLK_CLK                                                                              | CLK_STATE_READY     | 400000000       |
    @@ -278 +278 @@
    -|    21     |     4    | DEV_DCC5_DCC_CLKSRC4_CLK                                                                             | CLK_STATE_READY     | 100000000       |
    +|    21     |     4    | DEV_DCC5_DCC_CLKSRC4_CLK                                                                             | CLK_STATE_READY     | 400000000       |

    MMR0_JTAG_USER_ID for slow module

    root@mitysom-am62ax:/opt/edgeai-gst-apps# devmem2 0x43000018
    Read at address  0x43000018 (0xffff80a4b018): 0x4A7DB52E

    and for fast module

    root@mitysom-am62ax:/opt/edgeai-gst-apps# devmem2 0x43000018
    Read at address  0x43000018 (0xffff83ec1018): 0x4A7DB566


    Note the AM62A processor code for the newer modules is a speed grade U which the datasheet indicates the DSP can be clocked at 850/1000Mhz instead of at 500Mhz for T speed grade. However, the evm also identifies as a speed grade T and it seems to run the demo just fine...

    EVM:         AM62A74A*T*MGGIAMB  (DSP clocked at 850Mhz)
    Slow Module: AM62A74A*T*MGHIAMB (DSP clocked at 0Mhz?)
    Fast Module: AM62A74A*U*MHAAMB (DSP clocked at (1000Mhz)

  • Slow Module: AM62A74A*T*MGHIAMB (DSP clocked at 0Mhz?)

    This doesn't seem to be it.  I have atleast 3 other SOMs with this part that don't have an issue running the demo at speed.

    Also, I missed earlier that both the EVM and the slow module are XAM62A parts, so are preproduction.  Currently only this one specific module seems to have issues with the DSP demo, rather perplexing.

    Side note: I did have a AM62A74A*U*MHAAMB part boot and dump a corrupted clock tree.  I had to power cycle it to get the clock tree to dump like normal.  Not sure if that was a bug in the k3conf tool as the SOM appeared to be working as normal.

    weird_dump_clock_output.log

  • Hi Jonathan,

    Thanks for providing all this info here. That clock dump certainly has many corrupted values. It's hard to immediately tell if this is a k3conf tool, the underlying TISCI commands, or something deeper within that particular instance of the SoC (your slow 'T' grade part).

    And this 'U' grade part that dumped the corrupted clock tree does this randomly? It still booted, so at least some clocks are probably normal.

    This is unexpected behavior that I haven't seen before. Let me involve someone with more knowledge on these TISCI commands and clock tree.

    The 'U' grade part allows 1 GHz DSP when 0.85VDDCore is supplied. Are you providing power at that voltage?

    I'd like to ask you try clocking the DSP at 500 MHz and see if some of this clock tree corrupted and/or DSP is still shown as 0 Hz (on the slow module).

    BR,
    Reese

  • And this 'U' grade part that dumped the corrupted clock tree does this randomly? It still booted, so at least some clocks are probably normal.

    Hmm I only booted it twice so far.  First time the tree was messed up so I power cycled and it cleared up. I can do a bunch of power cycles to see if its an intermittent issue.

    The 'U' grade part allows 1 GHz DSP when 0.85VDDCore is supplied. Are you providing power at that voltage?

    Yes, VDD_CORE set to 0.85V. Verified with volt meter to be double sure.

    I'd like to ask you try clocking the DSP at 500 MHz and see if some of this clock tree corrupted and/or DSP is still shown as 0 Hz (on the slow module).

    Okay will do

  • c7x_0 {
    	/*
    	 * Override C7x frequency to 1 GHz for max performance
    	 * Requires VDD_CORE to be at 0.85V
    	 */
    	clocks = <&k3_clks 208 0>;
    	assigned-clocks = <&k3_clks 208 0>;
    	assigned-clock-rates = <500000000>; /* 500Mhz */
    };
    

    Running on good module at 500Mhz DSP, resulted in 74% c7x_1 usage and ~27 fps on object detection demo.

    Running this code on the slow module, the C7x clock dump still reports 0Hz, curiously the ARM is now running at 1.4Ghz, not sure if that's just a coincidence.

    The object detection demo however ends up crashing the edgeai-launcher. So the plot thickens.

    edgeai_launcher_crash.log
    May 07 15:35:34 mitysom-am62ax login[1169]: ROOT LOGIN  on '/dev/ttyS2'
    May 07 15:35:37 mitysom-am62ax crond[324]: (*system*) RELOAD (/etc/crontab)
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]: _rpmsg_char_find_ctrldev: could not find the matching rpmsg_ctrl device for virtio2.rpmsg_chrdev.-1.13
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]: _rpmsg_char_find_ctrldev: could not find the matching rpmsg_ctrl device for virtio0.rpmsg_chrdev.-1.21
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]: _rpmsg_char_find_ctrldev: could not find the matching rpmsg_ctrl device for virtio2.rpmsg_chrdev.-1.21
    May 07 15:35:42 mitysom-am62ax audit[1100]: ANOM_ABEND auid=4294967295 uid=0 gid=0 ses=4294967295 pid=1100 comm="edgeai-gui-app" exe="/usr/bin/edgeai-gui-app" sig=11 res=1
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]: APP: Init ... !!!
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]: MEM: Init ... !!!
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]: MEM: Initialized DMA HEAP (fd=23) !!!
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]: MEM: Init ... Done !!!
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]: IPC: Init ... !!!
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]: IPC: ERROR: Unable to create TX channels for CPU [c7x_1] !!!
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]: IPC: Init ... Done !!!
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]: APP: ERROR: IPC init failed !!!
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]: REMOTE_SERVICE: Init ... !!!
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]: REMOTE_SERVICE: Init ... Done !!!
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.180168 s: GTC Frequency = 200 MHz
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]: APP: Init ... Done !!!
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.182505 s:  VX_ZONE_INIT:Enabled
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.182563 s:  VX_ZONE_ERROR:Enabled
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.182566 s:  VX_ZONE_WARNING:Enabled
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.185981 s:  VX_ZONE_INIT:[tivxInitLocal:130] Initialization Done !!!
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.187406 s:  VX_ZONE_INIT:[tivxHostInitLocal:101] Initialization Done for HOST !!!
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.188118 s:  VX_ZONE_ERROR:[ownContextCreateCmdObj:162] context object descriptor [0] allocation failed
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.188125 s:  VX_ZONE_ERROR:[ownContextCreateCmdObj:165] context object descriptor [0] allocation failed
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.188131 s:  VX_ZONE_ERROR:[ownContextCreateCmdObj:166] Exceeded max object descriptors available. Increase TIVX_PLATFORM_MAX_OBJ_DESC_SHM_I>
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.188138 s:  VX_ZONE_ERROR:[ownContextCreateCmdObj:167] Increase TIVX_PLATFORM_MAX_OBJ_DESC_SHM_INST value in source/platform/psdk_j7/common>
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.188149 s:  VX_ZONE_ERROR:[vxCreateContext:1017] context objection creation failed
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.188157 s:  VX_ZONE_ERROR:[vxGetStatus:1015] Reference is NULL
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.190452 s:  VX_ZONE_ERROR:[ownContextCreateCmdObj:162] context object descriptor [0] allocation failed
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.190463 s:  VX_ZONE_ERROR:[ownContextCreateCmdObj:165] context object descriptor [0] allocation failed
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.190469 s:  VX_ZONE_ERROR:[ownContextCreateCmdObj:166] Exceeded max object descriptors available. Increase TIVX_PLATFORM_MAX_OBJ_DESC_SHM_I>
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.190475 s:  VX_ZONE_ERROR:[ownContextCreateCmdObj:167] Increase TIVX_PLATFORM_MAX_OBJ_DESC_SHM_INST value in source/platform/psdk_j7/common>
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.190483 s:  VX_ZONE_ERROR:[vxCreateContext:1017] context objection creation failed
    May 07 15:35:42 mitysom-am62ax kernel: kauditd_printk_skb: 1 callbacks suppressed
    May 07 15:35:42 mitysom-am62ax kernel: audit: type=1701 audit(1715096142.734:15): auid=4294967295 uid=0 gid=0 ses=4294967295 pid=1100 comm="edgeai-gui-app" exe="/usr/bin/edgeai-gui-app" sig=11 res=1
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.190490 s:  VX_ZONE_ERROR:[vxGetStatus:1015] Reference is NULL
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.255210 s:  VX_ZONE_ERROR:[ownContextCreateCmdObj:162] context object descriptor [0] allocation failed
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.255222 s:  VX_ZONE_ERROR:[ownContextCreateCmdObj:165] context object descriptor [0] allocation failed
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.255227 s:  VX_ZONE_ERROR:[ownContextCreateCmdObj:166] Exceeded max object descriptors available. Increase TIVX_PLATFORM_MAX_OBJ_DESC_SHM_I>
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.255231 s:  VX_ZONE_ERROR:[ownContextCreateCmdObj:167] Increase TIVX_PLATFORM_MAX_OBJ_DESC_SHM_INST value in source/platform/psdk_j7/common>
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.255241 s:  VX_ZONE_ERROR:[vxCreateContext:1017] context objection creation failed
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.255247 s:  VX_ZONE_ERROR:[vxGetStatus:1015] Reference is NULL
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:  Number of subgraphs:1 , 129 nodes delegated out of 129 nodes
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:  
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.548866 s:  VX_ZONE_ERROR:[ownContextCreateCmdObj:162] context object descriptor [0] allocation failed
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.548882 s:  VX_ZONE_ERROR:[ownContextCreateCmdObj:165] context object descriptor [0] allocation failed
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.548890 s:  VX_ZONE_ERROR:[ownContextCreateCmdObj:166] Exceeded max object descriptors available. Increase TIVX_PLATFORM_MAX_OBJ_DESC_SHM_I>
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.548897 s:  VX_ZONE_ERROR:[ownContextCreateCmdObj:167] Increase TIVX_PLATFORM_MAX_OBJ_DESC_SHM_INST value in source/platform/psdk_j7/common>
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.548910 s:  VX_ZONE_ERROR:[vxCreateContext:1017] context objection creation failed
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.548980 s:  VX_ZONE_ERROR:[vxGetStatus:1015] Reference is NULL
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.548993 s:  VX_ZONE_ERROR:[tivxAddKernelTIDL:243] Unable to allocate user kernel ID
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.549001 s:  VX_ZONE_ERROR:[vxGetStatus:1015] Reference is NULL
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.549007 s:  VX_ZONE_ERROR:[vxGetStatus:1015] Reference is NULL
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.549039 s:  VX_ZONE_ERROR:[vxSetReferenceName:960] Invalid reference
    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]:     65.549049 s:  VX_ZONE_ERROR:[vxQueryKernel:140] Invalid kernel reference
    May 07 15:35:42 mitysom-am62ax systemd[1]: Created slice Slice /system/systemd-coredump.
    May 07 15:35:42 mitysom-am62ax audit: BPF prog-id=13 op=LOAD
    May 07 15:35:42 mitysom-am62ax audit: BPF prog-id=14 op=LOAD
    May 07 15:35:42 mitysom-am62ax kernel: audit: type=1334 audit(1715096142.794:16): prog-id=13 op=LOAD
    May 07 15:35:42 mitysom-am62ax kernel: audit: type=1334 audit(1715096142.802:17): prog-id=14 op=LOAD
    May 07 15:35:42 mitysom-am62ax systemd[1]: Started Process Core Dump (PID 1253/UID 0).
    May 07 15:35:57 mitysom-am62ax systemd-coredump[1254]: elfutils disabled, parsing ELF objects not supported
    May 07 15:35:57 mitysom-am62ax systemd-coredump[1254]: [LNK] Process 1100 (edgeai-gui-app) of user 0 dumped core.
    May 07 15:35:57 mitysom-am62ax systemd[1]: systemd-coredump@0-1253-0.service: Deactivated successfully.
    May 07 15:35:57 mitysom-am62ax audit: BPF prog-id=14 op=UNLOAD
    May 07 15:35:57 mitysom-am62ax audit: BPF prog-id=13 op=UNLOAD
    May 07 15:35:57 mitysom-am62ax edgeai-launcher.sh[1100]:     65.549059 s:  VX_ZONE_ERROR:[vxMap
    May 07 15:35:57 mitysom-am62ax kernel: audit: type=1334 audit(1715096157.322:18): prog-id=14 op=UNLOAD
    May 07 15:35:57 mitysom-am62ax kernel: audit: type=1334 audit(1715096157.322:19): prog-id=13 op=UNLOAD
    May 07 15:35:57 mitysom-am62ax edgeai-launcher.sh[1259]: /etc/init.d/edgeai-launcher.sh: line 16: kill: (1100) - No such process
    May 07 15:35:57 mitysom-am62ax edgeai-launcher.sh[1259]: Service not running
    May 07 15:35:57 mitysom-am62ax edgeai-launcher.sh[1259]: Applying wallpaper to linux frame buffer
    May 07 15:36:01 mitysom-am62ax systemd[1]: edgeai-init.service: Deactivated successfully.
    
    
    

  • Hi Jonathan,

    Running on good module at 500Mhz DSP, resulted in 74% c7x_1 usage and ~27 fps on object detection demo.

    Seems reasonable; this is about what I'd expect for 500 MHz.

    Running this code on the slow module, the C7x clock dump still reports 0Hz, curiously the ARM is now running at 1.4Ghz, not sure if that's just a coincidence.

    Okay, so certainly still something wrong here on this slow device. Is 1.4 GHz listed as an option for the A53 clock rate? Typically it's lower than this in default DTS. There's a /boot/dtb/ti/k3-am62a7-sk-e3-max-opp.dtbo that sets to 1.4 GHz, which I assume isn't applied. I'm not sure why it would otherwise be at this 1.4 GHz. It may be loosely linked

    In the logs where the demo-init crashed, it looks like this is more-or-less predicted by the IPC (RPMesg) channels failing to initialize, e.g.

    May 07 15:35:42 mitysom-am62ax edgeai-launcher.sh[1100]: _rpmsg_char_find_ctrldev: could not find the matching rpmsg_ctrl device for virtio2.rpmsg_chrdev.-1.13
    

    The edgeai demo tried to spool up the C7x side through OpenVX, failed to run anything and then resulted in OpenVX detecting the failure and probably causing the process to die without systemd's management.

    There have been some PLL issues for the C7x core, and it's possible you are encountering those here. If you could include full boot log on the device, that may help identify this.

    Otherwise my suggestion would be to rebuild using the 9.2 SDK's SYSFW. I believe we have patched the PLL issue for AM62A C7x in this release. You can try replacing the sysfw images under linux-SDK-install/board-support/prebuilt-images/ti-sysfw with those from the 9.2 SDK

    BR,
    Reese

  • Okay, so certainly still something wrong here on this slow device. Is 1.4 GHz listed as an option for the A53 clock rate? Typically it's lower than this in default DTS. There's a /boot/dtb/ti/k3-am62a7-sk-e3-max-opp.dtbo that sets to 1.4 GHz, which I assume isn't applied. I'm not sure why it would otherwise be at this 1.4 GHz. It may be loosely linked

    Our device tree always applies the 0.85V, 1.4Ghz ARM, and 1Ghz DSP changes as at least currently all our modules were supposed to be U speed grade.

    Working module log for comparison:

    working_500Mhz_screenlog.txt

    First boot after launch on slow module, and the demo didn't crash so here is that log:

    slow_500Mhz_demo_works_screenlog.txt

    2nd boot did crash:

    slow_500Mhz_demo_crash_screenlog.txt

    Otherwise my suggestion would be to rebuild using the 9.2 SDK's SYSFW. I believe we have patched the PLL issue for AM62A C7x in this release. You can try replacing the sysfw images under linux-SDK-install/board-support/prebuilt-images/ti-sysfw with those from the 9.2 SDK

    Sure will do. 

  • Otherwise my suggestion would be to rebuild using the 9.2 SDK's SYSFW. I believe we have patched the PLL issue for AM62A C7x in this release. You can try replacing the sysfw images under linux-SDK-install/board-support/prebuilt-images/ti-sysfw with those from the 9.2 SDK

    Rebuilt u-boot with 09.02 firmwares, flashed the 09.02 SDK sdcard image and replaced u-boot and kernel.  The demo ran successfully and the clock info looks correct.

    Note that unsurprisingly the ti-sci-clk driver wasn't super happy as I suppose the sysfw broke compatibility again. Lots of "recalc-rate failed" and "is_prepared failed" message on boot.  Having kernel commits that rely so heavily on sysfw updates makes for a very fragile feeling system...

    Are there release notes for changes to the sysfw between 09.01 and 09.02?

    root@am62axx-evm:/opt/edgeai-gst-apps# k3conf dump processors
    |------------------------------------------------------------------------------|
    | VERSION INFO                                                                 |
    |------------------------------------------------------------------------------|
    | K3CONF | (version 0.3-nogit built Wed Mar 06 14:29:58 UTC 2024)              |
    | SoC    | AM62Ax SR1.0                                                        |
    | SYSFW  | ABI: 3.1 (firmware version 0x0009 '9.2.7--v09.02.07 (Kool Koala))') |
    |------------------------------------------------------------------------------|
    
    |-----------------------------------------------------------------------------------------|
    | Device ID | Processor ID | Processor Name       | Processor State | Processor Frequency |
    |-----------------------------------------------------------------------------------------|
    |   121     |       1      | WKUP_R5FSS0_CORE0    | DEVICE_STATE_ON | 800000000           |
    |     9     |       3      | MCU_R5FSS0_CORE0     | DEVICE_STATE_ON | 800000000           |
    |   208     |       4      | C7X256V0_C7XV_CORE_0 | DEVICE_STATE_ON | 1000000000          |
    |   135     |      32      | A53SS0_CORE_0        | DEVICE_STATE_ON | 1400000000          |
    |   136     |      33      | A53SS0_CORE_1        | DEVICE_STATE_ON | 1400000000          |
    |   137     |      34      | A53SS0_CORE_2        | DEVICE_STATE_ON | 1400000000          |
    |   138     |      35      | A53SS0_CORE_3        | DEVICE_STATE_ON | 1400000000          |
    |   225     |     128      | HSM0                 | DEVICE_STATE_ON | 500000000           |
    |-----------------------------------------------------------------------------------------|
    

  • Hi Jonathan,

    Thanks for trying this. I would call this a band-aid fix at the present stage. I'm glad to hear the base issue resolved, but this isn't a clean fix I would recommend long-term.

    The fact that this step resolved the DSP clock / demo issue does strongly suggest that the PLL lock is the root of the behavior you were seeing. Just to confirm, this isn't a lucky boot where it worked, correct? You can reboot multiple times and see consistent behavior on the 'slow' module?

    Moving on from here, my first suggestion would be to upgrade to the 9.2 SDK. This may not be ideal given your recent work with 9.1 SDK build. 9.2 is a more stable release overall, and should be substantially easier to upgrade to then following 10.x SDK's, since those will include Yocto version and kernel LTS updates, for what it's worth.

    The alternative would be to supply a backport patch for this PLL for 9.1 SYSFW. This may take us some time to supply; I would need to follow up on this matter.

    I was also under the impression SYSFW is backwards compatible such that these errors you're seeing are not expected. If you'd like to stick with the 9.1 SDK, could you supply more information about the errors you are seeing during boot?

    BR,
    Reese

  • Thanks for trying this. I would call this a band-aid fix at the present stage. I'm glad to hear the base issue resolved, but this isn't a clean fix I would recommend long-term.

    Understood, its just a diagnostic step. But does hopefully push us towards working on the 9.2 SDK.

    The fact that this step resolved the DSP clock / demo issue does strongly suggest that the PLL lock is the root of the behavior you were seeing. Just to confirm, this isn't a lucky boot where it worked, correct? You can reboot multiple times and see consistent behavior on the 'slow' module?

    Yeah, it worked for several reboots in a row.

    Moving on from here, my first suggestion would be to upgrade to the 9.2 SDK. This may not be ideal given your recent work with 9.1 SDK build. 9.2 is a more stable release overall, and should be substantially easier to upgrade to then following 10.x SDK's, since those will include Yocto version and kernel LTS updates, for what it's worth.

    Considering this seems to only effect one preproduction unit at this moment. I'm considering releasing our 9.1 SDK and start working towards the 9.2 SDK release.

    I was also under the impression SYSFW is backwards compatible such that these errors you're seeing are not expected. If you'd like to stick with the 9.1 SDK, could you supply more information about the errors you are seeing during boot?

    Incase this helps.

    U-Boot SPL 2023.04-00135-gabe18eed0306-dirty (May 07 2024 - 17:14:35 -0400)
    SYSFW ABI: 3.1 (firmware rev 0x0009 '9.2.7--v09.02.07 (Kool Koala)')
    NOTICE:  BL31: v2.9(release):v2.9.0-614-gd7a7135d32a8
    NOTICE:  BL31: Built : 10:35:05, Feb 15 2024
    I/TC: 
    I/TC: OP-TEE version: 4.0.0 (gcc version 9.2.1 20191025 (GNU Toolchain for the A-profile Architecture 9.2-2019.12 (arm-9.10))) #1 Thu Feb 15 15:35:03 UTC 2024 aarch64
    
    ...
    [    0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd034]
    [    0.000000] Linux version 6.1.46-00086-gd28abd4af0db-dirty (jcormier@jcormier-MS-7A93) (aarch64-none-linux-gnu-gcc (GNU Toolchain for the A-profile Architecture 9.2-2019.12 (arm-9.10)) 9.2.1 20191025, GNU ld (GNU Toolchain for the A-profile Architecture 9.2-2019.12 (arm-9.10)) 2.33.1.20191209) #12 SMP PREEMPT Tue May  7 11:23:28 EDT 2024
    ...
    [    1.747428] input: tps65219-pwrbutton as /devices/platform/bus@f0000/20000000.i2c/i2c-0/0-0030/tps65219-pwrbutton.2.auto/input/input0
    [    1.760382] pca953x 1-0020: supply vcc not found, using dummy regulator
    [    1.767176] pca953x 1-0020: using no AI
    [    1.774509] mmc0: SDHCI controller on fa10000.mmc [fa10000.mmc] using ADMA 64-bit
    [    1.797985] am65-cpsw-nuss 8000000.ethernet: Failed to create device link (0x180) with 1-0020
    [    1.806812] am65-cpsw-nuss 8000000.ethernet: Failed to create device link (0x180) with 1-0020
    [    1.822310] debugfs: Directory 'pd:182' with parent 'pm_genpd' already present!
    [    1.822513] mmc1: CQHCI version 5.10
    [    1.830792] debugfs: Directory 'pd:182' with parent 'pm_genpd' already present!
    [    1.840574] debugfs: Directory 'pd:182' with parent 'pm_genpd' already present!
    [    1.851861] ti-sci-clk 44043000.system-controller:clock-controller: is_prepared failed for dev=42, clk=18, ret=-19
    [    1.862275] ti-sci-clk 44043000.system-controller:clock-controller: is_prepared failed for dev=42, clk=17, ret=-19
    [    1.872666] ti-sci-clk 44043000.system-controller:clock-controller: is_prepared failed for dev=42, clk=16, ret=-19
    ...
    [    1.774509] mmc0: SDHCI controller on fa10000.mmc [fa10000.mmc] using ADMA 64-bit
    [    1.797985] am65-cpsw-nuss 8000000.ethernet: Failed to create device link (0x180) with 1-0020
    [    1.806812] am65-cpsw-nuss 8000000.ethernet: Failed to create device link (0x180) with 1-0020
    [    1.822310] debugfs: Directory 'pd:182' with parent 'pm_genpd' already present!
    [    1.822513] mmc1: CQHCI version 5.10
    [    1.830792] debugfs: Directory 'pd:182' with parent 'pm_genpd' already present!
    [    1.840574] debugfs: Directory 'pd:182' with parent 'pm_genpd' already present!
    [    1.851861] ti-sci-clk 44043000.system-controller:clock-controller: is_prepared failed for dev=42, clk=18, ret=-19
    [    1.862275] ti-sci-clk 44043000.system-controller:clock-controller: is_prepared failed for dev=42, clk=17, ret=-19
    [    1.872666] ti-sci-clk 44043000.system-controller:clock-controller: is_prepared failed for dev=42, clk=16, ret=-19
    

  • Hi Jonathan,

    All good to hear - I'd consider this issue resolved (with caveats related to SDK/SYSFW versions, of course).

    The issue you're seeing has also been encountered with production units. We're continuing to track this, but it should be resolved as of 9.2 SDK. The PLL lock issue tends to show up when changing the DSP clock, and is isolated to specific units.

    SYSFW is intended to be backwards compatible, and we've had success applying 9.2 SYSFW to builds from 8.6. It looks like this SYSFW is working correctly in general, but is noting that some clocks that existed previously are not available at the same ID's. I believe this device 42 is TIMER6

    The following upstream commit is intended to resolve some of those prints you were seeing. This commit can be applied on ti-linux-6.1.y
    https://github.com/torvalds/linux/commit/ad3ac13c6ec318b43e769cc9ffde67528e58e555. My interpretation is that it's adding checks to ensure each clock does exist before configuring it

    BR,
    Reese

  • I was also under the impression SYSFW is backwards compatible such that these errors you're seeing are not expected. If you'd like to stick with the 9.1 SDK, could you supply more information about the errors you are seeing during boot?

    Just FYI, I just did a test merge of the kernel tag 09.02.00.05 and the ti-sci errors went away. 

  • Ah, very good!

    Thanks for the patience and fast efforts in getting to a resolution.

    I'm closing this ticket for now, but you can respond back here in case you find this resolution hasn't fully fixed the issue.