TDA4AL-Q1: [TDA4AL] IPC userspace performace question

Part Number: TDA4AL-Q1

Tool/software:

Hi:

     I study the IPC between A72 and R5F core. My SDK is 10.01.00.04.

     I have run the rpmsg_char_simple to test lookback. From the output logs, it have shown the round trip delay time. It was spend the time about 90ms. But we need more short response time.

     How can we do that? Because the IPC was base on mailbox and share memory to do. Why it was spend many round trip time?

root@j721s2-evm:/# rpmsg_char_simple -r 0 -n 100
Created endpt device rpmsg-char-0-6457, fd = 4 port = 1025
Exchanging 100 messages with rpmsg device rpmsg-char-0-6457 on rproc id 0 ...

Sending message #0: hello there 0!
Received message #0: round trip delay(usecs) = 300220
hello there 0!
Sending message #1: hello there 1!
Received message #1: round trip delay(usecs) = 120220
hello there 1!
Sending message #2: hello there 2!
Received message #2: round trip delay(usecs) = 120665
hello there 2!
Sending message #3: hello there 3!
Received message #3: round trip delay(usecs) = 96650
hello there 3!
Sending message #4: hello there 4!
Received message #4: round trip delay(usecs) = 94130
hello there 4!
Sending message #5: hello there 5!
Received message #5: round trip delay(usecs) = 91095
hello there 5!
Sending message #6: hello there 6!
Received message #6: round trip delay(usecs) = 94505
hello there 6!
Sending message #7: hello there 7!
Received message #7: round trip delay(usecs) = 97245
hello there 7!
Sending message #8: hello there 8!
Received message #8: round trip delay(usecs) = 98030
hello there 8!
Sending message #9: hello there 9!
Received message #9: round trip delay(usecs) = 91040
hello there 9!
Sending message #10: hello there 10!
Received message #10: round trip delay(usecs) = 90415
hello there 10!
Sending message #11: hello there 11!
Received message #11: round trip delay(usecs) = 91065
hello there 11!
Sending message #12: hello there 12!
Received message #12: round trip delay(usecs) = 90400
hello there 12!
Sending message #13: hello there 13!
Received message #13: round trip delay(usecs) = 92485
hello there 13!
Sending message #14: hello there 14!
Received message #14: round trip delay(usecs) = 90995
hello there 14!
Sending message #15: hello there 15!
Received message #15: round trip delay(usecs) = 91660
hello there 15!
Sending message #16: hello there 16!
Received message #16: round trip delay(usecs) = 100330
hello there 16!
Sending message #17: hello there 17!
Received message #17: round trip delay(usecs) = 92455
hello there 17!
Sending message #18: hello there 18!
Received message #18: round trip delay(usecs) = 96590
hello there 18!
Sending message #19: hello there 19!
Received message #19: round trip delay(usecs) = 102675

  • Hi,

    Assigned engineer is out of office, kindly expect delay in response.

    Regards,

    Manojna

  • Hello,

    Can you please check https://software-dl.ti.com/jacinto7/esd/processor-sdk-rtos-jacinto7/10_01_00_04/exports/docs/pdk_jacinto_10_01_00_25/docs/datasheet/jacinto/datasheet_j721e.html ?

    The IPC remains same for all the J7 devices. You can refer the IPC performance numbers from the above link.

    This is round about time of "Hello world" from rpmsg_char_simple example sent to R5F core and returned back to A72.

    You can optimize it by writing better example on both the ends.

    Regards

    Tarun Mukesh

  • Hi Tarun Mukesh:

          Below table picture, I have some question:

         1. What the measure unit? Is ms(millisecond) or us(microsecond)?

          2. What's that Hose core A72? Is it that mean A72 send echo to each other core (MCU R5F0 or Main R5F0 or C66x1 etc) and back to A72 time? Is test store from A72 to other core round trip time, right?

          3. What's mean different data size? And Why different size have different performance? Is it cause by memory copy?

          4. From my test log shown "round trip delay(usecs) = 102675", It's 102ms, right? And I think that's bad performance!

  • Hi Tarun Mukesh:

          One more question that what's Bios? Is that mean A72 run Linux SDK?

  • Hello,

       1. What the measure unit? Is ms(millisecond) or us(microsecond)?

    It is already mentioned in the link it is us.

    2. What's that Hose core A72? Is it that mean A72 send echo to each other core (MCU R5F0 or Main R5F0 or C66x1 etc) and back to A72 time? Is test store from A72 to other core round trip time, right?

    Host core A72 is A72 starting the data sent and echoing it back.

          3. What's mean different data size? And Why different size have different performance? Is it cause by memory copy?

    Yes 

    4. From my test log shown "round trip delay(usecs) = 102675", It's 102ms, right? And I think that's bad performance!

    The performance numbers are based on bios not on linux. The data you are comparing bios with linux is not correct.

    bios is basic os not linux or any kind of hlos.

    Regards

    Tarun Mukesh

  • Hi Tarun Mukesh:

          Thanks, your comment!

          If I want to improvement the round trip delay in linux OS. What can I do on rpmsg_char_simple? Reduce data size? any else ideal? Because the round trip time was too long! Or any other suggestion way?

  • Hello,

    Let me check and get back to you.

    Regards

    Tarun Mukesh

  • Hello,

    I checked linux code we are measuring the times right before the send call and after receive the message.

    Fullscreen
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    clock_gettime(CLOCK_MONOTONIC, &ts_current);
    ret = send_msg(rcdev->fd, (char *)packet_buf, packet_len);
    if (ret < 0) {
    printf("send_msg failed for iteration %d, ret = %d\n", i, ret);
    goto out;
    }
    if (ret != packet_len) {
    printf("bytes written does not match send request, ret = %d, packet_len = %d\n",
    i, ret);
    goto out;
    }
    ret = recv_msg(rcdev->fd, 256, (char *)packet_buf, &packet_len);
    clock_gettime(CLOCK_MONOTONIC, &ts_end);
    if (ret < 0) {
    printf("recv_msg failed for iteration %d, ret = %d\n", i, ret);
    goto out;
    }
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

    send_msg() from user space is calling kernel level API's is causing the delay in linux OS but we cannot avoid this in Linux OS side.

    On other side R5F, As you are using MCU1_0 it has sciserver (mandatory task) as more priority than Ipc_echo_test this may cause a bit of delay if any context switch happened between tasks of ipc_echo_test and sciserver.

    May be you can check with any other R5F core with only ipc_echo_test task and see any better numbers you get from it.

    Regards

    Tarun MUkesh

  • Hi Tarun MUkesh:

          From your comment, I test rpmsg_char_simple on MCU R5F and main R5F. The Main R5F performance batter than MCU R5F. But It's seem still have bigger round trip time about 70ms.

          How can I to improve it? Because I have an application scenarios:

             1. MCU R5F will get the CAN data from CAN bus.

             2. If happen some emergency problem, will send notify from A72 to MCU R5F or from MCU R5F to A72 through IPC.

          So we need the IPC notify round trip time verify quickly. Do you have any suggestion about that quickly response time on IPC?  

    Test on MCU domain R5F: rpmsg_char_simple -r 0 -n 10000

    Sending message #3194: hello there 3194!
    Received message #3194: round trip delay(usecs) = 89515
    hello there 3194!
    Sending message #3195: hello there 3195!
    Received message #3195: round trip delay(usecs) = 106435
    hello there 3195!
    Sending message #3196: hello there 3196!
    Received message #3196: round trip delay(usecs) = 88340
    hello there 3196!
    Sending message #3197: hello there 3197!
    Received message #3197: round trip delay(usecs) = 87755
    hello there 3197!
    Sending message #3198: hello there 3198!
    Received message #3198: round trip delay(usecs) = 88600
    hello there 3198!
    Sending message #3199: hello there 3199!
    Received message #3199: round trip delay(usecs) = 88045
    hello there 3199!
    Sending message #3200: hello there 3200!
    Received message #3200: round trip delay(usecs) = 88680
    hello there 3200!
    Sending message #3201: hello there 3201!
    Received message #3201: round trip delay(usecs) = 87420
    hello there 3201!
    Sending message #3202: hello there 3202!
    Received message #3202: round trip delay(usecs) = 87950
    hello there 3202!
    Sending message #3203: hello there 3203!
    Received message #3203: round trip delay(usecs) = 88345
    hello there 3203!
    Sending message #3204: hello there 3204!
    Received message #3204: round trip delay(usecs) = 89440
    hello there 3204!
    Sending message #3205: hello there 3205!
    Received message #3205: round trip delay(usecs) = 89575
    hello there 3205!
    Sending message #3206: hello there 3206!
    Received message #3206: round trip delay(usecs) = 88855
    hello there 3206!
    Sending message #3207: hello there 3207!
    Received message #3207: round trip delay(usecs) = 87800
    hello there 3207!
    Sending message ^C
    Clean up and exit while handling signal 2
    Application did not close some rpmsg_char devices
    Received message #3227: round trip delay(usecs) = 87670

    Test on Main domain R5F core 0: rpmsg_char_simple -r 2 -n 10000

    Sending message #2839: hello there 2839!
    Received message #2839: round trip delay(usecs) = 64900
    hello there 2839!
    Sending message #2840: hello there 2840!
    Received message #2840: round trip delay(usecs) = 63945
    hello there 2840!
    Sending message #2841: hello there 2841!
    Received message #2841: round trip delay(usecs) = 65010
    hello there 2841!
    Sending message #2842: hello there 2842!
    Received message #2842: round trip delay(usecs) = 64365
    hello there 2842!
    Sending message #2843: hello there 2843!
    Received message #2843: round trip delay(usecs) = 80690
    hello there 2843!
    Sending message #2844: hello there 2844!
    Received message #2844: round trip delay(usecs) = 61355
    hello there 2844!
    Sending message #2845: hello there 2845!
    Received message #2845: round trip delay(usecs) = 60140
    hello there 2845!
    Sending message #2846: hello there 2846!
    Received message #2846: round trip delay(usecs) = 60560
    hello there 2846!
    Sending message #2847: hello there 2847!
    Received message #2847: round trip delay(usecs) = 60920
    hello there 2847!
    Sending message #^C
    Clean up and exit while handling signal 2
    Application did not close some rpmsg_char devices
    Received message #2875: round trip delay(usecs) = 64185

    Test on Main domain R5F core 1: rpmsg_char_simple -r 2 -n 10000

    Received message #2806: round trip delay(usecs) = 86785
    hello there 2806!
    Sending message #2807: hello there 2807!
    Received message #2807: round trip delay(usecs) = 71070
    hello there 2807!
    Sending message #2808: hello there 2808!
    Received message #2808: round trip delay(usecs) = 68415
    hello there 2808!
    Sending message #2809: hello there 2809!
    Received message #2809: round trip delay(usecs) = 68440
    hello there 2809!
    Sending message #2810: hello there 2810!
    Received message #2810: round trip delay(usecs) = 67645
    hello there 2810!
    Sending message #2811: hello there 2811!
    Received message #2811: round trip delay(usecs) = 69080
    hello there 2811!
    Sending message #2812: hello there 2812!
    Received message #2812: round trip delay(usecs) = 69955
    hello there 2812!
    Sending message #2813: hello there 2813!
    Received message #2813: round trip delay(usecs) = 68945
    hello there 2813!
    Sending message #2814: hello there 2814!
    Received message #2814: round trip delay(usecs) = 68880
    hello there 2814!
    Sending message #2815: hello there 2815!
    Received message #2815: round trip delay(usecs) = 68595
    hello there 2815!
    Sending message #2816: hello there 2816!
    Received message #2816: round trip delay(usecs) = 68195
    hello there 2816!
    Sending message #2817: hello there 2817!
    Received message #2817: round trip delay(usecs) = 68075
    hello there 2817!
    Sending message #2818: hello there 2818!
    Received message #2818: round trip delay(usecs) = 68145
    hello there 2818!
    Sending message #2819: hello there 2819!
    Received message #2819: round trip delay(usecs) = 68630
    hello there 2819!
    Sending message #2820: hello there 2820!
    Received message #2820: round trip delay(usecs) = 68445
    hello there 2820!
    Sending message #2821: hello there 2821!
    Received message #2821: round trip delay(usecs) = 68000
    hello there 2821!
    Sending message #2822: hello there 2822!
    Received message #2822: round trip delay(usecs) = 67890

  • Hello,

    For linux OS side there is nothing to modify much it is the time internally it takes.

    Sending message #2845: hello there 2845!
    Received message #2845: round trip delay(usecs) = 60140

    Here you reach almost 60ms ,How much is your expectation ?

    On R5F side, you can keep ipc as highest priority task to get better results. But as per MCU R5F you should have sciserver as highest priority rather than any other tasks,

    RegardS

    Tarun Mukesh

  • Hi Tarun Mukesh:

         Thanks your comment and suggestion!

         I expectation the IPC round trip delay should be average in 40ms(round trip delay). Can I do that?

         How can I measure the round-trip time from SOC to MCU, and how much time is spent on each segment? I would like to measure it to know where's take long time. For example userspace to kernel space take many time. How can I measure it?

  • Hi Tarun Mukesh:

          Correct my expectation performance.

          1. SOC to MCU must periodic send data in 40 ms. Data size 3000 bytes in one packet.

          2. MCU to SOC must periodic send data in 10 ms. Data size 200 bytes in one packet.

          As our target, SOC to MCU data size was 3000 bytes, but IPC max was 496 (512-16) bytes. As I know, it seem have another way that send share memory data address through IPC. Not send data through IPC. Do you have any example code for me reference?

     

  • Hi 

         I refer website software-dl.ti.com/.../developer_notes_ipc.html

    • To reduce latencies and / or to send larger buffer it is recommended to pass a pointer/handle/offset to a larger shared memory from ION heap

          Does have any example?

  • Hello,

    • To reduce latencies and / or to send larger buffer it is recommended to pass a pointer/handle/offset to a larger shared memory from ION heap

          Does have any example?

    No we don't have any example in SDK that does this.

          As our target, SOC to MCU data size was 3000 bytes, but IPC max was 496 (512-16) bytes. As I know, it seem have another way that send share memory data address through IPC. Not send data through IPC. Do you have any example code for me reference?

    No reference code.

     How can I measure the round-trip time from SOC to MCU, and how much time is spent on each segment? I would like to measure it to know where's take long time. For example userspace to kernel space take many time. How can I measure it?

    How much time it is taking in kernel is very much into Linux OS . Usually this will not be much helpful since OS driver API's are not going change or modify.

    Currently you are using ipc_echo_test on R5F core ,It is not optimized one and has prints which take long time , and have sciserver as highest priority .

    In your custom R5F application, you need to keep ipc task as highest priority and optimize the code without any console prints and check the response time.

    Do you have custom application written on R5F code ?

    Regards

    Tarun Mukesh

  • Hi Tarun Mukesh:

         I will try to look at ipc_echo_test if I can doing anything!

         I saw another case have taking about the rpmsg_char_zerocopy. It's seem can send larger data between remote core and SoC side. Does this also can apply on TDA4? Can I use this?

    https://git.ti.com/cgit/rpmsg/rpmsg_char_zerocopy/ 

  • Hello,

    We did not use this one on TDA4 so you can see there are no examples. I cannot comment much further without validating any example here.

    Regards

    Tarun Mukesh 

  • Hi Tarun Mukesh:

         Thanks your comment and feedback to me! I will close case first. Another question I will create new case. Many thanks you help!