AM3356: How to receive CAN data without any drop or overrun even if CPU load is high

Part Number: AM3356
Other Parts Discussed in Thread: WL1837

Hi expert

When we are receiving can data with high can-bus-load (e.g. 3 or 4 frames per ms)
and with high CPU load average(e.g. 1.8 or higher), we get errors of data drop, overrun or frame-order-changing.

These errors occurr even if any can-data-receiving process is not running.
We monitor them with the command `ip --details link show` except frame-order-changing which is done by a some process.

Procedure is followings:

$ ip link set can0 type can restart-ms 100
$ ip link set can0 type can bitrate 500
$ ip link set can0 up
Some can-data-receive-process runs or not

Is there any way to prevent or reduce these errors even if cpu load is high?
E.g. raise a priority of can device driver or so.

Thanks

  • Hi,

    Please tell me more about the application. For example:

    • Assuming Linux, is it RT Linux? (this would answer the question concerning about raising the priority)
    • What is the processor speed?
    • Is the AM335x initiating the traffic?

    Best Regards,

    Schuyler

  • Hi, Schuyler

    Thanks for your response.

    Assuming Linux, is it RT Linux? (this would answer the question concerning about raising the priority)

    No, normal Linux.
    I want RT Linux to be a last resort.
    Because the product is under mass production and changing to RT Linux makes a big incluence.

    What is the processor speed?

    Max speed is 800MHz and 'On demand'.

    Is the AM335x initiating the traffic?

    No, other device is initiating.
    Several devices begin to send CAN data after power-on, then
    our product which uses AM3356 starts to receive CAN data.

    Best Regards,
    Katsuyama

  • Hi Katsuyama,

    What you are describing is a flood of frames as like a power-on situation. Does this mean that after all the devices power up to a stable state the frame rate drops to a manageable rate where dropped packets are not experienced?

    Would it be possible to at least try performance setting instead of on-demand to see if the time it takes to transition to max speed is causing the missed packet problem? Then perhaps scale back the processor to on-demand after reaching a steady state?

    The non-RT kernel can still set scheduling policy and task priority but it is not as effective as the RT kernel.  Switching to an RT kernel will put the processor at max performance all the time.

    CANBUS to my knowledge is subject to dropped frames due to their not being a method for flow control.  

    Best Regards,

    Schuyler

  • Hi, Schuyler

    Thanks for your reply.

    What you are describing is a flood of frames as like a power-on situation. Does this mean that after all the devices power up to a stable state the frame rate drops to a manageable rate where dropped packets are not experienced?

    We see dropped packets in not only just after power-on but also stable state.

    Would it be possible to at least try performance setting instead of on-demand to see if the time it takes to transition to max speed is causing the missed packet problem? Then perhaps scale back the processor to on-demand after reaching a steady state?

    We tried 800MHz(Max speed) with 'performance' mode of governor.
    As the result, frame drops were reduced.
    But my customer want to use 'on demand' mode because of current consumption and temperature.
    So, I will try to change a 'up_threashold' value to 95 to small number for clock to chage quickly.

    The non-RT kernel can still set scheduling policy and task priority but it is not as effective as the RT kernel.  Switching to an RT kernel will put the processor at max performance all the time.

    I tried the RT kernel a little bit.
    In case of 'Fully Preemptible Kernel (RT)' of Preemption Model, OS hungs up when WIFI with WL1837 is enabled.
    In case of 'Preemptible Kernel(Basic RT)', kernel panic ocuurs after 2 sec from boot.
    In case of 'Preemptible Kernel (Low-Latency Desktop)', the kernel seems to work well.
    I will do the actual CAN receiving test next week.

    CANBUS to my knowledge is subject to dropped frames due to their not being a method for flow control.

    Yes, I agree with you. But my customer wants minimum frame drops.

    Thanks,
    Katsuyama

  • Hi, Schuyler

    I tried RT Linux of kernel 5.10, today.
    The result was not good.
    The drops occured when the insertion and removal of an ethernet cable  and also during wifi communicating.


    Contidions:
    - CPU clock 800Mz(fixed)
    - Low CPU load
    - No CAN receiving process

    Should I raise CAN IRQ priority?
    If so, how can I do it?

    By the way, I have two products which uses AM3356.
    One has 4.19 kernel version, the other has 5.10 kernel version.
    This time I tried 5.10's.
    And I found there are no 'Fully Preemptible Kernel (RT)' and 'Preemptible Kernel(Basic RT)'
    selection in menuconfig of 5.10's.  (Only 'Preemptible Kernel (Low-Latency Desktop)')

    Best Regards,
    Katsuyama

  • Hi,

    Did you download the TI RT SDK for the AM335x? Or are trying to patch the kernel tree? 
    Does the command uname -a have this PREEMPT_RT in the response?

    Best Regards,

    Schuyler

  • Hi, Schuyler

    The result of `uname -a result` is following:
    Linux am335x-evm 5.10.100-rt62+ #1 PREEMPT_RT Fri Mar 29 08:27:08 UTC 2024 armv7l GNU/Linux

    Preemption Model is 'Fully Preemptible Kernel (RT)'.
    In my last post, I said that kernel 5.10 does't have 'Fully Preemptible Kernel (RT)'.
    But it was incorrect. The Fully model is available.

    My method of making RT Linux is following:
    1. $ export ARAGO_RT_ENABLE=1
    2. $ bitbake linux-ti-staging-rt
    3. Copying a kernel-source from
    '.../tisdk/build/arago-tmp.../work-shared/am335x-evm/kernel-source' to a work directory.
    4. Apply my custom patches.
    5. Copying my .config to the work directory.
    6. $ make menconfig in the work directory.
    7. $ make
    8. Deploy kernel & modules in the target board.

    It seems some interrupts make CAN frame drops.

    I want to raise a CAN's priority.

    Best Regards,
    Katsuyama

  • Hi Katsuyama,

    A couple of questions, are you using a TI staging branch for building the rt kernel? Are you using the https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel tree for 5.10?

    As for raising the CAN priority you will use the chrt command to set priority and scheduling policy. Please attach a the results of the ps command and of the command cat /proc/interrupts.

    Best Regards,

    Schuyler

  • Hi, Schuyler

    > are you using a TI staging branch for building the rt kernel?

    Yes

    > Are you using the git.ti.com/.../ti-linux-kernel tree for 5.10?

    I'm using git.ti.com/.../ti-linux-kernel (without cgit because of URL error)

    I changed the priority of ksoftirq to 99 for 4.19 kernel version, then I could get the very nice
    results with not only RTLinux but also normal Linux except.
    The effects with normal Linux is slight smaller than RTLinux one, but it seems acceptable.
    Thank you very much for your advise.

    But, the frame-order-changing still occurs in both 4.19 and 5.10 kernel.
    So, my next assingment is this.
    I found next URL about this.
    e2e.ti.com/.../4630486

    Is this a problem of H/W or device driver?
    Is there any solution to this problem?


    How to change the priority:
    I also reffered to your following post.
    e2e.ti.com/.../linux-am4379-how-to-change-interrupt-priority-for-ethernet

    Then I did as folows.

    $ cat /proc/interrupts
    CPU0
    16: 20637 INTC 68 Level gp_timer
    18: 0 INTC 3 Level arm-pmu
    19: 1 INTC 78 Level wkup_m3_txev
    ...
    41: 0 INTC 52 Level can0
    42: 0 INTC 55 Level can1
    ...
    73: 4 4804c000.gpio 29 Edge wl18xx
    74: 0 icm20648-dev1 Edge icm20648_consumer1
    Err: 0

    $ ps -A | grep irq
    ps -A | grep irq
    9 ? 00:00:00 ksoftirqd/0
    49 ? 00:00:00 irq/55-TI-am335
    50 ? 00:00:00 irq/60-4803c000
    51 ? 00:00:00 irq/59-4803c000
    52 ? 00:00:00 irq/35-4802a000
    53 ? 00:00:00 irq/36-4819c000
    54 ? 00:00:00 irq/62-48060000
    818 ? 00:00:00 irq/74-icm20648
    1467 ? 00:00:00 irq/73-wl18xx

    $ chrt -p 99 9
    root@am335x-evm:~
    $ chrt -p 9
    pid 9's current scheduling policy: SCHED_RR
    pid 9's current scheduling priority: 99

    Thanks