This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM5728: Linux PSDK 6.03 - Kernel Panic with large ICMP packet over IPSec

Part Number: AM5728

Hi,

I am using TI's AM5728 EVM with Linux PSDK 6.03 on it. My kernel version is 4.19.94

I have configured pre-shared key based IPSEC Tunnel using StrongSwan between EVM and a local server. I have used standard cryptographic algorithms AES_256-HMAC_SHA2_SHA_256 for configuring IPSEC. I get following when I do -


root@am57xx-evm:~# ipsec statusall
Status of IKE charon daemon (strongSwan 5.6.2-nistpqc, Linux 4.19.94-gbe5389fd85, armv7l):
  uptime: 4 seconds, since Sep 07 18:27:09 2020
  malloc: sbrk 802816, mmap 0, used 414008, free 388808
  worker threads: 11 of 16 idle, 5/0/0/0 working, job queue: 0/0/0/0, scheduled: 11
  loaded plugins: charon aes des rc2 sha2 sha3 sha1 md5 mgf1 random nonce x509 revocation constraints pubkey pkcs1 pkcs7 pkcs8 pkcs12 pgp dnskey sshkey pem openssl fips-prf curve25519 chapoly xcbc cmac s
Listening IP addresses:
  192.168.139.229
Connections:
        1229:  192.168.139.229...192.168.4.241  IKEv2, dpddelay=3s
        1229:   local:  [192.168.139.229] uses pre-shared key authentication
        1229:   remote: [192.168.4.241] uses pre-shared key authentication
        1229:   child:  dynamic === 40.40.40.0/24 TUNNEL, dpdaction=restart
Security Associations (2 up, 0 connecting):
        1229[2]: ESTABLISHED 2 seconds ago, 192.168.139.229[192.168.139.229]...192.168.4.241[192.168.4.241]
        1229[2]: IKEv2 SPIs: 882989ef14977868_i d4aab913b1010cc0_r*, pre-shared key reauthentication in 8 minutes
        1229[2]: IKE proposal: AES_CBC_256/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/CURVE_25519
        1229{2}:  INSTALLED, TUNNEL, reqid 1, ESP SPIs: c562de59_i c64dbf79_o
        1229{2}:  AES_CBC_256/HMAC_SHA2_256_128, 0 bytes_i, 0 bytes_o, rekeying in 78 seconds
        1229{2}:   192.168.139.229/32 === 40.40.40.0/24
        1229[1]: ESTABLISHED 4 seconds ago, 192.168.139.229[192.168.139.229]...192.168.4.241[192.168.4.241]
        1229[1]: IKEv2 SPIs: 492dc6c699ed6e0d_i* 570dcf4001b22b06_r, pre-shared key reauthentication in 7 minutes
        1229[1]: IKE proposal: AES_CBC_256/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/CURVE_25519
        1229{1}:  INSTALLED, TUNNEL, reqid 1, ESP SPIs: ca75a05c_i cbd2eba6_o
        1229{1}:  AES_CBC_256/HMAC_SHA2_256_128, 0 bytes_i, 0 bytes_o, rekeying in 91 seconds
        1229{1}:   192.168.139.229/32 === 40.40.40.0/24



I am able to ping my server's IP from EVM with ping packet of different sizes (from 64 to 1600 bytes). I am also able to ping EVM's IP from my server using ping packet of standard size (64 bytes). 

However, when I try to ping EVM's IP from my server with a ping packet of size greater than 150 bytes, I get kernel panic on EVM.

ON SERVER - 


ping -s 150 192.168.139.229
PING 192.168.139.229 (192.168.139.229): 150 data bytes

The ping does not work and it produces kernel panic on EVM. I am attaching snapshot of the prints I get on minicom attached to EVM when kernel panic occurs.

Unable to handle kernel NULL pointer dereference at virtual address 00000000
[  141.081215] pgd = 2e42394b
[  141.083930] [00000000] *pgd=80000080004003, *pmd=00000000
[  141.089378] Internal error: Oops: 207 [#1] PREEMPT SMP ARM
[  141.094888] Modules linked in: cbc aes_arm_bs crypto_simd cryptd hmac drbg authenc echainiv xfrm4_mode_tunnel xfrm_user xfrm4_tunnel ipcomp xfrm_ipcomp esp4 ah4 af_key xfrm_algo usbhid arc4 wl18xx wlo
[  141.165714]  usbserial usbcore usb_common jailhouse(O) gdbserverproxy(O) cryptodev(O) cmemk(O)
[  141.174371] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O      4.19.94-gbe5389fd85 #1
[  141.182842] Hardware name: Generic DRA74X (Flattened Device Tree)
[  141.188966] PC is at page_address+0xc/0xf8
[  141.193080] LR is at omap_crypto_cleanup+0x48/0xbc [omap_crypto]
[  141.199109] pc : [<c0320af4>]    lr : [<bf125048>]    psr: 000b0113
[  141.205399] sp : c1201c98  ip : c1201cc0  fp : c1201cbc
[  141.210642] r10: 00000100  r9 : 00000000  r8 : ed6f34c0
[  141.215886] r7 : eed482e8  r6 : ed6f8ae0  r5 : c02162e8  r4 : 00000002
[  141.222436] r3 : 00000100  r2 : 00000000  r1 : ed6f34c0  r0 : 00000000
[  141.228989] Flags: nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[  141.236150] Control: 30c5387d  Table: 943c4100  DAC: fffffffd
[  141.241918] Process swapper/0 (pid: 0, stack limit = 0x225b5dfa)
[  141.247947] Stack: (0xc1201c98 to 0xc1202000)
[  141.252318] 1c80:                                                       00000002 c02162e8
[  141.260530] 1ca0: ed6f8ae0 eed482e8 ed6f34c0 00000000 c1201cf4 c1201cc0 bf125048 c0320af4
[  141.268741] 1cc0: 00000000 c053b134 00000100 ed6f8a40 c02162e8 00000000 eed482e8 c1068d40
[  141.276951] 1ce0: 00000040 00000006 c1201d14 c1201cf8 bf1f1f84 bf12500c 0000000a 00000923
[  141.285161] 1d00: ed6f8a60 ed6f8a64 c1201d3c c1201d18 c0232b84 bf1f1e30 00000000 c1203098
[  141.293371] 1d20: c1203080 ffffe000 40000006 00000102 c1201d4c c1201d40 c0232c2c c0232b24
[  141.301582] 1d40: c1201dac c1201d50 c020215c c0232c18 edcd2b40 eea65a00 eea65a68 c0c02ef0
[  141.309791] 1d60: 00200102 c0dd0be0 c1203d00 ffffc1ea 00000009 c1068d40 c1062358 c1203080
[  141.318001] 1d80: c1201dac c1068cfc 00000000 00000000 00000001 ee80c000 c1200000 c12081c0
[  141.326211] 1da0: c1201dbc c1201db0 c0232f88 c0202044 c1201de4 c1201dc0 c0288e50 c0232ec0
[  141.334421] 1dc0: c1205104 fa21200c fa212000 c1201e10 fa213000 c1200000 c1201e0c c1201de8
[  141.342632] 1de0: c0565458 c0288df4 c0a2f678 600b0013 ffffffff c1201e44 c0a2a3b4 c1200000
[  141.350841] 1e00: c1201e6c c1201e10 c02019f8 c0565420 eed4f180 00000002 00000000 000028b6
[  141.359051] 1e20: eed4f180 00000000 d4110000 00000001 c0a2a3b4 00000000 c12081c0 c1201e6c
[  141.367262] 1e40: c1201e70 c1201e60 c02526e0 c0a2f678 600b0013 ffffffff c024e178 00000000
[  141.375472] 1e60: c1201eac c1201e70 c02526e0 c0a2f65c 00000000 00000002 c1201eac c1201e88
[  141.383683] 1e80: c024e478 eed4f180 c12081c0 d40c8a00 00000000 ed71f2c0 00000000 c0c03350
[  141.391892] 1ea0: c1201f04 c1201eb0 c0a2a3b4 c0252678 edf5c104 eea28190 c1201ee4 c1201ec8
[  141.400103] 1ec0: c0a2ae00 c1204c48 c128f6d0 000001fe 00000000 d7ef8d58 c1201f0c c1200000
[  141.408314] 1ee0: c1204c7c c1204cc4 00000001 00000000 00000000 c1204c48 c1201f1c c1201f08
[  141.416525] 1f00: c0a2ae00 c0a2a10c ffffe000 c1204c7c c1201f6c c1201f20 c025a8f0 c0a2add0
[  141.424735] 1f20: eed4f180 00000000 c0dd2920 c12522d6 c1204cdc c1204c48 00000000 d7ef8d58
[  141.432945] 1f40: ffffffff 000000c7 00000002 00000000 c1253e00 ffffffff c1253e00 c104ba38
[  141.441156] 1f60: c1201f7c c1201f70 c025ad3c c025a7b0 c1201f94 c1201f80 c0a29424 c025ad28
[  141.449367] 1f80: c1253e58 00000000 c1201ff4 c1201f98 c1000dfc c0a29360 ffffffff ffffffff
[  141.457576] 1fa0: 00000000 c10005dc ffffffff 00000000 c1204c48 c1204c40 00000000 c104ba38
[  141.465786] 1fc0: d7eb9e06 00000000 00000000 c1000330 00000000 30c0387d 00000000 8ffdc000
[  141.473996] 1fe0: 412fc0f2 30c5387d 00000000 c1201ff8 00000000 c10009c0 00000000 00000000
[  141.482203] Backtrace: 
[  141.484662] [<c0320ae8>] (page_address) from [<bf125048>] (omap_crypto_cleanup+0x48/0xbc [omap_crypto])
[  141.494095]  r9:00000000 r8:ed6f34c0 r7:eed482e8 r6:ed6f8ae0 r5:c02162e8 r4:00000002
[  141.501877] [<bf125000>] (omap_crypto_cleanup [omap_crypto]) from [<bf1f1f84>] (omap_aes_done_task+0x160/0x1ec [omap_aes_driver])
[  141.513577]  r10:00000006 r9:00000040 r8:c1068d40 r7:eed482e8 r6:00000000 r5:c02162e8
[  141.521435]  r4:ed6f8a40
[  141.523984] [<bf1f1e24>] (omap_aes_done_task [omap_aes_driver]) from [<c0232b84>] (tasklet_action_common.constprop.2+0x6c/0xf4)
[  141.535507]  r5:ed6f8a64 r4:ed6f8a60
[  141.539096] [<c0232b18>] (tasklet_action_common.constprop.2) from [<c0232c2c>] (tasklet_action+0x20/0x28)
[  141.548702]  r9:00000102 r8:40000006 r7:ffffe000 r6:c1203080 r5:c1203098 r4:00000000
[  141.556479] [<c0232c0c>] (tasklet_action) from [<c020215c>] (__do_softirq+0x124/0x28c)
[  141.564430] [<c0202038>] (__do_softirq) from [<c0232f88>] (irq_exit+0xd4/0x110)
[  141.571767]  r10:c12081c0 r9:c1200000 r8:ee80c000 r7:00000001 r6:00000000 r5:00000000
[  141.579627]  r4:c1068cfc
[  141.582172] [<c0232eb4>] (irq_exit) from [<c0288e50>] (__handle_domain_irq+0x68/0xbc)
[  141.590037] [<c0288de8>] (__handle_domain_irq) from [<c0565458>] (gic_handle_irq+0x44/0x80)
[  141.598421]  r9:c1200000 r8:fa213000 r7:c1201e10 r6:fa212000 r5:fa21200c r4:c1205104
[  141.606195] [<c0565414>] (gic_handle_irq) from [<c02019f8>] (__irq_svc+0x58/0x8c)
[  141.613706] Exception stack(0xc1201e10 to 0xc1201e58)
[  141.618776] 1e00:                                     eed4f180 00000002 00000000 000028b6
[  141.626986] 1e20: eed4f180 00000000 d4110000 00000001 c0a2a3b4 00000000 c12081c0 c1201e6c
[  141.635195] 1e40: c1201e70 c1201e60 c02526e0 c0a2f678 600b0013 ffffffff
[  141.641836]  r9:c1200000 r8:c0a2a3b4 r7:c1201e44 r6:ffffffff r5:600b0013 r4:c0a2f678
[  141.649615] [<c0a2f650>] (_raw_spin_unlock_irq) from [<c02526e0>] (finish_task_switch+0x74/0x210)
[  141.658525] [<c025266c>] (finish_task_switch) from [<c0a2a3b4>] (__schedule+0x2b4/0x7f4)
[  141.666648]  r10:c0c03350 r9:00000000 r8:ed71f2c0 r7:00000000 r6:d40c8a00 r5:c12081c0
[  141.674507]  r4:eed4f180
[  141.677051] [<c0a2a100>] (__schedule) from [<c0a2ae00>] (schedule_idle+0x3c/0x7c)
[  141.684565]  r10:c1204c48 r9:00000000 r8:00000000 r7:00000001 r6:c1204cc4 r5:c1204c7c
[  141.692423]  r4:c1200000
[  141.694968] [<c0a2adc4>] (schedule_idle) from [<c025a8f0>] (do_idle+0x14c/0x2ac)
[  141.702392]  r5:c1204c7c r4:ffffe000
[  141.705982] [<c025a7a4>] (do_idle) from [<c025ad3c>] (cpu_startup_entry+0x20/0x24)
[  141.713582]  r10:c104ba38 r9:c1253e00 r8:ffffffff r7:c1253e00 r6:00000000 r5:00000002
[  141.721440]  r4:000000c7
[  141.723983] [<c025ad1c>] (cpu_startup_entry) from [<c0a29424>] (rest_init+0xd0/0xd4)
[  141.731761] [<c0a29354>] (rest_init) from [<c1000dfc>] (start_kernel+0x448/0x470)
[  141.739271]  r5:00000000 r4:c1253e58
[  141.742860] [<c10009b4>] (start_kernel) from [<00000000>] (  (null))
[  141.749239] Code: e89da800 e1a0c00d e92ddbf0 e24cb004 (e5903000) 
[  141.755402] ---[ end trace ab846def44892ab0 ]---
[  141.760047] Kernel panic - not syncing: Fatal exception in interrupt
[  141.766431] CPU1: stopping
[  141.769153] CPU: 1 PID: 95 Comm: systemd-journal Tainted: G      D    O      4.19.94-gbe5389fd85 #1
[  141.778234] Hardware name: Generic DRA74X (Flattened Device Tree)
[  141.784349] Backtrace: 
[  141.786809] [<c020ca34>] (dump_backtrace) from [<c020cd6c>] (show_stack+0x18/0x1c)
[  141.794410]  r7:ed53beb0 r6:60000193 r5:00000000 r4:c12506dc
[  141.800095] [<c020cd54>] (show_stack) from [<c0a15124>] (dump_stack+0x9c/0xb0)
[  141.807350] [<c0a15088>] (dump_stack) from [<c020f6b8>] (handle_IPI+0x1b0/0x1dc)
[  141.814776]  r7:ed53beb0 r6:00000001 r5:00000000 r4:c1068cfc
[  141.820459] [<c020f508>] (handle_IPI) from [<c0565490>] (gic_handle_irq+0x7c/0x80)
[  141.828058]  r6:fa212000 r5:fa21200c r4:c1205104
[  141.832694] [<c0565414>] (gic_handle_irq) from [<c02019f8>] (__irq_svc+0x58/0x8c)
[  141.840205] Exception stack(0xed53beb0 to 0xed53bef8)
[  141.845275] bea0:                                     ee803c00 edcc9000 edcc9010 c1205024
[  141.853487] bec0: edcc9000 c1204c48 ee803c00 edcc9000 fffff000 ed53a000 00000142 ed53bf2c
[  141.861696] bee0: ed53bf30 ed53bf00 c0367628 c0343064 60000013 ffffffff
[  141.868337]  r9:ed53a000 r8:fffff000 r7:ed53bee4 r6:ffffffff r5:60000013 r4:c0343064
[  141.876116] [<c0342f54>] (kmem_cache_free) from [<c0367628>] (putname+0x60/0x68)
[  141.883540]  r8:fffff000 r7:edcc9000 r6:fffffffe r5:c1204c48 r4:edcc9000
[  141.890270] [<c03675c8>] (putname) from [<c0354c58>] (do_sys_open+0x12c/0x1f4)
[  141.897520]  r5:c1204c48 r4:fffffffe
[  141.901109] [<c0354b2c>] (do_sys_open) from [<c0354d58>] (sys_openat+0x14/0x18)
[  141.908447]  r10:00000142 r9:ed53a000 r8:c0201204 r7:00000142 r6:bebd8318 r5:ffffffff
[  141.916306]  r4:00470180
[  141.918849] [<c0354d44>] (sys_openat) from [<c0201000>] (ret_fast_syscall+0x0/0x4c)
[  141.926535] Exception stack(0xed53bfa8 to 0xed53bff0)
[  141.931606] bfa0:                   00470180 ffffffff ffffff9c 00470180 000a0842 000001a0
[  141.939817] bfc0: 00470180 ffffffff bebd8318 00000142 b6ed31e0 0044f8b0 00000001 bebd82e0
[  141.948027] bfe0: 00000000 bebd8140 b6f052b0 b6d2c6dc
[  141.953103] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

I have no idea about what actually is causing this kernel panic. Also, I have little experience on how to debug a kernel panic. Can you help me out?

Kindly acknowledge and revert.

Thanks & Regards,

Devashish

  • Hi Devashish,

    There were multiple fixes on omap aes driver recently. One in the crash might be related
    to:

    https://lore.kernel.org/linux-crypto/20191105140111.20285-11-t-kristo@ti.com/

    That said there are more issues with 4.19 kernel. You can try porting changes from latest kernel:

    github.com/.../omap-aes.c

    https://github.com/torvalds/linux/blob/master/drivers/crypto/omap-aes.h

    If there are too many changes i would recommend not to use hardware crypto for IPSec.

    Best Regards,
    Keerthy

  • Hi,

    Thanks for your response.

    I couldn't wonder but ask what do you mean by "not using hardware crypto for IPSec"?

    We have used NETKEY kernel stack for IPSec which, in my opinion, is purely software and does not use hardware based crypto acceleration.

    I will do the changes as you mentioned in omap aes driver and try to run my test case.

    Thanks,

    Devashish

  • Hi Devashish,

    The function that is crashing for you is in omap AES hardware crypto driver.
    You could probably remove the hardware crypto modules using rmmod & you
    can create a software crypto based IPSec tunnel.

    just do rmmod of omap-aes-driver.ko, omap-crypto.ko, omap-sham.ko

    Setup IPSec tunnel and it should avoid hardware crypto accelerators.


    Best Regards,
    Keerthy

  • Hi,

    I did the changes in omap aes driver as per the following patch -

    https://lore.kernel.org/linux-crypto/20191105140111.20285-11-t-kristo@ti.com/

    And now I am able to ping server with large ping packet sizes (till 1400 Bytes). However, when I do, "ping -s 1500 server_IP" on EVM, I don't receive any ping reply on EVM.

    I am assuming there is some issue with ESP fragmentation. Anyways, If you have something to add for this issue, please comment.

    Also, I haven't rmmod the hardware crypto modules right now and I am able to establish an IPsec tunnel. Does that mean the hardware crypto accelerator is being used for crypto operations? If yes, how can I verify if the crypto accelerator is being used?

    Kindly acknowledge and revert.

    Regards,

    Devashish

  • Hi Devashish,

    The fact that the patch fixed the earlier issue means that hardware crypto is used.

    1) You can add prints in: omap_aes_done_task Just for experimenting it will flood your console
    as the prints will be very high frequency.

    2) Other was is to monitor DMA interrupt counts in /proc/interrupts.
    cat /proc/interrupts | grep sham

    That should increase.

    Type 1 is sure shot way you can try without rmmod and then with rmmod and you should see those
    cyrpto prints disappearing after you rmmod.


    If there are no further questions please resolve.

    Best Regards,
    Keerthy


  • Hi,

    Thanks for your reply. I was able to verify both of your methods (1) & (2).

    I have one more doubt related to this. According to this guide - https://software-dl.ti.com/processor-sdk-linux/esd/docs/06_03_00_106/linux/Foundational_Components/Kernel/Kernel_Drivers/Crypto.html , under section IPSec Testing, I am not able to run the command on my EVM. I get following output - 

    root@am57xx-evm:~# iperf3 -c 40.40.40.100 -u -b 400.0M -t 10 
    iperf3: error - unable to connect to server: Connection refused
    

    However, if I run it with iperf only, I get -

    iperf -c 40.40.40.100 -u -b 400.0M -t 10 
    ------------------------------------------------------------
    Client connecting to 40.40.40.100, UDP port 5001
    Sending 1470 byte datagrams
    UDP buffer size:  176 KByte (default)
    ------------------------------------------------------------
    [  3] local 192.168.139.229 port 48544 connected with 40.40.40.100 port 5001
    [ ID] Interval       Transfer     Bandwidth
    [  3]  0.0-10.0 sec   114 MBytes  95.8 Mbits/sec
    [  3] Sent 81430 datagrams
    [  3] WARNING: did not receive ack of last datagram after 10 tries.
    

    What's the usefulness of this command? I have different outputs of this command with rmmod and without rmmod of hardware crypto drivers.

    Please explain.

    Regards,

    Devashish

  • Hi Devashish,

    That was added for AM65 processors primarily if you see the section. Though it should work with AM5728 as well.
    That was more to test the throughput.

    You can go through this: https://www.tecmint.com/test-network-throughput-in-linux/

    Server side(PC) you need to do something like:

    iperf -s -u --len 1500.

    This will be needed only if you want to measure the network throughput.
    So without IPSec tunnel this will be high and with IPSec tunnel as the crypto
    framework comes in this will typically reduce.

    Let me know if you have any other questions. If not please resolve.

    Bes Regards,
    Keerthy