This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

PROCESSOR-SDK-AM335X: PROCESSOR-SDK-AM335X: AM335x Sitara HS HW Cryptography Accelerator Performance

Part Number: PROCESSOR-SDK-AM335X
Other Parts Discussed in Thread: SHA-256

Hi all,

I've developped my own baremetal crypto SHA + MD5 driver (in polling mode for the moment).

In the same time I've a running MBEDTLS solution. (Software implementation)

When I process 1kbytes of data and I compare it in my benchmark:

I have : 

MD5 : 94996960 ns
SHA-1 : 99947880 ns
SHA-256 : 114130480 ns

For my baremetal AM335X hardware acceleration.

Whereas with mbedtls for the same data :

I have :

MD5 : 51832760 ns
SHA-1 : 81381120 ns
SHA-256 : 196619960 ns

So without hardware acceleration. 

I am wondering myself if there is an error in my baremetal code or if it is normal ? Because of polling mode ? How can I can compare it with Linux for example : need test it in polling mode?

Thanks !

Alexis

  • Hi Alexis,
    It is very good to know that you worked out bare-metal crypto driver for SHA+MD5 on AM335x.
    Some notes on HW crypto performance with respect to SW implementation:
    a). AM335x CPU can run up to 1GHz depending on the device speed grade.
    b). AES and SHA modules on L3 interconnect run off L3F_CLK, and L3F_CLK is 200MHz for OPP100, or 100MHz for OPP50.
    Please refer to Table 8-22. Core PLL Typical Frequencies (MHz), and Table 8-23. Bus Interface Clocks in AM335x TRM
    c). One of benefits from using the HW crypto block is off-loading CPU. This needs to have interrupt driven DMA enabled.
    d). For reference on CPU offload with HW crypto, you may refer to Linux kernel crypto driver performance.
    software-dl.ti.com/.../Crypto.html
    Best,
    -Hong

  • Thanks Hong for Reply Hong Slight smile

    I did not compare the same modes with each other, hence the false results:

    For a 1*1Mo data SHA1 :

    • Under Linux DMA 1GHz : 0m0.013s = 13 ms
    • Under Linux DMA 300MHz : 0m0.034s = 34ms
    • Under Linux Polling 300MHz: 0m0.731s = 731ms
    • Under Code BareMetal Polling 300MHz: 99542520 ns = 99ms

    We can see that the use of DMA allows to considerably accelerate the speed,
    So I'm trying to implement it.
    I have a question about the TRM:
    In the TRM we deal with HIB1 and HIB2 module:
    I understand in the documentation that HIB1 = SHA_S (registers)
    And that HIB2 = SHA_P (registers)

    What are the differences between the both? (I heard "S" stands for secure and "P" for public right?) Should I use both modules? Should I use only SHA_P registers?
    What I understood is that only HIB2 supports DMA. Is it correct ?

  • Hi Alexis,
    Thanks for your update.
    Yes, you're right that in general, "S" stands for secure, and "P" stands for non-secure (public).
    You'd use "P" registers, and one example for "P' registers in Linux kernel device tree file "Linux/arch/arm/boot/dts/am33xx.dtsi":

    		sham: sham@53100000 {
    			compatible = "ti,omap4-sham";
    			ti,hwmods = "sham";
    			reg = <0x53100000 0x200>;
    			interrupts = <109>;
    			dmas = <&edma 36 0>;
    			dma-names = "rx";
    		};
    
    		aes: aes@53500000 {
    			compatible = "ti,omap4-aes";
    			ti,hwmods = "aes";
    			reg = <0x53500000 0xa0>;
    			interrupts = <103>;
    			dmas = <&edma 6 0>,
    			       <&edma 5 0>;
    			dma-names = "tx", "rx";
    		};

    Best,

    -Hong

  • Hi Hong ! ;)

    Now you guess it, I'm trying to implement DMA for my crypto implementation : I've found edma_echo_uart example in StarterWare sample, but I am not able to run it correctly : callback is not call ?

    Do you have an sample working code for baremetal with DMA?

    I'm also asking myself to guess what is the difference between number of crypto DMAEvent : https://www.ti.com/lit/ug/spruh73q/spruh73q.pdf (Page 547)  and EDMA Event (on page 1635 ).

    Crypto Engine don't use EDMA3 but only DMA ?! It's not clear to me.

    Thanks for your help !

    Have a good day !

    Alexis.

  • Hi Alexis,
    Crypto engine for non-secure (public) use EDMA, and you may refer to AM335x Crypto TRM, where
    - AES:
    Figure 2-2. AES Integration
    Table 2-2. Hardware Requests
    E_DMA_4
    E_DMA_5
    E_DMA_6
    E_DMA_7
    -SHA:
    Figure 3-1. SHA/MD5 Bus System Overview
    Table 3-2. Hardware Requests
    E_DMA_35
    E_DMA_36
    E_DMA_37

    The EDMA # for AES and SHA matches those defined in Linux kernel DT file "Linux/arch/arm/boot/dts/am33xx.dtsi" in my last reply:
    -sham:
    dmas = <&edma 36 0>;
    dma-names = "rx";
    -AES:
    dmas = <&edma 6 0>,
    <&edma 5 0>;
    dma-names = "tx", "rx";

    Best,
    -Hong