I have successfully installed the Crypto API drivers and Open Crypto Framework (OCF) for the DM814x using the Installing AM387x C6A814x DM814x Crypto Support Wiki. When I compare the cryptotest AES results between hardware accelerated and software only Crypto API drivers I only see about a 2x speed improvement at best and only with larger buffers.
Software only:
# dmesg | grep nss
nss_rng nss_rng: NSS Random Number Generator ver. 2.0
#
# time -v cryptotest -a aes 4096 64
0.182 sec, 8192 aes crypts, 64 bytes, 2879295 byte/sec, 22.0 Mb/sec
Command being timed: "cryptotest -a aes 4096 64"
User time (seconds): 0.00
System time (seconds): 0.09
Percent of CPU this job got: 51%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 0.18s
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 2080
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 143
Voluntary context switches: 11
Involuntary context switches: 5998
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
# time -v cryptotest -a aes 4096 1024
0.691 sec, 8192 aes crypts, 1024 bytes, 12142515 byte/sec, 92.6 Mb/sec
Command being timed: "cryptotest -a aes 4096 1024"
User time (seconds): 0.00
System time (seconds): 0.57
Percent of CPU this job got: 81%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 0.69s
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 2080
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 143
Voluntary context switches: 7
Involuntary context switches: 7503
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
# time -v cryptotest -a aes 4096 4096
2.454 sec, 8192 aes crypts, 4096 bytes, 13674800 byte/sec, 104.3 Mb/sec
Command being timed: "cryptotest -a aes 4096 4096"
User time (seconds): 0.00
System time (seconds): 1.97
Percent of CPU this job got: 80%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 2.46s
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 2128
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 146
Voluntary context switches: 6
Involuntary context switches: 7922
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
# time -v cryptotest -a aes 4096 65280
36.467 sec, 8192 aes crypts, 65280 bytes, 14664563 byte/sec, 111.9 Mb/sec
Command being timed: "cryptotest -a aes 4096 65280"
User time (seconds): 0.00
System time (seconds): 11.06
Percent of CPU this job got: 30%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 36.48s
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 2384
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 191
Voluntary context switches: 6
Involuntary context switches: 8227
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
#
Hardware accelerated:
# dmesg | grep nss
nss_rng nss_rng: NSS Random Number Generator ver. 2.0
nss_aes_mod_init: loading NSS AES driver
nss-aes nss-aes: NSS AES hw accel rev: 3.2 (context 0 @0x41140000)
nss-aes nss-aes: NSS AES hw accel rev: 3.2 (context 1 @0x41141000)
nss-aes nss-aes: NSS AES hw accel rev: 3.2 (context 2 @0x411a0000)
nss-aes nss-aes: NSS AES hw accel rev: 3.2 (context 3 @0x411a1000)
nss_aes_probe: probe() done
nss_des_mod_init: loading NSS DES driver
nss-des nss-des: NSS DES hw accel rev: 2.2 (context 0 @0x41160000)
nss-des nss-des: NSS DES hw accel rev: 2.2 (context 1 @0x41161000)
nss_des_probe: probe() done
nss_sham_mod_init: loading NSS SHA/MD5 driver
nss-sham nss-sham: NSS SHA/MD5 hw accel rev: 4.03 (context 0 @0x41100000)
nss-sham nss-sham: NSS SHA/MD5 hw accel rev: 4.03 (context 1 @0x41101000)
nss-sham nss-sham: NSS SHA/MD5 hw accel rev: 4.03 (context 2 @0x411c0000)
nss-sham nss-sham: NSS SHA/MD5 hw accel rev: 4.03 (context 3 @0x411c1000)
nss_sham_probe: probe() done
#
# time -v cryptotest -a aes 4096 64
0.213 sec, 8192 aes crypts, 64 bytes, 2459483 byte/sec, 18.8 Mb/sec
Command being timed: "cryptotest -a aes 4096 64"
User time (seconds): 0.00
System time (seconds): 0.11
Percent of CPU this job got: 51%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 0.22s
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 2080
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 143
Voluntary context switches: 1769
Involuntary context switches: 4821
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
# time -v cryptotest -a aes 4096 1024
0.477 sec, 8192 aes crypts, 1024 bytes, 17583895 byte/sec, 134.2 Mb/sec
Command being timed: "cryptotest -a aes 4096 1024"
User time (seconds): 0.00
System time (seconds): 0.24
Percent of CPU this job got: 51%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 0.48s
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 2080
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 143
Voluntary context switches: 8198
Involuntary context switches: 32
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
# time -v cryptotest -a aes 4096 4096
1.281 sec, 8192 aes crypts, 4096 bytes, 26193547 byte/sec, 199.8 Mb/sec
Command being timed: "cryptotest -a aes 4096 4096"
User time (seconds): 0.00
System time (seconds): 0.28
Percent of CPU this job got: 22%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 1.28s
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 2128
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 146
Voluntary context switches: 8199
Involuntary context switches: 33
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
# time -v cryptotest -a aes 4096 65280
16.561 sec, 8192 aes crypts, 65280 bytes, 32290958 byte/sec, 246.4 Mb/sec
Command being timed: "cryptotest -a aes 4096 65280"
User time (seconds): 0.01
System time (seconds): 0.90
Percent of CPU this job got: 5%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 16.57s
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 2384
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 191
Voluntary context switches: 8198
Involuntary context switches: 194
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
#
I also didn't see much improvement with scp when using openssl built with cryptodev support (-DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS).
Software only:
$ scp -c aes128-cbc -P 2222 root@10.100.251.249:/dm816x_1080p_demo.264 .
dm816x_1080p_demo.264 100% 155MB 7.1MB/s 00:22
Hardware accelerated:
$ scp -c aes128-cbc -P 2222 root@10.100.251.249:/dm816x_1080p_demo.264 .
dm816x_1080p_demo.264 100% 155MB 9.7MB/s 00:16
I expected to see better performance than this with the hardware acceleration. Do these numbers look correct?