计算机科学

基于Cortex-M4内核的AES-128-CTR算法汇编优化

  • 杨东轩 ,
  • 张刚刚 ,
  • 刘新亮
展开
  • 1. 北京工商大学 电商与物流学院, 北京 100048
    2. 首都师范大学 数字校园建设中心, 北京 100048

收稿日期: 2021-03-23

  网络出版日期: 2022-07-19

基金资助

国家重点研发计划子课题(2016YFD0401205); 北京市自然科学基金(4202014); 北京市科学技术委员会计划项目(Z191100008619007)

Assembly optimization of an AES-128-CTR algorithm based on a Cortex-M4 core

  • Dongxuan YANG ,
  • Ganggang ZHANG ,
  • Xinliang LIU
Expand
  • 1. School of E-commerce and Logistics, Beijing Technology and Business University, Beijing 100048, China
    2. Digital Campus, Capital Normal University, Beijing 100048, China

Received date: 2021-03-23

  Online published: 2022-07-19

摘要

随着物联网的快速发展, 嵌入式硬件产品在保障数据安全方面面临极大挑战. AES (Advanced Encryption Standard) 算法在数据加解密领域具有抗攻击性强、运算速度大以及分组长度灵活等优点. 由于嵌入式微控制器不具有针对AES加密的扩展指令集, 因此该算法的运行速度在微控制器平台上的表现远不如通用CPU (Central Processing Unit). 为了解决这个问题, 在基于Cortex-M4内核指令集的微控制器平台上, 使用汇编语言提高了AES算法在CTR (Counter)模式下的运行速度. 结合该内核特有的桶形移位器和三级流水线等特点优化算法的轮变换, 缩减算法运行时所需的指令周期数. 在FRDM-K82F开发板上的测试表明, 该优化算法的运行效率高于C语言实现代码的运行效率, 同时比基于协处理器所实现的硬件AES加密在成本和功耗方面更具有优势.

关键词: 汇编优化; AES; Cortex-M4

本文引用格式

杨东轩 , 张刚刚 , 刘新亮 . 基于Cortex-M4内核的AES-128-CTR算法汇编优化[J]. 华东师范大学学报(自然科学版), 2022 , 2022(4) : 67 -78 . DOI: 10.3969/j.issn.1000-5641.2022.04.007

Abstract

With the rapid development of the Internet of Things, embedded hardware products face great challenges in data security. The AES (Advanced Encryption Standard) algorithm has the advantages of strong attack resistance, fast operation speed and flexible block length in the field of data encryption and decryption. The speed of this algorithm on microcontroller platforms is far inferior to general-purpose CPUs (Central Processing Units) which have an extended instruction set for AES encryption. To solve this problem, a speed optimized AES algorithm in CTR (Counter) mode based on the Cortex-M4 core instruction set is implemented using assembly language. The kernel’s unique barrel shifter and three-stage pipeline are used to optimize the round transformation of the algorithm, and the number of instruction cycles is reduced. Testing on an FRDM-K82F development board shows that the assembly optimization of the algorithm is substantially more efficient than the code implemented using the C language, and it offers more advantages in both cost and power consumption compared to hardware encryption based on the coprocessor.

参考文献

1 胡荣群, 罗杰. 嵌入式系统的安全分析. 计算机与现代化, 2007, (2): 93- 96.
2 徐绘凯, 刘跃, 马振邦, 等. MQTT 安全大规模测量研究. 信息网络安全, 2020, 20 (9): 37- 41.
3 陈颖, 陈长松, 胡红钢. SM4 硬件电路的功耗分析研究. 信息网络安全, 2018, (5): 52- 58.
4 尚文利, 尹隆, 刘贤达, 等. 工业控制系统安全可信环境构建技术及应用. 信息网络安全, 2019, (6): 1- 10.
5 LAU D. Secure bootloader implementation [EB/OL]. (2012-10-14)[2021-03-05]. https://www.nxp.com/docs/en/application-note/AN4605.pdf.
6 曾小波, 易志中, 焦歆. 基于 51 核的 AES 算法高速硬件设计与实现. 电子科技, 2016, 29 (1): 36- 39.
7 NASSER Y, BAZZOUN M A, ABDUL-NABI S. AES algorithm implementation for a simple low cost portable 8-bit microcontroller [C]//2016 Sixth International Conference on Digital Information Processing and Communications. 2016: 214-218.
8 章登义, 毛从武, 李永忠. AES 算法及其在 DSP 中优化实现. 计算机工程与科学, 2005, 27 (9): 7- 9.
9 LI Q J, ZHONG C W, ZHAO K Y, et al. Implementation and analysis of AES encryption on GPU [C]//2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems. 2012: 843-848.
10 张月华, 张新贺, 刘鸿雁. AES 算法优化及其在 ARM 上的实现. 计算机应用, 2011, 31 (6): 1539- 1542.
11 张小梅. AES 算法在 ARM 核嵌入式系统上的优化实现. 计算机应用与软件, 2012, 29 (5): 285- 288.
12 张金辉, 郭晓彪, 符鑫. AES 加密算法分析及其在信息安全中的应用. 信息网络安全, 2011, (5): 31- 33.
13 ATASU K, BREVEGLIERI L, MACCHETTI M. Efficient AES implementations for ARM based platforms [C]//Proceedings of the 2004 Association for Computing Machinery Annual Symposium on Applied Computing. 2004: 841-845.
14 DAEMEN J, RIJMEN V. AES proposal: Rijndael [EB/OL]. (1999-09-03)[2020-10-08]. https://csrc.nist.gov/csrc/media/projects/cryptographic-standards-and-guidelines/documents/aes-development/rijndael-ammended.pdf.
15 BIHAM E. A fast new DES implementation in software [C]//Proceedings of the 4th International Workshop on Fast Software Encryption. 1997: 260-272.
16 MATSUI M, NAKAJIMA J. On the power of bitslice implementation on Intel Core2 processor [C]//Cryptographic Hardware and Embedded Systems-CHES 2007. 2007: 121-134.
17 REBEIRO C, SELVAKUMAR D, DEVI A S L. Bitslice implementation of AES [C]//Proceedings of the 5th International Conference on Cryptology and Network Security. 2006: 203-212.
18 HAMBURG M. Accelerating AES with vector permute instructions [C]//Proceedings of the 11th International Workshop on Cryptographic Hardware and Embedded Systems. 2009: 18-32.
19 王子派. ARM 系列微处理器架构初探. 电子世界, 2016, (14): 56- 57.
20 BERNSTEIN D J, SCHWABE P. New AES software speed records [C]//Proceedings of the 9th International Conference on Cryptology in India: Progress in Cryptology. 2008: 322-336.
21 EMMANUEL B, HASSAN H. Extended Barrel-Shifter for versatile QC-LDPC decoders. IEEE Wireless Communications Letters, 2020, 9 (5): 643- 647.
22 DWORKIN M. Recommendation for block cipher modes of operation [C]//National Institute of Standards and Technology Special Publication. 2001: 800-38A.
23 AUSTIN T, BLAAUW D, MUDGE T, et al. Making typical silicon matter with razor. Computer, 2004, 37 (3): 57- 65.
24 HUNTER M. Optimizing performance on Kinetis K-series MCUs [EB/OL]. (2014-06-11)[2021-03-03]. https://www.nxp.com/docs/en/application-note/AN4745.pdf.
25 NIKOLAIDIS S, LAOPOULOS T. Instruction-level power consumption estimation of embedded processors for low-power applications [J]. Computer Standards & Interfaces, 2002, 24: 133-137.
文章导航

/