2021 was another exciting year for Arm NN and Arm Compute Library (ACL), jam-packed with usability improvements and enhanced machine learning (ML) performance for Arm IP. As the Arm NN team do every year, we added more operator support for the Arm NN Android NNAPI Driver and the Arm NN TF Lite Delegate. We also hosted Arm AI Tech Talks and added new guides on how to get set up with Arm NN, which is easier than ever before. Let’s look at some of the highlights…
As part of every release of Arm NN, we now publish pre-built binaries for Arm NN which support a range of different platforms and Arm architectures. In our most recent 21.11 release, Arm also published something totally new for Android ML app developers: the Arm NN AAR (Android Archive) file. This nicely packages up the Arm NN TF Lite Delegate, Arm NN itself and ACL; ready to be integrated into your Android ML application. We held an Arm AI Tech Talk on how to run an ML Image Segmentation app in 5 minutes using this AAR file, with the supporting guide here. To download the latest prebuilt binaries for the latest 21.11 Arm NN release, including the Arm NN AAR file, please see the Assets at the bottom of the 21.11 Release Notes page.
Arm AI Tech Talk
Linux developers, don’t fret, we have not forgotten about you! The Arm NN team provided an awesome guide on how to integrate pre-built binaries for the Arm NN TF Lite Delegate into an Image Classification app, utilizing the popular MobileNetV2 model. In addition, we published a guide which outlines how to perform Automatic Speech Recognition with Wav2Letter using PyArmNN (the Python extension to our SDK) and our Debian packages. There has never been as much flexibility for ML developers on Arm as there is now. We are constantly adding new ways to integrate Arm NN into your app and providing the supporting material to help you along the way. If you have any suggestions on what you’d like to see next, please let us know in a GitHub Issue.
Arm NN and ACL continue to outperform generic ML libraries on Arm CPU and GPU performance. Our engineers in the ACL team are experts on the Arm architecture and they continued to squeeze more performance out of ML models with every release of our software in 2021. We also introduced some exciting performance related features. Kernel compression was introduced for the Android platform, halving the binary size of ACL! We eliminated the need to compile CL kernels by adding the ability to cache these kernels, resulting in massive performance uplifts on the first execution of a ML model. Zero copy is now possible on the GPU backend, reducing memory usage and eliminating the time spent copying in/out of the GPU memory space. These performance features are not exhaustive, and our engineers have many more planned for 2022 and beyond.
New Arm technologies
2021 was a massive year for Arm, with the announcement of our ground-breaking Armv9 CPU Architecture. Arm NN naturally has support for Armv9 features before any other general purpose ML library. The Scalable Matrix Extension (SME) is an Armv9 feature which builds on SVE2, promising a significant increase in CPU matrix processing throughput and efficiency. SME empowers next generation technologies such as 5G systems, virtual reality (VR), augmented reality (AR) and smart home applications. In addition, the ACL team recently enabled amazing Armv8 architectural features, such as Matrix Multiply (MatMul) and Dot Product instructions, further improving ML and floating-point performance. We also fine-tuned our software’s floating-point support, continuing to provide class-leading performance on Arm’s latest GPU offerings, such as the Arm Mali-G710.
An exceptional year for Arm NN
As you can see, 2021 was an exceptionally busy year. The usability improvements have made it possible to integrate Arm NN into an Android app in minutes and our software continues to be the best solution for ML performance on Arm IP. We can’t wait to unlock the power of our partners’ products with Arm NN and ACL in 2022. Stay tuned for our quarterly releases!