Working with its architecture licensees and ecosystem partners, Arm continues to evolve its architecture, developing new functionality to meet the needs of both new and existing markets.
This blog discusses some of the key additions to the A-Profile architecture in 2022.
Full Instruction Set and System Register information will be available from early October from our developer webpages. The complete Arm Architecture Reference Manual (Arm ARM), the 2022 extensions and earlier functionality, is due for release in early 2023. Updates to the Learn the Architecture pages will appear during 2022 and 2023.
Details of previous updates to the A-Profile architecture are available here: 2014, 2015, 2016, 2017, 2018, 2019, 2020 and 2021.
The 2022 extensions include several updates to the VMSA.
The 2022 extensions introduce a new way to control memory permissions. Instead of directly encoding the permission in the Translation Table Entry (TTE), fields in the TTEs are used to index into an array of permissions specified in a register. This indirection provides greater flexibility, greater encoding density and enables the representation of new permissions.
Each TTE can select two values, a base permission, and an overlay. The base permission represents the maximum set of permissions that the block or page has. The overlay can be used to further restrict the permission. This is illustrated in the following diagram:
The base permission is permitted to be cached in a TLB, but the overlay permission interpretation is not. This means that the effective permission of a block or page can be efficiently changed dynamically.
For operating systems, the architecture provides separate EL1 and EL0 overlay registers. This can allow an operating system to set a maximum permission for a page allocated to an application, then allow the application to further manage permissions within those constraints. For example, a JIT might be allocated a page that was permitted by the operating system to be write-able or executable. The JIT could then control, with the Overlays, whether the page was currently writeable or executable. This has the advantage of reducing the number of system calls and TLB invalidates.
Permission indirection also has benefits where the same tables are shared by multiple entities. For example, a set of tables might be used by both an Arm processor and an Arm System Memory Management Unit (SMMU). The permissions that we want to apply to software accesses might be different to those we want to apply to an accelerator behind the SMMU. With permission indirection, the processor and SMMU can use the same tables but interpret the permissions differently.
The translation tables used by the isolation model are a high value target for attackers. The 2022 extensions introduce a series of features to harden the MMU table walk process by reducing the available attack surface. These features include:
The Protected attribute controls which fields within a TTE are permitted to change. When the new RCW instruction is used to modify a TTE the CPU will check the Protected attribute, and if set, atomically only update permitted fields.
The new stage 2 “Most Read-only” (MRO) permission enables software to restrict what can write into a page. A page marked as MRO permits hardware updates of the Access Flag and dirty state of a page, as well as updates due to an RCW instruction. However other forms of store, such as STR (store register) instructions, will fail with a permission fault.
Together the Protected attribute at stage 1 and MRO permission at stage 2 give robust protection against many types of attacks. The MRO attribute prevents stores, other than those from RCW instructions, from changing mappings. The Protected attribute and RCW instruction limits which fields in TTEs can be updated.
The feature also introduces a stage 2 attribute, AssuredOnly, that can be used to ensure that only Protected tables can point to a certain page. This is to help protect against aliasing attacks.
As part of the 2022 extensions, Arm is adding a new translation table format to Armv9-A. The translation format follows the same principle as the existing format but increases the size of each descriptor to 128 bits. The new format enables larger output addresses, scope for new attribute fields and room for additional software metadata bits in the translation table entries.
In 2021 Arm announced the Scalable Matrix Extension (SME) to Armv9-A. SME added new capabilities to efficiently process matrices, including matrix tile storage and outer-product operations. In 2022, Arm builds on the capabilities of SME by introducing SME2.
SME provides outer-product instructions to accelerate matrix operations. SME2 significantly extends the capabilities with instructions for multi-vector operations, multi-vector predicates, range prefetches and 2b/4b weight compression.
The new instructions enable SME2 to accelerate more workloads than the original SME. Including GEMV, Non-Linear Solvers, Small and Sparse Matrices, and Feature Extraction or tracking.
With the 2022 extensions Arm also adds support for a Guarded Control Stack (GCS) in Armv9-A. GCS provides mitigations against some forms of ROP attacks. GCS also provides an efficient mechanism for profiling tools to get a copy of the current call stack, without needing to unwind the main stack.
A GCS is a protected region of virtual address space allocated by software. When the processor executes a Branch with Link instruction, such as BL, the return address is pushed onto the GCS as well as being written into the Link Register (LR). On a procedure return, the latest stored return address is popped from the GCS. The processor either compares the popped value with the LR, or uses the popped value directly. This process is illustrated here:
There are times when the software needs to make manual adjustments to the control stack, for example to handle some long jumps. To enable this, the architecture provides specialist instructions for maintaining the GCS; GCSPUSHx and GCSPOPx.
To prevent accidental or malicious changes to the GCS, a new Stage 1 permission is introduced. This permission allows reads by software, but restricts writes to either GCSPUSH instructions or as a side-effect of executing a BL.
In 2021 Arm announced the Realm Management Extension (RME), part of the Arm Confidential Compute Architecture. The 2022 extensions enhance RME in two areas:
Other enhancements introduced as part of the 2022 extensions include:
This blog provides a brief introduction to the latest features included in the Arm architecture as Armv8.9-A and Armv9.4-A. More detailed information can be found on our Developer website.
The next step will be working with our ecosystem partners to ensure that open-source software is enabled, to make use of this functionality as soon as the hardware becomes available.