This article is about AMD’s IOMMU, coming up in future server chipsets, what it does, how it works and why it is important.
What it does
IOMMU stands for I/O Memory Management Unit and works very similar to a processor’s memory management unit. The main difference is that it translates memory accesses performed by devices instead of by the processor, as the MMU already does. This address translation is implemented on a paging based scheme. As with the MMU, it is designed to allow not only implementation of translation, but protection functionality, as well. Another key feature is interrupt remapping.
How it works
This is the ability to directly assign a physical device to a particular guest OS. The required address space translation is handled transparently.
Ideally a device’ address space is the same as a guest’s physical address space; however, in the virtualized case this is hard to achieve without an IOMMU.
If done without IOMMU, our experience has been that it is very fragile, slow and works for paravirtualized OSs only. An IOMMU is designed to allow device pass-through functionality to work even with an unmodified OS. Device isolation is a key feature for increased virtualization performance, with network adapters and GPUs being the devices that benefit most, as they usually have high bandwidth requirements. As a side-effect, devices with 32 bit addressing only can be passed to guests that are physically mapped above 4 GB, to allow DMA transfers for them as well.
An IOMMU is designed to be able to safely map a device to a particular guest without risking the integrity of other guests. A guest should not break out of its address space with rogue DMA traffic. Additionally it is designed to provide an increased amount of security in scenarios without virtualization. Particularly the OS should be able to protect itself from buggy device drivers by limiting a device’s memory accesses.
Remapping of interrupts
Usually sharing device interrupts among several guests is complicated to handle. An IOMMU provides a basis to separate device interrupts that are already shared by different devices. It remaps a shared interrupt to an exclusive vector to ease up its delivery to a particular guest OS.
Why is it important?
In virtualization, there are lots of tricks done to abstract the underlying hardware, but also to minimize virtualization overhead. Using Rapid Virtualization Indexing™ instead of shadow page tables for memory management is only one example. The biggest remaining performance gap in today’s virtualization scenario is I/O. An IOMMU helps to bridge this gap and also improves the situation from a security point of view. Last, but not least, it allows hypervisors to be simpler and more robust.
–Jörg Rödel & Peter Oruba
This post is the opinion of the author and may not represent AMD’s positions, strategies or opinions. Links to third party sites and references to third party trademarks are provided for convenience and illustrative purposes only. Unless explicitly stated, AMD is not responsible for the contents of such links, and no third party endorsement of AMD or any of its products is implied.