Tuesday, May 13, 2008

How kvm does security

Like most software, kvm does security in layers.

At the inner privilege layer is the kvm module. This code interacts directly with the guest and also has full access to the machine. If breached, a guest could potentially take over the host and any virtual machines running on it.

The outer privilege layer is qemu. While it is much larger than the kvm kernel module, it is relatively easy to contain a qemu breach so that it doesn't affect the rest of the host:
  • The kernel already protects itself from non-root user processes; if you run kvm as an unprivileged user, the kernel will not let you harm it.
  • Processes that run as different users are also restricted; so if you run each guest under a distinct user ID, more isolation is gained.
  • Mandatory access control systems such as selinux can be used to further restrict the damage that a breached qemu can inflict.
What are the most vulnerable submodules in kvm?
  • Probably the most critical piece is the x86 instruction emulator, which is invoked whenever the guest accesses I/O registers or the its page tables. This code weighs in at about 2000 lines.
  • If the kvm mmu can be tricked into mapping an arbitrary host page into guest memory, then the guest can potentially insert its own code into the kernel. The mmu is about 3000 lines in length, but it has been the subject of endless inspection, so it is likely a very difficult target.
So again the "reuse Linux" theme repeats: kvm leverages the existing Linux kernel both to reduce the attack surface presented to malicious guests, and also to contain the damage should a security breach occur.

Friday, May 2, 2008

Comparing code size

Starting with Linux 2.6.26, kvm supports four different machine architectures: x86, s390 (System Z, or mainframes), ia64 (Intel's Itanium), and embedded PowerPC processors. It is interesting to compare the size of the code supporting each architecture:

archlines
x8617442
ia648154
s3902509
ppc2229


x86 is old and crufty; it supports three instruction sets and four paging modes; its long and successful history means that it needs the most kvm support code. There are two different virtualization extensions that kvm supports on x86 (Intel's VT and AMD's SVM). It is also the architecture that has been supported by kvm for the longest time. It is no surprise that it leads the pack by a significant amount.

ia64 is a newer architecture, but a quite complex one. The mechanism by which is supports virtualization, with a module loaded into the host kernel and a second module loaded into the guest address space, also adds complexity. So it comes in second, though far behind x86.

s390 is older (and probably far cruftier) than x86. But on the other hand, its hardware virtualization support is so mature and complete that a complete hypervisor fits in a fraction of the lines required for x86. Indeed, it will take a while until x86 can support 64-way guests.

ppc 44x, the embedded PowerPC variant targeted by kvm, has a simple software-managed tlb model, and the regular instruction set encoding favored by RISC processors, so it gets by with just a seventh of the amount of code required by x86.

As we add more features, kvm code size will continue to grow slowly, but the relative comparison will no doubt remain valid. And kvm will likely remain the smallest full virtualization solution available.