In the near future – in all likelihood, later this month – at least Windows and Linux will get security updates that change the way those operating systems manage memory on Intel processors.
There’s a lot of interest, excitement even, about these changes: they work at a very low level and are likely to affect performance.
The slowdown will depend on many factors, but one report suggests that database servers running on affected hardware might suffer a performance hit around 20%.
“Affected hardware” seems to include most Intel CPUs released in recent years; AMD processors, apparently, have different internals and are immune to this problem.
So, what’s going on here?
On Linux, the forthcoming patches are known colloquially as KPTI, short for Kernel Page Table Isolation, though they have jokingly been referred to along the way as both KAISER and F**CKWIT.
The latter is short for Forcefully Unmap Complete Kernel With Interrupt Trampolines; the former for Kernel Address Isolation to have Side-channels Efficiently Removed.
Inside most modern operating systems, you’ll find a privileged core, known as the kernel, that manages everything else: it starts and stops user programs; it enforces security settings; it manages memory so that one program can’t clobber another; it controls access to the underlying hardware such as USB drives and network cards; it rules and regulates the roost.
Everything else – what we glibly called “user programs” above – runs in what’s called userland, where programs can interact with each other, but only by agreement.
If one program could casually read (or, worse still, modify) any other program’s data, or interfere with its operation, that would be a serious security problem; it would be even worse if a userland program could get access to the kernel’s data, because that would interfere with the security and integrity of the entire computer.
One job of the kernel, therefore, is to keep userland and the kernel carefully apart, so that userland programs can’t take over from the kernel itself and subvert security, for example by launching malware, stealing data, snooping on network traffic and messing with the hardware.
The CPU itself provides hardware support for this sort of separation: the x86 and x64 processors provide what are known as privilege levels, implemented and enforced by the chip itself, that can be used to segregate the kernel from the user programs it launches.
Intel calls these privilege levels rings, of which there are four; most operating systems use two of them: Ring 0 (most privileged) for the kernel, and Ring 3 (least privileged) for userland.
Loosely speaking, processes in Ring 0 can take control over processes and resources in higher-numbered rings, but not the other way around.
In theory, then, the processor itself blocks Ring 3 programs from reading Ring 0 memory, thus proactively preventing userland programs from peeking into the kernel’s address space, which could leak critical details about the sytem itself, about other programs, or about other people’s data.
In technical terms, a sequence of machine code instructions like this, running in userland, should be blocked at step 1:
mov rax, [kernelmemory] ; this will get blocked - the memory is protected mov rbx, [usermemory] ; this is allowed - the memory is "yours"
Likewise, swapping the instructions, this sequence would be blocked at step 2:
mov rbx, [usermemory] ; this is allowed - the memory is "yours" mov rax, [kernelmemory] ; this will get blocked - the memory is protected
Now, modern Intel and AMD CPUs support what is called speculative execution, whereby the processor figures out what the next few instructions are supposed to do, breaks them into smaller sub-instructions, and processes them in a possibly different order to how they appear in the program.
This is done to increase throughput, so a slow operation that doesn’t affect any intermediate results can be started earlier in the pipeline, with other work being done in what would otherwise be “dead time” waiting for the slow instruction to finish if it ran at the end of the list.
Above, for example, the two instructions are computationally independent, so it doesn’t really matter what order they run in, even though swapping them round changes the moment at which the processor intervenes to block the offending instruction (the one that tries to load memory from the kernel).
Order does matter!
Back in July 2017, a German security researcher did some digging to see if order does, in fact, matter.
He wondered what would happen if the processor calculated some internal results as part of an illegal instruction X, used those internal results in handling legal instruction Y, and only then flagged X as disallowed.
Even if both X and Y were cancelled as a result, would there be a trace of the internal results from the illegal X left where it could be found?
The example he started with looked like this:
1. mov rax, [K] ; K is a kernel address that is banned 2. and rax, 1 3. mov rbx, [U+rax] ; U is a user address that is allowed
Don’t worry if you don’t speak assembler: what this does is:
- Load the A register from kernel memory.
- Change A to 0 if it was even or 1 if it was odd (this keeps the thought experiment simple).
- Load register B from memory location U+0 or U+1, depending on A.
In theory, speculative execution means that the CPU could finish working internally on instruction 3 before finishing instruction 1, even though the whole sequence of instructions would ultimately be invalidated and blocked because of the privilege violation in 1.
Perhaps, however, the side-effects of instruction 3 could be figured out from elsewhere in the CPU?
After all the the processor’s behaviour would have been slightly different depending on whether the speculatively-executed instruction 3 referenced mempey location U or U+1.
For example, this difference might, just might, show up in the CPU’s memory cache – a list of recently-referenced memory addresses plus their values that is maintained inside the CPU itself for performance reasons.
In other words, the cache might act as a “telltale”, known as a side channel, that could leak secret information from inside the CPU – in this case, whether the privileged value of memory location K was odd or even.
(Looking up memory in CPU cache is some 40 times faster than fetching it from the actual memory chips, so enabling this sort of “short-circuit” for commonly-used values can make a huge difference to performance.)
The long and the short of it is that the researcher couldn’t measure the difference between
A is even and
A is odd (or, alternatively,
did the CPU peek at U or
did the CPU peek at U+1) in this case…
…but the thought experiment worked out in the end.
The researcher found other similar code constructions that allow you to leach information about kernel memory using address calculation tricks of this sort – a hardware-level side channel that could leak privileged memory to unprivileged programs.
The rest is history
And the rest is history.
Patches are coming soon, at least for Linux and Windows, to deliver KAISER: Kernel Address Isolation to have Side-channels Efficiently Removed, or KPTI, to give its politically correct name.
Now you have an idea where that name KAISER came from: the patch keeps kernel and userland memory more carefully apart so that side-effects from speculative execution tricks can no longer be measured.
This security fix is especially relevant for multi-user computers, such as servers running several virtual machines, where individual users or guest operating systems could use this trick to “reach out” to other parts of the system, such as the host operating system, or other guests on the same physical server.
However, because CPU caching is there to boost performance, anything that reduces the effectiveness of caching is likely to reduce performance, and that is the way of the world.
Sometimes, the price of security progress is a modicum of inconvenience, in much the same the way that 2FA is more hassle than a plain login, and HTTPS is computationally more expensive than vanilla HTTP.
In eight words, get ready to take one for the team.
A lot of the detail beind these patches is currently hidden behind a veil [2018-01-03T16:30Z] – that seems to be down to non-disclosure clauses imposed by various vendors involved in preparing the fixes, an understandable precaution given the level of general interest in new ways to pull of data leakage and privilege escalation exploits.
We expect this secrecy to be lifted as patches are officially published.
However, you can get and try the Linux patches for yourself rigt now, if you wish. (They aren’t finalised yet, so we can’t recommend using them except for testing.)
So far as we know at the moment, the risk of this flaw seems comparatively modest on dedicated servers such as appliances, and on personal devices such as laptops: to exploit it would require an attacker to run code on your computer in the first place, so you’d already be compromised.
On shared computers such as as multiuser build servers or hosting services that run several different customers’ virtual machines on the same physical hardware, the risks are much greater: the host kernel is there to keep different users apart, not merely to keep different programs run by one user apart.
So, a flaw such as this might help an untrustworthy user to snoop on other who are logged in at the same time, or to influence other virtual machines hosted on the same server.
This flaw has existed for years and has been documented about for months at least, so there is no need to panic; nevertheless, we recommend that you keep your eyes out for patches for the operating systems you use, probably in the course of January 2018, and that you apply them as soon as you can.
Source : Naked Security