What is the meltdown attack?

Meltdown logo

Overview

Meltdown is a software attack that exploits the out-of-order execution of modern processors. Let’s first understand what is out-of-order execution.

Out-Of-Order(OOO) execution

Normally, processors are supposed to execute instructions in a sequence one at a time. But, in 1967 a computer scientist named Tomasulo invented an algorithm that provided processors the capability to execute instruction out-of-order. So, the processor would execute the instruction as soon as its operands are available provided there are enough free resources like registers and execution units.

The core of the meltdown attack:

In the light of an OOO execution consider the following sequence of instructions:

try:
    raise Exception('dummy exception')
    secret = memory[secret_data_location]
catch Exception:
    pass

Because of the OOO execution, by the time the exception is raised, the processor would already have issued a read from memory location secret_data_location. Note that the code would not even have permission to read from secret_data_location. But, it does not matter because that instruction is not supposed to have been executed by the processor because of the exception that just precedes the access. So, the processor can’t just kill the process stating that it tried to access an illegal memory. So, when the exception is raised the side effects of the remaining instructions that follow the raise Exception('dummy exception') instruction needs to be discarded. But, many of today’s processors only discard the architectural side effects and not the micro-architectural side effects. The micro-architectural state includes the processor’s components like cache which are not exposed to the operating system and the operating system can not control such components. The meltdown attack exploits exactly the fact that micro-architectural side effects of operations that are not executed are not discarded. It does this by changing the micro-architectural state to encode the information from the secret_data_location and extracting this information from the micro-architectural state afterward.

Leaking a secret byte

Now, let’s see how can we retrieve a byte of information from a secret location which is not directly accessible to us. Consider the following code:

try:
    raise Exception(`dummy exception`)
    secret = memory[secret_data_location]
    # probe is an array of 256 elements. Make sure that
    # the `probe` is not in cache at this point.
    x = probe[secret]
catch Exception:
    pass
# Measure the memory access time for all the 256 elements
# in the probe array. The index of the element which has
# the minimum access time is the value of secret byte.

As discussed previously the instructions that follow the raise instruction will get executed because of OOO execution. So, for instance, if the secret byte value is 11, then only the probe[11] will be in the cache(remember we flushed the cache beforehand?) and because of that, its access time will be significantly smaller compared to other items in the probe array. So, we will know that the value of the secret byte was 11.

So, just like this, the meltdown can melt the operating systems boundary between kernel and user space. The attack when reported was so novel and catastrophic that researchers believed the reports to be fake.