Jun 6, 2017 • Internship, Rust • Edits • Permalink

Exploring MIR Semantics through miri

It’s now been two weeks since my internship started (two weeks already, can you believe it?). In other words, if I want to post “weekly or bi-weekly” updates, I better write one today ;) .

As already mentioned, the goal for this internship is to experiment with unsafe code guidelines by implementing them in miri. Before I tackle that, however, it seemed sensible for me to grab some low-hanging fruit in miri just to make myself familiar with the codebase. It turns out I entered “unsafe code guidelines” territory much quicker than expected. This post is about what I found, and it also serves as a nice demonstration of how we envision my unsafe code guidelines workflow to look like.

Live and Let Die

The first miri task that I took on was for miri to do something sensible with MIR’s StorageLive and StorageDead statements. If you haven’t seen these before, that’s because they do not come up in Rust code; instead, StorageLive and StorageDead are inserted by the compiler. For example, the following Rust code

fn main() {
    let x = 13;
    let y = 3*x + 3;
}

compiles to the following MIR (in release mode)¹:

    bb0: {
        StorageLive(_1);
        _1 = const 13i32;               // let x = 13;
        StorageLive(_2);
        StorageLive(_3);
        StorageLive(_4);
        _4 = _1;                        // let tmp_4 = x;
        _3 = Mul(const 3i32, _4);       // let tmp_3 = 3 * tmp_4;
        StorageDead(_4);
        _2 = Add(_3, const 3i32);       // let y = tmp_3 + x;
        StorageDead(_3);
        _0 = ();
        StorageDead(_2);
        StorageDead(_1);
        return;
    }

That’s quite verbose, but you can still see our little test program in there: There’s a const 13i32, there’s a Mul by 3i32, and there’s an Add with 3i32. The surrounding bb0 hints at the fact that MIR is a language built around basic blocks. That’s not really relevant right now, so we’ll gloss over this.

What is relevant is that the MIR code contains all these extra statements we did not write, all these StorageLive(...) and StorageDead(...). They indicate when a variable is live, i.e., when the value of a variable is relevant for the program execution. For example, the compiler marks _4 as dead right after the multiplication. _4 is not used again by the program, so there is no point in keeping its value around.

Liveness information is important because it is forwarded to LLVM, which then uses this information to allocate variables into stack slots. If two variables are never live at the same time, the compiler can assign both of them the same slot, which is great because it makes the stack frame smaller. To this end, rustc will translate StorageLive to llvm.lifetime.start and StorageDead to llvm.lifetime.end. These are intrinsics that LLVM understands and takes into account when computing the layout of the stack frame.

Giving It Semantics

Now, the trouble is: How do we know if rustc made a mistake when it added the liveness information? That would be bad, because it could lead to miscompilation. However, to figure out whether the information is “wrong”, we first have to find a definition of what it means for the information to be “right”. You may think that’s obvious: Just make sure that every access to a variable is after the StorageLive, and before the StorageDead, done – right? Unfortunately, it’s not always so easy. What if we say StorageLive twice on the same variable, is that allowed? What if there is a StorageLive in a loop?

One way to give a definite answer to all such questions is to give these annotations an actual behavior, i.e., to give them semantics. When the program above runs, we actually imagine that every local variable comes with an additional bit storing whether the variable is currently live, and StorageLive and StorageDead modify said bit. When a variable is accessed, we can check that bit to make sure the variable is currently live. And we can actually implement this behavior, so that questions like the ones I raised above can be answered by just executing programs that do these things, and see what happens. So that’s exactly what I did: I extended miri to no longer just ignore StorageLive and StorageDead, but instead keep track of liveness of all local variables.

It is in the implementation of StorageLive that the aforementioned questions come up: When we see StorageLive on a variable that already is live, what do we do? We have three options (well, I am sure we have more, but these are the three I considered):

Do nothing. The variable is already live, it is marked live again, so we are good.
Bail out. Marking something as live twice is bogus, MIR programs should not do that.
Forget the value that is currently in the local variable, but otherwise keep it live.

LLVM Chiming In

I started out with the first option. However, it turns out that’s a bad choice. The trouble is, we are not entirely unconstrained in what we allow and forbid. After all, StorageLive and StorageDead are compiled into LLVM intrinsics, and LLVM has some rules for what these intrinsics mean and when they are allowed to be used – and when not. Here’s what the LLVM specification of llvm.lifetime.start has to say:

This intrinsic indicates that before this point in the code, the value of the memory pointed to by ptr is dead. This means that it is known to never be used and has an undefined value. A load from the pointer that precedes this intrinsic can be replaced with undef.

Unfortunately, this is rather short and does not cover lots of interesting questions. For example: Is it allowed to mark a variable as live, and then dead, and then live again? The text somewhat reads like it is not, but that would make it illegal to use these intrinsics in a loop, which is probably not the intended meaning. Such imprecision in natural language specifications is the entire reason why we also want to have an executable specification of Rust. (This is not to say that natural language specifications are useless. Ideally, you have them both.)

Anyway, what does seem clear from this specification is that adding a redundant StorageLive at the wrong place can have bad side-effects, like make preceding loads yield undef. We’d rather not have that happen in code produced by rustc. To guarantee this doesn’t happen, we should define the semantics of these MIR statements such that they forbid redundant StorageLive. As a consequence of this, we can be sure that if we start with a “valid” MIR program – a program that does not execute forbidden instructions, like a redundant StorageLive – then the resulting LLVM will also be “valid”.

Executing such bad instructions is often referred to as the program having “undefined behavior”. The reason for this is that we just don’t want to talk about the possible behaviors of such programs, we just give the compiler the license to do whatever. For this reason, it is crucially important that the programs we care about do not have undefined behavior (or, for short: UB).

How is making StorageLive UB helpful, you may wonder? Didn’t we just shift the problem from rustc producing invalid LLVM to rustc producing invalid MIR? In a sense, that’s true, but I would still argue this is helpful. First of all, it is now clear where the bug (if there is one) has to be fixed: The pass that lowers HIR to MIR should make sure that it emits liveness annotations that match MIR’s rules. All the passes that transform MIR (e.g. to optimize it) have a clear idea of what they are working against. They can all rely on the MIR program they get to not to have UB, and they have to guarantee that they do not change the program in a way that it now does have UB. Code generation from MIR to LLVM is right when it emits the aforementioned intrinsics, because all it relies on is the MIR program not having UB.

And, secondly, we actually now have a way to test if the compiler emits incorrect liveness annotations for a give piece of code: We implement the MIR rules in miri, and then we run miri on a body of representative MIR code produced by the latest compiler to see whether UB arises or not. miri’s test suite is tiny compared to the body of Rust code out there, but it exercises pretty much all syntactic features of core Rust and some standard library code (mostly around Vec as well as all the code needed to even get a program running and to tear it down again – there’s quite a few things going on before and after your fn main(), but that would be a topic for another blog post). It’s a good enough start.

So, I implemented the second of the three behaviors of StorageLive that I outlined above: Marking something as live twice is bogus, MIR programs should not do that. The reasoning here is to play safe; this is as conservative as it gets. If a program is fine under these rules, we can be fairly certain that LLVM also considers this program to be fine.

Of course, the inevitable happened and…

rustc Disagrees, A Compromise Arises

That’s right, there was a test failure. It turns out rustc does produce code that violates the rules we picked.² So, did we find a bug?

Well, maybe. Or maybe the rules we picked were just too conservative. At this point, I ended up in a lengthy discussion with @eddyb and @arielb1 and some folks in #rustc, who know approximately infinitely more about LLVM and rustc than I do, and this is how the third option in the list arose: When performing StorageLive on a variable that already is live, forget the value that is currently in the local variable, but otherwise keep it live. This is consistent with what we have caught LLVM doing. It is hard to get any more definite than this. The “real” semantics of the relevant LLVM intrinsics is implicitly hidden between the lines of the LLVM code that uses these intrinsics to perform analyses. Often, not even the LLVM developers will be sure what is allowed and what is not. (In fact, this is exactly what happened here.) This is not very satisfying, but lacking a more precise description of the LLVM semantics, this is all we can do.

The good news is that with this choice of MIR semantics, miri’s test suite passes. We can thus be sure (well, insofar as the test suite is representative – this will hopefully get better over time) that rustc produces code that follows our new rules.

And the Moral of This Story

So, what did we learn?

First of all, we learned that it is frustrating not to know whether a certain concrete program, with given concrete input, is actually triggering undefined behavior or not. This kind of uncertainty is even more worrying if we consider that tons and tons of code out there is compiled by LLVM – most of it is probably fine, but still, I think we should be able to be somewhat more certain about whether the programs we write and run are actually doing what we think they do. (Because, remember – if a program has UB, the compiler has license to do literally anything to your code, and most modern compilers are making use of this possibility without a second thought. To be fair, they kind of have to, because all the other compilers do, and this is how they make programs go fast and win benchmarks, and this is what everybody cares about when it comes to compilers. Sad world.) In Rust, we want to do better than this, and that’s in fact the entire point of the unsafe code guidelines project. If we can realize the Rust specification in miri’s code (besides having a natural language description), that should get us a long way towards reducing ambiguity and gaps in our specification. For the near future, LLVM and its issues (and speed!) will still remain the pipeline, but who knows what happens long-term?

Secondly, you have seen an example of how an “instrumented machine” looks like. I didn’t actually call it that way, but notice how miri now keeps track of the liveness of a local variable. That’s not something that the final program does. And that’s fine, because entirely ignoring all liveness information just means that programs that used to be UB, are now fully defined – which is always a legal thing to do. (You may however lose some optimization potential on the way.) However, miri has the goal of detecting all liveness-related UB, and to this end, it “instruments” the program (or the execution, or the machine – however you look at it) with additional information, additional state. This is a very common approach. It is used, for example, by valgrind to tell you when your program’s memory accesses are invalid, but it is also used in formal verification of programs (in this domain, we often refer to the additional state of the instrumented machine as “logical state” or “ghost state”).

And finally, the workflow above is actually somewhat representative of how my contribution to the unsafe code guidelines within the next few months will hopefully look like: The plan is to add additional instructions to MIR that express certain facts the compiler believes to be true (just like the liveness annotations do), and to give these instructions a semantics in miri (this will involve adding state that the actual program does not have, i.e., adding more instrumentation). There will be a tension between the semantics being “conservative enough” such that LLVM’s assumptions are satisfied (for example, some of this will relate to the noalias flag that rustc emits), and such that rustc can perform some clever optimizations we really want it to perform, while at the same time the semantics should be “permissive enough” such that the code all the Rust developers out there write, and the MIR that rustc generates from that code, is actually permitted. There will be back and forth between various alternatives, and in the end, hopefully there will be a compromise that everybody can accept.

Happy safe hacking!

Footnotes

You can run rustc -O --emit=mir to reproduce this. -O turns on release mode, which turns off overflow checking, which makes the MIR code significantly shorter. Alternatively, you can compile Rust code to MIR on the playground. ↩
In case you are curious, it was this test that failed. Somewhat unsurprisingly, loops are involved. ↩

Posted on Ralf's Ramblings on Jun 6, 2017.
Comments? Drop me a mail or leave a note on reddit!