Some weeks ago, I described Types as Contracts as an approach for how to go about defining Rust’s aliasing-related undefined behavior. One key property of this approach is that it is executable, which means we can actually have a program (in this case, miri) tell us whether some particular Rust test case exhibits undefined behavior or not. I have since then spent most of my time completing and refining this implementation, and running it on miri’s test suite to exercise various bits of the standard library and see whether they are actually following the rules I have been suggesting.
This post is my report on what I found, and how the approach itself changed in response to these findings. It also explains how you can run miri yourself and have it check your code for undefined behavior according to this model. Finally, I discuss some of the things that I would like to look at going forward.
1 Rule Violations in libstd and miri’s Test Suite
Unsurprisingly, some of miri’s test cases started to fail once I properly implemented the rules I wanted to have checked. Some of these turned out to be compiler bugs, but others are actually examples of code violating the proposed rules. In the following, I describe the patterns that I found. I assume that you are familiar with Types as Contracts as introduced in my previous post.
1.1 Ownership Passing Via Raw Pointers
I already mentioned this in the original post:
mem::swap triggers a lock violation in the strictest interpretation of the model.
The problem is that
ptr::swap_nonoverlapping takes a raw pointer, so
swap sees no reason to release its write locks on the memory covered by
That makes it UB for
ptr::swap_nonoverlapping to access that memory.
Raw pointers sneakily transferring ownership cannot just lead to functions accessing memory that the model does not allow them to access, it can also lead to overlapping mutable references.
We can see this in
The problem is that when
from_raw_parts_mut(ptr, mid) returns a
&mut [T], that overlaps with
self, so we have two aliasing mutable references in scope.
This makes acquire-validation of the return value of
from_raw_parts_mut(ptr, mid) fail.
Another example of this pattern is
self.v.get() returns a
*mut u8, which then gets turned into a
However, this again overlaps with
This list is not exhaustive by any means; there are problems of this sort pretty much whenever raw pointers are used in non-trivial ways.
1.2 Uninitialized memory
Any place that calls
mem::uninitialized instantly has validation blow up because the returned
T is, obviously, not actually a valid
T – it is, after all, completely uninitialized.
However, it turns out that
mem::uninitialized is causing other problems as well (e.g. around uninhabited types), and better APIs for handling uninitialized data have already been proposed.
This is the most interesting case I found. The
Arc destructor ends up calling
Arc::drop_slow, which is implemented as follows:
drop_slow is executed, it first uses
drop_in_place to drop the user data in the
Next, it calls
inner, and this is where things go wrong.
The reason validation fails is that when
self.ptr.as_ref(), that returns an
This return value is validated, like all return values are.
However, remember that we already dropped the
data field of the
We don’t actually have a fully valid
&ArcInner<T>; only the
strong fields may still be used.
Nevertheless, the type of
inner says that there is a valid shared reference to
ArcInner being passed around, making this code UB.
Notice that even if
inner did not care about the types, validation would still fail when
drop_slow goes on calling the safe function
Layout::for_value and passes it the invalid
To fix this, we have to make sure that the type
&ArcInner is not used after
data is dropped.
The following would work, for example:
2 Changes to the Model
There were a whole bunch of other test failures, but they all turned out to be bugs in the implementations or even the model, rather than legit violations of the “Types as Contracts” idea. In particular, it turned out that suspensions were actually not modeling borrowing of mutable data properly.
There were also some changes affecting the rules of where validation calls are emitted in some corner cases, but there weren’t really any interesting things to learn there. At some point I will write down the exact rules in full detail; until then it is really the general approach that matters much more than such corner cases.
2.1 Suspension of Write Locks
In the original proposal, on every borrow
y = &[mut] x;, the write locks on
x would be suspended.
This means that the locks get released, but will be re-acquired (or recovered) automatically when the lifetime of the borrow ends.
First of all, this actually fails to detect some cases that should be considered contract violations, like passing an
&mut bool to a function and then having that function write something invalid (i.e., something that is not a boolean) into the reference.
To fix that, I added a new table to the global state of the program that would record suspended lvalues and their type (rather than suspended memory regions).
When the lifetime of the suspension ends, we perform a full re-acquire of the given lvalue at the given type, acquiring all the write locks again and also checking other invariants, like whether
bool values are valid.
However, it turned out that this model was still too naive. I failed to account for the fact that the borrow checker actually accepts programs like the following:
Let us step through this code while considering what happens with the locks.
Initially, we acquire
s, so the memory occupied by the pair will be write-locked by
When the borrow of
s.1 happens, we release the write lock and suspend
s.1 at type
i32, making sure we re-acquire the write lock when the lifetime
'y of the borrow.
Nothing interesting happens when
*y gets acquired.
Next, the entire
s gets borrowed, so we will try to release all the write locks on it during suspension – but at this point, we don’t even have a write lock of
Furthermore, due to this second borrow (that has a longer lifetime than the first borrow), we actually do not want to re-acquire the write lock on
'y ends, we will want to wait until
'x ends and keep the read lock until then.
What we really want to do is count how many shared borrows happened of
s.2, respectively, and only acquire the write lock again once that count reaches 0.
This is complicated a lot once
RefCell enters the game: The fact that we take a shared borrow does not mean that there won’t also be mutable borrows again for the same region of memory.
This is demonstrated by the following example (with comments explaining what happens to the locks; adding all the
Validate calls just makes this code unreadable):
At the end of this function, we actually have four aliasing shared borrows covering the content of the
There are two mutable references, both of which currently suspended (
x_inner_shr2 have expired (i.e, their lifetime has ended), we will want to recover the write lock on
x_inner expires, releasing the write lock.
Finally, once both
r gets recovered and we again acquire the write lock.
To make that happen, we need to be able to track multiple suspended write locks for a location.
In implementing that, an interesting question came up: What identifies a write lock?
I ended up using the lifetime of the lock as the identifier, but that doesn’t actually work.
Notice that in the above example, both
r have the same lifetime, even though they are clearly corresponding to distinct write locks.
So, I will have to go back to the drawing board and figure out a different way to identify write locks.
I have some ideas, but I am not very happy with them.
2.2 Unsafe Code
As already indicated in the original post, the way I suggest to handle the problems with raw pointers described in §1.1 is by relaxing the validation performed around unsafe code. Specifically, the code that emits MIR validation statements now detects if a function is unsafe, contains an unsafe block, or (in the case of closures) is defined inside an unsafe block/function. In that case, we still perform validation of the arguments passed to us, but we immediately release all locks again. The same happens for data returned to us from other functions. No further validation is happening inside the unsafe function. In other words, we enforce the type invariants when data passes between functions, but we do not enforce any invariant within such functions.
I am aware that this part of the proposal is particularly controversial. I do not know of an alternative that actually works with the raw pointers, and that performs validation purely based on the types of variables as they are passed around (rather than, say, also taking some form of provenance or “derived-from” relation into account). In discussions, two possible solutions to this problems became apparent:
- We could try to find some kind of API that’s slightly safer than raw pointers, and somehow does not lose all lifetime information. Unsafe code using that API, and not using raw pointers, would still be subject to full validation and thus also enjoy the full set of optimizations.
- We could try to incorporate some form of provenance tracking into this model.
Both of these options are certainly worth exploring.
3 Testing Your Own Code
The implementation of the model as described above (well, with the open problems mentioned in §2.1) actually landed upstream in rustc and miri, so you can play around with this yourself!
To do so, you need to install Rust nightly and miri.
Lucky enough, rustup makes this really easy.
(Unfortunately, I have very little Windows experience. If you manage to get this all running on Windows, I’m happy to link to your instructions for how to do that.)
On Linux and macOS, you can use the following commands to install everything you need, and run
test.rs in miri:
rustup install nightly
git clone https://github.com/solson/miri.git && cd miri
cargo +nightly install xargo # xargo is needed to build a libstd that miri can work with
xargo/build.sh # build that libstd
cargo +nightly build --release
MIRI_SYSROOT=~/.xargo/HOST cargo +nightly run --release --bin miri -- -Zmir-emit-validate=1 test.rs
-Zmir-emit-validate=1 turns on generation of the MIR validation statements.
If you set it to
2, full validation will performed even in unsafe functions.
Notice however that the flag only affects the code generated when miri actually runs; the script in
xargo/build.sh controls validation for code in libstd.
If you find anything odd and things don’t work as expected, please report a bug. Notice however that miri currently supports virtually no interaction with the surrounding operating system, like file access or spawning threads.
4 Future Work
As usual, there are so many things that could be done next.
miri’s test suite is rather small, but still large enough to keep me busy, so I have not yet looked at the much more extensive test suite that comes with rustc – but that certainly something I want to do eventually. I am sure this will uncover bugs in corner-cases of my implementation, and it may also uncover new interesting cases of rules being violated. Speaking of the implementation, the problem around write locks described in §2.1 is not the only issue; another problem I still have to solve is to properly acquire and release locks on padding bytes and enum discriminants.
While §1.1 is “fixed” by exempting unsafe code from the full range of checks, and the trouble with uninitialized memory (§1.2) will hopefully be solved as a side-effect of changing the way uninitialized memory is handled, this still leaves
Arc::drop (§1.3) as an open problem.
Notice that relaxing validation around unsafe code does not help here; the invalid data actually flows between functions, and so far the plan for unsafe code is to still enforce type validity when function boundaries are crossed.
This comes down to a judgment call: We could relax validation sufficiently (e.g., relaxing validation in entire modules when they do certain unsafe things, and not validating private fields of structs that we cannot access) to make this code legal, or we could declare that code to actually be UB.
The concrete problem in
Arc can be fixed fairly easily as I have shown; the question is whether this kind of code is something that programmers actually expect to be legal – and if they do, whether we can do anything about that (by providing documentation, or tools, or whatever else is necessary).
Of course, it may also turn out that this shows that the entire approach is not actually feasible.
Interestingly, the discussion of my previous post introducing Types as Contracts was to a large extend not even about this model, but about the underlying model of what memory and pointers are, and how to write a checker that actually faithfully models what LLVM does. After all, we want miri to also detect UB related to e.g. illegal pointer arithmetic. These issues are far from settled, and I do have some interesting ideas here that I want so share with you.
Finally, there is of course the discussion around how to handle raw pointers; I already talked in §2.2 about some of the options that could be explored here.
Unfortunately, today is the last day of my internship, so I will no longer be able to work on all of this full-time like I did the last three months. Still, I certainly intend to stay involved. This problem is way too interesting to just let it go :)
As always, please comment with your thoughts on the topic. I am particularly curious about what kind of test cases you are throwing at miri, and how it is doing!
Update: I added a proposal for how to fix the
Arc problem. /Update