Nov 16, 2018 • Internship, RustEditsPermalink

Stacked Borrows Implemented

Three months ago, I proposed Stacked Borrows as a model for defining what kinds of aliasing are allowed in Rust, and the idea of a validity invariant that has to be maintained by all code at all times. Since then I have been busy implementing both of these, and developed Stacked Borrows further in doing so. This post describes the latest version of Stacked Borrows, and reports my findings from the implementation phase: What worked, what did not, and what remains to be done. There will also be an opportunity for you to help the effort!

This post is a self-contained introduction to Stacked Borrows. Other than historical curiosity and some comparison with my earlier work on Types as Contracts there is no reason to read the original post at this point.

What Stacked Borrows does is that it defines a semantics for Rust programs such that some things about references always hold true for every valid execution (meaning executions where no undefined behavior occurred): &mut references are unique (we can rely on no accesses by other functions happening to the memory they point to), and & references are immutable (we can rely on no writes happening to the memory they point to, unless there is an UnsafeCell). Usually we have the borrow checker guarding us against such nefarious violations of reference type guarantees, but alas, when we are writing unsafe code, the borrow checker cannot help us. We have to define a set of rules that makes sense even for unsafe code.

I will explain these rules again in this post. The explanation is not going to be the same as last time, not only because it changed a bit, but also because I think I understand the model better myself now so I can do a better job explaining it.

Ready? Let’s get started. I hope you brought some time, because this is a rather lengthy post. If you are not interested in a detailed description of Stacked Borrows, you can skip most of this post and go right to section 4. If you only want to know how to help, jump to section 6.

1 Enforcing Uniqueness

Let us first ignore the part about & references being immutable and focus on uniqueness of mutable references. Namely, we want to define our model in a way that calling the following function will trigger undefined behavior:

fn demo0() {
  let x = &mut 1u8;
  let y = &mut *x;
  *y = 5;
  // Write through a pointer aliasing `y`
  *x = 3;
  // Use `y` again, asserting it is still exclusive
  let _val = *y;
}

We want this function to be disallowed because between two uses of y, there is a use of another pointer for the same location, violating the fact that y should be unique.

Notice that this function does not compile, the borrow checker won’t allow it. That’s great! It is undefined behavior, after all. But the entire point of this exercise is to explain why we have undefined behavior here without referring to the borrow checker, because we want to have rules that also work for unsafe code. In fact, you could say that retroactively, these rules explain why the borrow checker works the way it does: We can pretend that the model came first, and the borrow checker is merely doing compile-time checks to make sure we follow the rules of the model.

To be able to do this, we have to pretend our machine has two things which real CPUs do not have. This is an example of adding “shadow state” or “instrumented state” to the “virtual machine” that we use to specify Rust. This is not an uncommon approach, often times source languages make distinctions that do not appear in the actual hardware. A related example is valgrind’s memcheck which keeps track of which memory is initialized to be able to detect memory errors: During a normal execution, uninitialized memory looks just like all other memory, but to figure out whether the program is violating C’s memory rules, we have to keep track of some extra state.

For stacked borrows, the extra state looks as follows:

  1. For every pointer, we keep track of an extra “tag” that records when and how this pointer was created.
  2. For every location in memory, we keep track of a stack of “items”, indicating which tag a pointer must have to be allowed to access this location.

These exist separately, i.e., when a pointer is stored in memory, then we both have a tag stored as part of this pointer value (remember, bytes are more than u8), and every byte occupied by the pointer has a stack regulating access to this location. Also these two do not interact, i.e., when loading a pointer from memory, we just load the tag that was stored as part of this pointer. The stack of a location, and the tag of a pointer stored at some location, do not have any effect on each other.

In our example, there are two pointers (x and y) and one location of interest (the one both of these pointers point to, initialized with 1u8). When we initially create x, it gets tagged Uniq(0) to indicate that it is a unique reference, and the location’s stack has Uniq(0) at its top to indicate that this is the latest reference allowed to access said location. When we create y, it gets a new tag, Uniq(1), so that we can distinguish it from x. We also push Uniq(1) onto the stack, indicating not only that Uniq(1) is the latest reference allow to access, but also that it is “derived from” Uniq(0): The tags higher up in the stack are descendants of the ones further down.

So after both references are created, we have: x tagged Uniq(0), y tagged Uniq(1), and the stack contains [Uniq(0), Uniq(1)]. (Top of the stack is on the right.)

When we use y to access the location, we make sure its tag is at the top of the stack: check, no problem here. When we use x, we do the same thing: Since it is not at the top yet, we pop the stack until it is, which is easy. Now the stack is just [Uniq(0)]. Now we use y again and… blast! Its tag is not on the stack. We have undefined behavior.

In case you got lost, here is the source code with comments indicating the tags and the stack of the one location that interests us:

fn demo0() {
  let x = &mut 1u8; // tag: `Uniq(0)`
  // stack: [Uniq(0)]

  let y = &mut *x; // tag: `Uniq(1)`
  // stack: [Uniq(0), Uniq(1)]

  // Pop until `Uniq(1)`, the tag of `y`, is on top of the stack:
  // Nothing changes.
  *y = 5;
  // stack: [Uniq(0), Uniq(1)]

  // Pop until `Uniq(0)`, the tag of `x`, is on top of the stack:
  // We pop `Uniq(1)`.
  *x = 3;
  // stack: [Uniq(0)]

  // Pop until `Uniq(1)`, the tag of `y`, is on top of the stack:
  // That is not possible, hence we have undefined behavior.
  let _val = *y;
}

Well, actually having undefined behavior here is good news, since that’s what we wanted from the start! And since there is an implementation of the model in Miri, you can try this yourself: The amazing @shepmaster has integrated Miri into the playground, so you can put the example there (adjusting it slightly to circumvent the borrow checker), then select “Tools - Miri” and it will complain (together with a rather unreadable backtrace, we sure have to improve that one):

error[E0080]: constant evaluation error: Borrow being dereferenced (Uniq(1037)) does not exist on the stack
 --> src/main.rs:6:14
  |
6 |   let _val = *y;
  |              ^^ Borrow being dereferenced (Uniq(1037)) does not exist on the stack
  |

2 Enabling Sharing

If we just had unique pointers, Rust would be a rather dull language. Luckily enough, there are also two ways to have shared access to a location: through shared references (safely), and through raw pointers (unsafely). Moreover, shared references sometimes (but not when they point to an UnsafeCell) assert an additional guarantee: Their destination is immutable.

For example, we want the following code to be allowed – not least because this is actually safe code accepted by the borrow checker, so we better make sure this is not undefined behavior:

fn demo1() {
  let x = &mut 1u8;
  // Create several shared references, and we can also still read from `x`
  let y1 = &*x;
  let _val = *x;
  let y2 = &*x;
  let _val = *y1;
  let _val = *y2;
}

However, the following code is not okay:

fn demo2() {
  let x = &mut 1u8;
  let y = &*x;
  // Create raw reference aliasing `y` and write through it
  let z = x as *const u8 as *mut u8;
  unsafe { *z = 3; }
  // Use `y` again, asserting it still points to the same value
  let _val = *y;
}

If you try this in Miri, you will see it complain:

 --> src/main.rs:6:14
  |
6 |   let _val = *y;
  |              ^^ Location is not frozen long enough
  |

How is it doing that, and what is a “frozen” location?

To explain this, we have to extend the “shadow state” of our “virtual machine” a bit. First of all, we introduce a new kind of tag that a pointer can carry: A shared tag. The following Rust type describes the possible tags of a pointer:

pub type Timestamp = u64;
pub enum Borrow {
    Uniq(Timestamp),
    Shr(Option<Timestamp>),
}

You can think of the timestamp as a unique ID, but as we will see, for shared references, it is also important to be able to determine which of these IDs was created first. The timestamp is optional in the shared tag because that tag is also used by raw pointers, and for raw pointers, we are often not able to track when and how they are created (for example, when raw pointers are converted to integers and back).

We use a separate type for the items on our stack, because there we do not need a timestamp for shared pointers:

pub enum BorStackItem {
    Uniq(Timestamp),
    Shr,
}

And finally, a “borrow stack” consists of a stack of BorStackItem, together with an indication of whether the stack (and the location it governs) is currently frozen, meaning it may only be read, not written:

pub struct Stack {
    borrows: Vec<BorStackItem>, // used as a stack; never empty
    frozen_since: Option<Timestamp>, // virtual frozen "item" on top of the stack
}

2.1 Executing the Examples

Let us now look at what happens when we execute our two example programs. To this end, I will embed comments in the source code. There is only one location of interest here, so whenever I talk about a “stack”, I am referring to the stack of that location.

fn demo1() {
  let x = &mut 1u8; // tag: `Uniq(0)`
  // stack: [Uniq(0)]; not frozen

  let y1 = &*x; // tag: `Shr(Some(1))`
  // stack: [Uniq(0), Shr]; frozen since 1

  // Access through `x`.  We first check whether its tag `Uniq(0)` is in the
  // stack (it is).  Next, we make sure that either our item *or* `Shr` is on
  // top *or* the location is frozen.  The latter is the case, so we go on.
  let _val = *x;
  // stack: [Uniq(0), Shr]; frozen since 1

  // This is not an access, but we still dereference `x`, so we do the same
  // actions as on a read.  Just like in the previous line, nothing happens.
  let y2 = &*x; // tag: `Shr(Some(2))`
  // stack: [Uniq(0), Shr]; frozen since 1

  // Access through `y1`.  Since the shared tag has a timestamp (1) and the type
  // (`u8`) does not allow interior mutability (no `UnsafeCell`), we check that
  // the location is frozen since (at least) that timestamp.  It is.
  let _val = *y1;
  // stack: [Uniq(0), Shr]; frozen since 1

  // Same as with `y2`: The location is frozen at least since 2 (actually, it
  // is frozen since 1), so we are good.
  let _val = *y2;
  // stack: [Uniq(0), Shr]; frozen since 1
}

This example demonstrates a few new aspects. First of all, there are actually two operations that perform tag-related checks in this model (so far): Dereferencing a pointer (whenever you have a *, also implicitly), and actual memory accesses. Operations like &*x are an example of operations that dereference a pointer without accessing memory. Secondly, reading through a mutable reference is actually okay even when that reference is not exclusive. It is only writing through a mutable reference that “re-asserts” its exclusivity. I will come back to these points later, but let us first go through another example.

fn demo2() {
  let x = &mut 1u8; // tag: `Uniq(0)`
  // stack: [Uniq(0)]; not frozen
  
  let y = &*x; // tag: `Shr(Some(1))`
  // stack: [Uniq(0), Shr]; frozen since 1

  // The `x` here really is a `&*x`, but we have already seen above what
  // happens: `Uniq(0)` must be in the stack, but we leave it unchanged.
  let z = x as *const u8 as *mut u8; // tag erased: `Shr(None)`
  // stack: [Uniq(0), Shr]; frozen since 1

  // A write access through a raw pointer: Unfreeze the location and make sure
  // that `Shr` is at the top of the stack.
  unsafe { *z = 3; }
  // stack: [Uniq(0), Shr]; not frozen

  // Access through `y`.  There is a timestamp in the `Shr` tag, and the type
  // `u8` does not allow interior mutability, but the location is not frozen.
  // This is undefined behavior.
  let _val = *y;
}

2.2 Dereferencing a Pointer

As we have seen, we consider the tag of a pointer already when dereferencing it, before any memory access happens. The operation on a dereference never mutates the stack, but it performs some basic checks that might declare the program UB. The reason for this is twofold: First of all, I think we should require some basic validity for pointers that are dereferenced even when they do not access memory. Secondly, there is the practical concern for the implementation in Miri: When we dereference a pointer, we are guaranteed to have type information available (crucial for things that depend on the presence of an UnsafeCell), whereas having type information on every memory access would be quite hard to achieve in Miri.

Notice that on a dereference, we have both a tag at the pointer and the type of a pointer, and the two might not agree, which we do not always want to rule out (after a transmute, we might have raw or shared pointers with a unique tag, for example).

The following checks are done on every pointer dereference, for every location covered by the pointer (size_of_val tells us how many bytes the pointer covers):

  1. If this is a raw pointer, do nothing. Raw accesses are checked as little as possible.
  2. If this is a unique reference and the tag is Shr(Some(_)), that’s an error.
  3. If the tag is Uniq, make sure there is a matching Uniq item with the same ID on the stack.
  4. If the tag is Shr(None), make sure that either the location is frozen or else there is a Shr item on the stack.
  5. If the tag is Shr(Some(t)), then the check depends on whether the location is inside an UnsafeCell or not, according to the type of the reference.
    • Locations outside UnsafeCell must have frozen_since set to t or an older timestamp.
    • UnsafeCell locations must either be frozen or else have a Shr item in their stack (same check as if the tag had no timestamp).

2.3 Accessing Memory

On an actual memory access, we know the tag of the pointer that was used to access (we always use the actual tag and disregard the type of the pointer), and we know whether we are reading from or writing to the current location. We perform the following operations on all locations affected by the access:

  1. If the location is frozen and this is a read access, nothing happens (even if the tag is Uniq).
  2. Otherwise, if this is a write access, unfreeze the location (set frozen_since to None). (If this is a read access and we come here, the location is already unfrozen.)
  3. Pop the stack until the top item matches the tag of the pointer.
    • A Uniq item matches a Uniq tag with the same ID.
    • A Shr item matches any Shr tag (with or without timestamp).
    • When we are reading, a Shr item matches a Uniq tag.

    If we pop the entire stack without finding a match, then we have undefined behavior.

To understand these rules better, try going back through the three examples we have seen so far and applying these rules for dereferencing pointers and accessing memory to understand how they interact.

The most subtle point here is that we make a Uniq tag match a Shr item and also accept Uniq reads on frozen locations. This is required to make demo1 work: Rust permits read accesses through mutable references even when they are not currently actually unique. Our model hence has to do the same.

3 Retagging and Creating Raw Pointers

We have talked quite a bit about what happens when we use a pointer. It is time we take a close look at how pointers are created. However, before we go there, I would like us to consider one more example:

fn demo3(x: &mut u8) -> u8 {
    some_function();
    *x
}

The question is: Can we move the load of x to before the function call? Remember that the entire point of Stacked Borrows is to enforce a certain discipline when using references, in particular, to enforce uniqueness of mutable references. So we should hope that the answer to that question is “yes” (and that, in turn, is good because we might use it for optimizations). Unfortunately, things are not so easy.

The uniqueness of mutable references entirely rests on the fact that the pointer has a unique tag: If our tag is at the top of the stack (and the location is not frozen), then any access with another tag will pop our item from the stack (or cause undefined behavior). This is ensured by the memory access checks. Hence, if our tag is still on the stack after some other accesses happened (and we know it is still on the stack every time we dereference the pointer, as per the dereference checks described above), we know that no access through a pointer with a different tag can have happened.

3.1 Guaranteed Freshness

However, what if some_function has an exact copy of x? We got x from our caller (whom we do not trust), maybe they used that same tag for another reference (copied it with transmute_copy or so) and gave that to some_function? There is a simple way we can circumvent this concern: Generate a new tag for x. If we generate the tag (and we know generation never emits the same tag twice, which is easy), we can be sure this tag is not used for any other reference. So let us make this explicit by putting a Retag instruction into the code where we generate new tags:

fn demo3(x: &mut u8) -> u8 {
    Retag(x);
    some_function();
    *x
}

These Retag instructions are inserted by the compiler pretty much any time references are copied: At the beginning of every function, all inputs of reference type get retagged. On every assignment, if the assigned value is of reference type, it gets retagged. Moreover, we do this even when the reference value is inside the field of a struct or enum, to make sure we really cover all references. (This recursive descent is already implemented, but the implementation has not landed yet.) Finally, Box is treated like a mutable reference, to encode that it asserts unique access. However, we do not descend recursively through references: Retagging a &mut &mut u8 will only retag the outer reference.

Retagging is the only operation that generates fresh tags. Taking a reference simply forwards the tag of the pointer we are basing this reference on.

Here is our very first example with explicit retagging:

fn demo0() {
  let x = &mut 1u8; // nothing interesting happens here
  Retag(x); // tag of `x` gets changed to `Uniq(0)`
  // stack: [Uniq(0)]; not frozen
  
  let y = &mut *x; // nothing interesting happens here
  Retag(y); // tag of `y` gets changed to `Uniq(1)`
  // stack: [Uniq(0), Uniq(1)]; not frozen

  // Check that `Uniq(1)` is on the stack, then pop to bring it to the top.
  *y = 5;
  // stack: [Uniq(0), Uniq(1)]; not frozen

  // Check that `Uniq(0)` is on the stack, then pop to bring it to the top.
  *x = 3;
  // stack: [Uniq(0)]; not frozen

  // Check that `Uniq(1)` is on the stack -- it is not, hence UB.
  let _val = *y;
}

For each reference and Box, Retag does the following (we will slightly refine these instructions later) on all locations covered by the reference (again, according to size_of_val):

  1. Compute a fresh tag: Uniq(_) for mutable references and Box, and Shr(Some(_)) for shared references.
  2. Perform the checks that would also happen when we dereference this reference.
  3. Perform the actions that would also happen when an actual access happens through this reference (for shared references a read access, for mutable references a write access).
  4. Check if the new tag is Shr(Some(t)) and the location is inside an UnsafeCell.
    • If both conditions apply, freeze the location with timestamp t. If it is already frozen, do nothing.
    • Otherwise, push a new item onto the stack: Shr if the tag is a Shr(_), Uniq(id) if the tag is Uniq(id).

One high-level way to think about retagging is that it computes a fresh tag, and then performs a reborrow of the old reference with the new tag.

3.2 When Pointers Escape

Creating a shared reference is not the only way to share a location: We can also create raw pointers, and if we are careful enough, use them to access a location from different aliasing pointers. (Of course, “careful enough” is not very precise, but the precise answer is the very model I am describing here.)

To account for this, we consider the act of casting to a raw pointer as a special way of creating a reference (“creating a raw reference”, so to speak). As usual for creating a new reference, that operation is followed by retagging. This retagging is special though because unlike normal retagging it acts on raw pointers. Consider the following example:

fn demo4() {
  let x = &mut 1u8;
  Retag(x); // tag of `x` gets changed to `Uniq(0)`
  // stack: [Uniq(0)]; not frozen

  let y1 = x as *mut u8;
  // Make sure what `x` points to is accessible through raw pointers.
  Retag([raw] y1) // tag of `y` gets erased to `Shr(None)`
  // stack: [Uniq(0), Shr]; not frozen

  let y2 = y1;
  unsafe {
    // All of these first dereference a raw pointer (no checks, tag gets
    // ignored) and then perform a read or write access with `Shr(None)` as
    // the tag, which is already the top of the stack so nothing changes.
    *y1 = 3;
    *y2 = 5;
    *y2 = *y1;
  }

  // Writing to `x` again pops `Shr` off the stack, as per the rules for
  // write accesses.
  *x = 7;
  // stack: [Uniq(0)]; not frozen

  // Any further access through the raw pointers is undefined behavior, even
  // reads: The write to `x` re-asserted that `x` is the unique reference for
  // this memory.
  let _val = unsafe { *y1 };
}

Retag([raw]) acts almost like a normal retag, except that it does not ignore raw pointers and instead tags them Shr(None), and pushes Shr to the stack.

This way, even if the program casts the pointer to an integer and back (where we cannot always keep track of the tag, so it might get reset to Shr(None)), there is a matching Shr on the stack, making sure the raw pointer can actually be used. One way to think about this is to consider the reference to “escape” when it is cast to a raw pointer, which is reflected by the Shr item on the stack.

Knowing about how Retag interacts with raw pointers, you can now go back to demo2 and should be able to fully explain why the stack changes the way it does in that example.

3.3 The Case of the Aliasing References

Everything I described so far was pretty much in working condition as of about a week ago. However, there was one thorny problem that I only discovered fairly late, and as usual it is best demonstrated by an example – entirely in safe code:

fn demo_refcell() {
  let rc: &mut RefCell<u8> = &mut RefCell::new(23u8);
  Retag(rc); // tag gets changed to `Uniq(0)`
  // We will consider the stack of the location where `23` is stored; the
  // `RefCell` bookkeeping counters are not of interest.
  // stack: [Uniq(0)]

  // Taking a shared reference shares the location but does not freeze, due
  // to the `UnsafeCell`.
  let rc_shr: &RefCell<u8> = &*rc;
  Retag(rc_shr); // tag gets changed to `Shr(Some(1))`
  // stack: [Uniq(0), Shr]; not frozen

  // Lots of stuff happens here but it does not matter for this example.
  let mut bmut: RefMut<u8> = rc_shr.borrow_mut();
  
  // Obtain a mutable reference into the `RefCell`.
  let mut_ref: &mut u8 = &mut *bmut;
  Retag(mut_ref); // tag gets changed to `Uniq(2)`
  // stack: [Uniq(0), Shr, Uniq(2)]; not frozen
  
  // And at the same time, a fresh shared reference to its outside!
  // This counts as a read access through `rc`, so we have to pop until
  // at least a `Shr` is at the top of the stack.
  let shr_ref: &RefCell<u8> = &*rc; // tag gets changed to `Shr(Some(3))`
  Retag(shr_ref);
  // stack: [Uniq(0), Shr]; not frozen

  // Now using `mut_ref` is UB because its tag is no longer on the stack.  But
  // that is bad, because it is usable in safe code.
  *mut_ref += 19;
}

Notice how mut_ref and shr_ref alias! And yet, creating a shared reference to the memory already covered by our unique mut_ref must not invalidate mut_ref. If we follow the instructions above, when we retag shr_ref after it got created, we have no choice but pop the item matching mut_ref off the stack. Ouch.

This made me realize that creating a shared reference has to be very weak inside UnsafeCell. In fact, it is entirely equivalent to Retag([raw]): We just have to make sure some kind of shared access is possible, but we have to accept that there might be active mutable references assuming exclusive access to the same locations. That on its own is not enough, though.

I also added a new check to the retagging procedure: Before taking any action (i.e., before step 3, which could pop items off the stack), we check if the reborrow is redundant: If the new reference we want to create is already dereferencable (because its item is already on the stack and, if applicable, the stack is already frozen), and if the item that justifies this is moreover “derived from” the item that corresponds to the old reference, then we just do nothing. Here, “derived from” means “further up the stack”. Basically, the reborrow has already happened and the new reference is ready for use; and because of that “derived from” check, we know that using the new reference will not pop the item corresponding to the old reference off the stack. In that case, we avoid popping anything, to keep other references valid.

It may seem like this rule can never apply, because how can our fresh tag match something that’s already on the stack? This is indeed impossible for Uniq tags, but for Shr tags, matching is more liberal. For example, this rule applies in our example above when we create shr_ref from mut_ref. We do not require freezing (because there is an UnsafeCell), there is already a Shr on the stack (so the new reference is dereferencable) and the item matching the old reference (Uniq(0)) is below that Shr (so after using the new reference, the old one remains dereferencable). Hence we do nothing, keeping the Uniq(2) on the stack, such that the access through mut_ref at the end remains valid.

This may sound like a weird rule, and it is. I would surely not have thought of this if RefCell would not force our hands here. However, as we shall see in section 5, it also does not to break any of the important properties of the model (mutable references being unique and shared references being immutable except for UnsafeCell). Moreover, when pushing an item to the stack (at the end of the retag action), we can now be sure that the stack is not yet frozen: if it were frozen, the reborrow would be redundant.

With this extension, the instructions for Retag now look as follows (again executed on all locations covered by the reference, according to size_of_val):

  1. Compute a fresh tag: Uniq(_) for mutable references, Box, Shr(Some(_)) for shared references, and Shr(None) for raw pointers.
  2. Perform the checks that would also happen when we dereference this reference. Remember the position of the item matching the tag in the stack.
  3. Redundancy check: If the new tag passes the checks performed on a dereference, and if the item that makes this check succeed is above the one we remembered in step 2 (where the “frozen” state is considered above every item in the stack), then stop. We are done for this location.
  4. Perform the actions that would also happen when an actual access happens through this reference (for shared references a read access, for mutable references a write access).
    Now the location cannot be frozen any more: If the fresh tag is Uniq, we just unfroze; if the fresh tag is Shr and the location was already frozen, then the redundancy check (step 3) would have kicked in.
  5. Check if the new tag is Shr(Some(t)) and the location is inside an UnsafeCell.
    • If both conditions apply, freeze the location with timestamp t. If it is already frozen, do nothing.
    • Otherwise, push a new item onto the stack: Shr if the tag is a Shr(_), Uniq(id) if the tag is Uniq(id).

The one thing I find slightly unsatisfying about the redundancy check is that it seems to overlap a bit with the rule that on a read access, a Shr item matches a Uniq tag. Both of these together enable the read-only use of mutable references that have already been shared; I would prefer to have a single condition enabling that instead of two working together. Still, overall I think this is a pleasingly clean model; certainly much cleaner than what I proposed last year and at the same time much more compatible with existing code.

4 Differences to the Original Proposal

The key differences to the original proposal is that the check performed on a dereference, and the check performed on an access, are not the same check. This means there are more “moving parts” in the model, but it also means we do not need a weird special exception (about reads from frozen locations) for demo1 any more like the original proposal did. The main reason for this change, however, is that on an access, we just do not know if we are inside an UnsafeCell or not, so we cannot do all the checks we would like to do. Accordingly, I also rearranged terminology a bit. There is no longer one “reactivation” action, instead there is a “deref” check and an “access” action, as described above in sections 2.2 and 2.3.

Beyond that, I made the behavior of shared references and raw pointers more uniform. This helped to fix test failures around iter_mut on slices, which first creates a raw reference and then a shared reference: In the original model, creating the shared reference invalidates previously created raw pointers. As a result of the more uniform treatment, this no longer happens. (Coincidentally, I did not make this change with the intention of fixing iter_mut. I did this change because I wanted to reduce the number of case distinctions in the model. Then I realized the relevant test suddenly passed even with the full model enabled, investigated what happened, and realized I accidentally had had a great idea. :D )

The tag is now “typed” (Uniq vs Shr) to be able to support transmute between references and shared pointers. Such transmute were an open question in the original model and some people raised concerns about it in the ensuing discussion. I invite all of you to come up with strange things you think you should be able to transmute and throw them at Miri so that we can see if your use-cases are covered. :)

The redundancy check during retagging can be seen as refining a similar check that the original model did whenever a new reference was created (where we wouldn’t change the state if the new borrow is already active).

Finally, the notion of “function barriers” from the original Stacked Borrows has not been implemented yet. This is the next item on my todo list.

5 Key Properties

Let us look at the two key properties that I set out as design goals, and see how the model guarantees that they hold true in all valid (UB-free) executions.

5.1 Mutable References are Unique

The property I would like to establish here is that: After creating (retagging, really) a &mut, if we then run some unknown code that does not get passed the reference, and then we use the reference again (reading or writing), we can be sure that this unknown code did not access the memory behind our mutable reference at all (or we have UB). For example:

fn demo_mut_unique(our: &mut i32) -> i32 {
  Retag(our); // So we can be sure the tag is unique

  *our = 5;

  unknown_code();

  // We know this will return 5, and moreover if `unknown_code` does not panic
  // we know we could do the write after calling `unknown_code` (because it
  // cannot even read from `our`).
  *our
}

The proof sketch goes as follows: After retagging the reference, we know it is at the top of the stack and the location is not frozen. (The “redundant reborrow” rule does not apply because a fresh Uniq tag can never be redundant.) For any access performed by the unknown code, we know that access cannot use the tag of our reference because the tags are unique and not forgeable. Hence if the unknown code accesses our locations, that would pop our tag from the stack. When we use our reference again, we know it is on the stack, and hence has not been popped off. Thus there cannot have been an access from the unknown code.

Actually this theorem applies any time we have a reference whose tag we can be sure has not been leaked to anyone else, and which points to locations which have this tag at the top of the (unfrozen) stack. This is not just the case immediately after retagging. We know our reference is at the top of the stack after writing to it, so in the following example we know that unknown_code_2 cannot access our:

fn demo_mut_advanced_unique(our: &mut u8) -> u8 {
  Retag(our); // So we can be sure the tag is unique

  unknown_code_1(&*our);

  // This "re-asserts" uniqueness of the reference: After writing, we know
  // our tag is at the top of the stack.
  *our = 5;

  unknown_code_2();

  // We know this will return 5
  *our
}

5.2 Shared References (without UnsafeCell) are Immutable

The key property of shared references is that: After creating (retagging, really) a shared reference, if we then run some unknown code (it can even have our reference if it wants), and then we use the reference again, we know that the value pointed to by the reference has not been changed. For example:

fn demo_shr_frozen(our: &u8) -> u8 {
  Retag(our); // So we can be sure the tag actually carries a timestamp

  // See what's in there.
  let val = *our;
  
  unknown_code(our);

  // We know this will return `val`
  *our
}

The proof sketch goes as follows: After retagging the reference, we know the location is frozen (this is the case even if the “redundant reborrow” rule applies). If the unknown code does any write, we know this will unfreeze the location. The location might get re-frozen, but only at the then-current timestamp. When we do our read after coming back from the unknown code, this checks that the location is frozen at least since the timestamp given in its tag, so if the location is unfrozen or got re-frozen by the unknown code, the check would fail. Thus the unknown code cannot have written to the location.

One interesting observation here for both of these proofs is that all we rely on when the unknown code is executed are the actions performed on every memory access. The additional checks that happen when a pointer is dereferenced only matter in our code, not in the foreign code. Hence we have no problem reasoning about the case where we call some code via FFI that is written in a language without a notion of “dereferencing”, all we care about is the actual memory accesses performed by that foreign code. This also indicates that we could see the checks on pointer dereference as another “shadow state operation” next to Retag, and then these two operations plus the actions on memory accesses are all that there is to Stacked Borrows. This is difficult to implement in Miri because dereferences can happen any time a path is evaluated, but it is nevertheless interesting and might be useful in a “lower-level MIR” that does not permit dereferences in paths.

6 Evaluation, and How You Can Help

I have implemented both the validity invariant and the model as described above in Miri. This uncovered two issues in the standard library, but both were related to validity invariants, not Stacked Borrows. With these exceptions, the model passes the entire test suite. There were some more test failures in earlier versions (as mentioned in section 4), but the final model accepts all the code covered by Miri’s test suite. (If you look close enough, you can see that three libstd methods are currently whitelisted and what they do is not checked. However, even before I ran into these cases, efforts were already underway that would fix all of them, so I am not concerned about them.) Moreover I wrote a bunch of compile-fail tests to make sure the model catches various violations of the key properties it should ensure.

The most interesting change I had to make to libstd is in NonNull::from. That function turned a &mut T into a *const T going through a &T. This means that the final raw pointer was created from a shared reference, and hence must not be used for mutation. An earlier version of this post described a model that would permit such behavior, but I think we should actually at least experiment with ruling it out: “no mutation through (pointers derived from) shared references” is an old rule in Rust, after all.

Overall, I am quite happy with this! I was expecting much more trouble, expecting to run into cases where libstd does strange things that are common or otherwise hard to declare illegal and that my model could not reasonably allow. I see the test suite passing as an indication that this model may be well-suited for Rust.

However, Miri’s test suite is tiny, and I have but one brain to come up with counterexamples! In fact I am quite a bit worried because I literally came up with demo_refcell less than two weeks ago, so what else might I have missed? This where you come in. Please test this model! Come up with something funny you think should work (I am thinking about funny transmute in particular, using type punning through unions or raw pointers if you prefer that), or maybe you have some crate that has some unsafe code and a test suite (you do have a test suite, right?) that might run under Miri.

The easiest way to try the model is the playground: Type the code, select “Tools - Miri”, and you’ll see what it does.

For things that are too long for the playground, you have to install Miri on your own computer. Miri depends on rustc nightly and has to be updated regularly to keep working, so it is not well-suited for crates.io. Instead, installation instructions for Miri are provided in the README. We are still working on making installing Miri easier. Please let me know if you are having trouble with anything. You can report issues, comment on this post or find me in chat (as of recently, I am partial to Zulip where we have an unsafe code guidelines stream).

With Miri installed, you can cargo miri a project with a binary to run it in Miri. Dependencies should be fully supported, so you can use any crate you like. It is not unlikely, however, that you will run into issues because Miri does not support some operation. In that case please search the issue tracker and report the issue if it is new. We cannot support everything, but we might be able to do something for your case.

Unfortunately, cargo miri test is currently broken; if you want to help with that here are some details. Moreover, wouldn’t it be nice if we could run the entire libcore, liballoc and libstd test suite in miri? There are tons of interesting cases of Rust’s core data structures being exercise there, and the comparatively tiny Miri test suite has already helped to find two soundness bugs, so there are probably more. Once cargo miri test works again, it would be great to find a way to run it on the standard library test suites, and set up something so that this happens automatically on a regular basis (so that we notice regressions). Update: cargo miri test has been fixed in the mean time, so you can use it on your libraries now! /Update

As you can see, there is more than enough work for everyone. Don’t be shy! I have a mere two weeks left on this internship, after which I will have to significantly reduce my Rust activities in favor of finishing my PhD. I won’t disappear entirely though, don’t worry – I will still be able to mentor you if you want to help with any of the above tasks. :)

Thanks to @nikomatsakis for feedback on a draft of this post, to @shepmaster for making Miri available on the playground, and to @oli-obk for reviewing all my PRs at unparalleled speed. <3

If you want to help or report results of your experiments, if you have any questions or comments, please join the discussion in the forums.

Changelog

2018-11-21: Dereferencing a pointer now always preserves the tag, but casting to a raw pointer resets the tag to Shr(None). Box is treated like a mutable reference.

2018-12-22: Creating a shared reference does not push a Shr item to the stack (unless there is an UnsafeCell). Moreover, creating a raw pointer is a special kind of retagging.

Posted on Ralf's Ramblings on Nov 16, 2018.
Comments? Drop me a mail or leave a note in the forum!