My internship (“research assistantship”) with Mozilla has ended several weeks ago, and this post is a report of the most recent tweaks I made to Miri and Stacked Borrows. Neither project is by any means “done”, of course. However, both have reached a fairly reasonable state, so I felt some kind of closing report made sense. Also, if I ever want to finish my PhD, I’ll have to seriously scale down the amount of time I work on Rust – so at least from my side, things will move more slowly from now on.
In particular, installing Miri and running your test suite in it is now just a single command away! Scroll all the way down if you are not interested in the rest.
Stacked Borrows Tweaks
I made some small tweaks to Stacked Borrows which mostly serve to make things overall more consistent. All of these have been incorporated into my previous post on Stacked Borrows.
The only noticeable functional change affects creating a shared reference: previously, that would push a
Shr item to the stack before freezing the location.
That would, however, permit mutation like in the following example:
Since “no mutation through (pointers derived from) shared references” has been a rule in Rust pretty much from the beginning, I changed creating a shared reference to no longer unconditionally push a
That now only happens when the pointee is mutable because of an
As a consequence, the code above is now forbidden.
Beyond these small tweaks, I implemented the barriers that I already described in my original proposal. The purpose of barriers is to handle code like the following:
aliasing here gets two aliasing pointers as argument.
This means writing to
raw changes the value in
ptr, so the desired optimization would change the behavior of
demo_barrier (currently it returns 8, after moving the load of
ptr down it returns 4).
Stacked Borrows without barriers would allow this code because
ptr does not get used again after
To disallow this code and enable the desired optimization, we introduce a new kind of item that can be pushed to the per-location stack: a function barrier referring to a particular function call. While the function is running, the barrier may not be popped. The type of items on the borrow stack therefore now looks as follows:
These barriers get pushed when doing the initial retagging of a function (remember that we insert
Retag statements for all arguments at the beginning of each function).
We do this only for references (not for
Box), and we do not push a barrier to locations inside a shared
Concretely, between steps 4 and 5 of retagging (after we performed the action of an access and before we push the new item), we push a
FnBarrier referring to the current function call.
Moreover, when retagging with barriers, we do not perform the usual redundancy check – even if the reborrow is redundant, pushing the barrier might be necessary.
Barriers have no effect when a pointer gets dereferenced: looking for the item in the stack just skips over barriers. However, when performing a memory access and popping items from the stack, we stop when we encounter an active barrier, i.e., a barrier for a function call that is still ongoing. If that ever happens, we have undefined behavior.
Together, these rules make the above example invalid code, and hence enable our optimization: when doing the memory access in
*raw = 4, we pop until we find a
That will hit the barrier that was added when
aliasing did the initial retagging of
val, triggering UB.
An interesting (and positive!) consequence of barriers is that they make the following code UB:
Without barriers, there is no problem with this code because
aliasing does not actually use the two aliasing pointers in conflicting ways.
With barriers, the initial retag of
ptr2 goes wrong because it tries to pop off the barrier that just got pushed for
That’s great, because we don’t actually want to allow passing two aliasing pointers to
Barriers have one more effect: when memory gets deallocated, it must not have any active barriers left in its stacks.
Barriers so far ensure that a reference passed to a function does not get invalidated during that function call; this extra check ensures that it also does not get deallocated.
This is important, because we tell LLVM via the
dereferencable attribute that the reference is, well, dereferencable for the duration of the function call.
To justify this attribute, we have to argue that the program really cannot deallocate this memory
Intuition and Design Decisions
One high-level intuition for barriers is that they encode the rule that a reference passed to a function outlives the function call.
The item matching that reference can never get popped while the function is still being executed: if it would get popped, we would hit the barrier and trigger UB.
This is also why we don’t have barriers for
Box: they don’t have any such relationship to function calls.
The other exception, about not having barriers for shared
UnsafeCell, is more subtle.
In my previous post, I had an example of a function where a shared and a mutable reference alias (the shared reference being
&RefCell and the mutable reference pointing into it).
This example crucially relies on the redundancy check during retagging, where the shared reference does not get reborrowed (which would invalidate the mutable reference).
If we wanted to add a barrier for that shared reference, we could not push it in top of the stack; we would have to add it somewhere in the middle next to the
Shr item matching this reference.
For now, I am trying to avoid adding items to the middle of the stack.
Another problem with shared
UnsafeCell and barriers is summarized in this issue: in
Arc, it can actually be the case that the memory referenced by a shared reference (but inside an
UnsafeCell) gets deallocated while the function (that got this shared reference as argument) is still running.
This violated LLVM’s expectations for
dereferencable pointers, and it would also violate the rule that memory with active barriers cannot be deallocated.
We will have to either declare the current implementation of
Arc incorrect or change the attributes that we use with LLVM.
The model currently takes the position that
Arc is correct, deallocating data behind a shared
UnsafeCell is allowed, and the attribute is emitted incorrectly.
During the RustFest Rome, I was reminded that Stacked Borrows does not currently support two-phase borrows. This means that the following code (which is safe code in the 2018 edition) would get rejected by Miri:
To fix that, I just had to make a very tiny change.
Retagging now has a special mode for two-phase borrows (it would be used in the above example when retagging the temporary containing the first argument of
push), and in that mode, after completing all the usual steps, we create a shared reborrow of the newly created mutable borrow, with all the usual effects (freezing or pushing a
Shr onto the stack.
Thanks to the rule that reading through a mutable reference does not unfreeze or pop a
Shr off the stack, this means that the original
v can be used for reading until the time that the newly created (two-phase) borrow is written to for the first time.
Writing will unfreeze and pop the
Shr off the stack, disallowing any further reads through other references.
This corresponds to the “activation” of a two-phase borrow.
Unfortunately, this model is not quite enough to handle all the code that rustc currently accepts. See this issue for further discussion on this subject.
Stacked Borrows Specification
Between the various changes described here, there is now a noticeable discrepancy between the only self-contained description of Stacked Borrows in my previous blog post(s) and what is actually implemented. I will try to find a good spot to put a “living description” of whatever version of Stacked Borrows is currently implemented in Miri, so that there is more than just the code to look at. I’ll let you know when I found one!
Making Miri Easier to Use
Finally, I decided to spend some time making Miri easier to install and use and fixing the problems with running test suites. Now all you have to do is run
cargo +nightly install --force --git https://github.com/solson/miri miri
and then go to your project directory and run
cargo +nightly miri test (you may have to
cargo clean first to make sure all your dependencies are built the right way for Miri).
On first launch, Miri will (with your consent) install xargo and prepare a libstd suited for executing Rust programs in the interpreter, and then it will run your test suite.
@oli-obk is currently working on making this even easier by making Miri a component that you can install via
Give it a try and let us know what you think!