Feb 12, 2019 • RustEditsPermalink

All-Hands 2019 Recap

Last week, I was in Berlin at the Rust All-Hands 2019. It was great! I will miss nerding out in discussions about type theory and having every question answered by just going to the person who’s the expert in that area, and asking them. In this post, I am summarizing the progress we made in my main areas of interest and the discussions I was involved in—this is obviously just a small slice of all the things that happened.

Validity Invariants and MaybeUninit

We had a session to talk about validity invariants (meeting notes here). No firm conclusions were reached, but we got input from people that haven’t been part of the discussions inside the UCG (unsafe code guidelines) WG yet and that was very interesting. It seems that deciding about the validity invariant for references and (to a lesser extend) the validity invariant for unions will be a slow process—there are lots of different options, and some of the trade-offs come down to prioritizing one value over another (like prioritizing checkable UB over optimizations, or vice versa). Probably the closest to a conclusion we reached was that uninitialized integers should probably not be UB until you perform any operation on them.

That said, it also got clear that the sooner we can stabilize (parts of) MaybeUninit, the better. And as @japaric pointed out, we do not have to wait until all of these questions get answered! We can stabilize a subset of the API, and leave (in particular) get_ref and get_mut out of that. I think the biggest blocker for that is to get RFC 2582 accepted (which is very close to beginning its FCP). Once that is done, I think I’ll just beef up the documentation a bit and submit a stabilization PR.

What worries me a bit is that we did not get much feedback from people using the API. It seems like everyone is waiting for it to become stable, but of course then it will be too late to fix API mistakes that we made! Given the importance of the API for certain classes of unsafe code, I think it would be great to get some more feedback before we set things in stone.

To finish up the API, I went ahead and renamed the method that you call when a MaybeUninit is fully initialized to into_initialized, which seems, uh, good enough? I also like assume_initialized but that does not sound like it would return anything. One thing I wonder about is whether some of these methods should be “downgraded” to functions, forcing callers to write MaybeUninit::into_initialized or so. If you have an opinion on this, please join us in the tracking issue!

Uninitialized Memory

@stjepang, @Amanieu and me talked a bit about uninitialized data in atomic operations, and how to make sense of the C++ spec, in particular in the presence of mixed atomic and non-atomic accesses to the same object using atomic_ref. We did not make any notable progress, other than the possibility of a sound AtomicCell without get_mut.

Another discussion was around using Read::read to initialize memory. The problem here is that with an unknown Read implementation, we cannot just pass in an uninitialized buffer. Uninitialized data is strictly different from any particular bit pattern, and using it in any non-trivial way is UB. Since there is no way to know what the unknown Read::read function does, we cannot pass such bad data to it. In my terminology, even though integers with uninitialized data are (likely) valid, uninitialized data violates the safety invariant of our integer types. To solve this, the proposal is to add a freeze method that turns all uninitialized memory into arbitrary but fixed bits. Such frozen memory admits fewer optimizations (because it must observably have a consistent value when accessed multiple times), but it is also safe to use at any integer type (for the same reason).
The main reason I like this proposal is that it officially acknowledges the subtle but important distinction between “arbitrary bit pattern” and “uninitialized”, and that should help a lot with teaching these concepts to unsafe code authors.

Stacked Borrows

Finally, there was some discussion about Stacked Borrows (meeting notes here). The feedback was very positive, and (to my surprise) everybody agreed that the things I had to change in the standard library to make it conform with the model were appropriate bugfixes. @arielb1 and @matthewjesper helped a lot with detailed discussions about some of the remaining issues in the model, as well as figuring out a plan to fix a problem found by @Amanieu. However, two-phase borrows remain a problem.

Miri now tests libcore and liballoc

I am particularly happy about the progress I made on Miri during this week. With help from @eddyb and @alexcrichton, I got Miri to run the libcore and liballoc unit test suites. This has already helped to uncover a bug and some subtle undocumented invariants in the standard library, as well as several bugs in Miri. I have set things up such that Miri will run these tests every night, giving us higher confidence that the standard library is free of undefined behavior—or rather, that the parts of the standard library that are covered by unit tests are free of undefined behavior that Miri can detect. I’ve been partially working towards this goal for several months now, so it is really satisfying to see it all come together. :-)

Moreover, @Amanieu ran Miri on hashbrown, not discovering a bug in his crate but running into several bugs in Miri which I then fixed.

And last but not least, Miri can now pass arguments to the interpreted program, which is particularly useful when running test suites to run only the test one is currently debugging.

I think that’s it! Lots of exciting progress as well as lots of grounds for further discussion. This won’t get boring any time soon. :D

Posted on Ralf's Ramblings on Feb 12, 2019.
Comments? Drop me a mail or leave a note on reddit!