The pirate-based logic of Rust shared references

Tags: rust programming memory mathematics | Written by Alex Smith, 2025-09-14

Introduction

In my previous blog post, about scoped generics, I identified a problem with the Rust programming language that couldn't properly be solved by existing programming techniques, implying that a new language feature was needed. As such, I've been intending to officially propose that scoped generics should be added to Rust. This blog post isn't primarily about the particular new feature itself: rather, it's partially about the process of working out the details of a new feature in general, but primarily about some interesting things I discovered along the way.

I care a lot about getting the details right for my new feature, both because it increases the chance that the feature will be accepted, and because it increases the chance that the feature will be useful if it is accepted. The Rust language design team are frequently in a similar situation, and in order to try to get the details of new features right, they use an "experimental feature gate" process that lets them implement something experimentally to find out whether or not it's a good idea and what the details should be. I'm not an experienced Rust contributor, so I don't get to use that process myself: but that doesn't mean that I can't try to do something similar on my own. It should be possible to learn lessons from trying to implement something whether or not the implementation is in the official Rust compiler repository!

As such, I've already started on an experiment into implementing scoped generics. It isn't completed yet, but it's already taught me things, not just about scoped generics, but about Rust in general. And one of those insights about Rust in general seemed both fundamental enough, and surprising enough, to make me stop what I was doing and write this blog post instead.

The insight is about a phenomenon that will be very familiar to most Rust programmers: "a large proportion of the functions, methods, and even sometimes types in the Rust standard library have to be written twice: one version that is or operates on shared references, and one version that is or operates on mutable references", and comes in two parts:

  1. For functions/types/methods that do have to be written twice, there is a mechanical translation that can be used to produce the "shared reference" version based on the "mutable reference" version; and

  2. The translation in question has a well-established mathematical structure: it follows the type theory of linear logic (a mathematical tool invented in 1987 and that has frequently been used since to explain programming language type systems). But this isn't the normal programming-language use of linear logic: most programming language type theories are limited to a "fragment" of linear logic that contains some of the operations and not others, and the translation in question falls outside the commonly used fragments.

My primary aim with writing this blog post is to try to explain: a) the relevant parts of Rust, b) all the relevant parts of linear logic (which is almost all of it) as seen from a programming-language point of view, and c) how they relate to each other. Hopefully the insights will make Rust easier to reason about, or maybe even reduce some of the shared/mutable duplication that Rust seems to struggle with.

And along the way, we might do more than a little copying of things that you aren't supposed to Copy.

Starting the experiment

First, a quick summary of what I'm trying to do, for those who haven't read the previous blog post. In Rust, a "const generic" is a way to parameterise a type using a number that's a compile-time constant: for example, a Rust "array" (as opposed to a "slice" or "vector") has a length known at compile time, with types like [i32; 4] (an array of 4 32-bit integers) and [i32; 8] (an array of 8 32-bit integers) being different because the const generic is different. Scoped generics let you do something similar with values that aren't compile-time constants (and might not be numbers), under the conditions that a) the type parameter is stored in a variable and b) all values of types parameterised by the variable must be discarded before the variable goes out of scope or has its value change. The conditions mean that the parameter is "effectively constant" from the point of view of values of the type (because the value will be gone by the time the parameter changes); unlike const generics (which use a compile-time check for equal values to see whether two parameters are the same), two scoped-generic parameters are considered the same (for the purpose of checking whether types containing them are the same) by checking to see whether they are stored in the same variable.

One advantage of experimenting with how a new language feature might work is that you don't need to come up with a finished implementation immediately. For the purpose of an experiment, a) it's OK if the syntax isn't very good, and b) it's OK if the feature doesn't work in all cases, as long as it works in enough practically useful cases to get an idea of how it would work in practice. Additionally, I've never built the Rust compiler myself, and my computer probably doesn't have enough memory or free disk space to do so. That pointed me towards trying to do my experiment by writing a new library, rather than trying to change the compiler: I wouldn't be able to implement every aspect of the feature, and the syntax would be worse, but it'd be easier and produce results faster.

It also allowed me to not bother with trying to work out a case that would otherwise be very hard to implement: scoped generics are a sort of type-system-level reference to local variables, which means that a full/final implementation of them would support the same scoping behaviour as local variables do (meaning that if a function containing a scoped variable runs recursively, each level of recursive calls creates its own scoped generic). For the purpose of my experiment, I decided to instead use a less general implementation (especially because the full version may be unimplementable in present Rust): storing, at every location in the program which uses a local variable as a scoped generic, a reference to the local variable into a global variable specific to that location (and representing the scoped generic as a Rust type that hard-codes a reference to the global variable). This is not re-entrant, i.e. it doesn't work properly in the presence of multi-threaded or recursive code, which would make it an inappropriate final implementation for the language feature; but it's fine for experimental purposes, because there are plenty of single-threaded non-recursive programs which would find the feature useful and could be used to evaluate it.

Representing the scoped generic as a Rust type lets Rust's existing type checker do the type-checking (as it already knows how to check whether type parameters are the same), so that's a huge amount of code that doesn't need writing for the experiment. Some code is still needed, though: it's necessary to implement the rule that values of types that use the scoped generic can't outlive the value of the scoped generic, and to implement a way to get at the value of the generic parameter (but only within its lifetime). Both of those are tied to the same lifetime, meaning that the type that needs implementing is "something that has a lifetime and that can provide access to the value in a given global variable during that lifetime". I'll call a value of such a type a "Thing A".

A Thing A is almost sufficient to implement scoped generics simply by a) embedding it in the type that has the generic and b) parameterising the type on what specific type of Thing A we have (because each one is tied to a different global variable). The only things it can't do are a) be initialised correctly in code that's used re-entrantly, and b) implement traits like Default with methods that aren't based on an existing value. But it's close enough to evaluate the feature and see if any lessons are learned about it.

So what is a Thing A? It has a lifetime; during that lifetime, it lets you get at the value stored in something; and it probably has to be Copy (otherwise you couldn't embed it in other Copy things). It can't be Send unless the thing it references is Sync, otherwise you'd have a trivial soundness bug. So far, this sounds an awful lot like a Rust shared reference. But, it also has to have zero size (part of the purpose of scoped generics is to allow a large number of objects to all reference the same thing without using extra memory per-object), which Rust shared references don't (they're represented as pointers). So a Thing A is a… manually implemented?… shared reference that hard-codes the address (rather than containing a pointer).

So far, so good. Once my reasoning reached this point, I set out to manually implement shared references – basically aiming to replicate Rust's existing shared references, except with a hard-coded address – and also to implement the code that stored values in global variables in a sound way (i.e. checking that the code isn't being used re-entrantly). And that's when I started learning things.

Lots and lots of references

This blog post is really about the "manually implementing shared references", but it'll be easier to understand the problems that arose by first starting with the code for storing into the global variable. When we enter the scope of the scoped generic, we need to assign a value to it. But we're writing in Rust, where you can't "just" assign to global variables whenever you like, because that would frequently be unsound. Instead, you need to give the global variable a cell type (i.e. a type that permits writes even if you don't have a mutable reference), using that cell type's rules for writes in order to prove safety. (Early Rust did let you write to global variables directly in unsafe blocks, and this is still supported for backwards compatibility, but nowadays it is preferable to use UnsafeCell: it supports the same operations, but with an API that is more consistent with the rest of the language.)

There are only three potentially viable options for this type of cell in the Rust standard library:

UnsafeCell looks like the best option here: it just needs to be wrapped into a safe abstraction that ensures it isn't being concurrently accessed, panicking if an attempt to use it for a second purpose is made while it's still in use for the first. The resulting abstraction has almost identical semantics to RefCell, except that it's Sync, which gives the fairly obvious name SyncRefCell. (I suspect that the Rust standard library doesn't provide the abstraction in question, even though it's sound, because attempting to use it in multi-threaded code would panic whenever there was contention, and if something is almost unusable in multi-threaded code there's no real purpose to making it Sync. But global variables have to be Sync, so I needed the new abstraction.)

Now, let's see what would have to be done to manually implement a (zero-sized) shared reference to the inside of a SyncRefCell (and I'm going to start counting the number of different reference-like types that are needed). First, we need a way to get from a zero-sized object to the address of the outside of the cell (1): that's just a zero-sized type whose Deref implementation returns a reference to the global variable. Next, we need a way to get from the outside to the inside of the cell (2): that needs a zero-sized structure that both lets us access the inside of the cell, and remembers that "we own the cell" (the reference that owns the cell should be able to read and write it, but other references to the cell should cause a panic, so we need a type to remember the difference). This is basically the same sort of thing as MutexGuard or cell::RefMut from the standard library, and is fairly straightforward to implement: we just need to embed one of the zero-sized references-to-the-outside in it, in addition to some fairly standard PhantomData-for-lifetime. However, this type needs a destructor (to unlock the cell once it's no longer referenced), meaning that it's unsound to copy it (or the destructor would run twice) and thus is only usable as a mutable reference to the inside. So yet another type is needed to represent a share of the MutexGuard equivalent (3): it has the same semantics as a shared reference to a type 2, except that a) it's zero size and b) there isn't an extra layer of indirection in the Deref implementation.

OK, so the next problem is: what type do we put inside the UnsafeCell? We want it to be able to hold references, and when doing that, the type of the reference is mostly known but we don't know the lifetimes (because if code that creates a scoped generic runs in a loop, it could be referencing a different object each time, with disjoint lifetimes, and so the lifetimes will be different). Back when I wrote my previous blog post, I thought that this particular sticking-point would render the global-variable technique non-viable, but there is a solution: references which are the same apart from lifetimes (of the reference itself and of the type they reference) have the same size and alignment, so you can create a block of uninitialised memory (of the right size and alignment) and interpret it as the various different possible types by transmuting it. (In fact, if you know that nobody else is using the block of memory in question, you can even transmute it in safe code, because an uninitialised instance of one type is clearly safe to interpret as an uninitialised instance of another type with the same size/alignment in that case.) So, what's needed is a MaybeUninit containing some pointer-sized memory. We need to be able to create a "transmuted reference" to it that interprets it as a MaybeUninit of some particular type (4). Then we need to track the fact that it's been initialised (to be able to dereference it soundly), so we need a MutexGuard-like reference to it that records the fact that it's initialised (5). To create the original zero-sized references, each of them will need to be able to embed the guard-like reference, which means that we need a copiable version that can't mutably dereference (6); and when trying to implement those, we discover that those need to embed a type 4, except type 4 clearly can't be copied because it relies on exclusive access to make the transmute safe, so we need a copiable version of that without the ability to write through it (7). (A type 7 can do the transmute safely because it knows that all other coexisting uses of the memory access it via the same transmute.)

And once we have all that, all we need is a zero-sized reference that embeds a type 6 and dereferences it twice to get at the object we're actually trying to reference (8), and that's the Thing A that makes it possible to implement scoped generics. It would be possible to write a mutable version of that too, but I didn't need it for anything and had become tired of writing implementations of references.

And that's the point at which a good programmer should start thinking. By now I'd written very similar code 8 times, with only minor variations, and some of the thoughts that should prompt are (from least useful to most useful):

(What you shouldn't do: you shouldn't try to get an AI coding assistant to automate it. Even in the general case, this makes the code less maintainable in the future because you don't have any reliable instructions you can write down in case you need to do the same task again in the future, and because it leads to code which is less readable and less editable because it ends up writing out the repetitive/equal parts every time, rather than storing them all in a common location. In this specific case, it would be even worse, because this contains some unsafe code which is somewhat subtle with respect to lifetimes: getting it wrong would cause the resulting code to be unsound and would be difficult to discover through testing, because you'd need to write a separate ```compile_fail test for every possible case where a lifetime might be longer than it should be and I don't even know how to enumerate those.)

In general, the options towards the end of the list are better than the options towards the start of the list, as they a) increase the ability that the compiler has to verify that the code is correct, b) reduce the amount of effort needed for future maintenance on the code, and c) might even help out unrelated code in the future (perhaps written by other people).

But to do any of those actions, you need to know the pattern, and I'd now manually written enough implementations of references to get a pretty clear view of it. The same pattern appeared to apply to a) writing a Deref implementation given the DerefMut implementation, and b) implementing a shared reference given a mutable reference. The basic idea seemed to be to replace all mutable references with the equivalent shared references, remove destructors and side-effecting constructors (and other side-effecting methods), convert method calls from their mutable to shared versions (e.g. assume_init_mut became assume_init_ref), and add a way to obtain the shared reference from a shared borrow of the corresponding mutable reference (via converting every field recursively from mutable to shared, and copying the result to produce a field of the shared reference).

This looks a lot like a missing language feature: it looks almost like we should be able to call many methods that take a mutable reference but give them shared references, and get shared references rather than mutable references in the result. But, that would almost certainly be unsound if implemented as-is, maybe in subtle ways. (For example, an UnsafeCell can safely be read through a &mut UnsafeCell reference, but needs extra safety checks to soundly read it through the corresponding &UnsafeCell reference – the former is an exclusive reference and thus guarantees that nobody else is writing the UnsafeCell at the time, whereas an &UnsafeCell doesn't make any guarantee about who might be writing the contents because the guarantees Rust usually provides for & references don't apply inside cells.)

I wouldn't blame anyone for giving up at this point (or indeed earlier), but I'm the sort of person who likes to keep going along this line of investigation. I wanted to know: just when is it sound to do this sort of mutable-to-shared replacement? And in order to work that out, I would need to discover the type theory that explains what's going on.

"You can't have two mutable references to the same object"

It's time for another experiment, but this time it's a thought experiment. If Rust supported a general way to convert mutable reference implementations to shared reference implementations, it would presumably work on Rust's actual mutable reference implementation, &'a mut T, too: instead of needing mutable and shared references to exist, we'd just need mutable references and an automatic reference converter. That means that we'd, somehow, have a version of Rust that managed to make things work without "natively" having shared references at all. And one of the best ways to find out how to do something is to try to prove it impossible, and see where the proof breaks down: the gap in the proof is the solution you were looking for.

The most obvious impossibility is: it's possible to have two shared references to the same object (and use both concurrently), and if you couldn't do that, most Rust programs would be impossible to write. But one of the most fundamental rules of Rust is that you can't have two mutable references to the same object. Rust's aliasing rules aren't fully decided yet, but surely that one is the most important of all and couldn't possibly have exceptions.

Still, there isn't much to do here but try. I was once watching someone play a card game, and they had two copies of a card that would have been really useful in their situation, but unfortunately the rules of the game said that a player could only play the card in question once per turn. Another spectator said, sarcastically, that the reason that they couldn't play both is because they weren't trying hard enough. That line has really stuck with me, and somehow it seems relevant here.

In Rust, you "can't" copy a mutable reference (except with a reborrow, which prevents the two copies being used concurrently). But maybe that just means that I wasn't trying hard enough, and I should do it myself:

fn main() {
    let mut a = 5i32;
    let b = &mut a;
    
    // SAFETY:
    // It is better to remain silent and be thought a fool,
    // than to speak and remove all doubt.
    let c = unsafe {
        core::ptr::read(&raw const b)
    };
    
    println!("{}", *b);
    println!("{}", *c);
}

This seemed promising: if Rust won't copy the reference, why don't I just look at the bits of memory that represent it and form them into a new reference? Unfortunately, despite this code outright stating that the reference should be copied, and compiling, it doesn't actually copy the reference.

The reason behind this should be familiar to people who have used any memory-unsafe language, such as C, C++, or unsafe Rust: this program has undefined behaviour. Memory-unsafe languages have various rules that you have to follow in order for the program to have any meaning at all; and a program that contains undefined behaviour is meaningless and might do anything, including skipping the offending code entirely (an outcome that actually happens quite frequently in practice). As such, the code can't meaningfully be said to copy the reference because there is no requirement that the Rust compiler compiles the code into anything resembling the original program, and the code it actually produces might therefore not contain the copy. (The standard analogy, dating back many decades now, is that it is perfectly valid to implement undefined behaviour by making demons come out of your nose. That's why I didn't test this program on my own computer, but rather on the Rust playground, in the hope that if demons were summoned, it would be in some distant data centre that could contain them more easily.)

Still, if you give up after just one failure, you probably weren't trying hard enough. I'm just going to have to try even harder:

fn main() {
    let mut a = 5i32;
    let b = &mut a;
    
    // SAFETY: ???
    let mut c = &mut 0;  // placeholder, immediately overwritten
    unsafe {
        core::ptr::copy(&raw const b, &raw mut c, 1);
    };
    
    println!("{}", *b);
    println!("{}", *c);
}

Yes, I know the program looks basically identical to the last one: instead of copying the bits out of a mutable reference and forming them into a new mutable reference, I instead copied the bits out of a mutable reference and used them to overwrite an existing mutable reference. But if you run it in Miri, a Rust interpreter that detects undefined behaviour, something magical (and somewhat surprising) happens: it runs the program successfully and reports no undefined behaviour along the way.

This is one of the most subtle areas of Rust, to the extent that I recently bug-reported a similar situation before learning that it was intended behaviour rather than a bug (and I wasn't the only person making that sort of mistake). I think I understand what's happening now, though, and will start with an analogy from C.

Let's write some C code that prints the value of the PATH environment variable:

puts(getenv("PATH"));

The question I'd like to consider is: is this code thread-safe?

This is a bit more subtle than might be expected, but there's a good explanation in the glibc manual (see the section about "env"). The basic issue is this: C's getenv function is usually implemented as returning a pointer into inside a global variable environ. That pointer is only guaranteed to remain valid as long as nobody changes environ; if it is changed by another thread, the returned value might get overwritten while we're using it and that could potentially lead to undefined behaviour, e.g. due to a buffer over-read (you might read the value after elements have been added to it but before the sentinel to say where the elements end has been written, and thus read too much) or due to accessing deallocated memory. Despite all this, getenv is typically considered safe: the reason is that changing environ (e.g. via setenv) is considered to be unsound unless you can prove that no other threads are using environ at the time. In other words, we have a combination of two things (getenv and setenv) that are unsound to use in combination, and one of them is considered safe, with the other being considered unsafe. (This subtlety lead to one of the only cases in Rust's history where a standard library function was previously marked as safe, but a breaking change was made to mark it unsafe: std::env::set_var, Rust's equivalent of setenv, became unsafe because it might break C code using getenv that ran in parallel (even though it was correctly guarded against running in parallel with the Rust equivalent).)

This is a demonstration of the fact that there are two ways in which code can lead to undefined behaviour: either it directly does something that the compiler/runtime/OS/processor assumes is impossible (potentially causing arbitrary breakage when the assumption is used to guide optimisations and they go off the rails), or else it does something that other code is allowed to assume is impossible. If I call setenv in a multi-threaded program, that might lead to undefined behaviour because some other code, perhaps in a library, might assume that it can safely call getenv at the same time. If I call setenv and can ensure that nobody is calling getenv at the same time, that isn't undefined behaviour and the program will work correctly. So there are two shades of "code you aren't allowed to write" here: either 1) the code directly does something that's undefined and can't possibly be allowed in a valid program, causing it to instantly become meaningless; or 2) the code breaks some invariants that other code is allowed to assume will hold, and is sound only if you can ensure that no code which makes that assumption will run while the assumption is false.

So, what about the example above? In general, Rust code is allowed to assume that two concurrently usable mutable references won't alias. For example, ptr::read assumes that its return value doesn't alias any mutable reference that the caller is using (indeed, Rust functions in general do that).

Rust code is allowed to assume that two concurrently usable mutable references won't alias… but the compiler doesn't.

Here's a great demonstration. Suppose we compile these two Rust functions, in release mode:

pub fn f1(a: &mut i32, b: &mut i32) {
    *a = 1;
    *b = 2;
    *a += 2;
}

pub fn f2(a: &mut &mut i32, b: &mut &mut i32) {
    **a = 1;
    **b = 2;
    **a += 2;
}

In the present version of Rust, the generated machine code for f1 works like this: store 2 in *b; and store 3 in *a. This optimisation is valid because mutable references in function arguments can't alias each other, which is an assumption that the compiler is allowed to make.

In the present version of Rust, the generated machine code for f2 works like this: calculate what address **a points to and store 1 in it; store 2 in **b; then add 2 to the calculated **a address.

With f1, the compiler has made use of knowledge that a and b can't alias with each other. The compiler makes an assumption that mutable references that are function arguments cannot alias each other, and violating this assumption is immediate undefined behaviour.

With f2, the compiler hasn't made use of knowledge that *a and *b can't alias each other, because at least under the current draft of the aliasing rules, that isn't an actual rule of Rust. There are lots of things that you can't do with aliasing mutable references, but apparently this isn't one of them (and returning a reference via writing into memory provided by the caller isn't either, which is what makes the ptr::copy program meaningfully different from the ptr::read program). You can even (if you mark b as mut) add a call to f2(&mut b, &mut c) into the ptr::copy program above, and get an output of two 4s, even in release mode and even under Miri.

Now, I wouldn't recommend ever doing anything like this for any purpose other than thought experiments. The rules behind this sort of thing a) are very subtle and hard to understand, b) can cause arbitrarily bad malfunctions at runtime, in a way that the compiler can't catch and testing may not be able to catch either, and c) are subject to change, potentially in ways that break programs that were previously working. If the Rust developers decided that it was safe to break the code in this section (i.e. that they wouldn't break too much existing unsafe code in the process), I would be upset that my blog post was out of date, but happy at the new optimisation opportunities and at the rules becoming easier to understand.

But in the hypothetical world of thought experiments, in which Rust ended up being implemented without shared references for some reason, every Rust programmer would have been writing code like this, and the developers would ensure that they didn't break all the existing code.

Instead, we'd get one of two scenarios. One is that nobody would use the alternate-world Rust because it would be even more dangerous than C, and thus lose most of its reason for existence. And the other, more intriguing, scenario is that someone would have come up with a framework for making this sort of thing safe.

A Rust with piracy instead of sharing

In this hypothetical version of Rust, programmers would have: a) mutable references (which exist in the real version of Rust too), b) a way to copy things that aren't supposed to be copied (probably with better syntax, as this would be very widely used to do things that would otherwise need the nonexistent shared references), and c) some notation that could be used at least in types, in order to distinguish illegal copies of objects from objects that hadn't been illegally copied (some operations only work on the latter, so they need to be distinguished from the former in order to prove those operations safe).

c) puts me in the situation of creating a new language feature, again, even though this time it's just for a parallel-world hypothetical language for a thought experiment. I don't have any really good ideas for syntax, so I'll deploy another lesson that's been learned about experimenting with new language features: if you don't have a good syntax available, choose an obviously terrible one in order to prevent a bad placeholder name sticking. (Experimental Rust APIs have ended up with names like Yeet or BikeshedIntrinsicFrom based on this principle, although Yeet may not have been bad enough because apparently some people want to keep it.)

As such, my new notation, for experimental purposes, is going to be a prefix ? that marks "illegal copy" types: for example, an illegally copied &'a mut T would be ?&'a mut T. This also suggests that prefix ? would be the operator for copying uncopiable things in the first place:

fn main() {
    let mut a = 5i32;
    let b = &mut a;
    let c = ?b;

    println!("{}", *b);
    println!("{}", *c);
}

I think of the ? operation as "pirating", because it copies something that, according to the rules, you aren't allowed to copy. Note that after the let c = ?b; line, both b and c are effectively a ?&mut i32: the original and copy are copies of each other, so they each have to take the existence of the other into account.

Thinking about the pirating operation in general (not just as it applies to mutable references), it seems to be about "halfway" to creating a shared reference. For example, pirated values are Copy (they're already been illegally copied once, copying them again isn't going to make the situation worse), can be used to read the original value in most cases, and can be used as a function argument without consuming the original value. On the other hand, they're the same size as the original value (rather than being pointer-sized like references are), can't be used to determine the value's address, and are unable to write inside a Cell (whereas a shared reference is able to do that); all those changes stem from the pirated values being implemented as a copy of the original value rather than a reference to it.

This may give you a suspicion (at least, it gave me one): if we pirate a mutable reference, doesn't the result end up being a shared reference? Pirating a value directly means that the address is lost and the size is preserved, but if you pirate a mutable reference, the address is copied and you end up with something the size of a reference. The only remaining requirement would be to check that the things that can soundly be done with ?&'a mut T are the same things that can soundly be done with normal Rust's &'a T, but they certainly seem pretty similar (e.g. you can't write through it because it might be aliased in another thread, except if it's a single-threaded program and then you can write through it but can't take a mutable reference through it, etc.).

This is interesting because it suggests a model for the mutable-to-shared transformation that I was looking for originally. In particular, it specifies what to do about the type definitions (you just pirate the type, which is equivalent to pirating all the fields, which is equivalent (except for PhantomData weirdness) to pirating all the non-Copy fields because copying a Copy field isn't illegal). That means that I can check it against the reference implementations I actually wrote and discover that… it's almost right. There's nearly an exact match, except that some of these shared references have lifetimes when the equivalent mutable references don't.

And thinking about it, there's a reason for that. The mismatch happens in cases where the mutable reference has a destructor. Say we pirate one of those: then the destructor won't be able to run safely because it might unlock something that the pirated version is assuming will stay locked. The only way to make the code safe is to get rid of all the pirated copies before the destructor runs: and in Rust, the way to ensure that an object is gone before you do an action is to give the object a lifetime. It turns out to be fairly easy to implement this behaviour into the pirating operator ?: you just give it a lifetime when applying to types, along the lines of ?'a &'a mut T, and when the lifetime is over, the borrowed object is considered to be unpirated again because it now has no copies to alias it. (At the value level, rather than the type level, the lifetime behaviour is identical to how a shared reference would behave.) After making that change to the plans for ?, the references now match, either exactly or in a "different implementation that gives the same result" sort of way.

So it looks like implementing all those references has been a fruitful experiment in two different ways: a) the implementation should help to give insights into scoped generics, but b) the implementation also shed light on how to generate shared-reference code from the mutable-reference version. This is almost certainly the correct way to transform types. But what does the same transformation on functions and methods look like? I have some examples and know what the transformation "feels like", but it would be nice to have something that's provably correct or at least has some established theory behind it.

At this point, I'm not sure what someone without a PhD in substructural type system theory would do. Fortunately, I happened to have the correct background for solving this particular problem, and spent several hours trying to figure out whether there was some existing mathematical theory that covered this: first from memory, then searching through Haskell documentation for types with the right shape or related-sounding names, then asking friends online. (Haskell is a great place to check for this sort of thing: no matter what weird type-system-related thing you're looking for, there's a very good chance that someone implemented it for Haskell first.) All this effectively had the same result as talking to a rubber duck would (i.e. the replies were useful only to rule things out, but it helped streamline my own thoughts to the extent that I could figure it out on my own), and I eventually started thinking along the correct lines to discover something very interesting indeed.

Linear logic

<ais523> actually, linear logic probably has an operation for this somewhere, it has lots of operations like that

— me, suddenly realising that I had been trying to use the wrong branch of mathematics

Let's take a step back for a moment, to think about programming language type systems. It was observed, several decades ago now, that programming language type systems seemed to follow the same rules as systems of logic did (this is normally known as the "Curry–Howard correspondence"). As a simple example: in logic, if A implies B and B implies C, then A implies C; and in programming, if you have a function that takes an A and returns a B, and a function that takes a B and returns a C, you can combine them into a function that takes an A and returns a C. In general, types in programming languages tend to follow the same sort of rules as predicates in logic, and programs tend to follow the same sort of rules as proofs. At least to me, it isn't clear whether or not this is a coincidence, but because the rules are the same, it lets you take results from one field and reinterpret them to learn things about the other (and doing this is valid regardless of whether or not there's a "reason" for the rules being the same).

There are plenty of different models of logic in philosophy, each of which has slightly different rules: and most of them make some sort of sense when interpreted as frameworks for designing programming language type systems, but the difference in the rules gives you languages of a different general nature.

One thing that many models of logic have in common is that if you have a list of premises that you're trying to prove something with, you're allowed to use the premises multiple times, or not at all, or in an arbitrary order. For example, given an assumption that statements P and Q and R are all true, it's possible to prove that (P and Q) and (Q and R) in logical systems that try to reflect the real-world notion of truth, even though that means you have to use your assumption that Q is true twice. But it's possible to imagine a logic where some of those rules are removed (this is known as a "substructural" logic). For example, if premises aren't allowed to be used more than once ("affine" logic), you end up with a logic in which the truth of the premises get "used up" as you try to prove things with them, making them useless to prove other things.

Affine logic doesn't seem like a good model for truth and falsity (because it isn't: in real life, true statements don't stop being true just because you prove things with them). But its rules are pretty good at modelling other things, like manufacturing things from resources: the construct which normal logic would call "P implies Q" can be interpreted in affine logic as "you can use a P to make a Q". And when viewed as a type system, the resulting effect on the language is quite interesting: the same construct becomes "the type of a function from P to Q", but implies that P isn't usable after the call (because you don't get to use it as an argument for a different function afterwards). One possible cause of that is that "P was moved into the function", leaving it unavailable to the caller. At present, there's only one widely used programming language where functions work like that by default: but it's the language that this blog post happens to be about. Rust is generally considered to have an affine type system, under the view of the arguments being moved into the function and unavailable to the caller.

There are, however, other possible views of affine type systems. I was doing my PhD at around the time that Rust was being invented, and I was using an affine type system too (which I like to think gave me a head start on understanding Rust): but in my case, I was working with an implementation of a language (SCI, "Syntactic Control of Interference") that compiled it directly into hardware and which was therefore able to do things in parallel very cheaply. However, this carried the usual risks of thread-safety issues that come with running code in parallel, and SCI wanted to avoid that. SCI's solution is to use an affine type system, considering things that create new threads to "consume" the resources used in those threads (which includes both hardware resources like multiplier circuits, and data resources like particular memory cells), preventing them being used by other threads. On the other hand, if two blocks of code run sequentially, they would be allowed to share resources.

The difference between these two cases was implemented using something called "connectives". In classical logic, the connectives are things like "and", "or", "not". In substructural logics, you have more connectives: each of them is generally similar to a classical logical connective, but more specific about the exact way that resources are shared. SCI has different types for the "run in parallel" and "run in sequence" operators; when running in parallel, resources couldn't be shared between the threads; but when running in series, they could be, and that was represented by combining the types of arguments using different connectives: both the commands took "a command and a command" as their argument, but "run in parallel" used a version of "and" where the commands were not allowed to share resources, whereas "run in series" used a version of "and" which did not have that safety requirement, so the language was able to emulate two different connectives that would both translate to "and" in classical logic.

"Linear logic" is a substructural logic that, just like affine logic, disallows using the same premise twice, but which is also notable for having a fairly large set of connectives. Most of the time that it's used in type theory, it's used as a tool for creating type systems by picking only the connectives you need for your language and disregarding the rest: some of them are fairly obscure, to the extent that the week before I started writing this blog post, I'd never used them and didn't even really understand what they meant. As such, when viewed as a type theory, it's more like a framework for developing type theories than it is a useful type theory in its own right.

But, as it turns out, Rust was already using a fairly large portion of it, and when modelling the pirating operator, I needed even more. So I'm going to go over the six "main" connectives in this blog post (known as the "additives, multiplicatives and exponentials") and try to explain how they relate to Rust.

Understanding the more commonly-used linear logic connectives

I'm going to go through the linear logic connectives in order from easiest to hardest to understand. As such, I'm going to start with ⊗. Combining two things with ⊗ basically means that you get both, and can use both: in Rust, this is a struct or tuple, which has multiple fields and you can move out of all of them. For example, in linear logic, you would model the Rust type (i32, u32) as i32u32, and a struct with an i32 field and u32 field would be modelled the same way (because it isn't really significantly different from a type theory point of view: you can do the same things with it). A good way to think about TU is that to produce it, you must produce a T and a U, and when consuming it, the consumer gets a T and a U: simple enough.

Rust's two main tools for creating arbitrary data structures are structs and enums, and it isn't surprising that linear logic has a connective for enums, too. ⊕ is the construct called enum in Rust, "tagged union" in language-agnostic programming discussions, and "disjoint union" in mathematics: it represents a value that could have either of two types, and tracks which. For example, linear logic would model Result<T, U> as TU. To produce a TU, you only need to be able to produce a T or a U; and when consuming one, the consumer must consume the T or the U, whichever one the producer produced. However, the consumer gets to share resources between the two possible cases: if (say) it needs to access the same piece of memory to process a T as it does to process a U, it can check to see whether it has a T or a U before deciding what to use the memory for, so the program is still sound.

The next-simplest connective is &, which represents a single object that can be viewed as having either of two types. In Rust, this situation occurs with trait implementations: a Vec is a Vec, but it's also an impl Debug, and it's also an impl IntoIterator, and it implements plenty of other traits too. Linear logic represents this situation using &: a T&U is a single value that can be viewed as either a T or a U, so to produce a T&U, the producer has to be able to produce something that's a T and a U, but when consuming one, the consumer only gets to consume it as a T or a U. (& is unlike ⊕, though, in that the consumer gets to choose which; with ⊕ it's the producer who chooses.)

It's worth noting at this point that linear logic is somewhat flexible with how it models "time passing" in a program. One common way to use it is to think of it as representing the types that are instantaneously available in a program: in SCI, the "commands to run in sequence" argument of the sequencing operator takes a command&command as its argument. Linear logic indicates that it should only be able to use one; but the correct interpretation (at least for SCI) is that it should only be able to use one at a time, but nothing prevents it running the other afterwards, because there's no point in time at which the state of the program is wrong. If you use the "at a time" view of linear logic to model Rust, it ends up modelling mutable borrows (because once you get the value back from a mutable borrow, you can use it for something else, so a T&U can be borrowed first as a T, and later as a U); if you use the "for all time" view instead, it models moves instead. There's probably some mathematical trick to get it to model both at once, but I don't know what it is.

The last connective I want to consider in this section is !, an "exponential" that applies to just a single type. In linear logic, !T is equivalent to 1 & T & (TT) & (TTT) & … with any number of Ts (where 1 is an empty tuple, Rust's ()); it can be viewed as an inexhaustible supply of Ts. From the Rust point of view, this doesn't make much sense as an operation to apply to arbitrary types T. However, the aim is to use linear logic to model Rust, not Rust to model linear logic, and there's a clear purpose for ! when modelling Rust: it's used to model Copy types. For example, i32 could be represented as !NonCopyI32: if you have an i32 you can easily generate more of it (by copying in Rust, or by converting !NonCopyI32 to !NonCopyI32⊗!NonCopyI32 in linear logic).

Translating Copy types as ! of a non-Copy type is interesting because it teaches us something about Rust. Affine logic doesn't allow you to use values more than once, without some way to copy them; but linear logic additionally doesn't allow you to use values less than once, without some way to get rid of them. Rust is normally described as affine because you can get rid of any value using mem::forget, but I think that's misleading: in affine type systems you can just ignore a value and it vanishes without side effects, whereas in Rust you can only get rid of a value by dropping it, leaking it, or using an operation like mem::forget or ManuallyDrop to forget it. Forgetting has to be explicit; leaking leaves the object in existence from the type-system point of view, you just never access it again; and dropping is done implicitly, but can run custom code (and thus has to be treated from the type system point of view as though the call to drop/drop_in_place were written explicitly). So I think it's more accurate to view Rust's type system as "linear, but you have a range of options for dropping/forgetting things intentionally".

The interesting aspect here is that in linear type systems, a ! type is defined to be discardable even without explicitly using a function to get rid of it. For most types, the type system will ensure that values of the type are dropped when they leave memory, unless explicitly forgotten. For ! types, it doesn't: it allows the type to just vanish even without being dropped. And that implies a rule that Rust should have (and actually does have): a Copy type can't have a destructor. Part of the reason I wanted to look at the type theory behind pirating is that I was hoping for insights like this: if the type theory is modelling your types correctly, then following the logic will teach you things that should be possible or impossible in your language, and you can see whether they actually are. If they aren't, either you aren't modelling the types correctly, or you have a soundness hole or missing feature.

In any case, that's four connectives covered out of the six I want to discuss, and (apart from the connective that represents function types) those are the only ones I've ever seen anyone use in practice. The remaining connectives are ⅋ and ?, and ? is fairly easy to define in terms of ⅋.

But understanding ⅋ is something of a nightmare, and easily the most complicated part of linear type theory (in the rare cases where you actually end up needing it). Linear logic is actually very internally symmetrical, with ⅋ filling a hole in a pattern: but it's a hole that might at first seem impossible to fill. Some connectives produce types that are easier to produce than others; and connectives that are easier to produce with are less useful for the consumer, who can do less with them. Ordering the four non-exponential connectives from easiest to consume to easiest to produce:

Unlike the first three connectives, ⅋ seems impossible and nonsensical. When producing it, the producer is doing something apparently impossible (and yet this is easier for the producer, somehow, than ⊕ which appears to give the producer maximum flexibility in how to produce). When consuming it, you get two objects, but somehow that's harder for the consumer to handle (and less useful for producing things with) than getting only one.

I think I have a good intuition for what ⅋ is actually doing, now. But something this strange is going to take a lot of explaining, so it's going to take me a whole section to see if I can convey my intuition in a way that readers will understand.

Understanding linear logic's ⅋ and ?

“I should like to buy an egg, please,” she said timidly. “How do you sell them?”

“Fivepence farthing for one—Twopence for two,” the Sheep replied.

“Then two are cheaper than one?” Alice said in a surprised tone, taking out her purse.

“Only you must eat them both, if you buy two,” said the Sheep.

— Lewis Carroll, Through the Looking-Glass

Let's start with a situation where ⅋ might actually come up. It's too contrived to be likely to happen naturally, but hopefully it'll nonetheless be possible to understand what's going on.

Suppose we find an experienced JavaScript programmer who doesn't know any Rust, and ask them to design the interface for a Rust library for making web requests (this is a high-level library: you specify the URL you want to fetch, and get the contents of the web page back, or an error code). There are two problems in designing this API: a) how to deal with the delay between making the request and getting the response, and b) how to deal with the fact that the return value could be either a string or an integer.

For a), Rust programmers might use async, or just block on the request, but the usual low-level primitive behind this sort of thing in JavaScript would be a callback: although it would normally be wrapped up into a promise in modern JavaScript, someone who didn't know the language they were working in would be likely to try the simple things first, so it's quite plausible that they'd come up with an API that uses a callback. For b), obviously the correct interface is a tagged union (Rust enum – a Rust programmer would probably use Result), but someone who didn't know Rust might not know about that feature and might attempt to implement it manually.

So how do you implement a callback that conceptually takes an enum as argument, without using an actual enum? (looks up linear logic notes and observes that to consume a TU you can use a T-consuming function & a U-consuming function) The correct approach for this is to define a trait with a method for each variant, so that you can determine which variant you had by which method gets called:

trait HttpCallback {
    fn success(self, response_body: &str);
    fn failure(self, http_status: i32);
}

// version 1
fn http_request(url: &str, callback: impl HttpCallback);

However, someone who's unfamiliar with Rust would probably not use a trait, and after looking up the correct syntax for taking a function as an argument, might end up using a struct or tuple of two callbacks instead, or just ask for two separate callbacks as method arguments:

// version 2
fn http_request(url: &str,
                success_callback: impl FnOnce(&str),
                failure_callback: impl FnOnce(i32));

In version 1 of this API, http_request is effectively providing an &stri32 to the callback, which is what we'd expect. But in version 2, it's providing an &stri32.

Let's see what goes wrong from the point of view of creating the callbacks, first. A ⅋ type is less useful for the consumer than a ⊕ type, and here, that shows up when we try to borrow-check the callbacks. In version 1, there's no issue with having the success and failure callback methods both try to use the same fields of self: the whole self object gets moved into whichever callback method gets called, and so the method has full access to all of self. But in version 2, the success and failure callbacks are actual objects that exist at the same time as each other, so they can't contain conflicting borrows; for example, if both the success and failure callbacks want to write to some sort of controller object, they won't both be able to hold a mutable reference to it.

Is the borrow checker being unnecessarily restrictive here? A beginning Rust programmer might think so, but a more experienced Rust programmer will recognise that there's probably some valid hypothetical that the compiler is trying to guard against. And in this case, it's not too hard to spot the potential issue: nothing in version 2 of http_request actually specifies that only one of the callbacks gets called. A malicious http_request could call the success callback, or the failure callback, or the success callback and the failure callback, or maybe it calls them in the opposite order. If the callbacks used more complex argument types than &str and i32, it might find a way to pass the success callback as an argument to the failure callback, or vice versa, and trick the calling program into running one inside the other – and if either of them were Send, it could run them simultaneously on different threads.

So this is what a TU is: it's an object that could be a T, or a U, or a T and a U that have to be safe to process together in some arbitrary way that might involve processing them in any sequence, or in parallel, or interleaving them, etc. The consumer can't tell which (unlike with ⊕ where it can tell which, and & where it can choose which), and has to dedicate separate resources to the T and to the U just in case it ever gets given both (even if the producer only actually happens to generate one). In a way (that linear logic actually precisely defines!), ⅋ is the opposite of ⊗: a TU can also be processed as a T, or a U, or a T then a U, or a U then a T, or both at the same time – but with ⊗, the consumer gets to choose which, whereas with ⅋ it's imposed by the producer.

⅋ might not seem very useful, and indeed I haven't found anything simple or useful in Rust that it corresponds to. But it can be used as a building block for something more interesting. Think about what happens when the linear logic connectives are used with both types the same:

OK, so it's hard to see how a TT a) could get created, or b) would be useful even if it did. So let's go further: what about a TTT, or a TTTT? What about the limit: imagine a type that could be any number of Ts, ⅋ed together. This would be TTTTTT ⊕…, which linear logic has a name for: it's ?T, where ? is the sixth and final connective that I want to discuss in this blog post.

Let's think about what a ?T actually is. It's sort-of like a T, but consumers are very restricted in what they can do when processing it: anything they do has to be valid even if arbitrarily many copies of the same consumer were trying to do the same thing before, after, during the processing of the ?T ("same consumer" because you have to potentially be able to process arbitrarily many Ts but only have finitely many different consumers, so some of them will be the same). They can't borrow anything mutably, because anything they do might be done any number of times, and even twice wouldn't pass the borrow checker. That also means that they can't use a mutable reference to prove that they have unique access to memory: anything they try to write to might potentially be aliased.

Once you've finished consuming, any result you produce is also a ? type. The reason is that the original producer of the ? type gets to choose how many times it might need to be processed and which of the results end up mattering, so if you process a ? type and return a value, you don't actually know the structure of that value: just that it's some number of copies of the return value and you don't know which ones are valid or what sequence they ran in, so the best you can do is produce a ? of the result type.

Writing while holding a ?T isn't outright impossible, but you can't rely on the borrow-checker to tell you that it's safe: you'd need some other sort of proof, like "this thing isn't Send so I know it's only being accessed from a single thread" or "there's a lock around this object in order to protect it from concurrent modifications". The same rules for writing apply to shared references (the types that let shared references get around the restrictions are Cell and Mutex respectively), which should be enough to raise more than a little suspicion.

What linear logic teaches us about pirating

All of this leads up to my big surprising discovery: even though Rust doesn't (as far as I know) have anything that's modelled by ⅋, it does seem to have something that's modelled by ?. In particular, the linear-logic-defined type ?T seems to be acting an awful lot like a pirated type ?T. Thinking about it, this makes a lot of sense: pirating a type effectively says "there may be extra copies of this that aren't supposed to exist", and ? effectively says "this value can only be handled in ways that are safe even if you have to allow for the potential that extra copies of it exist and are being operated on".

To verify the theory, we just need to look at what ? is supposed to be able to do and verify that Rust can do the same. The main interesting rule of ? in linear logic is that if something can operate on a T using only resources which have ! type, then it can also operate on a ?T (producing a ? of the result). In Rust, that translates to "a closure that is Copy can operate on a pirated version of its argument to produce a pirated version of its result".

Unfortunately, this seems obviously wrong. Rust has some very simple closures like |x: &mut i32| *x = 10 which are Copy and return () (which is safely piratable due to also being Copy), and yet would be unsound if you pirated the argument in order to make it into a shared rather than mutable reference.

The mismatch comes down to what the assignment operator = is doing: in order to be able to know that it can do the writing safely, it is making an "argument from uniqueness" using the uniqueness of the mutable reference that it's writing to. When given a pirated mutable reference (i.e. a shared reference) instead, the uniqueness proof needs to come from somewhere else instead: but such a proof would (obviously) not be Copy because then it would be able to prove to two different closures that they both had a unique access. So the correct conclusion is "a closure that is Copy can operate on a pirated version of its argument to produce a pirated version of its result, as long as it does not make an argument from uniqueness" (or, at least, if it wants to make such an argument, it has to somehow obtain a non-pirated uniqueness proof despite being Copy). This reasoning somewhat suggests that arguments from uniqueness should be a type-system-level concept, given that it shows up in the logic and influences it.

If this is a type-system-level concept, it should be possible to find it in the type system! And indeed, there are a couple of places that I've seen it before. One of them is in my previous blog post, which ended up implementing Cell in safe Rust via using scoped generics. It did that by using a scoped generic that represented the permission to access Cells: while you were operating inside a Cell, you didn't have the permission and thus couldn't form a new reference to the inside of a Cell while you already had one. This is a sort of argument from uniqueness, but one that's visible as a concrete object in the type system (which I'll call a "cell permission").

In a way, Rust's existing Cell works a bit like that too: you can think of it as storing a cell permission in a thread-local variable, taking it for the body of each method on Cell and returning it before the method exits. (This is the reason why you can't borrow a reference from the inside of a Cell: the cell wouldn't be able to return the permission to the thread-local variable while the permission was still needed to hold the reference alive.) Because the permissions are thread-local, you can't sync a Cell to another thread because then both the old and new threads might be able to use their own permissions to access it at once. On this reasoning, Cell should be Send if it contains a Send type, but never Sync (and it does indeed have those specific trait implementations).

Still, Rust's current Cell implementation is currently clearly not as powerful as the theory suggests it should be: even in current Rust, without scoped generics, it should be possible to represent a cell permission in the type system and use it to access cells which can't be proven accessible in other ways. This suggests that Rust is missing a feature; and although I can't find the feature in question in the Rust standard library, I was able to find it in a couple of places on crates.io: ghost-cell has an implementation backed by some formal logic proving that the idea is correct, and qcell has four different implementations, which differ in the arguments from uniqueness that they use to generate the cell permission in the first place. So it seems like such a feature reasonably could/should exist in the language.

Another interesting observation about arguments from uniqueness: normally, such arguments can be used to justify writes; but if you pirate one of them, it can no longer make writes safe because it might conflict with a pirated copy of itself. However, the pirated proof still proves that there are no conflicting writes (because pirating it prevents it being used for writing but doesn't give anything else the ability to write), and thus it makes reads safe. So this gives, in a sense, a mathematical justification for why shared references work to make reading safe: extra justification for that probably wasn't needed, but it was still a little surprising (and somewhat mathematically elegant) to see such a justification turn up.

Unfortunately, I wasn't able to conclude much more from the "something that maps T to U can map ?T to ?U if all its resources are Copy" rule. The rule feels as though, in Rust, it wants to operate by updating every pirated copy of the object simultaneously, but even if that would be sound, it would be impossible to implement due to not tracking where the copies actually are.

There is another ?-related rule in linear logic: ??T is the same as ?T. Although that one's almost degenerate in linear logic, it's interesting in the context of Rust because it effectively makes pirating more transparent than referencing: in Rust, T, &T and &&T can all have different trait implementations, whereas the fact that ?T and ??T are the same means that it's hard to distinguish between U and ?U in traits that are generic over U (because U might be ?T).

This means that some trait implementations get weirder. Logically, you would want pirated copies of objects to implement pirated copies of their traits (the most obvious example in this direction is that pirating DerefMut gives you Deref). But going down this line tears Clone in two different directions: one wants to be consistent with Copy, cloning a ?T as a different ?T by copying it, and the other wants to clone a ?T as a T (but also wants to clone a ??T as a T, not a ?T). And weirdly, it turns out that the Rust library team has already accepted an API change proposal to add the second of these traits (that clones a ?T as a T), calling it CloneFromCopy, which is particularly strange given that current Rust doesn't even have a pirating operator – I was unaware of the trait in question before I started writing this blog post, and only coincidentally learned about it while doing something unrelated. (The details are in this GitHub issue: the idea of CloneFromCopy was discovered in the process of trying to implement a way to safely clone out of a Cell.)

Despite this weirdness, it definitely feels like a version of Rust that's based on pirating and mutable references rather than on shared references and mutable references (with &T being sugar for ?&mut T) should be viable: I can think in pirated Rust and it seems to logically hang together (both in terms of the linear logic behind it and in terms of what the language can and can't do). Having three sorts of reference-like things (mutable references, shared references, pirated copies) rather than two might make finding a good syntax hard, but hopefully that would be a resolvable problem. In most ways, the language seems somewhat superior to current Rust, in that it doesn't force you to create a reference merely to be able to share something (for many types, the reference just adds an extra layer of indirection for no benefit), and doesn't force you to write both shared and mutable versions of almost every method: and it would be a superset of current Rust in that it you can implement shared references in it, so you get all the functionality of current Rust in addition to pirating as well.

Another interesting data point: I've been considering writing my own practical programming language for a while, and part of the process of planning involved thinking about what sort of memory model I wanted it to have (in many cases, by thinking about programs I wanted to write and what sort of primitives I would need to be able to write them). Although I started mostly from scratch, the three "main" ways to pass arguments ended up corresponding fairly closely to Rust's (T, &T, &mut T) – except that the &T was a little different, and thinking about it in retrospect, what I came up with was ?T rather than &T. (The planning was quite prescient in that it also ended up with an equivalent of scoped generics, with very different syntax but the same semantics.)

Unfortunately, it's almost certainly too late to switch Rust to be based on pirating rather than shared references at this point; there's likely a lot of unsafe code that makes assumptions that are true in current Rust but become false with pirating, you'd want to redesign a lot of standard APIs to use pirated copies rather than shared references, some very core traits would start working slightly differently, and all the existing tutorials would go out of date: it'd be a very wide-reaching change.

Still, even if it's too late to make this large a change to Rust as a whole, maybe it isn't too late to learn some lessons from the thought-experiment version of Rust, and find some features that are missing from the real version.

What pirating teaches us about Rust

Shares of reference-counted references

The most obvious place to look for missing features in current Rust is by pirating its existing reference types. After all, this blog post managed to recreate Rust shared references by starting with Rust mutable references and pirating them: so maybe something interesting happens if we pirate some other sort of reference?

And indeed, a missing feature turns up almost immediately: just consider ?'a Rc<T> or ?'a Arc<T>. The semantics of these are fairly easy to explain: they're created from an existing Rc/Arc (i.e. one of the potentially many references to a reference-counted object), assert that that particular reference they were created from won't be dropped during the lifetime 'a, and can be freely cloned during that lifetime without actually needing to change the reference count (making the clones more efficient). The advantage over &'a Rc<T>/&'a Arc<T> is that it saves a level of indirection, leading to simpler code (especially in the generated executable); the advantage over &'a T is that you can use it to clone the underlying Rc/Arc, even with a lifetime that extends beyond 'a; and the advantage over Rc/Arc is that it cuts down on reference count updates, which can be really slow (especially with Arc – it uses atomic read-modify-write operations, which are some of the slowest instructions typical processors have – but even Rc needs to guard against the reference count overflowing and the overflow checks usually make reference increments and decrements impossible to optimise out).

The ?Rc<T> and ?Arc<T> types seem to be missing from the Rust standard library, but I found them in a couple of crates on crates.io: rc-borrow is a direct implementation of them, but is hardly used; and triomph implements the equivalent for its own Arc implementation. My guess is that although the feature in question is a clear improvement over the alternatives in the case where you need it, it isn't enough of an improvement to be worth adding a dependency (dependencies have a substantial cost of their own, so you don't want to add one unless you see a comparable gain).

Why shallow snapshots don't work

A different way in which pirating might seem useful is to create a "snapshot" of a value: it at first seems like it might be useful to take a copy of something so that you can safely use it after the original changes. But one lesson I learned from pirating all those reference types was that this sort of copy has similar safety rules to a shared reference: it has a lifetime, and you can't change the original until the lifetime is over. In this situation, you really need a clone (cloning deeply if that's how the type's Clone implementation works), rather than a shallow copy; otherwise, if the original is modified or dropped, that might cause something owned by the original to be dropped, leaving a dangling reference in the copy. (While writing this paragraph, I realised that CloneFromCopy had exactly this issue and thus was unsound, and the design was changed as a consequence. Although I was hoping to learn lessons about Rust from the experiment into pirating, I wasn't expecting them to be so immediately practically relevant!)

Trait implementations on references

The whole Clone versus CloneFromCopy thing got me thinking about how trait implementations interact with references more generally. It feels like Rust "wants" to, in general, make references implement the same traits as the things that they reference (and the "??T is ?T" equivalence provides an argument that something like that should be expected) – although with shared references, you get a "read only" version of the trait (because a shared reference is a pirated mutable reference, and pirating an object gives you a read only version of its traits). In particular, it's rare for both the reference and the referenced object to implement the same trait differently.

There are a couple of notable exceptions: Clone and Debug both make sense at different abstraction levels (you can clone an Rc<T> or you can clone the T inside it; and you can get debugging details either for the referenced object or the reference itself). But the interesting thing here is that the two cases seem to differ. When cloning, usually you want to clone the reference because it's usually either cheaper than, or entirely equivalent to, cloning the object (although if you specifically need a deep copy of an interior-mutable object, you need to clone the object itself rather than a reference wrapping it). When you're debugging code, normally you care about debugging the referenced object rather than the reference itself.

Rust defaults to taking the trait implementation from the reference, which makes <Rc as Clone>::clone() work correctly – but it would "naturally" make Debug give the wrong behaviour. As such, reference types are normally implemented to just copy the results of Debug of the inner value: but that means that you can't debug a reference (which is annoying when you're writing your own custom reference types). It also becomes a little incoherent sometimes: one of the references I wrote was basically a "proof that a MaybeUninit is initialised, plus a reference to the value inside it", and it wraps a reference to the MaybeUninit. If the inside value is Debug, by Rust's normal rules the reference's Debug implementation should delegate to that, but that means manually overriding the inside reference's Debug (which can't get at the referenced object because it doesn't know whether the memory it points to is initialised). If it isn't, then it would be nice to be able to Debug the inside reference instead: but that would be incoherent. This sort of thing makes me think that at least Debug (and possibly also Clone) is probably defined incorrectly (in particular, I am wondering whether it would make more sense if Debug on a reference debugged the reference itself, but the {:?} format specifier Dereffed as many times as possible before using the Debug implementation, with a separate format specifier available to use Debug directly).

Packed types

Another topic which can usefully be thought about in terms of pirating is that of packed or bit-packed types. Most types in Rust have an alignment of more than 1, which means that pointers to them don't use all the bits of their storage (the lowest bit is always 0). Suppose you have an enum like this (this example is similar to something that came up in a program I was writing recently):

enum Tree<'a> {
    Leaf(&'a u64),
    Branch(&'a [Tree; 2])
}

Presently, Rust will store this using an amount of memory that's twice the size of a pointer; for example, on a 64-bit system, it will use 64 bits to store the reference (with 3 of those bits always being 0), and another 64 bits to store the leaf/branch status (there is no gain from using fewer, because the Tree structure needs to have a size that's a multiple of 64 bits). It would obviously be much more memory-efficient to store this in half the size, by (e.g.) flipping the bottom bit of a Leaf reference so that it could be distinguished from a Branch reference; this is a "bit-packed" type because it's packing extra data into one of the known bits of a field.

The issue with packed and bit-packed types is that it's very difficult to use them soundly. A reference is supposed to point to memory that contains a value of the correct type: but (e.g.) if you tried to form a reference to the field of a Leaf, the memory it was pointing to would not contain a &'a u64 like it was supposed to, but rather a "&'a u64 with the bottom bit flipped" (which is not the same type because its memory representation is different). Rust's current solution to this sort of problem is "you have to move or copy the value out of the packed type before you can do anything with it". That works in the case of Tree above, because &'a u64 is Copy and so you can copy it out of the packed type freely.

What about packed types whose fields aren't Copy? In present Rust, they're almost completely unusable (you have to entirely destructure the type in order to do anything with its contents), because you can't copy the fields. However, rather than copying, what about pirating the fields? That gives you a little less power than copying or forming a reference would (e.g. you don't get to mutate the resulting value with interior mutability), but there are still a lot of things you can soundly do with it (and having a theory of pirating helps to clarify what those are). Without pirating existing as a language feature in Rust, the compiler won't be able to help you use packed fields directly; but it's still possible to use the theory to analyse unsafe code that does the same thing, and get a quick idea of whether it is likely to be sound or not. This line of reasoning also implies that adding a pirating operation to Rust might be beneficial even if it isn't used pervasively throughout the API; it would still be useful for forming "references" to packed fields.

The case for pirating in Rust

Normally I like to end blog posts with a conclusion. But in this case, the situation is more like "I had an idea, experimented with it, learned a lot of new things as a result, and it just lead to more ideas – but I don't know whether they're good ideas and haven't formed an opinion on them". Sometimes (as with scoped generics) doing research leads to a conclusion, a thesis you can argue and try to convince other people of; and then you get a nice, self-contained blog post. But sometimes, rather than a thesis, you just end up with more questions: and that sort of result can also be useful in its own way.

So instead of a conclusion, I'm going to finish by arguing one side of a debate: presenting a list of arguments that maybe adding pirating to Rust would actually be a good idea after all. I haven't yet decided whether or not it would make a good addition to the existing language; I am pretty sure I would add it if I were redoing Rust from scratch, but the trade-offs for changing an existing language are much larger. I think the downsides are obvious; but the upsides (and arguments against the downsides) are much more interesting, and there are more of them than I expected, so I'm going to write them out and let readers make up their own minds.

A key observation is that the only observable difference between a pirated value and a shared reference to the original value is that the shared reference preserves the address. (You can convert a pirated value to a "shared reference but with the wrong address" by storing it in memory, then reinterpreting a pointer to it as a reference – Rust's code generation back-end already does that sort of thing automatically when passing a sufficiently large value to a function, so passing a large pirated value and passing a shared reference to that value are actually very similar from the point of view of the generated code.) That means that a pirating operation can alternatively be viewed as a "shared reference, but you don't do anything that depends on its address" operation (the main things this rules out are converting it to a pointer, and writing through it using interior mutability).

Rust's type system already tracks whether or not a reference is to a type with interior mutability, so the difficult part is working out whether or not a shared reference is ever converted to a pointer. This is a great time to deploy another lesson about adding features to a language: if the feature is something fundamental and important, you should be able to find individual uses or special cases of it in the compiler already (e.g. one of the things that helped convince me that scoped generics were the right design is that lifetimes are a special case of them, so they just generalise something that Rust already needed). In this case, the goal is to look for situations where the compiler is checking to see if a shared reference ever gets converted to a pointer, and acting differently based on whether it is or isn't.

And unsurprisingly, such situations do exist. The most important is in the compiler back-end: one of the things Rust's compiler back-end LLVM does is that it looks for functions that take shared references to small values and never use the address, and converts them to take the value directly: in other words, it's optimising referencing into pirating. Almost every nontrivial Rust program requires this optimisation to avoid losing performance, because there are a lot of standard library methods that take &T purely because they need to share the value, not because they need the address. So Rust is implementing a sort of "pirating inference": the lack of pirating in the language itself means that a literal compilation of the code would be slower than in other systems programming languages, and inferring pirating back into the program is used to regain the performance.

This leads to a lot of complexity that doesn't inherently need to exist; and forcing programmers to write simple code (pass by value) as something complex (pass by shared reference), then optimising it back into the simple version, is normally considered an anti-pattern. In this case, it's happening because current Rust conflates the parameter passing method (by-value or by-address) with the sharing rules (shared or exclusive). The complexity has practical downsides, too: if the inference goes wrong (e.g. due to optimiser bugs), you can end up with a program that's much slower than it should be (and such bugs are probably fairly common: it didn't take me long to find a recent such bug, and the comments on it even discuss the issues with capturing a small value by reference when it could have been captured by value). The complexity also likely slows down the compiler somewhat, because there are so many references that need to be optimised out and removing them is likely to take substantial time.

It turns out that knowing whether a reference is ever used as a pointer is useful for more than optimisations, too. Programs that mix Rust code with code in other languages often benefit from security mitigations due to scenarios like "the Rust code creates a buffer and gives a pointer to it to the C code, and then the C code overflows it", which means that if a pointer to a Rust buffer can escape to C code, it improves security to have the compiled Rust code insert checks to see whether the memory immediately after the buffer was overwritten before it tries to do anything critical with the memory further beyond. But too many checks can hurt performance, As such, it has been proposed that the Rust compiler should insert such checks only after buffers for which a reference is ever converted to a pointer: in other words, this is another case where the compiler wants to act differently based on whether a reference is truly being used as a reference, or whether it's just being used to simulate pirating.

Trying to work out this sort of information by using a static analysis on every compile is inherently somewhat slow: the analysis could be skipped if information about the "reference used as pointer" status instead existed in the program, so it seems like something that, ideally, should be handled in the type system. As it happens, most Rust crates never actually need to convert shared references to pointers (even crates that use low-level unsafe code usually do it by storing pointers and converting them to references, rather than the other way round). So it's possible to imagine, e.g., a crate attribute that says "this crate doesn't do any shared-reference-to-pointer conversions" (perhaps accompanied by a separate type that represents a shared reference that can be converted to a pointer, although in a pinch you could use a shared reference to a mutable reference, & &mut). If a whole crate graph sets the attribute, then the Rust compiler would be able to take the relevant information directly from the type system, and not need to do a slow and error-prone static analysis (and if a crate didn't set the attribute, you could fall back to the old method of doing things). Perhaps it could even be made the default in a future edition of Rust.

And that seems like it would be a fairly small change. But once it had been made, you would have pirating in Rust: for T without interior mutability, &T would now have the semantics of pirating (with the compiler being able to freely choose between pass-by-value, pass-by-address, and pass-by-address-of-copy depending on which was most efficient). It wouldn't be a "perfect" implementation (e.g. &T and &&T would still be different types, and you would occasionally need a * in the source code even in cases where it doesn't correspond to an actual pointer dereference): but it would have the huge advantage of being consistent with how current Rust works, meaning that the existing tutorials would still work and programmers in general wouldn't need retraining.

There might even be a way to get benefits like "you can take references to packed fields now" (implemented as a copy, then either passing around the copy or a pointer to it), although that seems like it would probably be out of reach because it would stop working if someone added a dependency on an old crate that used the old, address-carrying definition of shared references.

So it seems like a world where pirating is added to Rust itself might actually potentially be viable: it would have lower upsides than the "everything is written in terms of pirating now" world, but also lower downsides (and many programmers might not even notice the change). It might be interesting to implement it experimentally and see what the impact on compile times is like!

And even if it does turn out that adding pirating to Rust isn't worthwhile, we've still learnt a lot from the experiment: hints on how to write custom references, thoughts about the traits system, a suggestion to add shares of Rc and Arc, a bug caught, and a mathematical model of shared references that uses a part of linear logic that previously didn't seem useful. It's been fun. Perhaps I can go back to my experiment in implementing scoped generics now!