Rust Lifetimes or: How I Learned to Stop Free-ing and Love the Borrow

Lifetimes

<'a> ?

To me one of the initial shocks of learning Rust was figuring out lifetimes. As a frontend-by-day developer I don't come face-to-face with the 'Double free' and 'Use after free' problems all that often. Actually, it could be easily argued that my backend-brethren don't really either or, for that matter, anyone who's typically dealing with a garbage collected language. I'm looking over at you JS, Java, and Ruby devs. I'd bet most neckbea.. *cough, excuse me, C developers are comfortable with these issues but alas, I am not. As such, lifetimes were kinda difficult to wrap my head around but I think I get them a little better now so let me try to explain.

First let's have a little warmup with two big problems that languages without garbage collection have (Rust doesn't have GC afterall).

The 'Double free' and 'Use after free' problems

The double-free problem in C looks like this (I'll translate as much as possible for the non-C readers)

// allocate 10 bytes in memory for ptr{1,2,3}
ptr1 = malloc(10);
ptr2 = malloc(10);
ptr3 = malloc(10);

// `free(...)` deallocates the memory at these addresses
free(ptr1);
free(ptr2);
free(ptr1); // ptr1 freed twice!

// Now suppose we malloc again...

new_ptr1 = malloc(10);
new_ptr2 = malloc(10);
new_ptr3 = malloc(10);

Printing the addresses of these values...

// ptr1, ptr2, and ptr3, respectively
0x55e318171670
0x55e318171690
0x55e3181716b0

// new_ptr1, new_ptr2, new_ptr3, respectively
0x55e318171670 // new_ptr1 address, same memory address as new_ptr3
0x55e318171690
0x55e318171670 // new_ptr3 address, same memory address as new_ptr1

This is a massive problem! This can lead to all sorts of unpredictable behavior in our program on top of being a huge security flaw. And it's pretty easy to implement. You just need two (or more) variables to reference the same value.

The use-after-free, which is as problematic (and in the same way) as double-free, looks like this:

p1 = (char *) malloc(SIZE);

free(p1);

// ... some code later...

strncpy(p1, argv[1], SIZE - 1);
// ^ What are we copying here??
// We already freed p1!

Obviously, these are artificial cases. In "real" programs this confusion would probably be distributed over hundreds of lines and dozens of files. But we can see exactly how SNAFUs can happen.

How Rust prevents chaos

Suppose I have the following struct and I attempt to get a value from it:

struct Person {
   name: String
}

impl Person {
    pub fn new(name: String) -> Person {
        Person { name: name }
    }

    pub fn get_name(&self) -> String {
      self.name
    }
}

fn main() {
  let me = Person::new(String::from("Matt"));
  println!("{:?}", me.get_name()); // Problem?
}

The above won't work because we're trying to move name out of our Person instance (me), but since me owns name we're out of luck. Compiler error. Denied. Alternatively, we could return a Clone of the name value since we're playing with a String type and that automatically implements the Clone trait:

// Previous code left out for brevity

pub fn get_name(&self) -> String {
  self.name.clone()
}

This works, but under the hood Rust is making a new copy of this buffer on our heap. That would probably be fine for small strings that won't be copied all the time, but what if all we want to do is read the value? This copying could spiral out of control under other scenarios (high-traffic servers, embedded systems with small amounts of memory, etc). Ideally, what we'd do is return a reference to the name variable from the get_name function:

pub fn get_name(&self) -> &String {
  &self.name
}

This also works, but with one teensy problem: If our Person goes out of scope while we hold a reference to name through another variable we could fall into the use-after-free problem:

let name;
{
  let person = Person::new(String::from("Matt"));
  name = person.get_name();
  // Compiler error: person dropped here
  // value does not live long enough
}
println!("{:?}", name);

You can get a hint of how lifetimes work if you think about that error for a second. "Value does not live long enough" Rust's compiler knows that the owner of that value won't be around long enough for referenced usage by anything else. Explicitly, person has an insufficient lifetime in this scope.

Rust is giving us a freebee here because it's such simple code. It doesn't always work out that way though, and more often than not we're going to need to hold the compiler's hand, rub its back, affirm its safety, and let it know exactly what lifetime we're basing our references off of. Let's see an example of this hand-holding using similar code (with no brevity):

struct Person {
   name: String
}

impl Person {
    pub fn new(name: String) -> Person {
        Person { name: name }
    }

    pub fn get_name<'a>(&'a self) -> &'a String {
      &self.name
    }
}

fn main() {
  let person = Person::new(String::from("Matt"));
  let name = person.get_name();
  // Here the compiler is flagging `person`
  // as the lifetime to watch, i.e. 'a == person

  // Hundreds and thousands of lines of crazy code below...
}

In English, we're saying to the compiler,

"Hey, I'll be returning a reference from this function get_name that will be dependent on the lifetime a. You can keep an eye on a to be sure I'm using the reference properly."

In other words, a is a compile-time flag which is associated with whatever the function is invoked with, or from. In this case the Person instance person. Once you've declared your functions, impls, or whatever with the lifetime operator (<'a>), you don't need to do anything more. You don't even have to mark which value is associated with your lifetime a! The compiler handles the rest.

By convention we use a, b, c, etc. to represent our lifetimes. Keep in mind that you can have multiple lifetimes that you'll need the compiler to keep tabs on. In this case your function construct may look like:

fn some_function<'a, 'b>(x: &'a f64, y: &'b f64) {/* ... */}

This is good stuff. Rust avoids the runtime overhead of garbage collection while also avoiding the nastiness of double-free and use-after-free. There is some cognitive overhead to learning the philosophy, but once you get the hang of it...not too bad.

comment

Comments