Rust Borrow Checker: “New Variable is All You Need.”

As is known to all, doubly-linked list is notoriously hard to implement in Rust. Recently, I was trying to implement a doubly-linked list in Rust without unsafe blocks. Here is part of the implementation.

use std::cell::RefCell;
use std::rc::Rc;

type Link<T> = Rc<RefCell<Node<T>>>;

#[derive(Debug)]
pub struct LinkedList<T> {
    head: Link<T>,
    tail: Link<T>,
    len: usize,
}

#[derive(Debug)]
struct Node<T> {
    data: Option<T>,
    prev: Option<Link<T>>,
    next: Option<Link<T>>,
}

impl<T> Node<T> {
    fn new(data: Option<T>, prev: Option<Link<T>>, next: Option<Link<T>>) -> Node<T> {
        Node::<T> { data, prev, next }
    }
}

impl<T> LinkedList<T> {
    /// Constructor
    pub fn new() -> LinkedList<T> {
        // initialize dummy head and tail
        let dummy_head = Rc::new(
            RefCell::new(
                Node::new(None, None, None)
            )
        );
        let dummy_tail = Rc::new(
            RefCell::new(
                Node::new(None, None, None)
            )
        );

        // connect dummy head and tail
        dummy_head.borrow_mut().next = Some(dummy_tail.clone());
        dummy_tail.borrow_mut().prev = Some(dummy_head.clone());

        LinkedList::<T> {
            head: dummy_head,
            tail: dummy_tail,
            len: 0,
        }
    }

    /// Get the number of elements in the list.
    pub fn len(&self) -> usize {
        self.len
    }

    /// Insert an element at position `index`, shifting all elements after it
    /// to the right
    /// 
    /// # Panics
    /// 
    /// Panics if `index > len` or `index < 0`
    pub fn insert(&mut self, index: usize, data: Option<T>) {
        // sanity check
        if index > self.len {
            panic!("index is {index}, which is greater than len.");
        }

        // get the node before and after the index
        let mut cur_pos = self.head.clone();
        for _i in 0 .. index {
            cur_pos = cur_pos.borrow().next.as_ref().unwrap().clone();
        }
        let next = cur_pos.borrow().next.as_ref().unwrap().clone();

        // construct new node
        let new_node = Rc::new(
            RefCell::new(
                Node::new(
                    data, 
                    Some(cur_pos.clone()), 
                    Some(next.clone())
                )
            )
        );

        // change pointer
        cur_pos.borrow_mut().next = Some(new_node.clone());
        next.borrow_mut().prev = Some(new_node.clone());
    }
}

Unfortunately, Rust compiler will complain at line 72 about the usage of borrow. On that line, we are trying to update the cur_pos to make it point to the next node.

error[E0506]: cannot assign to `cur_pos` because it is borrowed
    --> src/main.rs:72:13
     |
72   |             cur_pos = cur_pos.borrow().next.as_ref().unwrap().clone();
     |             ^^^^^^^   ----------------                               - ... and the borrow might be used here, when that temporary is dropped and runs the destructor for type `Ref<'_, Node<T>>`
     |             |         |
     |             |         `cur_pos` is borrowed here
     |             |         a temporary with access to the borrow is created here ...
     |             `cur_pos` is assigned to here but it was already borrowed
     |
     = note: borrow occurs due to deref coercion to `RefCell<Node<T>>`

This kind of error is common for programmers who migrate from other programming language. Though the compiler generates detailed error messages for us, the message is hard to comprehend. Let's analyze the error step by step. In the first step, we apply a dot operation on the cur_pos, which is a Rc that implements Deref. So, deref coercion happens here. The Rust Compiler will rewrite our code as follows:

(*cur_pos).borrow(); // step 1: dot operator dereferences smart pointers automatically
(*(cur_pos.deref())).borrow(); // step 2: dereferencing is converted to a call to deref() and a plain dereference, because Rc implements Deref.

If you have no idea about how conversion works at the second step, I would recommend you to go over The Book. Especially, pay attention to the following section:

When we entered *y in Listing 15-9, behind the scenes Rust actually ran this code:
*(y.deref())
Rust substitutes the * operator with a call to the deref method and then a plain dereference so we don’t have to think about whether or not we need to call the deref method. This Rust feature lets us write code that functions identically whether we have a regular reference or a type that implements Deref.

If you feel comfortable about the conversion, let's look into the definition of Deref.

pub trait Deref {
    type Target: ?Sized;

    // Required method
    fn deref(&self) -> &Self::Target;
}

Notice that the parameter of deref() is &self. This means deref() borrows cur_pos. So, a temporary, right? Unfortunately, it's not the case. In order to understand that there is no temporaries generated, we need to look into the definition of temporaries in Rust:

When using a value expression in most place expression contexts, a temporary unnamed memory location is created and initialized to that value. The expression evaluates to that location instead, except if promoted to a static. The drop scope of the temporary is usually the end of the enclosing statement.

Here, place expression is also known as lvalue and value expression is also known as rvalue. The definition tells us that temporaries will only be generated if we put some (not any) rvalue to the place where a rvalue should be placed. At the current step, it's not the case. So, no temporaries are generated.

Let's look at the next step, borrow().next. This is where a temporary is generated. Although borrow() returns a Ref<Node<T>>, which is perfectly fine to apply .next, but by definition, it's a method call, which is a rvalue. Since it is placed to the left of struct member access operator, which is the place for a lvalue, a temporary is generated for Ref<Node<T>>. It's scope extend until the enclosing of the current statement (i.e. the place of the semicolon).

The temporary scope of an expression is the scope that is used for the temporary variable that holds the result of that expression when used in a place context, unless it is promoted.
Apart from lifetime extension, the temporary scope of an expression is the smallest scope that contains the expression and is one of the following:

The second operand of a lazy boolean expression.

The entire function.

A statement.

The body of an if, while or loop expression.

The else block of an if expression.

The condition expression of an if or while expression, or a match guard.

The body expression for a match arm.

After complete the following evaluation on the right side of assignment, we assign it to the right side. However, Ref<Node<T>> still remains alive. This cause the problem. This is because when we have a shared reference to a memory location (in our case, it's a Node<T>), all its ancestors (i.e. other references / names that have direct or indirect access to it) are set to be read-only. So, the assignment operation is rejected by the compiler.

To illustrate a more general case, I use a picture from Chapter 5, Programming Rust, 2nd Edition.

Luckily, this problem can be easily solved by breaking the long statement into a shorter one.


let next = cur_pos.borrow().next.as_ref().unwrap().clone();
cur_pos = next;

cur_pos = {
    let borrowed = cur_pos.borrow();
    borrowed.next.as_ref().unwrap().clone()
};

The first fix works because the temporary is dropped at the first semicolon, so cur_pos becomes writable again. The second works because the temporary is bound to a local variable borrowed, which is dropped at the enclosing brace. However, the following fix does not work:

cur_pos = { cur_pos.borrow().next.as_ref().unwrap().clone() };

Braces do not constrain the scope of temporaries, so the temporary is still dropped at the semicolon.

Although I learned a lot due to this compiler error, it still gives me a lesson about Rust programming: Use new variables instead of write long expressions. New variable is all you need to avoid such compiler error.

Rust Borrow Checker: “New Variable is All You Need.”

RefMut is Not Mutable Reference

getline() is Not a C standard Library Function

Comments NOTHING

Cancel Reply