8 min read

One of Rust’s most criticized problem is that it’s difficult to develop an application with shared pointers. It’s true that due to Rust’s memory safety guarantees, it might be difficult to develop those kind of algorithms, but as we will see now, the standard library gives us types we can use to safely allow that behavior. In this article, we’ll understand how to overcome the issue of shared pointers in Rust to increase efficiency.

This article is an extract from Rust High Performance, authored by Iban Eguia Moraza.

Overcoming issue with cell module

The standard Rust library has one interesting module, the std::cell module, that allows us to use objects with interior mutability. This means that we can have an immutable object and still mutate it by getting a mutable borrow to the underlying data. This, of course, would not comply with the mutability rules we saw before, but the cells make sure this works by checking the borrows at runtime or by doing copies of the underlying data.

Cells

Let’s start with the basic Cell structure. A Cell will contain a mutable value, but it can be mutated without having a mutable Cell. It has mainly three interesting methods: set()swap(), and replace(). The first allows us to set the contained value, replacing it with a new value. The previous structure will be dropped (the destructor will run). That last bit is the only difference with the replace() method. In the replace() method, instead of dropping the previous value, it will be returned. The swap() method, on the other hand, will take another Cell and swap the values between the two. All this without the Cell needing to be mutable. Let’s see it with an example:

use std::cell::Cell;
#[derive(Copy, Clone)]
struct House {
bedrooms: u8,
}

impl Default for House {
fn default() -> Self {
House { bedrooms: 1 }
}
}
fn main() {
let my_house = House { bedrooms: 2 };
let my_dream_house = House { bedrooms: 5 };

let my_cell = Cell::new(my_house);
println!("My house has {} bedrooms.", my_cell.get().bedrooms);

my_cell.set(my_dream_house);
println!("My new house has {} bedrooms.", my_cell.get().bedrooms);
let my_new_old_house = my_cell.replace(my_house);
println!(
"My house has {} bedrooms, it was better with {}",
my_cell.get().bedrooms,
my_new_old_house.bedrooms
);

let my_new_cell = Cell::new(my_dream_house);
my_cell.swap(&my_new_cell);
println!(
"Yay! my current house has {} bedrooms! (my new house {})",
my_cell.get().bedrooms,
my_new_cell.get().bedrooms
);

let my_final_house = my_cell.take();
println!(
"My final house has {} bedrooms, the shared one {}",
my_final_house.bedrooms,
my_cell.get().bedrooms
);
}

As you can see in the example, to use a Cell, the contained type must be Copy. If the contained type is not Copy, you will need to use a RefCell, which we will see next. Continuing with this Cell example, as you can see through the code, the output will be the following:

So we first create two houses, we select one of them as the current one, and we keep mutating the current and the new ones. As you might have seen, I also used the take() method, only available for types implementing the Default trait. This method will return the current value, replacing it with the default value. As you can see, you don’t really mutate the value inside, but you replace it with another value. You can either retrieve the old value or lose it. Also, when using the get() method, you get a copy of the current value, and not a reference to it. That’s why you can only use elements implementing Copy with a Cell. This also means that a Cell does not need to dynamically check borrows at runtime.

RefCell

RefCell is similar to Cell, except that it accepts non-Copy data. This also means that when modifying the underlying object, it cannot simply copy it when returning it, it will need to return references. The same way, when you want to mutate the object inside, it will return a mutable reference. This only works because it will dynamically check at runtime whether a borrow exists before returning a mutable borrow, or the other way around, and if it does, the thread will panic.

Instead of using the get() method as in CellRefCell has two methods to get the underlying data: borrow() and borrow_mut(). The first will get a read-only borrow, and you can have as many immutable borrows in a scope. The second one will return a read-write borrow, and you will only be able to have one in scope to follow the mutability rules. If you try to do a borrow_mut() after a borrow() in the same scope, or a borrow() after a borrow_mut(), the thread will panic.

There are two non-panicking alternatives to these borrows: try_borrow() and try_borrow_mut(). These two will try to borrow the data (the first read-only and the second read/write), and if there are incompatible borrows present, they will return a Result::Err, so that you can handle the error without panicking.

Both Cell and RefCell have a get_mut() method, that will get a mutable reference to the element inside, but it requires the Cell / RefCell to be mutable, so it doesn’t make much sense if you need the Cell / RefCell to be immutable. Nevertheless, if in a part of the code you can actually have a mutable Cell / RefCell, you should use this method to change the contents, since it will check all rules statically at compile time, without runtime overhead.

Interestingly enough, RefCell does not return a plain reference to the underlying data when we call borrow() or borrow_mut(). You would expect them to return &T and &mut T (where T is the wrapped element). Instead, they will return a Ref and a RefMut, respectively. This is to safely wrap the reference inside, so that the lifetimes get correctly calculated by the compiler without requiring references to live for the whole lifetime of the RefCell. They implement Deref into references, though, so thanks to Rust’s Deref coercion, you can use them as references.

Overcoming issue with rc module

The std::rc module contains reference-counted pointers that can be used in single-threaded applications. They have very little overhead, thanks to counters not being atomic counters, but this means that using them in multithreaded applications could cause data races. Thus, Rust will stop you from sending them between threads at compile time. There are two structures in this module: Rc and Weak.

An Rc is an owning pointer to the heap. This means that it’s the same as a Box, except that it allows for reference-counted pointers. When the Rc goes out of scope, it will decrease by 1 the number of references, and if that count is 0, it will drop the contained object.

Since an Rc is a shared reference, it cannot be mutated, but a common pattern is to use a Cell or a RefCell inside the Rc to allow for interior mutability.

Rc can be downgraded to a Weak pointer, that will have a borrowed reference to the heap. When an Rc drops the value inside, it will not check whether there are Weak pointers to it. This means that a Weak pointer will not always have a valid reference, and therefore, for safety reasons, the only way to check the value of the Weak pointer is to upgrade it to an Rc, which could fail. The upgrade() method will return None if the reference has been dropped.

Let’s check all this by creating an example binary tree structure:

use std::cell::RefCell;
use std::rc::{Rc, Weak};
struct Tree<T> {
root: Node<T>,
}

struct Node<T> {
parent: Option<Weak<Node<T>>>,
left: Option<Rc<RefCell<Node<T>>>>,
right: Option<Rc<RefCell<Node<T>>>>,
value: T,
}

In this case, the tree will have a root node, and each of the nodes can have up to two children. We call them left and right, because they are usually represented as trees with one child on each side. Each node has a pointer to one of the children, and it owns the children nodes. This means that when a node loses all references, it will be dropped, and with it, its children.

Each child has a pointer to its parent. The main issue with this is that, if the child has an Rc pointer to its parent, it will never drop. This is a circular dependency, and to avoid it, the pointer to the parent will be a Weak pointer.

So, you’ve finally understood how Rust manages shared pointers for complex structures, where the Rust borrow checker can make your coding experience much more difficult. If you found this article useful and would like to learn more such tips, head over to pick up the book, Rust High Performance, authored by Iban Eguia Moraza.

Read Next:

Perform Advanced Programming with Rust

Rust 1.28 is here with global allocators, nonZero types and more

Say hello to Sequoia: a new Rust based OpenPGP library to secure your apps

I'm a technology enthusiast who designs and creates learning content for IT professionals, in my role as a Category Manager at Packt. I also blog about what's trending in technology and IT. I'm a foodie, an adventure freak, a beard grower and a doggie lover.

LEAVE A REPLY

Please enter your comment!
Please enter your name here