5 Feb, 2022

Downcasting in Rust

As you progress in your journey as an intermediate rustacean and beyond, you will begin to see (and write) more and more code involving traits and generics. (You can do a lot of straightforward programs without getting too involved with those things.) Eventually, you will find yourself in a situation where you have a value of a generic type or that simply implements a trait, but you need to do something with the specific, concrete type.

Some people would say that this isn’t good practice, because you are piercing the abstraction boundary: if your function takes a generic T, then it should not need anything specific of a u8 or a String. Nonetheless, sometimes it is just easier to do a quick and dirty cast from the supertype to the subtype. That cast is called a downcast, because it goes down in the type hierarchy (and the opposite is an upcast). Downcasting is quite commonplace, and some of the most popular crates (such as anyhow) and even a type in std offers functions for such things.

In this post we will take a look at a couple of solutions to that problem in Rust, and towards the end of the post, as in my previous rambling, we will go deep in the weeds to better understand trait objects.

Use an enum

While that’s technically not a solution to the topic of the post, it may be the solution for your particular case. I don’t want to waste too much time with this case because it’s well covered by the book. The code below shows a simple example if this is new to you:

enum Value {
    Int(isize),
    Text(String)
}

// take the supertype and use the subtype
fn print_value(val: Value) {
    match val {
        Value::Int(number) =>
            println!("A number w/ {} ones in binary", number.count_ones()),
        Value::Text(string) =>
            println!("A string as bytes: {:?}", string.as_bytes())
    }
}

This is Rust’s way of implementing types that are an OR of two or more types (also known as unions). There is a crucial limitation to this solution: it only works if you have a fixed group of types to consider. It may also be the case that you receive the generic type or trait object from somewhere else, so the enum solution does not apply.

A Quick Trait Recap

Since the rest of this post will involve traits, let’s do a quick recap. Something that took me a while to get used to is the fact that there are two very distinct ways of using traits. One is via generics: you can put a trait bound on a generic type with the T: Trait syntax, meaning that the generic type T has to implement Trait, otherwise the code won’t compile. The syntax impl Trait is just syntax sugar for a generic type T with T: Trait as a bound. The following are equivalent:

fn use_generics<T: Trait>(x: T) {}
fn use_impl(x: impl Trait) {}

In this use of traits, what happens is straightforward. The compiler will do the famous monomorphisation, basically generating a non-generic copy of the generic functions for each combination of concrete types with which they are called. The trait bounds don’t affect this process so much, except for throwing a compile error if the bounds are violated.

The second use of traits are the so-called trait objects. A trait object is created via coercion (or explicitly via casting), when a value that implements the trait is passed somewhere where a dyn Trait is expected, like so:

let trait_obj: &dyn Display = &5u8;
println!("I can display: {}", trait_obj);

// let x: u8 = *trait_obj;
// error: expected `u8`, found trait object `dyn std::fmt::Display`

As you can see, once we make a trait object, we can’t simply downcast it back to the original type with coercion or casting. The type has been erased, just like in the first case (generics with trait bounds). Despite this similarity, trait objects are nothing like the generic trait-bound types in the way they are physically represented.

First, they can only be used behind a pointer (like &T or Box<T>, for example), because the compiler cannot know their concrete type at compile time, and therefore cannot know how much space it would need to allocate for it on the stack (similar to the &str or slice types, that cannot be used directly). For example, imagine that a dyn Display could be a tiny u8 or a user-defined 2GB struct. As a result of this, trait objects are not Sized. Non-Sized types are known as unsized types or dynamically sized types. They are represented as fat pointers, which have twice the size of a thin (regular) pointer: one word (8 bytes in a 64-bit machine) has the address to the value, and another word has the address of a vtable with metadata (more on this later). I want to note here that – contrary to what I thought – the object itself doesn’t have to be in the heap; an allocation is not automatically done when you create a trait object (e.g. like this: &10 as &dyn Debug).

Second, since the compiler doesn’t know the type at compile time, it also doesn’t know what are the concrete functions implementing the trait for a particular trait object. This means that the addresses of these functions have to be resolved at runtime via the vtables mentioned above. This is called dynamic dispatch.

A Solution with Trait-Bound Generics

Back to our problem, let’s start with the easier case, remaking the enum example but now with traits and static dispatch:

trait Value {
    fn as_int(&self) -> Option<isize> { None }
    fn as_text(&self) -> Option<&str> { None }
}

impl Value for isize {
    fn as_int(&self) -> Option<isize> { Some(*self) }
}

impl Value for String {
    fn as_text(&self) -> Option<&str> { Some(self) }
}

fn print_value<T: Value>(val: T) {
    if let Some(num) = val.as_int() {
        println!("A number w/ {} ones in binary", num.count_ones());
    }
    if let Some(string) = val.as_text() {
        println!("A string as bytes: {:?}", string.as_bytes());
    }
}

fn main() {
    print_value(110);
    print_value(String::from("hello"));
}

One advantage of this approach is that this solution works even if you can’t make changes on the code that sends you the T. Maybe you are working on a large codebase or a library and your users want your function to take a generic T, so you cannot change your function to receive an enum. In this case, the only (breaking) change on the API that you would have to make in your library is to add a bound T: Value. Then you just need to implement the trait for the relevant types. In case any new type needs to be introduced, you only have to add a method with a default implementation to the trait, which is not a breaking change. Also, static dispatch is usually faster (the compiler knows what functions to call, it doesn’t have to check a vtable at runtime).

Still, this solution leaves a lot to be desired when it comes to flexibility. And monomorphisation has the downside of generating more code, so if you use the generic functions with lots of different types, your binary can get bloated.

The Any Trait

Now we will explore the dynamic language side of Rust with the Any trait. In dynamic languages, downcasting is trivial thanks to runtime reflection, more specifically, runtime metadata about types. In Rust, types are mostly a compile-time concept, they don’t really exist when your program is running. There is no general way of knowing the type of something, your values are just bytes in memory without any metadata. The Any trait comes to partially solve this problem.

With Any, we can write code like this to easily downcast values:

use std::any::{Any, TypeId};

fn log(value: &dyn Any) {
    match value.downcast_ref::<String>() {
        Some(text) => println!("Bytes of the string: {:?}", text.as_bytes()),
        None => println!("No string...")
    };
}

fn main() {
    // &T is coerced into &dyn Any:
    log(&String::from("hello"));
    log(&10);
}

The advantage of this solution with trait objects is that it is the most flexible, and the code generated is (probably) smaller because the compiler doesn’t have to create copies of the functions for each concrete use of a generic type. The cost is that now it needs to check vtables to find the function implementations of each trait object, so function calls are a bit slower.

The Any trait is implemented automatically for all types that do not contain non-'static references. When the trait is in scope, you can call the type_id method in a value to know what type they have:

use std::any::{Any, TypeId};
let val: u8 = 10;
println!("Type ID of u8: {:?}", val.type_id());
assert_eq!(val.type_id(), TypeId::of::<u8>());

We can also use the is method if the value is a dyn Any trait object:

let x = &true as &dyn Any;
assert!(x.is::<bool>());

How does Any Work?

The type metadata for Any is provided by TypeId, which relies on compiler intrinsics to do its job. That means that the implementation is part of rustc itself, not just a function like any other in std. All that we can see in the sources of std is:

#[rustc_const_unstable(feature = "const_type_id", issue = "77125")]
pub fn type_id<T: ?Sized + 'static>() -> u64;

Any calls to TypeId::of::<T>() in your code will resolve to a constant, calculated at compile time. This is how reflection works in Rust. Now what about the downcast_ref and downcast_mut methods in Any? After checking that the correct type is being casted, they just do an unsafe cast of the pointer, as we can see in downcast_ref_unchecked:

unsafe { &*(self as *const dyn Any as *const T) } // self is &dyn Any

One interesting thing that can be learned by looking at the code in std::any is that you can have an impl dyn Trait block, which only implements methods for the trait object version of the trait. That is, these methods will not be available to T where T: Trait, unless this T is a trait object. Note that a trait object fat pointer does not automatically implement the trait (i.e. &dyn Trait: Trait can be false).

In fact, we can do the Any style downcast with any trait, the problem is only that we need to know the type:

let x = &String::from("hello") as &dyn Trait;
let y: &String = unsafe { &*(x as *const dyn Trait as *const String) };
println!("bytes: {:?}", y.as_bytes());

We could even implement our own unsafe downcast function on dyn Trait:

trait Trait {}
impl Trait for String {}
impl Trait for u8 {}

impl dyn Trait {
    // SAFETY: I hope you know what you're doing
    unsafe fn downcast<T>(&self) -> &T {
        &*(self as *const dyn Trait as *const T)
    }
}

fn main() {
    let a: &dyn Trait = &42_u8;
    let b: &dyn Trait = &String::from("hello");

    let _number: u8 = *unsafe { a.downcast::<u8>() };
    let _text: &str = unsafe { b.downcast::<String>() };
}

The bottom line is that we don’t even need the Any trait, what is really essential is TypeId::of::<T>(), but Any makes our lives easier by exposing this functionality with a stable API.

Inspecting a Trait Object

As promised, now we will dive deep in the representation of trait objects. First, let’s check that the fat pointer is indeed wide:

println!("{}", std::mem::size_of::<&String>());
println!("{}", std::mem::size_of::<&[u8]>());
println!("{}", std::mem::size_of::<&dyn X>());

This will give us 8, 16 and 16, respectively, in a 64-bit machine. So the pointers to trait objects and slices are wide.

The first word in the fat pointer for a trait object is the address of the value, and the second is the address of a vtable (for the slice, the second word is just the size). The vtable has the following layout (all fields are word-sized):

pointer to drop_in_place
size
align
pointer to fn 1
pointer to fn 2
pointer to fn N

The pointers to fn 1 etc. have the addresses of the concrete implementations of the trait functions. I saw references to this layout in a couple of blog posts, for example here, but I have not seen this in the official docs or in the language reference. So let’s verify that this is correct in practice.

First let’s create and implement a trait for String and u64:

trait Trait {
    fn do_something(&self);
    fn do_something_else(&self);
}

impl Trait for String {
    fn do_something(&self) { println!("a string: {}", self); }
    fn do_something_else(&self) { println!("a string: {}", self); }
}

impl Trait for u64 {
    fn do_something(&self) { println!("a u64: {}", self); }
    fn do_something_else(&self) { println!("a u64: {}", self); }
}

Please don’t use the following block of code. It’s a quick and dirty trick that works for this demonstration but shouldn’t be relied upon for anything serious. For example, it doesn’t work well for small types because we simply cast the data to a slice of usize. That said, now we write this function to “inspect” a fat pointer, making some assumptions:

fn analyse_fatp<T: ?Sized>(p: *const T, datasize: usize, vtsize: usize) {
    let addr = &p as *const *const T as *const usize;
    let second = (addr as usize + std::mem::size_of::<usize>()) as *const usize;
    let datap = unsafe { *addr } as *const usize;
    let vtp = unsafe { *second } as *const usize;
    let data = unsafe { std::slice::from_raw_parts(datap, datasize) };
    let vtable = unsafe { std::slice::from_raw_parts(vtp, vtsize) };
    let vtable = vtable
        .iter()
        .map(|val| format!("0x{:x}", val))
        .collect::<Vec<_>>();

    println!("Addr of fat pointer (1st word): {:p}", addr);
    println!("Addr of fat pointer (2nd word): {:p}", second);
    println!("Addr of data:                   {:p}", datap);
    println!("Addr of vtable:                 {:p}", vtp);
    println!("Data:   {:?}", data);
    println!("VTable: {:?}", vtable);
}

Note that we used the trait bound T: ?Sized. All generic parameters have an implicit trait bound of T: Sized, so we use ?Sized to opt out of this bound. This means that T does not have to be Sized, and indeed since we want to pass a fat pointer to this function, the pointed to object (a dyn Trait) will be unsized. In summary, this function gets the address of the fat pointer it receives, and treats it as a 2-word value. The first word is the pointer to the data, the value itself. The second word is the pointer to the vtable. We pass the sizes of the data and vtable manually. We know the vtable has size 5 because its layout is as mentioned before (address of drop_in_place, size and alignment of data, adress of do_something and address of do_something_else, each of these fields has size of 1 word or 8 bytes in 64-bit systems).

Now we create some trait objects and analyse the fat pointers with that function:

let obj: &dyn Trait = &String::from("hello");
dbg!(String::do_something as *const ());
dbg!(String::do_something_else as *const ());
analyse_fatp(obj, std::mem::size_of::<String>() / std::mem::size_of::<usize>(), 5);

let obj: &dyn Trait = &12_u64;
dbg!(u64::do_something as *const ());
dbg!(u64::do_something_else as *const ());
analyse_fatp(obj, std::mem::size_of::<u64>() / std::mem::size_of::<usize>(), 5);

Playground link. We get:

[src/main.rs:38] String::do_something as *const () = 0x00005576c9fe48c0
[src/main.rs:39] String::do_something_else as *const () = 0x00005576c9fe4940
[src/main.rs:43] u64::do_something as *const () = 0x00005576c9fe49c0
[src/main.rs:44] u64::do_something_else as *const () = 0x00005576c9fe4a40

Addr of fat pointer (1st word): 0x7ffc1d178fd0
Addr of fat pointer (2nd word): 0x7ffc1d178fd8
Addr of data:                   0x7ffc1d179438
Addr of vtable:                 0x5576ca02a150
Data:   [93968711141840, 5, 5]
VTable: ["0x5576c9fdfa80", "0x18", "0x8", "0x5576c9fe48c0", "0x5576c9fe4940"]

Addr of fat pointer (1st word): 0x7ffc1d178fd0
Addr of fat pointer (2nd word): 0x7ffc1d178fd8
Addr of data:                   0x5576ca01a250
Addr of vtable:                 0x5576ca02a1f8
Data:   [12]
VTable: ["0x5576c9fdf9b0", "0x8", "0x8", "0x5576c9fe49c0", "0x5576c9fe4a40"]

What to make out of this? First, the data: for String we have an address first (the address where the str is), a length and a capacity (both of 5). For the u64 it’s just a 12, as expected. Now the most interesting part, the vtables. Look how the addresses in the two last positions match the addresses of the functions that we printed with dbg!. Also the size and alignment are correct: the 0x18 is 24 in hexadecimal, which is the size of a String in bytes. Both types have an alignment of 8, which is the size of a word.

As a word of caution, have in mind that this is just an exercise. I don’t know if rustc gives us any guarantees that this vtable layout will always be like that. There is an RFC with the goal of exposing the metadata for fat pointers in a more user-friendly way, which seems to be already usable in nightly.

As a bonus for the post, I wanted to include a wildly unsafe demonstration of how to build a trait object from scratch, using mem::transmute, and show how to call the functions above by their addresses. But as usual the post is already very long, so we will leave that for another time.