Babelmonkeys

home

Dynamically Sized Types in Rust

18 Mar 2014

Disclaimer: This is my understanding of the current plans for dynamically sized types. The actual implementation may vary.

Some time ago it was decided dynamically sized types (DST) are going to be introduced into the Rust programming language. This post is meant to be a high level introduction to the concept. A more in depth explanations is provided on Nicholas Matsakis' Blog.

What introducing DSTs means in particular is that the special ~[T] and ~str types are going to vanish from the language. Instead the new dynamically sized types [T] and Str are going to be introduced. Trait objects (written ~Trait) will also be expressed in terms of DST, introducing a dynamically sized Trait type.

The special characteristic of DSTs is that their size is only known at runtime, i.e. dynamically. Therefore, similarly to C's void type, a dynamically sized type can not be instantiated. DSTs are only meaningful when pointed to. Pointers to dynamically sized types are so called fat pointers. They consist of a pointer to the actual object and an additional word. For a pointer to [T] or Str the additional word is the length of the vector, or string. For a pointer to Trait the additional word is a pointer to a vtable for Trait.

The rest of this post will discuss the current and future situation concerning vectors, strings, and trait objects.

Now

In current Rust there is a set of special types, that do not follow the general composition rule that a ~U is an owned pointer to a U. These are ~[T], ~str, and ~Trait. The same is true for &U not being a reference to a U, for &[T], &str, and &Trait.

Vectors

Currently there are four vector related types:

Fixed-size vectors are the simplest vector type. They are allocated with an arbitrary but fixed size. This size is part of their type. They can and should be used whenever the necessary amount of storage is known at compile time. Unlike all other vector types they can be allocated on the stack. A simple usage example is:

1     let v: [int, ..4] = [1, 2, 3, 4];
2     let sum: int = v.iter().fold(0, |a, &b| a + b);
3     println!("{}", sum);

Unique vectors are currently the primarily used type for growable vectors. They can grow and shrink at runtime, when indexed into the index is, at runtime, checked against the dynamic length. Typical usage is something like:

1     let mut v: ~[int] = ~[];
2     v.push(42);
3     let answer: int = v.pop();

The Vec<T> type exists in preparation for a DST world. It is the destined replacement for the current ~[T]. It will implement all of the methods currently available for ~[T]. Therefore, it can be used as follows, to perform the same task as above:

1     let mut v: Vec<int> = Vec::new();
2     v.push(42);
3     let answer: int = v.pop();

Since there is no literal syntax for Vec<T> the vec!() macro is provided to enable easily creating new vectors. E.g. ~[1, 2, 3] becomes vec!(1, 2, 3).

A vector slice (&[T]) is a view into a vector. It enables you to work on a contiguous part of a vector (a slice). Typical use-cases for a slice are giving a function access to a vector, without copying it, or working on a subset of vector elements:

1     let v: [int, ..4] = [1, 2, 3, 4];
2     // Create a slice containing the last two elements
3     let s: &[int] = v.slice(2, 4);
4     // Add the last two elements
5     let sum: int = s.iter().fold(0, |a, &b| a + b);
6     println!("{}", sum);

Strings

In current Rust the two string related types are:

A unique string is semantically almost equivalent to a unique vector. The internal representation is actually just a ~[u8]. The main difference is that a ~str enforces that the contained bytes represent a valid UTF-8 string. It can be used as follows:

1     let mut s: ~str = ~"Hello";
2     s.push_str(" World");
3     println!("{}", s);

Likewise, a string slice is very similar to a vector slice. It provides a view of a sub-string. E.g.:

1     let s: &'static str = "Hello World";
2     let sub: &'static str = s.slice(6, 11);
3     println!("{}", sub);

Trait Objects

Trait objects allow you to refer to any object implementing a certain trait. This is useful if you need objects to implement certain functionality, but do not necessarily care about their exact type. A quick example is:

 1     trait Drive { fn drive(&self, where: &str); }
 2     struct Bycicle;
 3     struct Tardis;
 4 
 5     impl Bycicle {
 6         fn pedal(&self, w: &str) { println!("{}", w) }
 7     }
 8     impl Tardis {
 9         fn warp(&self, w: &str) { println!("{}", w) }
10     }
11 
12     impl Drive for Bycicle {
13         fn drive(&self, where: &str) { self.pedal(where) }
14     }
15     impl Drive for Tardis {
16         fn drive(&self, where: &str) { self.warp(where) }
17     }
18 
19     fn go_home(vehicle: ~Drive) {
20         vehicle.drive("home");
21     }
22 
23     fn main() {
24         go_home(~Tardis as ~Drive);
25     }

With dynamically sized types

Once DSTs are implmented composition will hold true for all types. The type ~[T] is simply an owning pointer to the dynamically sized type [T]. However, this implies that this type now has different semantics, as discussed in the following.

Vectors

With DSTs there are only three vector related types:

Fixed-size vectors and the Vec<T> type remain exactly the same after DSTs have been introduced. Much more interesting is how the ~[T], and &[T] types behave with DSTs.

~[T] is now an owning pointer to a dynamically sized vector. A literal to allocate an object of this type does no longer exist. Instead a literal written as ~[1i, 2, 3, 4] now has the type ~[int, ..4], i.e. it is an owning pointer to a fixed-size vector. A ~[T] is created from a fixed-sized vector by coercion (or explicit casting). This looks as follows:

1     let v1: ~[int, ..4] = ~[1, 2, 3, 4];
2     let v2: ~[int, ..3] = ~[5, 6, 7];
3     // Move v1 and v2 into vv
4     // The length is erased from the type
5     let vv: [~[int], ..2] = [v1 as ~[int], v2 as ~[int]];

&[T] is a reference to a dynamically sized vector. Conveniently this means it behaves almost exactly like a slice. A useful interpretation is that it points to a sub-vector of an existing one. A reference to a fixed-size vector (&[T, ..n]) can be coerced into this type. This allows for the following:

1     let v1: [int, ..4] = [1, 2, 3, 4];
2     let v2: [int, ..3] = [5, 6, 7];
3     let vv: [&[int], ..2] = [&v1 as &[int], &v2 as &[int]];

Strings

The situation for string types after DSTs are implemented is still a bit uncertain. Likely there will be two string types:

The StrBuf type will replace the ~str type. It will implement the same methods and be growable. The string example from above could look like this:

1     let mut s: StrBuf = StrBuf::new();
2     s.push_str("Hello");
3     s.push_str(" World");
4     println!("{}", s);

The Str type will be equivalent to the [u8] type, except for implementing different methods, and enforcing that the contained bytes make up a valid UTF-8 string.

Trait Objects

With respect to the above examples, trait objects will behave exactly the same. However, Trait will become a proper (dynamically sized) type, which makes it usable in generics. Another potential change is that &T: Trait will likely become coercible to &Trait.