Generic Types, Traits, and Lifetimes
For the sake of conciseness, I'm not explaining generics here. It is the same concept as in Java. This section will include Rust specific syntax and behaviors.
Generic Data Types
In Function Definitions
fn largest<T>(list: &[T]) -> &T {
let mut largest = &list[0];
for item in list {
if item > largest {
largest = item;
}
}
largest
}
fn main() {
let number_list = vec![34, 50, 25, 100, 65];
let result = largest(&number_list);
println!("The largest number is {}", result);
let char_list = vec!['y', 'm', 'a', 'q'];
let result = largest(&char_list);
println!("The largest char is {}", result);
}
fn largest<T>
- declares a function calledlargest
that has a generic type,T
.(list: &[T])
- states the method takes in a parameter that is a slice type with the contents of the slice being of typeT
-> &T
- declares the funtion returns a reference to aT
type
This won't compile because there is no guarantee that T
has a >
method implemented:
$ cargo run
Compiling chapter10 v0.1.0 (file:///projects/chapter10)
error[E0369]: binary operation `>` cannot be applied to type `T`
--> src/main.rs:5:17
|
5 | if item > largest {
| ---- ^ ------- T
| |
| T
|
= note: `T` might need a bound for `std::cmp::PartialOrd`
error: aborting due to previous error
For more information about this error, try `rustc --explain E0369`.
error: could not compile `chapter10`.
To learn more, run the command again with --verbose.
We'll need to use a trait, which we'll get to in a moment.
In Struct Definitions
We can also define structs using generics:
struct Point<T> {
x: T,
y: T,
}
fn main() {
let integer = Point { x: 5, y: 10 };
let float = Point { x: 1.0, y: 4.0 };
}
In Enum Definitions
We can also define enums using generics:
enum Result<T, E> {
Ok(T),
Err(E),
}
We say that "Struct
is generic over two types, T
and E
".
In Method Definitions
struct Point<T> {
x: T,
y: T,
}
impl<T> Point<T> {
fn x(&self) -> &T {
&self.x
}
}
We have to declare <T>
after impl
so we can say we're defining a method for a generic type. You could also define methods for a specific type:
impl Point<f32> {
fn distance_from_origin(&self) -> f32 {
(self.x.powi(2) + self.y.powi(2)).sqrt()
}
}
The generics used in a struct doesn't necessarily match those used in the struct's method signatures:
struct Point<T, U> {
x: T,
y: U,
}
impl<T, U> Point<T, U> {
fn mixup<V, W>(self, other: Point<V, W>) -> Point<T, W> {
Point {
x: self.x,
y: other.y,
}
}
}
Performance of Code Using Generics
Rust implements generics in a way that your code doesn't run any slower using generics types than it would using concrete types.
It achieves this by performing monomorphization at compile type.
Monomorphization - process of turning generic code into specific code by filling in the concrete types that are used when compiled.
Let's take a look at an example with the Option
enum:
let integer = Some(5);
let float = Some(5.0);
The compiler would see that the Option
enum is being used for two types: i32
and f64
. It expands the definition of Option<T>
into Option_i32
and Option_f64
.
The compiled code will look like this, with Option<T>
replaced:
enum Option_i32 {
Some(i32),
None,
}
enum Option_f64 {
Some(f64),
None,
}
fn main() {
let integer = Option_i32::Some(5);
let float = Option_f64::Some(5.0);
}
Traits: Defining Shared Behavior
Trait - tells the Rust compiler about a functionality that a type has that can be shared with other types.
We use traits to define shared behavior in an abstract way. We can use trait bounds to specific that a generic can be any type that has a certain behavior.
This is basically interfaces in Java, with some caveats.
Defining a Trait
A type’s behavior consists of the methods we can call on that type. Different types share the same behavior if we can call the same methods on all of those types. Trait definitions are a way to group method signatures together to define a set of behaviors necessary to accomplish some purpose.
pub trait Summary {
fn summarize(&self) -> String;
}
The methods have ;
at the end because the definition/implementation of the method will be defined by the types that implement this trait.
Implementing a Trait on a Type
Implementing a trait on a type is similar to implementing regular methonds. The main difference is after impl
, we put the trait name, then use the for
keyword, and then specify the name of the type you want to implement the trait for.
pub struct NewsArticle {
pub headline: String,
pub location: String,
pub author: String,
pub content: String,
}
impl Summary for NewsArticle {
fn summarize(&self) -> String {
format!("{}, by {} ({})", self.headline, self.author, self.location)
}
}
pub struct Tweet {
pub username: String,
pub content: String,
pub reply: bool,
pub retweet: bool,
}
impl Summary for Tweet {
fn summarize(&self) -> String {
format!("{}: {}", self.username, self.content)
}
}
We can then use summarize
as if it was defined in the type:
let tweet = Tweet {
username: String::from("horse_ebooks"),
content: String::from(
"of course, as you probably already know, people",
),
reply: false,
retweet: false,
};
println!("1 new tweet: {}", tweet.summarize());
We can only implement a trait on a type only if either the trait or the type is local to our crate.
- We can implement
Display
(from the standard library) forTweet
. - We can implement
Summary
forVec<T>
. - We cannot implement
Display
forVec<T>
.
We can't do the third because of coherence, or more specifically the orphan rule (the parent type is not present). This rule is to ensure no one else breaks your code and vice versa. Without this rule, two crates can create an implementation for the same type and Rust wouldn't know what to do.
Default Implementations
pub trait Summary {
fn summarize(&self) -> String {
String::from("(Read more...)")
}
}
Rather than a ;
, we can define the methods in the trait for a default implementation.
Traits as Parameter
We can use traits to define functions:
pub fn notify(item: &impl Summary) {
println!("Breaking news! {}", item.summarize());
}
Trait Bound Syntax
This is a syntatical sugar for trait bound syntax. The below is equivalent:
pub fn notify<T: Summary>(item: &T) {
println!("Breaking news! {}", item.summarize());
}
The former way is a more concise way, but the latter can express more complicated functions.
If we had a function that took in two Summary
data types, it would like this:
pub fn notify(item1: &impl Summary, item2: &impl Summary) {
We can pass in an implementation of summary to the first and second.
If we wanted to make sure that both arguments are of the same type (that implements Summary
), we need to do:
pub fn notify<T: Summary>(item1: &T, item2: &T) {
This ensures that both are of the same type (but implements Summary
).
Specifying Multiple Trait Bounds with the +
Syntax
pub fn notify(item: &(impl Summary + Display)) {
or
pub fn notify<T: Summary + Display>(item: &T) {
Clearer Trait Bounds with where
Clauses
A function with multiple generics, each with their own trait bounds, can get pretty verbose and hard to read:
fn some_function<T: Display + Clone, U: Clone + Debug>(t: &T, u: &U) -> i32 {
Instead, Rust has the keyword where
:
fn some_function<T, U>(t: &T, u: &U) -> i32
where T: Display + Clone,
U: Clone + Debug
{
This makes the signature less cluttered, and human readable in a way.
Returning Types that Implement Traits
We can also use the impl Trait
syntax as a return type:
fn returns_summarizable() -> impl Summary {
Tweet {
username: String::from("horse_ebooks"),
content: String::from(
"of course, as you probably already know, people",
),
reply: false,
retweet: false,
}
}
This syntax only allows one type to be returned. The following would return an error at compile time:
fn returns_summarizable(switch: bool) -> impl Summary {
if switch {
NewsArticle {
headline: String::from(
"Penguins win the Stanley Cup Championship!",
),
location: String::from("Pittsburgh, PA, USA"),
author: String::from("Iceburgh"),
content: String::from(
"The Pittsburgh Penguins once again are the best \
hockey team in the NHL.",
),
}
} else {
Tweet {
username: String::from("horse_ebooks"),
content: String::from(
"of course, as you probably already know, people",
),
reply: false,
retweet: false,
}
}
}
NewsArticle
and Tweet
both implement Summary
but the function can only return one of these, due to limitation of the compiler. We will get to how we can make this work later.
Fixing the largest Function with Trait Bounds
Let's revisit our largest
function:
fn largest<T>(list: &[T]) -> &T {
let mut largest = &list[0];
for item in list {
if item > largest {
largest = item;
}
}
largest
}
We can add a trait bound to allow the use of >
:
fn largest<T: PartialOrd>(list: &[T]) -> T {
let mut largest = list[0];
for &item in list {
if item > largest {
largest = item;
}
}
largest
}
but another issue arises:
$ cargo run
Compiling chapter10 v0.1.0 (file:///projects/chapter10)
error[E0508]: cannot move out of type `[T]`, a non-copy slice
--> src/main.rs:2:23
|
2 | let mut largest = list[0];
| ^^^^^^^
| |
| cannot move out of here
| move occurs because `list[_]` has type `T`, which does not implement the `Copy` trait
| help: consider borrowing here: `&list[0]`
error[E0507]: cannot move out of a shared reference
--> src/main.rs:4:18
|
4 | for &item in list {
| ----- ^^^^
| ||
| |data moved here
| |move occurs because `item` has type `T`, which does not implement the `Copy` trait
| help: consider removing the `&`: `item`
error: aborting due to 2 previous errors
Some errors have detailed explanations: E0507, E0508.
For more information about an error, try `rustc --explain E0507`.
error: could not compile `chapter10`.
To learn more, run the command again with --verbose.
Data types stored on the stack implement a Copy
trait, which allows us to move list[0]
into largest
. With generics, we do not know if the data type implements this trait.
Quick fix is to add another bound trait:
fn largest<T: PartialOrd + Copy>(list: &[T]) -> T {
let mut largest = list[0];
for &item in list {
if item > largest {
largest = item;
}
}
largest
}
Other approaches:
- If we don't want to limit to
Copy
we can have it bound byClone
, and then clone the element in the logic. We would potentially be making more heap allocations in this case.
fn largest<T: PartialOrd + Clone>(list: &[T]) -> T {
let mut largest = list[0].clone();
for item in list {
if item > &largest {
largest = item.clone();
}
}
largest
}
- We can implement
largest
to return a reference toT
value. We do not needClone
orCopy
in this case.
fn largest<T: PartialOrd>(list: &[T]) -> &T {
let mut largest = &list[0];
for item in list.iter() {
if item > largest {
largest = &item;
}
}
largest
}
Using Trait Bounds to Conditionally Implement Methods
We can conditionally create methods on types using trait bounds on impl
blocks.
use std::fmt::Display;
struct Pair<T> {
x: T,
y: T,
}
impl<T> Pair<T> {
fn new(x: T, y: T) -> Self {
Self { x, y }
}
}
impl<T: Display + PartialOrd> Pair<T> {
fn cmp_display(&self) {
if self.x >= self.y {
println!("The largest member is x = {}", self.x);
} else {
println!("The largest member is y = {}", self.y);
}
}
}
Here we define new
for all Pair<T>
, but define cmp_display
only for Pair<T>
where T
implements Display
and PartialOrd
.
We can also conditionally implement a trait for any type that implements another trait. This is called blanket implementations.
These are used a lot in the Rust library. An example - implementing the ToString
trait for types that implement Display
trait.
impl<T: Display> ToString for T {
// --snip--
}
Because the standard library has this blanket implementation, we can call to_string
on any type that implements the Display
trait.
Blanket implementations appear in the documentation for the trait in the “Implementors” section.
Validating References with Lifetimes
Every reference has a lifetime - the scope for which that reference is valid.
Most cases, the lifetime is inferred and implicit, just like how types are inferred most of the time. We need to annotate types when mutliple types are possible).
Similar to this, Rust requires us to annotate lifetimes when the lifetimes of references could be related in more than one way.
Rust requires us to annotate the relationships using generic lifetime parameters to ensure the references at runtime are valid.
Preventing Dangling References with Lifetimes
The main goal of lifetime is to prevent dangling references - which cause a program to reference data other than the data it was meant to reference:
{
let r;
{
let x = 5;
r = &x;
}
println!("r: {}", r);
}
- In the outer scope, we declare
r
- In the inner scope, we declare
x
- In the inner scope, we set
r
as a reference tox
- Inner scope ends
- We print
r
.
This won't compile because the value r
is trying to reference has gone out of scope. x
"didn't live long enough", since its scope ends in the inner scope. r
is still valid for the outer scope, so "it lives longer".
Rust uses a borrow checker to validate this.
Borrow Checker
The borrow checker compares scopes to determine whether all borrows are valid.
{
let r; // ---------+-- 'a
// |
{ // |
let x = 5; // -+-- 'b |
r = &x; // | |
} // -+ |
// |
println!("r: {}", r); // |
} // ---------+
lifetime of r is annotated with 'a
and lifetime of x is annotated with 'b
. Rust compares the two lifetimes and sees that the reference lives longer than the subject of the reference.
To fix this:
{
let x = 5; // ----------+-- 'b
// |
let r = &x; // --+-- 'a |
// | |
println!("r: {}", r); // | |
// --+ |
} // ----------+
Here r
can reference x
because its lifetime is shorter.
Generic Lifetimes in Functions
Working example - write a function that returns the longer of two string slices. The function should take in two string slices and return a single string slice.
fn main() {
let string1 = String::from("abcd");
let string2 = "xyz";
let result = longest(string1.as_str(), string2);
println!("The longest string is {}", result);
}
fn longest(x: &str, y: &str) -> &str {
if x.len() > y.len() {
x
} else {
y
}
}
The compiler gives us an error:
$ cargo run
Compiling chapter10 v0.1.0 (file:///projects/chapter10)
error[E0106]: missing lifetime specifier
--> src/main.rs:9:33
|
9 | fn longest(x: &str, y: &str) -> &str {
| ^ expected lifetime parameter
|
= help: this function's return type contains a borrowed value, but the signature does not say whether it is borrowed from `x` or `y`
error: aborting due to previous error
For more information about this error, try `rustc --explain E0106`.
error: could not compile `chapter10`.
To learn more, run the command again with --verbose.
Rust cannot tell whether the reference being returned is x
or y
. We actually don't know either. The borrow checker cannot tell the concrete lifetimes of the parameters being passed in and the concrete lifetime being returned.
To fix this error, we’ll add generic lifetime parameters that define the relationship between the references so the borrow checker can perform its analysis.
Lifetime Annotation Syntax
The lifetime annotation doesn't change the lifetime references.
A function can accept a reference with any lifetime by specifying a generic lifetime parameter, just like functions can accept any type when the signature specifies a generic type parameter.
Lifetime annotations describe the relationships of the lifetimes of multiple references to each other without affecting the lifetimes.
The names of lifetime parameters must start with '
, usually lowercase and very short. Most people uses 'a
. We place the lifetime parameter annotations after &
, with a space before the reference's type.
Example:
&i32 // a reference
&'a i32 // a reference with an explicit lifetime
&'a mut i32 // a mutable reference with an explicit lifetime
A single lifetime annotation doesn't have much meaning. The annotations are meant to tell Rust how lifetime annotation parameters of multiple references relate to each other.
For example, in a function we can define two parameters with the same lifetime annotation parameter to indicate the two have the same lifetime.
Lifetime Annotations in Function Signatures
Like with generic type parameters, we define generic lifetime parameters inside angle brackets between the function name and parameter list.
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
if x.len() > y.len() {
x
} else {
y
}
}
In the above example, we want the two parameters and the return type to all have the same lifetime, so we declare one lifetime annotation parameter and assign it to each of those.
The function signature now tells Rust that for some lifetime 'a
, the function takes two parameters (both of which are string slices with a lifetime of at least as long as lifetime of 'a
).
This means that the reference returned by the function is the same as the shorter of the two parameters' lifetimes.
Setting the parameter to 'a
doesn't change the lifetime of any reference, rather the value of 'a
is determined by the references.
When we pass the two concrete lifetimes of the references, the concrete lifetime that is substituted for 'a
is the scope of x
that overlaps with the scope of y
- aka it will be the shorter of the two lifetimes - aka it will be as long as both lifetimes are valid.
Let's take a look at some examples:
fn main() {
let string1 = String::from("long string is long");
{
let string2 = String::from("xyz");
let result = longest(string1.as_str(), string2.as_str());
println!("The longest string is {}", result);
}
}
string1
is valid until the end of the outer scope, string2
is valid until the end of the inner scope. This means result
is valid until the end of the inner scope.
fn main() {
let string1 = String::from("long string is long");
let result;
{
let string2 = String::from("xyz");
result = longest(string1.as_str(), string2.as_str());
}
println!("The longest string is {}", result);
}
This fails compilation.
$ cargo run
Compiling chapter10 v0.1.0 (file:///projects/chapter10)
error[E0597]: `string2` does not live long enough
--> src/main.rs:6:44
|
6 | result = longest(string1.as_str(), string2.as_str());
| ^^^^^^^ borrowed value does not live long enough
7 | }
| - `string2` dropped here while still borrowed
8 | println!("The longest string is {}", result);
| ------ borrow later used here
error: aborting due to previous error
For more information about this error, try `rustc --explain E0597`.
error: could not compile `chapter10`.
To learn more, run the command again with --verbose.
Rust says string2
didn't live long enough. It knows this because the function signature has generic lifetime annotation parameter among the parameters and return reference. It knows the lifetime of the returned reference is as long as string2
, so it cannot use the reference outside the scope of string2
.
Thinking in Terms of Lifetimes
The way you need to specify lifetime parameters depends on your function.
Example:
fn longest<'a>(x: &'a str, y: &str) -> &'a str {
x
}
We do not need an annotation on y
because the lifetime of y
has no relation with the lifetime of x
or the return value.
The lifetime parameter for the return type needs to match the lifetime parameter for one of the parameters when returning a reference from a function. If it didn't, that means it is referring to a value created in the function, meaning the value will become invalid outside of the function, making the compiler not happy.
This won't compile. Even though we have a lifetime parameter on the return type, it isn't related to one of the parameters of the function. More importantly, it is a dangling reference, so the whole lifetime parameter issue is moot anyway:
fn longest<'a>(x: &str, y: &str) -> &'a str {
let result = String::from("really long string");
result.as_str()
}
Giving us this error message:
$ cargo run
Compiling chapter10 v0.1.0 (file:///projects/chapter10)
error[E0515]: cannot return value referencing local variable `result`
--> src/main.rs:11:5
|
11 | result.as_str()
| ------^^^^^^^^^
| |
| returns a value referencing data owned by the current function
| `result` is borrowed here
error: aborting due to previous error
For more information about this error, try `rustc --explain E0515`.
error: could not compile `chapter10`.
To learn more, run the command again with --verbose.
Lifetime Annotations in Struct Definitions
It is possible for structs to hold references, but we would need to add a lifetime annotation on every reference in the struct's definition.
Quick refresher - why doesn't Rust have the lifetime to be equal to that of the struct?
The struct has a reference to a value and doesn't own it. It doesn't control when it becomes out of scope.
struct ImportantExcerpt<'a> {
part: &'a str,
}
fn main() {
let novel = String::from("Call me Ishmael. Some years ago...");
let first_sentence = novel.split('.').next().expect("Could not find a '.'");
let i = ImportantExcerpt {
part: first_sentence,
};
}
Here, ImportantExcerpt
has a field that is a string slice. Note the angle brackets after the struct name. This annotation means an instance of ImportantExcerpt
can’t outlive the reference it holds in its part field.
In the example above, ImportantExcerpt
is created in scope after novel
which is what provides the part
field its value. The struct doesn't go out of scope after novel
.
Lifetime Elision
The following method compiles without lifetime annotations:
fn first_word(s: &str) -> &str {
let bytes = s.as_bytes();
for (i, &item) in bytes.iter().enumerate() {
if item == b' ' {
return &s[0..i];
}
}
&s[..]
}
We expect the refernces to have lifetime annotations:
fn first_word<'a>(s: &'a str) -> &'a str {
but this got super repetitive. The compiler was made to infer the lifetimes in these specific examples.
Lifetime elision rules are these patterns programmed into Rust's analysis of references. If the compiler finds a specific pattern, we do not need to specify the lifetimes.
The rules do not provide full inferences, though. If there are some references without a lifetime annotation after the elision rules are applied, the compiler will fail.
- input lifetimes
- lifetimes on function/method parameters
- output lifetimes
- lifetiems on return values
The rules:
- Each parameter that is a reference gets its own lifetime annotation.
- If there is exactly one input lifetime parameter, that lifetime is assigned to all output lifetime parameters.
- If there are multiple input lifetime parameters, but one is
&self
or&mut self
(indicating this is a method, not a function), the lifetime ofself
is assigned to all output lifetime parameters.
Example 1
fn first_word(s: &str) -> &str {
fn first_word<'a>(s: &'a str) -> &str { // Rule 1
fn first_word<'a>(s: &'a str) -> &'a str { // Rule 2
The compiler stops, since all references have a lifetime.
Example 2
fn longest(x: &str, y: &str) -> &str {
fn longest<'a, 'b>(x: &'a str, y: &'b str) -> &str { // Rule 1
Rule 2 doesn't apply since there is more than 1 input lifetime parameter.
Rule 3 doesn't apply since this is a function, not a method.
Compiler fails, since we couldn't figure out the lifetimes of all the signature.
Lifetime Annotations in Method Definitions
We implement methods on structs with lifetimes by using the same syntax as generics.
TBD
The Static Lifetime
'static
lifetime means the reference can live for the entire duration of the program.
All string literals have the 'static
lifetimes.
We can annotate strings like:
let s: &'static str = "I have a static lifetime.";
The text of the string is stored in the binary, so it's always available.
Double check if you really need static lifetime parameters....