I'm learning/experimenting with Rust, and in all the elegance that I find in this language, there is one peculiarity that baffles me and seems totally out of place.
Rust automatically dereferences pointers when making method calls. I made some tests to determine the exact behaviour:
struct X { val: i32 }
impl std::ops::Deref for X {
type Target = i32;
fn deref(&self) -> &i32 { &self.val }
}
trait M { fn m(self); }
impl M for i32 { fn m(self) { println!("i32::m()"); } }
impl M for X { fn m(self) { println!("X::m()"); } }
impl M for &X { fn m(self) { println!("&X::m()"); } }
impl M for &&X { fn m(self) { println!("&&X::m()"); } }
impl M for &&&X { fn m(self) { println!("&&&X::m()"); } }
trait RefM { fn refm(&self); }
impl RefM for i32 { fn refm(&self) { println!("i32::refm()"); } }
impl RefM for X { fn refm(&self) { println!("X::refm()"); } }
impl RefM for &X { fn refm(&self) { println!("&X::refm()"); } }
impl RefM for &&X { fn refm(&self) { println!("&&X::refm()"); } }
impl RefM for &&&X { fn refm(&self) { println!("&&&X::refm()"); } }
struct Y { val: i32 }
impl std::ops::Deref for Y {
type Target = i32;
fn deref(&self) -> &i32 { &self.val }
}
struct Z { val: Y }
impl std::ops::Deref for Z {
type Target = Y;
fn deref(&self) -> &Y { &self.val }
}
#[derive(Clone, Copy)]
struct A;
impl M for A { fn m(self) { println!("A::m()"); } }
impl M for &&&A { fn m(self) { println!("&&&A::m()"); } }
impl RefM for A { fn refm(&self) { println!("A::refm()"); } }
impl RefM for &&&A { fn refm(&self) { println!("&&&A::refm()"); } }
fn main() {
// I'll use @ to denote left side of the dot operator
(*X{val:42}).m(); // i32::m() , Self == @
X{val:42}.m(); // X::m() , Self == @
(&X{val:42}).m(); // &X::m() , Self == @
(&&X{val:42}).m(); // &&X::m() , Self == @
(&&&X{val:42}).m(); // &&&X:m() , Self == @
(&&&&X{val:42}).m(); // &&&X::m() , Self == *@
(&&&&&X{val:42}).m(); // &&&X::m() , Self == **@
println!("-------------------------");
(*X{val:42}).refm(); // i32::refm() , Self == @
X{val:42}.refm(); // X::refm() , Self == @
(&X{val:42}).refm(); // X::refm() , Self == *@
(&&X{val:42}).refm(); // &X::refm() , Self == *@
(&&&X{val:42}).refm(); // &&X::refm() , Self == *@
(&&&&X{val:42}).refm(); // &&&X::refm(), Self == *@
(&&&&&X{val:42}).refm(); // &&&X::refm(), Self == **@
println!("-------------------------");
Y{val:42}.refm(); // i32::refm() , Self == *@
Z{val:Y{val:42}}.refm(); // i32::refm() , Self == **@
println!("-------------------------");
A.m(); // A::m() , Self == @
// without the Copy trait, (&A).m() would be a compilation error:
// cannot move out of borrowed content
(&A).m(); // A::m() , Self == *@
(&&A).m(); // &&&A::m() , Self == &@
(&&&A).m(); // &&&A::m() , Self == @
A.refm(); // A::refm() , Self == @
(&A).refm(); // A::refm() , Self == *@
(&&A).refm(); // A::refm() , Self == **@
(&&&A).refm(); // &&&A::refm(), Self == @
}
So, it seems that, more or less:
-
The compiler will insert as many dereference operators as necessary to invoke a method.
-
The compiler, when resolving methods declared using
&self
(call-by-reference):
- First tries calling for a single dereference of
self
- Then tries calling for the exact type of
self
- Then, tries inserting as many dereference operators as necessary for a match
- First tries calling for a single dereference of
-
Methods declared using
self
(call-by-value) for typeT
behave as if they were declared using&self
(call-by-reference) for type&T
and called on the reference to whatever is on the left side of the dot operator. -
The above rules are first tried with raw built-in dereferencing, and if there's no match, the overload with
Deref
trait is used.
What are the exact auto-dereferencing rules? Can anyone give any formal rationale for such a design decision?
Answers
The behavior you're observing in Rust, known as auto-dereferencing, allows method calls to work seamlessly on references and values by automatically applying dereference operations as needed. This feature makes method calls more ergonomic and intuitive, especially in the presence of pointer-like types (e.g., Box
, Rc
, Arc
, etc.). Let's dive deeper into the rules and rationale behind this design decision.
Auto-Dereferencing Rules
The rules governing auto-dereferencing in Rust can be summarized as follows:
-
Method Resolution and Dereferencing:
- When you call a method
foo.bar()
, Rust will first attempt to resolvebar
directly on the type offoo
. - If that fails, Rust will dereference
foo
(usingDeref
if applicable) and try again. - This process repeats until the method is found or Rust runs out of dereference operations.
- When you call a method
-
Methods with
&self
,&mut self
, andself
:- Methods defined with
&self
can be called on values and references because Rust can automatically borrow the value or dereference the reference as needed. - Methods defined with
&mut self
follow similar logic, but they require a mutable reference. - Methods defined with
self
take ownership of the value. Rust allows calling these methods on references by dereferencing them first.
- Methods defined with
-
Trait Method Resolution:
- When a method call involves traits, Rust's method resolution includes checking the dereferenced types and applying trait methods if necessary.
- The same dereferencing logic applies, allowing trait methods to be called on references if the underlying type implements the trait.
Examples
Here are some examples to illustrate these rules in practice:
Example 1: Simple Struct and Method Call
struct X {
val: i32,
}
impl X {
fn consume(self) {
println!("X consumed with value {}", self.val);
}
fn borrow(&self) {
println!("X borrowed with value {}", self.val);
}
}
fn main() {
let x = X { val: 42 };
x.consume(); // Takes ownership of x
// x.consume(); // Error: x has been moved
let y = X { val: 100 };
let y_ref = &y;
y_ref.borrow(); // Automatically borrows y
y.borrow(); // Also works directly on y
}
Example 2: Using Deref
use std::ops::Deref;
struct MyBox<T>(T);
impl<T> Deref for MyBox<T> {
type Target = T;
fn deref(&self) -> &T {
&self.0
}
}
fn main() {
let x = MyBox(5);
// x is dereferenced to call `Deref`'s `Target` method
println!("Value inside MyBox: {}", *x);
}
Formal Rationale
The rationale for auto-dereferencing in Rust includes several considerations:
-
Ergonomics:
- Auto-dereferencing simplifies method calls by reducing the need for explicit dereferencing.
- It allows methods to be defined on the owned type and still be easily callable on references, making the code more flexible and easier to read.
-
Consistency with Other Languages:
- Many high-level languages (e.g., C++, Python) provide implicit dereferencing in method calls, making Rust feel more familiar to developers coming from these backgrounds.
-
Flexibility:
- Auto-dereferencing enables seamless interoperability between different kinds of pointer-like types (
Box
,Rc
,Arc
, etc.) and their underlying types. - It supports more expressive and concise code, especially in contexts where types need to interact with various levels of indirection.
- Auto-dereferencing enables seamless interoperability between different kinds of pointer-like types (
Conclusion
Auto-dereferencing in Rust is a carefully designed feature that enhances the language's ergonomics and consistency without sacrificing safety. It allows developers to write more intuitive and flexible code while leveraging Rust's powerful type system and borrowing rules. The design decision to include auto-dereferencing strikes a balance between convenience and explicitness, aligning with Rust's overall goals of safety and performance.