How can I pass a reference to a stack variable to a thread?

ghz 21days ago ⋅ 14 views

I'm writing a WebSocket server where a web client connects to play chess against a multithreaded computer AI. The WebSocket server wants to pass a Logger object into the AI code. The Logger object is going to pipe down log lines from the AI to the web client. The Logger must contain a reference to the client connection.

I'm confused about how lifetimes interact with threads. I've reproduced the problem with a Wrapper struct parameterized by a type. The run_thread function tries to unwrap the value and log it.

use std::fmt::Debug;
use std::thread;

struct Wrapper<T: Debug> {
    val: T,
}

fn run_thread<T: Debug>(wrapper: Wrapper<T>) {
    let thr = thread::spawn(move || {
        println!("{:?}", wrapper.val);
    });

    thr.join();
}

fn main() {
    run_thread(Wrapper::<i32> { val: -1 });
}

The wrapper argument lives on the stack, and its lifetime doesn't extend past run_thread's stack frame, even though the thread will be joined before the stack frame ends. I'd could copy the value off the stack:

use std::fmt::Debug;
use std::thread;

struct Wrapper<T: Debug + Send> {
    val: T,
}

fn run_thread<T: Debug + Send + 'static>(wrapper: Wrapper<T>) {
    let thr = thread::spawn(move || {
        println!("{:?}", wrapper.val);
    });

    thr.join();
}

fn main() {
    run_thread(Wrapper::<i32> { val: -1 });
}

This will not work if T is a reference to a big object I don't want copied:

use std::fmt::Debug;
use std::thread;

struct Wrapper<T: Debug + Send> {
    val: T,
}

fn run_thread<T: Debug + Send + 'static>(wrapper: Wrapper<T>) {
    let thr = thread::spawn(move || {
        println!("{:?}", wrapper.val);
    });

    thr.join();
}

fn main() {
    let mut v = Vec::new();
    for i in 0..1000 {
        v.push(i);
    }

    run_thread(Wrapper { val: &v });
}

Which results in:

error: `v` does not live long enough
  --> src/main.rs:22:32
   |
22 |     run_thread(Wrapper { val: &v });
   |                                ^ does not live long enough
23 | }
   | - borrowed value only lives until here
   |
   = note: borrowed value must be valid for the static lifetime...

The only solution I can think of is to use an Arc.

use std::fmt::Debug;
use std::sync::Arc;
use std::thread;

struct Wrapper<T: Debug + Send + Sync + 'static> {
    arc_val: Arc<T>,
}

fn run_thread<T: Debug + Send + Sync + 'static>(wrapper: &Wrapper<T>) {
    let arc_val = wrapper.arc_val.clone();
    let thr = thread::spawn(move || {
        println!("{:?}", *arc_val);
    });

    thr.join();
}

fn main() {
    let mut v = Vec::new();
    for i in 0..1000 {
        v.push(i);
    }

    let w = Wrapper { arc_val: Arc::new(v) };
    run_thread(&w);

    println!("{}", (*w.arc_val)[0]);
}

In my real program, it appears that both the Logger and the connection object must be placed in Arc wrappers. It seems annoying that the client is required to box the connection in an Arc when it is internal to the library that the code is parallelized. This is especially annoying because the lifetime of the connection is guaranteed to be greater than the lifetime of the worker threads.

Have I missed something?

Answer

Yes, you’ve correctly identified the core issue, and the solution you’re considering — using Arc to share ownership of the Logger and connection objects — is indeed one of the best ways to handle this situation, but let's dive into why this happens and explore any potential alternatives.

The Problem

In your example, you are passing a reference (&v) to a thread, and the Rust compiler is complaining because the reference may not be valid for the lifetime of the thread. The key reason for this is that the thread you're spawning may outlive the scope of the reference in the main function, and Rust’s borrow checker enforces strict lifetime rules to prevent dangling references.

Even though the thread is joined before the main function exits, the compiler cannot guarantee that the reference to v will live long enough to be used by the thread. Rust’s ownership and borrowing rules do not allow references to be shared across threads unless they are guaranteed to be valid for the 'static lifetime (i.e., the lifetime of the entire program). This is because, after spawning a thread, the compiler cannot guarantee when the thread will finish or what might happen in the interim, so it errs on the side of caution.

Why Arc is Necessary

In order for data to be shared safely across threads in Rust, ownership must be shared, and the data must implement Send (so it can be safely transferred between threads) and Sync (so it can be safely referenced by multiple threads simultaneously). However, if you're trying to pass data by reference (like &v), you run into lifetime issues because the reference must be guaranteed to live long enough.

To solve this, you wrap the data in a reference-counted pointer like Arc (atomic reference-counted), which allows multiple threads to own and share the data. Arc ensures that the data remains alive as long as any thread needs it.

Here’s why Arc works:

  • Ownership is shared across threads because Arc is a smart pointer that tracks the reference count of the underlying data.
  • The data inside Arc can be shared between threads because it’s safe for multiple threads to access it simultaneously if it implements Sync.
  • Arc is thread-safe, meaning it can be cloned and shared across threads without violating Rust’s ownership rules.

Example with Arc

You are right that the solution is to wrap your data in an Arc. Here’s an improved version of your code that uses Arc:

use std::sync::{Arc, Mutex};
use std::thread;

struct Wrapper<T: Debug + Send + Sync + 'static> {
    arc_val: Arc<T>,
}

fn run_thread<T: Debug + Send + Sync + 'static>(wrapper: Wrapper<T>) {
    let arc_val = wrapper.arc_val.clone();
    let thr = thread::spawn(move || {
        println!("{:?}", arc_val);
    });

    thr.join().unwrap();
}

fn main() {
    let mut v = Vec::new();
    for i in 0..1000 {
        v.push(i);
    }

    let wrapper = Wrapper { arc_val: Arc::new(v) };
    run_thread(wrapper);

    // Now you can access the data even after the thread finishes
    println!("{}", wrapper.arc_val[0]);
}

In this case, the Arc<T> ensures that the Vec remains alive for the duration of the thread's execution. When the thread is finished, Arc will automatically deallocate the memory if no more references remain.

Why Not Just Use References?

In Rust, you cannot easily pass references to data across threads unless that data lives for the 'static lifetime. This is because Rust cannot guarantee that the data will still be valid when the thread accesses it. When you're working with a multithreaded environment, it’s often better to take ownership of the data in the thread and move it across the thread boundary (using Arc or Mutex for shared ownership or mutability) rather than passing references around.

The "Annoying" Part

You mentioned that it’s annoying to box the connection in an Arc, especially since the lifetime of the connection is guaranteed to be greater than the lifetime of the worker threads. The issue arises because Rust cannot know at compile-time when the thread will finish or how the data will be used, so it enforces the 'static lifetime requirement.

While this may seem cumbersome, it’s a fundamental part of Rust’s memory safety model: Rust wants to make sure that you never end up with dangling references, especially in a multithreaded context where threads may outlive the scope of the reference they’re borrowing.

Alternatives and Workarounds

  1. Move Ownership of Data to Threads: Instead of trying to share a reference to the data, you can move ownership of the data into the thread by using Arc and Mutex (if mutability is required). For example:

    use std::sync::{Arc, Mutex};
    use std::thread;
    
    fn run_thread() {
        let data = Arc::new(Mutex::new(Vec::new()));
    
        let thr = thread::spawn({
            let data = Arc::clone(&data);
            move || {
                let mut data = data.lock().unwrap();
                data.push(42); // Modify the data
                println!("{:?}", data);
            }
        });
    
        thr.join().unwrap();
    }
    
    fn main() {
        run_thread();
    }
    

    Here, Mutex allows mutable access to the data, and Arc ensures that the data is shared safely across threads.

  2. Refactor the Design: If the connection’s lifetime is guaranteed to be longer than the threads, another approach might be to refactor your design so that the Logger can hold a reference to the connection (instead of needing to own it). For example, use a Weak reference or design your system so that the connection doesn’t need to be moved into the thread.

  3. Thread-Local Storage (TLS): If each thread will need access to the connection, you could also look into using thread-local storage (via std::thread::LocalKey), but that may be overkill for your use case and might introduce additional complexity.

Conclusion

You’ve understood the problem correctly. In Rust, when working with threads, you often need to ensure that the data you want to share across threads is owned by Arc or some other form of safe ownership. This is because Rust cannot guarantee the lifetime of a reference across threads. While it may feel inconvenient to use Arc, it’s the right approach for safely sharing data across threads in Rust.