How can I pass a reference to a stack variable to a thread?

ghz 7months ago ⋅ 102 views

I'm writing a WebSocket server where a web client connects to play chess against a multithreaded computer AI. The WebSocket server wants to pass a Logger object into the AI code. The Logger object is going to pipe down log lines from the AI to the web client. The Logger must contain a reference to the client connection.

I'm confused about how lifetimes interact with threads. I've reproduced the problem with a Wrapper struct parameterized by a type. The run_thread function tries to unwrap the value and log it.

use std::fmt::Debug;
use std::thread;

struct Wrapper<T: Debug> {
    val: T,
}

fn run_thread<T: Debug>(wrapper: Wrapper<T>) {
    let thr = thread::spawn(move || {
        println!("{:?}", wrapper.val);
    });

    thr.join();
}

fn main() {
    run_thread(Wrapper::<i32> { val: -1 });
}

The wrapper argument lives on the stack, and its lifetime doesn't extend past run_thread's stack frame, even though the thread will be joined before the stack frame ends. I'd could copy the value off the stack:

use std::fmt::Debug;
use std::thread;

struct Wrapper<T: Debug + Send> {
    val: T,
}

fn run_thread<T: Debug + Send + 'static>(wrapper: Wrapper<T>) {
    let thr = thread::spawn(move || {
        println!("{:?}", wrapper.val);
    });

    thr.join();
}

fn main() {
    run_thread(Wrapper::<i32> { val: -1 });
}

This will not work if T is a reference to a big object I don't want copied:

use std::fmt::Debug;
use std::thread;

struct Wrapper<T: Debug + Send> {
    val: T,
}

fn run_thread<T: Debug + Send + 'static>(wrapper: Wrapper<T>) {
    let thr = thread::spawn(move || {
        println!("{:?}", wrapper.val);
    });

    thr.join();
}

fn main() {
    let mut v = Vec::new();
    for i in 0..1000 {
        v.push(i);
    }

    run_thread(Wrapper { val: &v });
}

Which results in:

error: `v` does not live long enough
  --> src/main.rs:22:32
   |
22 |     run_thread(Wrapper { val: &v });
   |                                ^ does not live long enough
23 | }
   | - borrowed value only lives until here
   |
   = note: borrowed value must be valid for the static lifetime...

The only solution I can think of is to use an Arc.

use std::fmt::Debug;
use std::sync::Arc;
use std::thread;

struct Wrapper<T: Debug + Send + Sync + 'static> {
    arc_val: Arc<T>,
}

fn run_thread<T: Debug + Send + Sync + 'static>(wrapper: &Wrapper<T>) {
    let arc_val = wrapper.arc_val.clone();
    let thr = thread::spawn(move || {
        println!("{:?}", *arc_val);
    });

    thr.join();
}

fn main() {
    let mut v = Vec::new();
    for i in 0..1000 {
        v.push(i);
    }

    let w = Wrapper { arc_val: Arc::new(v) };
    run_thread(&w);

    println!("{}", (*w.arc_val)[0]);
}

In my real program, it appears that both the Logger and the connection object must be placed in Arc wrappers. It seems annoying that the client is required to box the connection in an Arc when it is internal to the library that the code is parallelized. This is especially annoying because the lifetime of the connection is guaranteed to be greater than the lifetime of the worker threads.

Have I missed something?

Answers

In Rust, the key challenge you're facing is ensuring that all threads have valid references to the data they are working with. Since threads can outlive the scope they were created in, Rust requires that any data shared between threads have a 'static lifetime, which can often lead to needing to use Arc for safe, concurrent access.

Here’s a detailed explanation of how you can address your issue:

The Challenge

The primary challenge is ensuring that the data passed to the thread remains valid for the duration of the thread's execution. Rust's ownership model ensures memory safety, and part of that is preventing data races and use-after-free errors.

When you pass data to a thread, Rust requires the data to be 'static because the thread may run longer than the function from which it was spawned. This is why using references (&T) directly is problematic unless the reference itself has a 'static lifetime.

The Arc Solution

Using Arc (Atomic Reference Counting) is the idiomatic way in Rust to share ownership of data across threads. Arc allows multiple owners of the same data, and it ensures that the data is not deallocated until the last Arc is dropped. This approach works well for read-heavy workloads, where you don't need to mutate the data being shared.

Here’s how you can use Arc in your specific context:

Example Code

use std::fmt::Debug;
use std::sync::Arc;
use std::thread;

struct Wrapper<T: Debug + Send + Sync + 'static> {
    arc_val: Arc<T>,
}

fn run_thread<T: Debug + Send + Sync + 'static>(wrapper: &Wrapper<T>) {
    let arc_val = Arc::clone(&wrapper.arc_val);
    let thr = thread::spawn(move || {
        println!("{:?}", *arc_val);
    });

    thr.join().expect("Thread panicked");
}

fn main() {
    let mut v = Vec::new();
    for i in 0..1000 {
        v.push(i);
    }

    let w = Wrapper { arc_val: Arc::new(v) };
    run_thread(&w);

    println!("{}", (*w.arc_val)[0]);
}

Explanation

  1. Wrapper Struct:

    • The Wrapper struct now contains an Arc<T> instead of T. This allows the data to be shared across threads safely.
  2. run_thread Function:

    • The run_thread function takes a reference to a Wrapper<T>.
    • It clones the Arc to increase the reference count and passes the cloned Arc to the new thread.
    • The move keyword ensures that the closure takes ownership of the Arc, allowing it to be used inside the thread.
  3. Main Function:

    • A vector v is created and populated.
    • A Wrapper instance w is created with an Arc containing the vector v.
    • run_thread is called with a reference to w.
    • The program prints the first element of the vector from the Arc after the thread has run.

Alternatives to Arc

If the lifetime of the data can be guaranteed to be longer than the threads, you might be able to use raw pointers with unsafe code. However, this is generally not recommended due to the potential for undefined behavior and memory safety issues. Using Arc is the safe and idiomatic way to handle such scenarios in Rust.

Conclusion

While it might seem cumbersome to wrap data in Arc for threading, this approach is necessary to maintain Rust's guarantees about memory safety and data race prevention. The Arc abstraction ensures that your data remains valid across multiple threads without risking unsafe memory access. This is the recommended way to handle shared data in multithreaded Rust programs.