I have a program which loads a C++ library to query its version number via its exported function. The foreign function returns C++ std::string
that I use with the cxx
crate in Rust. The code produces significantly different behaviour when it is built with debug mode and release mode.
unsafe {
let lib = Library::new("liblogic.so").unwrap();
let logic_getVersion: Symbol<unsafe extern "C" fn() -> *const CxxString> = lib.get(b"getVersion").unwrap();
println!("logic.getVersion={:?}", logic_getVersion);
let ptr_version = std::hint::black_box(logic_getVersion());
println!("ptr_version={:?}", ptr_version);
let cxx_version = std::hint::black_box(ptr_version.as_ref().unwrap());
println!("cxx_version={:p}", &cxx_version);
assert_eq!(ptr_version, &*cxx_version);
let ptr_version_cstr = std::hint::black_box(cxx_version.as_ptr()); // <= Release build gets invalid pointer here!!!
println!("ptr_version_cstr={:?}", ptr_version_cstr);
let mut strbuffer = [0_u8; 16];
println!("strbuffer.as_mut_ptr()={:?}", strbuffer.as_mut_ptr());
println!("strbuffer.len()={}", strbuffer.len());
std::ptr::copy(ptr_version_cstr, strbuffer.as_mut_ptr(), strbuffer.len());
println!("Logic version: {:?}", strbuffer);
let str_version = std::str::from_utf8(&strbuffer).unwrap();
println!("{str_version}");
}
Output of the debug build:
logic.getVersion=Symbol "getVersion"@0x76b6dac04c15 from "/opt/local/lib/liblogic.so"
ptr_version=0x7ffefd18bc70
cxx_version=0x7ffefd18bd20
ptr_version_cstr=0x55ad67ac3528
strbuffer.as_mut_ptr()=0x7ffefd18be10
strbuffer.len()=16
Logic version: [49, 46, 48, 46, 50, 55, 45, 83, 78, 65, 80, 83, 72, 79, 84, 0]
1.0.27-SNAPSHOT
Output of the release build:
logic.getVersion=Symbol "getVersion"@0x749eee404c15 from "/opt/local/lib/liblogic.so"
ptr_version=0x568a0b899510
cxx_version=0x7fff67901ec8
ptr_version_cstr=0x737265765f787863
strbuffer.as_mut_ptr()=0x7fff67901ef0
strbuffer.len()=16
Segmentation fault
What is apparent is that I get significantly different pointers each time. I'd expect ptr_version
, cxx_version
and ptr_version_cstr
to be somewhat close to each other which is true with the debug build. But the release build gives me a pointer that is visibly off:
ptr_version_cstr=0x737265765f787863
This is the result of a CxxString.as_ptr()
call in order to get the C *char
to finally access the characters of the version string.
Obviously, the crash happens with the std::ptr::copy
call which I added to try to copy the C string to a safe buffer before constructing a Rust str
from it. Originally I constructed the str
right from the buffer returned by the library, which also worked in debug and crashed the release.
I suspect Rust optimizes something away in release mode which would be essential for the program to not crash, so I added the std::hint::black_box
calls, but it didn't help anything. I also suspect Rust moves the values because it doesn't know that it shouldn't touch memory managed by the guest library, so I tried to Pin
the values but it didn't help either.
Note that I have very aggressive optimization settings because I want to produce as small binary as possible.
[profile.release]
lto = true
opt-level = "z"
strip = true
But turning them off doesn't help. I have no idea what's going on. Even if the library does something nasty, I couldn't explain why it works with the debug build.
UPDATE: I simplified the source to this (turned out the previous one moved the CxxString which should never happen), but it didn't help. It still looks better because I can get rid of raw pointer dereferences and use &CxxString
which is treated as a native Rust borrow.
let lib = unsafe { Library::new("liblogic.so").unwrap() };
let logic_getVersion: Symbol<unsafe extern "C" fn() -> &'a CxxString> = unsafe { lib.get(b"getVersion").unwrap() };
println!("logic.getVersion={:?}", logic_getVersion);
let cxx_version = pin!(std::hint::black_box(unsafe { logic_getVersion() }));
println!("&cxx_version={:p}", cxx_version);
let ptr_version_cstr = std::hint::black_box(cxx_version.as_ptr()); // <= Release build gets invalid pointer here!!!
println!("cxx_version.len()={}; ptr_version_cstr={:?}", cxx_version.len(), ptr_version_cstr);
println!("cxx_version.as_bytes().len()={}", cxx_version.as_bytes().len());
let end_index = cxx_version.as_bytes().iter().position(|&c| c == 0).unwrap();
println!("end_index={end_index}");
let buf_version = &cxx_version.as_bytes()[..end_index];
println!("buf_version={:?}", buf_version);
let str_version = std::str::from_utf8(buf_version).unwrap();
println!("{str_version}");
The current output:
logic.getVersion=Symbol "getVersion"@0x75becd804c15 from "/opt/local/lib/liblogic.so"
&cxx_version=0x7ffc12baf340
cxx_version.len()=3996997709695248755; ptr_version_cstr=0x7265765f78786326
cxx_version.as_bytes().len()=2913377524265152361
Segmentation fault
It's strange how .len()
reports a surreal length for the string, but the debug build does the same and works:
logic.getVersion=Symbol "getVersion"@0x7c17a5204c15 from "/opt/local/lib/liblogic.so"
&cxx_version=0x7ffea69ea380
cxx_version.len()=9223372036854775825; ptr_version_cstr=0x5f0608e47938
cxx_version.as_bytes().len()=9223372036854775825
end_index=15
buf_version=[49, 46, 48, 46, 50, 55, 45, 83, 78, 65, 80, 83, 72, 79, 84]
1.0.27-SNAPSHOT
The cxx
crate's documentation mentions to declare a bridge with [#cxx::bridge]
, but it wants to link the declared function at compile time, while the library is supposed to be loaded at runtime. I'm not sure if I must declare a bridge for the crate to work, because it seems to bridge std::string
anyway.
ANOTHER UPDATE: With GDB, I found that the bridge to the libstdc++
library's std::string
functions (e.g. cxxbridge1$cxx_string$data
) are completely optimized out. They are literally just 2 instructions immediately returning. In the debug build they are implemented properly, eventually calling C++ basic_string
functions. Now the question is why Rust is optimizing out function calls those are indeed getting used and how to prevent this behaviour.
Answers
The behavior you're experiencing, where the release build produces invalid pointers and crashes, while the debug build works fine, suggests that there might be undefined behavior or memory corruption happening in your code. The optimization level and aggressive optimization settings in the release build might be exposing this issue more prominently.
Here are some suggestions to debug and potentially fix the problem:
-
Check for Undefined Behavior: Review your code for any instances of undefined behavior, such as accessing uninitialized memory, accessing out-of-bounds memory, or dereferencing null pointers. Even if the debug build seems to work, undefined behavior can manifest differently in release builds.
-
Inspect Library Loading: Ensure that the library (
liblogic.so
) is being loaded correctly and that the symbols are resolved properly. Make sure that the library is compiled with compatible settings and that it exposes thegetVersion
function as expected. -
Review Pointer Handling: Double-check how you're handling pointers returned by the library functions. Ensure that you're not inadvertently dereferencing null pointers or accessing memory that has been deallocated.
-
Review C++ Library Usage: If the Rust code is interacting with a C++ library through FFI, ensure that the interactions are correctly managed. Pay attention to memory ownership, lifetime management, and potential differences between debug and release builds in how memory is managed.
-
Review Optimization Settings: Experiment with different optimization settings in the release build to see if changing them affects the behavior. Sometimes, extremely aggressive optimization settings can lead to unexpected behavior, especially if they interact poorly with certain code patterns or library calls.
-
Check for Rust Compiler Bugs: Although rare, it's possible that the Rust compiler might be optimizing out necessary function calls incorrectly. If you suspect this might be the case, consider filing a bug report with the Rust project, providing a minimal reproducible example.
-
Use Tools for Memory Analysis: Consider using tools like Valgrind or AddressSanitizer to perform runtime memory analysis on your release build. These tools can help identify memory errors such as use-after-free or buffer overflows that might be causing the segmentation fault.
-
Check Rust FFI Documentation: Review the documentation of the
cxx
crate and Rust FFI in general to ensure that you're using them correctly. It's possible that there might be subtle nuances or requirements in how FFI is used that you might have overlooked.
By carefully reviewing your code, checking for potential memory errors, and experimenting with different settings, you should be able to identify and resolve the issue causing the segmentation fault in the release build.