# Signal-Safe Stack Tracing - [Overview](#overview) - [Big-Picture](#big-picture) - [API](#api) - [Strategy](#strategy) - [Technical Requirements](#technical-requirements) - [Signal-Safe Tracing With `fork()` + `exec()`](#signal-safe-tracing-with-fork--exec) - [In the main program](#in-the-main-program) - [In the tracer program](#in-the-tracer-program) # Overview Stack traces from signal handlers can provide very helpful information for debugging application crashes, e.g. from SIGSEGV or SIGTRAP handlers. Signal handlers are really restrictive environments as your application could be interrupted by a signal at any point, including in the middle of malloc or buffered IO or while holding a lock. Doing a stack trace in a signal handler is possible but it requires a lot of care. This is difficult to do correctly and most examples online do this incorrectly. It is not possible to resolve debug symbols safely in the process from a signal handler without heroic effort. In order to produce a full trace there are three options: 1. Carefully save the object trace information to be resolved at a later time outside the signal handler 2. Write the object trace information to a file to be resolved later 3. Spawn a new process, communicate object trace information to that process, and have that process do the trace resolution For traces on segfaults, e.g., only options 2 and 3 are viable. The this guide will go over approach 3 and the cpptrace safe API. # Big-Picture In order to do this full process safely the way to go is collecting basic information in the signal handler and then either resolving later or handing that information to another process to resolve. It's not as simple as calling `cpptrace::generate_trace().print()` but this is what is needed to really do this safely as far as I can tell. FAQ: What's the worst that could happen if you call `cpptrace::generate_trace().print()` from a signal handler? In many cases you might be able to get away with it but you risk deadlocking or memory corruption. > [!IMPORTANT] > Currently signal-safe stack unwinding is only possible with `libunwind`, more details later. > [!IMPORTANT] > `_dl_find_object` is required for signal-safe stack tracing. This is a relatively recent addition to glibc, added in > glibc 2.35. > [!CAUTION] > Calls to shared objects can be lazy-loaded where the first call to the shared object invokes non-signal-safe functions > such as `malloc()`. Because of this, the signal safe api must be "warmed up" ahead of a signal handler. # API Cpptrace provides APIs for generating raw trace information safely and then also safely resolving those raw pointers to the most minimal object information needed to resolve later. ```cpp namespace cpptrace { // signal-safe std::size_t safe_generate_raw_trace(frame_ptr* buffer, std::size_t size, std::size_t skip = 0); // signal-safe std::size_t safe_generate_raw_trace(frame_ptr* buffer, std::size_t size, std::size_t skip, std::size_t max_depth); struct safe_object_frame { frame_ptr raw_address; frame_ptr address_relative_to_object_start; char object_path[CPPTRACE_PATH_MAX + 1]; object_frame resolve() const; // To be called outside a signal handler. Not signal safe. }; // signal-safe void get_safe_object_frame(frame_ptr address, safe_object_frame* out); // signal-safe bool can_signal_safe_unwind(); bool can_get_safe_object_frame(); } ``` `safe_generate_raw_trace` produces instruction pointers found while unwinding the stack. These addresses are sufficient information to resolve a stack trace in the currently running process after a signal handler exits, but they aren't sufficient for resolving outside of the currently running process, unless there is no position-independent code or shared-library code. To resolve outside the current process `safe_object_frame` information is needed. This contains the path to the object where the address is located as well as the address before address randomization. # Strategy Signal-safe tracing can be done three ways: - In a signal handler, call `safe_generate_raw_trace` and then outside a signal handler construct a `cpptrace:raw_trace` and resolve. - In a signal handler, call `safe_generate_raw_trace`, then write `cpptrace::safe_object_frame` information to a file to be resolved later. - In a signal handler, call `safe_generate_raw_trace`, `fork()` and `exec()` a process to handle the resolution, pass `cpptrace::safe_object_frame` information to that child through a pipe, and wait for the child to exit. It's not as simple as calling `cpptrace::generate_trace().print()`, I know, but these are truly the only ways to do this safely as far as I can tell. # Technical Requirements **Note:** Not all back-ends and platforms support these interfaces. If signal-safe unwinding isn't supported `safe_generate_raw_trace` will just produce an empty trace and if object information can't be resolved in a signal-safe way then `get_safe_object_frame` will not populate fields beyond the `raw_address`. `cpptrace::can_signal_safe_unwind` and `cpptrace::can_get_safe_object_frame` can be used to check for safe tracing support. Currently the only back-end that can unwind safely is libunwind. Currently, the only way I know to get `dladdr`'s information in a signal-safe manner is `_dl_find_object`, which doesn't exist on macos (or windows of course). If anyone knows ways to do these safely on other platforms, I'd be much appreciative. # Signal-Safe Tracing With `fork()` + `exec()` Of the three strategies, `fork()` + `exec()`, is the most technically involved and the only way to resolve while the signal handler is running. I think it's worthwhile to do a deep-dive into how to do this. In the source code, [`signal_demo.cpp`](../test/signal_demo.cpp) and [`signal_tracer.cpp`](../test/signal_tracer.cpp) provide a working example for what is described here. ## In the main program The main program handles most of the complexity for tracing from signal handlers: - Collecting a raw trace - Spawning a child process - Resolving raw frame pointers to minimal object frames - Sending that info to the other process Also note: A warmup for the library is done in main. A basic implementation is as follows: ```cpp #include #include #include #include #include // This is just a utility I like, it makes the pipe API more expressive. struct pipe_t { union { struct { int read_end; int write_end; }; int data[2]; }; }; void do_signal_safe_trace(cpptrace::frame_ptr* buffer, std::size_t count) { // Setup pipe and spawn child pipe_t input_pipe; pipe(input_pipe.data); const pid_t pid = fork(); if(pid == -1) { const char* fork_failure_message = "fork() failed\n"; write(STDERR_FILENO, fork_failure_message, strlen(fork_failure_message)); return; } if(pid == 0) { // child dup2(input_pipe.read_end, STDIN_FILENO); close(input_pipe.read_end); close(input_pipe.write_end); execl("signal_tracer", "signal_tracer", nullptr); const char* exec_failure_message = "exec(signal_tracer) failed: Make sure the signal_tracer executable is in " "the current working directory and the binary's permissions are correct.\n"; write(STDERR_FILENO, exec_failure_message, strlen(exec_failure_message)); _exit(1); } // Resolve to safe_object_frames and write those to the pipe for(std::size_t i = 0; i < count; i++) { cpptrace::safe_object_frame frame; cpptrace::get_safe_object_frame(buffer[i], &frame); write(input_pipe.write_end, &frame, sizeof(frame)); } close(input_pipe.read_end); close(input_pipe.write_end); // Wait for child waitpid(pid, nullptr, 0); } void handler(int signo, siginfo_t* info, void* context) { // Print basic message const char* message = "SIGSEGV occurred:\n"; write(STDERR_FILENO, message, strlen(message)); // Generate trace constexpr std::size_t N = 100; cpptrace::frame_ptr buffer[N]; std::size_t count = cpptrace::safe_generate_raw_trace(buffer, N); do_signal_safe_trace(buffer, count); // Up to you if you want to exit or continue or whatever _exit(1); } void warmup_cpptrace() { // This is done for any dynamic-loading shenanigans cpptrace::frame_ptr buffer[10]; std::size_t count = cpptrace::safe_generate_raw_trace(buffer, 10); cpptrace::safe_object_frame frame; cpptrace::get_safe_object_frame(buffer[0], &frame); } int main() { warmup_cpptrace(); // Setup signal handler struct sigaction action = { 0 }; action.sa_flags = 0; action.sa_sigaction = &handler; if (sigaction(SIGSEGV, &action, NULL) == -1) { perror("sigaction"); } /// ... } ``` ## In the tracer program The tracer program is quite simple. It just has to read `cpptrace::safe_object_frame`s from the pipe, resolve to `cpptrace::object_frame`s, and resolve an `object_trace`. ```cpp #include #include #include #include int main() { cpptrace::object_trace trace; while(true) { cpptrace::safe_object_frame frame; // fread used over read because a read() from a pipe might not read the full frame std::size_t res = fread(&frame, sizeof(frame), 1, stdin); if(res == 0) { break; } else if(res != 1) { std::cerr<<"Something went wrong while reading from the pipe"<