Sanitize Your C/C++ Program With Sanitizers
“There is no truth. There is only perception.”
— Gustave Flaubert
Prologue
Illegal memory access and memory leaks are the most common bugs in C/C++ programming. However, C/C++ does not detect such errors by themselves, and these bugs can occur quite randomly and are difficult to reproduce.
Recently, I was working on a program repair project where I had to fix bugs in C/C++ code. I found that sanitizers are very useful in detecting certain types of bugs, such as memory related ones. In this article, I will show you how to use sanitizers to find hidden bugs in your C/C++ code.
The demonstration is done on Ubuntu 20.04 LTS in my WSL2 environment, but it should work everywhere as long as you have the compiler and the necessary tools installed.
Quite a lot of the content in this article is generated by Copilot.😉
Basic Knowledge
What are Sanitizers?
Sanitizers are a set of tools that help you find bugs in your code. They are built into the compiler to enable runtime checks that help you find bugs that are difficult to catch with static analysis tools. The most common sanitizers are:
- AddressSanitizer (ASan): Detects memory errors such as buffer overflows, use-after-free, and other memory corruption bugs.
- UndefinedBehaviorSanitizer (UBSan): Detects undefined behavior in your code.
- MemorySanitizer (MSan): Detects uninitialized memory reads.
- LeakSanitizer (LSan): Detects memory leaks. It already comes with AddressSanitizer.
- ThreadSanitizer (TSan): Detects data race in concurrent programs.
There are also more specialized sanitizers, such as divide-by-zero sanitizer, etc.
How does it work?
Sanitizers work by instrumenting your code with additional checks. For example, AddressSanitizer adds a redzone around each memory allocation and checks if the program accesses memory outside the allocated region. If it does, the program is terminated and a report is generated.
How to install?
As mentioned earlier, sanitizers are built into the compiler, so you don’t need to install anything extra. You just need to pass the appropriate flags to the compiler to enable them.
Getting Started
Let’s start by writing some buggy C programs and then use sanitizers to unveil the bugs. ASAN and UBSAN are the most commonly used sanitizers, so we will focus on them. You can try other sanitizers on your own.
Here is a list of the most common compiler flags to enable sanitizers:
1 | -fsanitize=address : Enable AddressSanitizer |
You can enable multiple sanitizers at once by separating them with a comma. For example:
1 | -fsanitize=address,undefined |
Notice that some sanitizers are not compatible with each other, so you may need to enable them separately. Don’t know which are incompatible? Just try it out.😉
AddressSanitizer
AddressSanitizer is the most commonly used sanitizer, which detects the general memory errors in your code. Let’s start with a simple example:
1 |
|
First, compile the program without any sanitizers:
1 | gcc -o test test.c && ./test |
You will see that the program runs without any error or warning, and that’s how the bug goes unnoticed. Now, let’s compile the program with AddressSanitizer enabled:
1 | gcc -fsanitize=address -o test test.c && ./test |
Ka-boom!💥Well, the bug is obvious, AddressSanitizer gives a quite comprehensive report. And notably, it comes with colors.
1 | ================================================================= |
The report is quite detailed and tells you exactly what went wrong. You can see that the program tried to access the 11th character of the buffer, which is outside the allocated region. This is a classic buffer overflow bug, and AddressSanitizer caught it.
Besides buffer overflow, AddressSanitizer can also detect use-after-free bugs, double-free bugs, and other memory corruption bugs. It is a very powerful tool for finding memory-related bugs in your code.
UndefinedBehaviorSanitizer
UndefinedBehaviorSanitizer detects undefined behavior in your code. You may ask, what is undefined behavior? Undefined behavior is when the program does something that the C standard does not define. For example, dividing by zero, dereferencing a null pointer, etc.
1 |
|
Without sanitizers, the program may simply crash with a floating point exception, which is not very informative. However, if you compile the program with UndefinedBehaviorSanitizer enabled, you will get a more detailed report.
1 | gcc -fsanitize=undefined -o test test.c && ./test |
Now we know exactly what went wrong, and we can fix the bug.
More Options
There are some more options you can use with sanitizers to get more detailed reports or to suppress certain warnings.
By default, sanitizers may not abort the program immediately when an error is detected. Instead, they may continue running the program and report the error at the end. This is useful if you want to see all the errors at once. However, if you want the program to abort immediately after an error is detected, you can set the environment variable ASAN_OPTIONS.
1 | ASAN_OPTIONS=abort_on_error=1 |
If you want to see more logs from the sanitizer, set another environment for this.
1 | LSAN_OPTIONS=verbosity=1:log_threads=1 |
At last, if the program you run requires dynamic libraries, you may encounter
1 | ASAN_OPTIONS=verify_asan_link_order=0 |
When you want to specify multiple options, separate them with ,
.
1 | ASAN_OPTIONS=abort_on_error=1,verify_asan_link_order=0 |
Epilogue
Sanitizer is really useful to uncover those hidden 🐞 in your program. Use it to save your life. ᓚᘏᗢ