Semmle CTF 1: SEGV hunt
We created this CTF challenge to help you quickly learn Semmle CodeQL. The objective is to find a critical buffer overflow bug in glibc using CodeQL, our simple, code query language. To capture the flag, you'll need to refine your query to increase its precision using this step by step guide.
The goal of this challenge is to find unsafe uses of alloca in the GNU C Library (glibc). alloca is used to allocate a buffer on the stack. It is usually implemented by simply subtracting the size parameter from the stack pointer and returning the new value of the stack pointer. This means that it has two important benefits:
The memory allocated by alloca is automatically freed when the current function returns.
It is extremely fast.
But alloca can also be unsafe because it does not check whether there is enough stack space left for the buffer. If the requested buffer size is too big, then alloca might return an invalid pointer. This can cause the application to crash with a SIGSEGV when it attempts to read or write the buffer. Therefore alloca is only intended to be used to allocate small buffers. It is the programmer's responsibility to check that the size isn't too big.
The GNU C Library contains hundreds of calls to alloca. In this challenge, you will use Semmle CodeQL to find those calls. Of course many of those calls are safe, so the main goal of the challenge is to refine your query to reduce the number of false positives. If you follow the challenge all the way to the end then you might even find a bug in glibc that is reproducible from a standard command-line application.
The quickest way to get started with CodeQL is to use LGTM's query console. However, if you prefer, you can also install CodeQL and write your queries offline. Instructions for installing CodeQL are included at the end of this document.
If you get stuck, try searching our documentation and blog posts for help and ideas. Below are a few links to help you get started:
The challenge is split into several steps, each of which contains multiple questions, however building one query per step is sufficient.
The correct way to use alloca in glibc is to first check that the allocation is safe by calling __libc_use_alloca. You can see a good example of this at getopt.c:252. That code uses __libc_use_alloca to check if it is safe to use alloca. If not, it uses malloc instead. In this step, you will identify calls to alloca that are safe because they are guarded by a call to __libc_use_alloca.
In this step, you'll use a taint tracking query to find an unsafe call to alloca where the allocation size is controlled by a value read from a file.
Question 50: The GNU C Library includes several command-line applications. (It contains 24 main functions.) Demonstrate that the bug is real by showing that you can trigger a SIGSEGV in one of these command-line applications.
We hope you enjoyed this challenge! If you are interested in continuing to use CodeQL for security research, then we recommend installing CodeQL on your own computer. This will enable you to run queries offline. We have also provided these offline instructions for posterity, because the query results on LGTM will change over time as the source code evolves. But the instructions below use a snapshot corresponding to revision 3332218, which is the revision for which we designed this challenge.
To run CodeQL queries offline, follow these steps:
You can download other snapshots for offline use from LGTM. For example, you can download a snapshot for the latest revision of glibc here. Every project on LGTM has a download link for downloading the latest snapshot.