Understanding the Format String Vulnerability

Robin Sandhu
4 min readJun 17, 2021

--

If you have ever programmed in C programming language you must already be familiar with printf function which is used for printing formatted data to the stdout stream. But even this simple-looking function can result in a high threat vulnerability if the programmers are not careful. In this blog post, we’ll look at what exactly is format string vulnerability and why it occurs.

What is Format String Vulnerability?

The Format String exploit occurs when the submitted data of an input string is evaluated as a command by the application.

In technical terms when a data string passed to a program gets passed to printf function as a format string it leaves the program vulnerable to format string vulnerability.

But where is the problem?

Before understanding why passing data string to printf as a format string makes it vulnerable, we need to understand how optional arguments work in C programming language.

Let’s take a look at function signature of printf

int printf(const char *format, …);

The argument list of printf contains:
1. Format string
2. Zero or more optional arguments

‘…’ is a special operator in C which allows for passing in any number of arguments to the program.

But how does that even work?

Spoiler: Pointers

Following code snippet shows an implementation of optional arguments:

va_list (line 6) is a pointer that accesses the optional arguments.
va_start() (line 8) is macro which calculates the initial position of va_list pointer based on the last argument passed to the function which is not an optional parameter (Narg in the above example).
va_arg() (line 10, 11) is a macro used for advancing the va_list pointer based on the datatype.
va_end() (line 13) is called after accessing all the arguments.

So, it all boils down to having a pointer pointing at the start of the optional arguments in the stack and advancing the pointer on the basis of the datatype of those arguments.

Note:- Just a reminder, arguments are pushed onto the stack in reverse order.

stack layout of optional parameters

printf also accesses its optional argument in just the same way but unlike the above example rather than taking the number of arguments explicitly as we did with the Narg argument parameter, it refers to format specifiers for advancing the va_list pointer.

If you are still confused just take a look at the example below:

Here we are simply printing out a formatted string using printf function. Now, take a look at the stack layout of this example.

As you can clearly see, for each format specifier va_list pointer advances up the stack.

FYI

Format Specifier available in C

What if some optional parameter is missing?

va_arg() macro has no mechanism for checking whether it has reached the end of the optional argument list. So, it continues fetching data from the stack by advancing the va_list pointer leading to a memory leak.

Well there you go, now you understand this vulnerability.

Identify what is vulnerable

Let’s start by under-passing the arguments to printf

Now, if you do something like this it won’t work, your compiler will flag the statement because it already knows at the compile time that printf function lacks an optional parameter. But if the format string is passed at runtime the program has no mechanism for checking the missing params.

As in the above snippet, we are passing the input variable (str) to the second printf call which makes that part vulnerable.

How to prevent Format String Vulnerability?

  • Specify format strings as a part of the program, not as an input variable.
  • If possible use a literal constant as format strings in your program.
  • Never print strings in C, by simply passing the string to printf, instead use "%s”format specifier in the format string and then pass the input string as an argument to printf.

Conclusion

In this blog post we learned about format string vulnerability, what it is, why does it occurs and a few ways of preventing it.

Now, format string vulnerabilities are very rare these days because they are easy to detect, but nevertheless, it is a very good example for showcasing the danger of mixing code with data.

In a future blog post, we will see how can format string vulnerabilities be exploited to gain root access.

--

--

Robin Sandhu
Robin Sandhu

Written by Robin Sandhu

0 Followers

A CS student trying his hand at CyberSecurity.

No responses yet