08.27.03


By John Viega and Matt Messier

Eavesdropping attacks are often easy to launch, but most people don't worry about them in their applications. Instead, they tend to worry about what malicious things can be done to the machine on which the application is running. Most people are far more worried about active attacks than they are about passive attacks.

Nearly every active attack out there is the result of some kind of input from an attacker. Secure programming is about making sure that inputs from bad people do not do bad things. Indeed, most of the soon-to-be-released Secure Programming Cookbook for C and C++ addresses how to deal with malicious inputs. For example, cryptography and a strong authentication protocol can help prevent attackers from capturing someone's login credentials and sending those credentials as input to the program.

If this entire cookbook focuses primarily on preventing malicious inputs, then why do we have a chapter of recipes specifically devoted to this topic? It's because this chapter is about one important class of defensive techniques: input validation.

In Recipe 3.3 below on preventing buffer overflows, and in all of the recipes in the book's "Input Validation" chapter, we assume that people are connected to our software, and that some of them may send malicious data (even if we think there is a trusted client on the other end). One thing we really care about is this: "What does our application do with that data? In particular, does the program take data that should be untrusted and do something potentially security-critical with it? More importantly, can any untrusted data be used to manipulate the application or the underlying system in a way that has security implications?"

Recipe 3.3: Preventing Buffer Overflows

Problem

C and C++ do not perform array-bounds checking, which turns out to be a security-critical issue, particularly in handling strings. The risks increase even more dramatically when user-controlled data is on the program stack (i.e., is a local variable).

Solution

There are many solutions to this problem, but none that are satisfying in every situation. You may want to rely on operational protections (such as StackGuard from Immunix), use a library for safe string handling, or even use a different programming language.

Discussion

Buffer overflows get a lot of attention in the technical world, partially because they constitute one of the largest classes of security problems in code, but also because they have been around for a long time, are easy to get rid of, and yet still are a huge problem.

Buffer overflows are generally very easy for a C or C++ programmer to understand. An experienced programmer has invariably written off the end of an array, or indexed into the wrong memory because he improperly checked the value of the index variable.

Because we assume that you are a C or C++ programmer, we won't insult your intelligence by explaining buffer overflows to you. If you do not already understand the concept, you can consult many other software security books, including Building Secure Software. In this recipe, we won't even focus so much on why buffer overflows are such a big deal. Other resources can help you understand that if you're insatiably curious. Instead, we'll focus on state of the art strategies for mitigating these problems.

Most languages do not have this problem at all, because they ensure that writes to memory are always in bounds. Sometimes, this can be done at compile time, but generally it is done dynamically, right before data gets written. The C and C++ philosophy is different -- you are given the ability to eke out more speed, even if it means that you risk shooting yourself in the foot.

String Handling in C and C++

Unfortunately, in C and C++, it is not only possible to overflow buffers -- it is easy, particularly when dealing with strings. The problem is that C strings are not high-level data types; they are arrays of characters. The major consequence of this nonabstraction is that the language does not manage the length of strings; you have to do it yourself. The only time C ever cares about the length of a string is in the standard library, and the length is not related to the allocated size at all -- instead, it is delimited by a 0-valued (NULL) byte. Needless to say, this can be extremely error-prone.

One of the simplest examples is the ANSI C standard library function, gets():

char *gets(char *str);

This function reads data from the standard input device into the memory pointed to by str until there is a newline or until the end of file is reached. It then returns a pointer to the buffer. In addition, the function NULL-terminates the buffer.

The problem with this function is that no matter how big the buffer is, an attacker can always stick more data into the buffer than it is designed to hold, simply by avoiding the newline.

If the buffer in question is a local variable or otherwise lives on the program stack, then the attacker can often force the program to execute arbitrary code by overwriting important data on the stack. This is called a stack-smashing attack. Even when the buffer is heap allocated (that is, it is allocated with malloc() or new), a buffer overflow can be security-critical if an attacker can write over critical data that happens to be in nearby memory.

There are plenty of other places where it is easy to overflow strings. Pretty much any time you perform an operation that writes to a "string," there is room for a problem. One famous example is strcpy():

char *strcpy(char *dst, const char *src);

This function copies bytes from the address indicated by src into the buffer pointed to by dst, up to and including the first NULL byte in src. Then it returns dst. No effort is made to ensure that the dst buffer is big enough to hold the contents of the src buffer. Because the language does not track allocated sizes, there is no way for the function to do so.

To help alleviate the problems with functions like strcpy() that have no way of determining whether the destination buffer is big enough to hold the result from their respective operations, there are also functions like strncpy():

char *strncpy(char *dst, const char *src, size_t len);

The strncpy() function is certainly an improvement over strcpy(), but there are still problems with it. Most notably, if the source buffer contains more data than the limit imposed by the len argument, the destination buffer will not be NULL-terminated. This leads to the need for the programmer to ensure that the destination buffer is NULL-terminated. Unfortunately, the programmer often forgets to do so. There are two reasons for this failure:

  • It's an additional step for what should be a simple operation.

  • Many programmers do not realize that the destination buffer may not beNULL-terminated.

The problems with strncpy() are further complicated by the fact that a similar function, strncat(), treats its length-limiting argument in a completely different manner. The difference in behavior serves only to confuse programmers, and more often than not, mistakes are made. Certainly, we recommend using strncpy() over using strcpy(); however, there better solutions.

OpenBSD 2.4 introduced two new functions, strlcpy() and strlcat(), that are consistent in their behavior, and they provide an indication back to the caller of how much space in the destination buffer would be required to successfully complete their respective operations without truncating the results. For both functions, the length limit indicates the maximum size of the destination buffer, and the destination buffer is always NULL-terminated, even if the destination buffer must be truncated.

Click Here to Read the Full Article

First appeared at the O'Reilly Network



About the Author:

Secure Programming Cookbook for C and C++ is an important new resource for developers serious about writing secure code for Unix® (including Linux®) and Windows® environments. This essential code companion covers a wide range of topics, including safe initialization, access control, input validation, symmetric and public key cryptography, cryptographic hashes and MACs, authentication and key exchange, PKI, random numbers, and anti-tampering.


Read this Newsletter at: http://www.cprogrammingtrends.com/2003/0827.html

 

 

 

 

 

 

 

 

 

 

 

 

-- CProgrammingTrends is an iEntry, Inc. publication --
2003 iEntry, Inc. All Rights Reserved Privacy Policy  Legal