In Recipe 3.3 below on preventing buffer overflows, and in all of the recipes
in the book's "Input Validation" chapter, we assume that people are connected
to our software, and that some of them may send malicious data (even if we think
there is a trusted client on the other end). One thing we really care about is
this: "What does our application do with that data? In particular, does the program
take data that should be untrusted and do something potentially security-critical
with it? More importantly, can any untrusted data be used to manipulate the application
or the underlying system in a way that has security implications?"
Recipe 3.3: Preventing Buffer Overflows
Problem
C and C++ do not perform array-bounds checking, which turns out to be a security-critical
issue, particularly in handling strings. The risks increase even more dramatically
when user-controlled data is on the program stack (i.e., is a local variable).
Solution
There are many solutions to this problem, but none that are satisfying in every
situation. You may want to rely on operational protections (such as StackGuard
from Immunix), use a library for safe string handling, or even use a different
programming language.
Discussion
Buffer overflows get a lot of attention in the technical world, partially because
they constitute one of the largest classes of security problems in code, but also
because they have been around for a long time, are easy to get rid of, and yet
still are a huge problem.
Buffer overflows are generally very easy for a C or C++ programmer to understand.
An experienced programmer has invariably written off the end of an array, or indexed
into the wrong memory because he improperly checked the value of the index variable.
Because we assume that you are a C or C++ programmer, we won't insult your
intelligence by explaining buffer overflows to you. If you do not already understand
the concept, you can consult many other software security books, including Building
Secure Software. In this recipe, we won't even focus so much on why buffer
overflows are such a big deal. Other resources can help you understand that if
you're insatiably curious. Instead, we'll focus on state of the art strategies
for mitigating these problems.
Most languages do not have this problem at all, because they ensure that writes
to memory are always in bounds. Sometimes, this can be done at compile time, but
generally it is done dynamically, right before data gets written. The C and C++
philosophy is different -- you are given the ability to eke out more speed, even
if it means that you risk shooting yourself in the foot.
String Handling in C and C++
Unfortunately, in C and C++, it is not only possible to overflow buffers --
it is easy, particularly when dealing with strings. The problem is that C strings
are not high-level data types; they are arrays of characters. The major consequence
of this nonabstraction is that the language does not manage the length of strings;
you have to do it yourself. The only time C ever cares about the length of a string
is in the standard library, and the length is not related to the allocated size
at all -- instead, it is delimited by a 0-valued (NULL) byte. Needless
to say, this can be extremely error-prone.
One of the simplest examples is the ANSI C standard library function, gets():
char *gets(char *str); This function reads data from the standard input device
into the memory pointed to by str until there is a newline or until
the end of file is reached. It then returns a pointer to the buffer. In addition,
the function NULL-terminates the buffer.
The problem with this function is that no matter how big the buffer is, an
attacker can always stick more data into the buffer than it is designed to hold,
simply by avoiding the newline.
If the buffer in question is a local variable or otherwise lives on the program
stack, then the attacker can often force the program to execute arbitrary code
by overwriting important data on the stack. This is called a stack-smashing
attack. Even when the buffer is heap allocated (that is, it is allocated with
malloc() or new), a buffer overflow can be security-critical if an
attacker can write over critical data that happens to be in nearby memory.
There are plenty of other places where it is easy to overflow strings. Pretty
much any time you perform an operation that writes to a "string," there
is room for a problem. One famous example is strcpy():
char *strcpy(char *dst, const char *src);
This function copies bytes from the address indicated by src into
the buffer pointed to by dst, up to and including the first NULL
byte in src. Then it returns dst. No effort is made
to ensure that the dst buffer is big enough to hold the contents
of the src buffer. Because the language does not track allocated
sizes, there is no way for the function to do so.
To help alleviate the problems with functions like strcpy() that
have no way of determining whether the destination buffer is big enough to hold
the result from their respective operations, there are also functions like strncpy():
char *strncpy(char *dst, const char *src, size_t len);
The strncpy() function is certainly an improvement over strcpy(),
but there are still problems with it. Most notably, if the source buffer contains
more data than the limit imposed by the len argument, the destination
buffer will not be NULL-terminated. This leads to the need for the
programmer to ensure that the destination buffer is NULL-terminated.
Unfortunately, the programmer often forgets to do so. There are two reasons for
this failure:
The problems with strncpy() are further complicated by the fact
that a similar function, strncat(), treats its length-limiting argument
in a completely different manner. The difference in behavior serves only to confuse
programmers, and more often than not, mistakes are made. Certainly, we recommend
using strncpy() over using strcpy(); however, there
better solutions.
OpenBSD 2.4 introduced two new functions, strlcpy() and strlcat(),
that are consistent in their behavior, and they provide an indication back to
the caller of how much space in the destination buffer would be required to successfully
complete their respective operations without truncating the results. For both
functions, the length limit indicates the maximum size of the destination buffer,
and the destination buffer is always NULL-terminated, even if the
destination buffer must be truncated.
Click
Here to Read the Full Article
First appeared at the O'Reilly
Network
About the Author:
Secure
Programming Cookbook for C and C++ is an important new resource for developers
serious about writing secure code for Unix® (including Linux®) and Windows® environments.
This essential code companion covers a wide range of topics, including safe initialization,
access control, input validation, symmetric and public key cryptography, cryptographic
hashes and MACs, authentication and key exchange, PKI, random numbers, and anti-tampering.
|
|
Read this Newsletter at: http://www.cprogrammingtrends.com/2003/0827.html |
|