Buffer overflow

From Citizendium
Revision as of 02:32, 9 May 2007 by imported>Pat Palmer (changed link to compiler)
Jump to navigation Jump to search

In computers and computer security, a buffer overflow occurs when more data is written to a memory buffer than can fit into the memory buffer. In certain programs, the excess data is written to memory beyond that buffer, overwriting other data. This error is the most common type of Computer security flaw, and its prevalence is due to the common use of languages such as C which have no implicit method to prevent buffer overflows.

Other names for this attack include "buffer overrun" and "Smashing the Stack," both of which describe the concept.[1]

Technical Explanation

A software execution stack exists for every process running on a computer. Parts of the stack contain program variables, and other parts contain information such as saved program counter address. Many programs---often because of the nature of the language in which they were written---do not take adequate steps to ensure they cannot overwrite their stacks as a result of invalid inputs. As a result, it is possible to coerce such programs to overwrite their stacks with chosen data.

By overwriting saved program counter addresses, an attacker may modify variables within the program, or even redirect execution to other code, potentially code that the attacker placed onto stack.

This can achieve unexpected results, ranging anywhere from the program crashing, to hijacking the execution context (and therefore, the security context) of the program in question. This simple concept has had profound implications in the annals of computer security.

Attempts at Overcoming This Vulnerability

Attempts at overcoming this vulnerability in a proactive way (rather than simply issuing Software patches) have had limited success. Researchers in Computer security have attempted to solve the buffer overflow attack problem both in software and in hardware. The best way to ensure that this attack vector isn't successful is by writing code that validates input wherever necessary.

Software Debugging Tools

Valgrind is an open source suite of tools that are designed to assist with debugging, improving the performance of software, and detection of the way functions and function calls are made, to help reduce the possibility of buffer overflow attacks.

It simulates the execution of compiled code (not source code) on a virtual x86 processor (working on many of the same principles of a software CPU emulator), and intercepts relevant function calls, allowing for fine-grained buffer overflow detection on the heap.[2] One drawback of Valgrind is its speed - because it acts as an emulator, code runs considerably slower then it would if it was on native hardware.

Splint is another open source toolset which performs static program analysis of source code to detect common programming and security errors in C programs. It can be used with "plain old" source code, or with source code that has been specifically annotated. Splint can help detect a large number of errors before a program is deployed.[3]

These examples use two different means of detecting the possibility of buffer overflow attacks: Valgrind detects buffer overflow possibilities on compiled executables, while Splint analyzes source code before it has been compiled.

By The Operating System

Some operating systems, most notably the Unix-variant OpenBSD, employ address randomization in an attempt to thwart many buffer overflow attacks. In this method, the operating system attempts to map allocated memory to random memory addresses during the system calls malloc() and mmap(). This method will foil attacks which assume some address relationship exists between blocks of memory, such as one object occupies space immediately preceding another.

Similarly, OpenBSD attempts to insert so-called Guard pages before and after allocated blocks of memory. By manipulating the memory controller's memory map, the operating system can be notified upon reads or writes to a guard page. Thus, buffer overflows that escape one allocated block are trapped before they can reach another.

Other operating systems such as Hewlett-Packard's MPE attempted to manage memory more directly by recognizing program stack bounds and prevent memory writes within those bounds. This meant that programming languages that allowed modification of code during execution (e.g.; the infamous COBOL "ALTER" verb) were stripped of that capability.

As Language Semantics or Library Functionality

One major cause of buffer overflow vulnerabilities in software systems has been the use of unsafe string manipulation functions---most notably C's strcpy() and strcat() and others. These functions perform buffer copies, but do not require the programmer to impose a maximum number of bytes to copy, and thus can result in buffer overflows. Programmers call checking this input by hand "input validation." The first improvements over these two functions were strncpy() and strncat(), which take, as an extra parameter the maximum number of bytes to copy. However, the semantics of these functions are difficult for programmers to understand, and they have a whole slew of boundary cases that are commonly misunderstood. More recently, the OpenBSD project has implemented the strlcpy() and strlcat() functions, which offer simplified semantics, and presumably safer usage. These two functions have become common on other Unix-like operating systems.

Another approach to the same goal is to simply replace unsafe languages, such as C, with higher-level languages, such as Perl, Java, or many others. Proponents argue that, since these languages include data structures that have automatic bounds checking and automatic memory management, that they are less susceptible to buffer overflow attacks.

As Compiler Features

Main Article: canary value

Several groups have implemented security enhancements to compilers, hoping they can produce more secure code without forcing programmers to change their application's source code. Notable examples of this are StackGuard and Propolice.

The method is simple. The compiler generates additional instructions, so that the function prologue will add a so-called canary value to the stack frame between the return address and the local variables. This canary value is a random number chosen when the program begins. Then, additional instructions are inserted into the function epilogue which check the canary value, as it appears in the stack frame. If incorrect, the new instructions cause the program to go into a fail-safe mode (usually immediate termination), as to control the program's worst-case behavior while under attack. Canary values can work, because most stack smashing attacks which successfully overwrites the return address will also overwrite the canary value, and it is unlikely that the attacker will be able to guess the canary value. [4]

At least four attacks have been developed against this sort of protection. [5]

In Hardware

Processor manufacturers have attempted to create a hardware solution to this problem, where parts of memory are segregated into areas marked as instructions that should be executed and areas marked as data, which should never be executed. This solution, when used properly, can prevent buffer overflow attacks in many cases.

AMD developed and marketed this feature first, and named it the NX (No eXecute) bit. Intel's name for this feature is the XD (eXexute Disable) bit, however the two technologies are functionally the same and serve the same purpose.

Related Topics

External Links

"Smashing the Stack for Fun and Profit" This article is a bit dated, but it covers in great technical detail this flaw

References