Princeton University
COS 217: Introduction to Programming Systems

Assignment 6: Buffer Overrun

Attack a program by exploiting a buffer overrun vulnerability.

Purpose

The purpose of this assignment is to help you learn (1) how programs are represented in machine code, (2) how stack frames are laid out in memory, and (3) how programs can be vulnerable to buffer-overrun attacks.

Background

We will provide you a program, both source code (hello.c) and executable binary code (hello). The file hello was produced from hello.c using the gcc command with the -O option. 

The program asks you your name, and prints out something like this (where the user input and program output are indicated by fonts style):

% hello
What is your name?
Bob
Thank you, Bob.
I recommend that you get a grade of D on this assignment.

However, the author of the program has inexplicably forgotten to do bounds-checking on the array into which it reads the input, and therefore it is vulnerable to attack.

Your Task

Your job is to provide input "data" to this program so that it prints something more like this:

% hello < data
What is your name?
Thank you, Bob.
I recommend that you get a grade of A on this assignment.

As you can see from reading the program, it is not designed to give anyone an A under any circumstances. However, it is programmed sloppily: it reads the input into a buffer, but forgets to check whether the input fits. This means that a too-long input can overwrite other important memory, and you can trick the program into giving you an A.

This assignment has several parts.

F. Fill in the blanks.

Copy this sentence to your readme file, and fill in the blanks such that the sentence is correct:

"If you were to use a buffer overrun attack to knowingly gain unauthorized access or to cause damage to other people's computers, the Computer Fraud and Abuse Act provides a maximum penalty of _______ years in prison for a first offense. However, the creator of the Melissa virus plea-bargained down to ______ months in prison."

D. Analyze the program.

Take the hello executable binary file that we have provided you, and use gdb to analyze its sections:
% gdb hello
(gdb) x/68i readString

Copy the resulting 68 lines of text into a text file named traces, and then annotate the code to explain what's going on. You should use the source code in hello.c as a reference, and indeed your annotation should just consist of showing how the machine code corresponds to the C code. You don't need an annotation for every line of machine code.

% gdb hello
(gdb) print &grade
(gdb) print grade

Place a diagram in your traces file showing the layout of the data section.

% gdb hello
(gdb) print &Name

Place a diagram in your traces file showing the layout of the bss section.

% gdb hello
(gdb) break *readString+73
(gdb) run
Type a name
(gdb) print $esp
(gdb) print $ebp
(gdb) x/??b $esp  (where ?? is the appropriate number of bytes)

Place a diagram of the stack frame layout, indicating addresses relative to the stack pointer in your traces file.

You may create your traces file jointly with one other student; if you do so, tell us who you worked with. You should do the rest of the assignment on your own  (though, as usual, you may discuss problems and approaches with other students as long as you don't copy each others' programs).

C. Get the program to crash.

Write a C program named createdataC.c that produces a file named dataC, as simple as possible, that causes the hello program to generate a segmentation fault. Explain its principles of operation in one sentence as a comment within your createdataC.c program.

B. Get the program to print "B".

Write a C program named createdataB.c that produces a file named dataB, as simple as possible, that causes the hello program to print your name and recommend a grade of "B". You can see by reading the program that, if your name is Andrew Appel, this is very easy to do. But probably your name isn't Andrew Appel.

Recommended method: overrun the buffer with a return address that jumps to a place inside of the main function.

A. Get the program to print "A".

Write a C program named createdataA.c that produces a file named dataA, as simple as possible, that causes the hello program to print your name and recommend a grade of "A".

Recommended method: overrun the buffer with a three-part byte-sequence: (1) your name, (2) a return address that points into the buffer, and (3) a short machine-language program that stores an 'A' into the right place and then jumps somewhere useful.

For parts B and A, if your name is very long, you may use just the first 15 characters of your name for the purposes of this assignment.

Implementation Notes:

  1. On some versions of Linux, every time the program is executed the initial stack pointer is in a different place. This makes it difficult to make an attack in which the return address points into the same data that was just read into the buffer on the stack. (Indeed, that is the purpose of varying the initial stack pointer!) However, you'll note that the data is copied from "buf" into "Name". You'll find that "Name" is reliably in the same place every time you (or we) run the program.
  2. On some versions of Linux, executing instructions from the data section causes a segmentation violation. The purpose of this is to defend against buffer overrun attacks! The "mprotect" call in our sample program is to disable this protection. You're not required to understand or explain how this line works. Note, however, that this mechanism (even if we didn't disable it) would not defend against the "B" attack.
  3. When we grade this assignment, we will take the recommendation of the hello program into account, but this will not be the only criterion.
  4. If you work hard, you could create a data input that will exploit the buffer overrun to take over the grader's Linux process and do all sorts of damage. DO NOT DO THIS! Any deliberate attempt of this sort is a violation of the University's disciplinary code, and is also a violation of the Computer Fraud and Abuse Act (see section F above).

Logistics

Create your programs on hats using the bash shell, xemacs, gcc, and gdb.

The directory /u/cos217/Assignment6 contains the hello.c and hello files. It also contains a makefile that you might find helpful during development.

Create a readme text file that contains:

Submit your work electronically on hats via the command:

/u/cos217/bin/i686/submit 6 traces createdataC.c createdataB.c createdataA.c readme

Grading

We will grade your work on correctness and design. We will consider understandability to be an important aspect of good design. To encourage good coding practices, we will compile using "gcc -Wall -ansi -pedantic" and take off points based on warning messages during compilation.

The hello.c Program

#include <stdio.h>
#include <sys/mman.h>
#include <string.h>
#include <stdlib.h>

#define BUFSIZE 30

char grade = 'D';
char Name[BUFSIZE];

void readString(char *s) {
   char buf[BUFSIZE];
   int i = 0; 
   int c;

   for (;;) {
      c = fgetc(stdin);
      if ((c == EOF) || (c == '\n')) 
         break;
      buf[i++] = c;
   }
   buf[i] = 0;

   for (i = 0; i < BUFSIZE; i++) 
      s[i] = buf[i];
}


int main(void) {
   mprotect((void*)((unsigned int)Name & 0xfffff000), 1,
            PROT_READ | PROT_WRITE | PROT_EXEC);

   printf("What is your name?\n");
   readString(Name);

   if (strcmp(Name, "Andrew Appel") == 0) 
      grade = 'B';

   printf("Thank you, %s.\n", Name);
   printf("I recommend that you get a grade of %c on this assignment.\n", 
          grade);

   exit(0);
}