COS 126: RSA Public-Key Cryptosystem

COS 126

RSA Public-Key Cryptosystem

Programming Assignment 10

Due: Wednesday, 11:59pm

Overview. Write a program to implement the RSA public-key cryptosystem. The RSA (Rivest-Shamir-Adleman) cryptosystem is widely used for secure communication in browsers, bank ATM machines, credit card machines, mobile phones, smart cards, and the Windows operating system. It works by manipulating large integers. In the first part of the assignment, you will write a C program that converts a text message into a sequence of integers. To thwart eavesdroppers, the RSA cryptosystem must manipulate large integers (hundreds of digits). The built-in C type int is only capable of dealing with 16 or 32 bit integers, providing little or no security. For the second part of the assignment, you will design, implement, and analyze an extended precision arithmetic data type that is capable of manipulating much larger integers. Finally, you will use this data type to write a client program that encrypts and decrypts messages using RSA.

Note: the training wheels are off on this assignment - you're starting with a blank screen, not our code. This means that you must pay particular attention to the process of building up your program from scratch. Consider carefully how your program should be structured and how you are going to implement the various functions before plunging in. Remember to thoroughly test and debug each function as you write it.

The RSA cryptosystem. As discussed in lecture S1, the RSA public key cryptosystem involves three integers d, e, and n that satisfy certain mathematical properties. The private key (d, n) is known only by Alice, while the public key (e, n) is published on the Internet. If Bob wants to send Alice a message, he encodes his message as an integer M that is between 0 and n-1. (Below, we describe what to do if Bob's message is longer than n-1.) Then he computes:

C = M^e mod n

and sends the integer C to Alice. As an example, if M = 2003, e = 7, d = 2563, n = 3713, then Bob computes

C = 2003⁷ mod 3713 = 129,350,063,142,700,422,208,187 mod 3713 = 746.

When Alice receives the encrypted communication C, she decrypts it by computing:

M = C^d mod n.

Continuing with the example above, Alice recovers the original message by computing:

M = 746²⁵⁶³ mod 3713 = 2003.

To check your arithmetic, you may use bc, maple, or xmaple.

Part 1 (converting between text and integers). Instead of writing separating programs to do this (as was originally proposed), you will write two extra interface functions XPinitASCII() and XPshowASCII(), as described below. We think this will make things easier.

Part 2 (extended precision arithmetic). The RSA cryptosystem is easily broken if the private key d or the modulus n are too small (e.g., 32 bit integers). The built-in C types int and long int can typically handle only 16 or 32 bit integers. Your main challenge is to design, implement, and analyze an extended precision arithmetic data type that can manipulate large (nonnegative) integers. To make it easier to check your work, w recommend working with integers represented using familiar decimal (base 10) notation. (In real applications, the base is typically chosen to be a power of 2 for performance reasons.) Your data type will support the following types of operations: addition, subtraction, multiplication, division, modular exponentiation, and comparison. You may use the header file XP.h.

Data structure. The first and most critical step is to choose an appropriate data structure to represent N-digit extended precision integers. One approach is to store each digit as an element in an array. To avoid pointers and memory allocation and deallocation headaches, consider packaging the array up in a struct as follows:

#define N 100         // maximum number of digits

typedef struct {
   unsigned char digit[2*N + 1];
} XP;

We declare our array to have slightly more than N elements to give ourselves a bit of extra scratch space to store intermediate results - the exact reason for using 2N + 1 will become clear later. Advanced students (who like pointers) may wish to make it a true first class ADT as in Sedgewick, p. 181.

Initialization. Implement a function XPinitDecimal() that takes a string of decimal digits as input, and returns the extended precision integer corresponding to this string. If you represent integers using decimal notation, this will be straightforward; if you use base 2 or 256, consider using Horner's method to do the conversion.

Display. Implement a function XPshowDecimal() that takes an extended precision integer as input and prints out a (decimal) representation of that integer.

Addition. Implement the function XPadd(). It should take two extended precision integers as input and return a third extended precision integer that is the sum of the two. Implement the method you learned in grade school. Observe that if a and b are 2N-digit integers, the sum will have at most 2N + 1 digits.

Once you've written the above three function, you should be able to write code like:

XP a, b, c;                       // declare extended precision variables
a = XPinitDecimal("123456789");   // a   = 123456789
b = XPinitDecimal("54545454");    // b   =  54545454
c = XPadd(a, b);                  // c   =  a + b
XPshowDecimal(c);                 // print 178002243

Random number generation. Implement a function XPrand() that takes as input an integer n and returns a pseudo-random n-digit extended precision integer by choosing each digit at random. This will be useful in debugging the subsequent code, and also in analyzing its complexity.

Parity. Implement the function XPisOdd() that takes an extended precision integer as input and return 1 if and only if it is odd. You may assume that the base is even, so that this amount to checking the parity of the least significant bit.

Comparisons. Implement the following functions: XPgreater(), XPless(), and XPeq(). They should take two extended precision integers as input and return 1 if and only if the first integer is greater than (less than, equal to) the second.

Subtraction. Implement the functions XPsub(). It should take two extended precision integers as input and return a third extended precision integer that is the difference of the two. You should check that the first integer is greater than or equal to the second before subtracting.

Multiplication. Implement the function XPmul(). It should take two extended precision integers as input and return a third extended precision integer that is the product of the two. Observe that if a and b are N-digit integers, the product will have at most 2N digits.

Division. Integer division is the most complicated of the arithmetic functions. Here is one of the simplest ways to compute q = a / b and r = a mod b, where a and b are extended precision integers, with b not equal to 0.

div(a, b)
   if (a < b)
      return (0, a)
   (q, r) = div(a, 2b)
   q = 2q
   if (r < b)
      return (q, r)
   else
      return (q + 1, r - b)

Note that you should use extended precision arithmetic to perform each of the intermediate computations. The natural way to implement this algorithm is using recursion. Observe that if a and b are 2N-digit numbers, then the intermediate results will never have more than 2N+1 digits.

Part 3 (encryption and decryption). Now it's time to put all your hard work to use. The encryption and decryption processes are identical, except that they involve different keys. Write a program rsa.c that takes in two command line arguments (the string "-d" or "-e, followed by the name of the encryption key), reads in ASCII text from standard input, and applies the RSA encryption/decryption procedure to the message. Note: if the original message is larger than the modulus, you should break up the message into appropriate sized chunks, convert each chunk into an integer between 0 and the modulus, and encrypt each chunk individually. It should work as follows.

% gcc126 rsa.c xp.c                                      # compile
% a.out -e alice.pub < message.txt > message.encrypted   # Bob encodes and encrypts
% mail alice < message.encrypted                         # Bob emails to Alice
% a.out -d alice.pri < message.encrypted                 # Alice decrypts and decodes

Text conversion. The RSA cryptosystem works on integer data, but Bob and Alice want to communicate with text. So before Bob sends a message to Alice, he converts it to a sequence of integers. Write a function XPinitASCII() that takes as input a string of characters, and converts it to an extended precision integer. Alice will need the inverse to transform integers back to ASCII text. Write a function XPshowASCII() that takes an extended precision integer as input, and prints out the corresponding ASCII text.

XP a;                             // declare extended precision variables
a = XPinitASCII("Hello");         // initialize a
XPshowASCII(a);                   // prints "Hello"

To perform encryption, the client program will break up the input message into chunks, and repeatedly call XPinitASCII() to encode the chunks into integers. To perform decryption, the client program will use the function XPshowASCII() to decode the integers after it has done the decryption.

RSA function. The key step is to implement the RSA function: given three extended precision integers a (the base), b (the exponent), and n (the modulus), compute c = a^b mod n. A natural way to compute c is to set x = 1; then multiply it by a, b times; then set c = x mod n. This method has two serious flaws.

First, observe that if the inputs a, b, and n are N-digit decimal integers, then the intermediate result x could contain as many as 10^N digits. To see the great importance of this, observe that if N is 30, the intermediate result could consume a terrabyte of memory! The following mathematical fact saves the day:

xy mod n = ((x mod n) * (y mod n)) mod n

The second problem is that repeating anything b times will be slow. If b is even moderately large (say 30 decimal digits), it will take centuries to execute, even on the fastest supercomputer. The following repeated squaring technique enables us to perform modular exponentiation in a reasonable amount of time.

  t = a ^ (b / 2) mod n
  c = t * t
  c = c mod n
  if (b is odd)
    c = c * a
    c = c mod n
  return c

Note here that the division is integer division, and each of the intermediate computations is performed using extended precision arithmetic. The exponentiation computation is naturally implemented using recursion. As above, if a, b, and c are N-digit numbers, then the intermediate results in this function will never have more than 2N digits. However, if c is 2N digits long, then the intermediate results in the mod (division) function might require as many as 2N + 1 digits. This explains why we allocate 2N + 1 digit.

Part 4 (analysis). Perform a complexity analysis of the following operations for N-digit extended precision arithmetic: addition, multiplication, division, and the rsa encryption, and decryption algorithms. For each operation, give the exponent and coefficient of the leading term in its asymptotic running time, e.g., 1.3 × 10^-5 N² seconds. Justify your answers with empirical and theoretical evidence. Using this analysis, estimate the largest input (measured in number of digits) that each of your functions could solve in 1 day (86,400 seconds). Depending on your conclusions, you may wish to modify your design and implementation to improve performance. You may find it helpful to use the program timing.c to estimate the running times of your functions.

Legal notice. It is a violation of US law to export your solution for this assignment to foreign governments or embargoed destinations (Cuba, Iran, Iraq, Libya, North Korea, Serbia, Sudan, Syria, and Taliban-controlled areas of Afghanistan as of January 2000). It is also illegal to import your solution into several countries, including France, Iran, Iraq, and Russia.

Extra credit. Write a program that compute two large pseudo-random prime numbers p and q of a specified number of digits. Compute φ = (p - 1) (q - 1), and select a small integer e that is relatively prime with φ. Use these to generate a public key (e, n) and its companion secret key (d, n).

This assignment was created by Bob Sedgewick and Kevin Wayne.