7.8 Cryptography
This section under major construction.
Cryptology. Cryptology is the science of secret communication. It has two main subfields: cryptography is the science of creating secret codes; cryptanalysis is the science of breaking codes. There are five pillars of cryptology:
 Confidentiality: keep communication private.
 Integrity: detect unauthorized alteration to communication.
 Authentication: confirm identity of sender.
 Authorization: establish level of access for trusted parties.
 Nonrepudiation: prove that communication was received.
We will focus primarily on confidentiality, the most romantic of these endeavors. Highly recommended reading for entertainment: The Code Book. Useful Flash demo: eSecurity history from rsa.com.
Some applications of crypto. Phil Zimmermann asserts "Cryptography used to be an obscure science, of little relevance to everyday life. Historically, it always had a special role in military and diplomatic communications. But in the Information Age, cryptography is about political power, and in particular, about the power relationship between a government and its people. It is about the right to privacy, freedom of speech, freedom of political association, freedom of the press, freedom from unreasonable search and seizure, freedom to be left alone." (Code Book, p. 296). Crypo benefits both ordinary citizens and terrorists. Enables ecommerce. Below is a table of activities that we would like to be able to implement digitally and securely. We all list a number of everyday analog implementation of each task.
Task  Analog Implementations 

Protect information  Code book, lock and key 
Identification  Driver's license, Social Security number, password, bioinformatics, secret handshake 
Contract  Handwritten signature, notary 
Money transfer  Coin, bill, check, credit card 
Public auction  Sealed envelope 
Public election  Anonymous ballot 
Poker  Cards with concealed backs 
Public lottery  Dice, coins, rockpaperscissors 
Anonymous communication  Pseudonym, ransom note 
A malicious adversary can sometimes subvert these analog implementations: forgery, lock picks, counterfeiters, card cheats, ballotstuffing, loaded dice.
Our goal. Our goal is to implement all of these tasks digitally and securely. We would also like to implement additional tasks that can't be done with physics! For example: play poker variant where dealer wins if no one has an Ace, have an anonymous election where everyone learns winner, but nothing else. Is any of this possible? If so, how? In the remainder of this section, we will give a flavor of modern (digital) cryptography, implement a few of these tasks, and sketch a few technical details.
History. Decryption of Mary Stuart's encrypted letters revealed her intent to assassinate Elizabeth I. In the 1800s, Edgar Allen Poe boasted that he could break anyone's cypher using frequency analysis. Alan Turing led a team at Bletchley Park which cracked the German Enigma cipher. Many historians believe this was the turning point of World War II. Here's an Enigma applet.
Security by obscurity. The Content Scrambling System (CSS) is used by Hollywood to encrypt DVDs. Each disc has three 40bit keys. Each DVD decoder has unique 40bit key. In principle it is "not possible" to play back on computer without disc. In 1999, two Norwegians (Canman and SoupaFrog, 1999) wrote a decryption algorithm that cracked the CSS system. CSS was a proprietary algorithm and Hollywood was banking on the fact that nobody would discover the algorithm. Moreover, the size of the keys was too small, so brute force attacks were possible. Other high profile failures due to ad hoc approach: GSM cell phones, Windows XP product activiation, RIAA digital music watermarking, VCR+ codes, and Adobe eBooks, Diebold AccuVoteTS electronic voting machines, ExxonMobil SpeedPass RFIDs.
In 1883, The Dutch linguist Auguste Kerckhoffs von Nieuwenhof embodied the underlying principle guiding modern cryptography in his paper Cryptographie militaire:
Il faut qu'il n'exige pas le secret, et qu'il puisse sans inconvenient tomber entre les mains de l'ennemi.
The system must not require secrecy and can be stolen by the enemy without causing trouble.
This is now known as Kerckhoffs' Principle. The security of a cryptosystem should not depend on keeping the algorithm secret, but only on keeping the numeric key secret. There are two primary distinctions between the algorithm and the numeric key. (Ed Felten) First, since we generate the numeric key at random, we can accurately model and quantify how long it would take an adversary to guess (under general technical conditions); in contrast, it is much harder to predict or quantify how long it would take an adverary to guess our algorithm. Second, it's easy to use different numeric keys for different purposes of people, or to stop using a key that has been compromised; it's more difficult to design new algorithms.
It says that systems based on "security by obscurity" are fatally flawed. This is equivalent to Shannon's maxim is "The enemy knows the system." The design of secure systems should be left to the experts. Despite this, we can still explore the basic ideas of cryptography that the experts use.
The participants. In keeping with the rich tradition of cryptographers, Alice and Bob are the two people trying to communicate securely over an insecure communication channel. We will assume that the message is already encoded in binary, so we can treat it as a (potentially huge) integer m. We let N denote the number of bits in the message m. Alice applies an encryption function E to the message, which produces another N bit integer E(m). Bob receives E(m) and applies his decryption function D to this. An obvious condition for this to make any sense is that D(E(m)) = m. In other words, Bob recovers the original message. Eve is a third party who wishes to intercept the message. Eve can observe E(m), so for the scheme to be secure, it should be prohibitively difficult for Eve to recover m from E(m) alone.
Private key cryptography. Private key = two parties share a secret key prior to their communication. Onetime pads (Chapter 1) are provably secure if the bits in the key are generated from a truly random source. It is also extremely easy to implement. Nevertheless, onetime pads have several mitigating factors that render it impractical in most situations. First, it is a challenge to generate truly random bits, free of biases and correlations. One must go outside the world of digital computers and extract them from some physical source (e.g., time between emission of particles due to radioactive decay, sound from a microphone, elapsed time between keystrokes). Such sources are often biased and we would need to take great care to prevent Eve from observing or tampering with the process. The scheme is called onetime since we need new key for each message or part of the message. If we reuse a onetime pad, then the system is no longer secure. Signature? Nonrepudiation? Perhaps the most limiting factor is key distribution. Alice and Bob must know each other and exchange the key sending the secret message. The Kremlin and White House used to communicate with each other using this method. A trusted courier would be sent across the Atlantic Ocean with a briefcase of onetime pads handcuffed to his arm. This method is ridiculously impractical for if Alice wants to purchase a product from Bob over the Internet.
Other private key encryption schemes. Data Encryption Standard (DES). Advanced Encryption Standard (AES, Rijndael algorithm). Blowfish. Methods are not provably secure like onetime pads, but have withstood the test of time of mathematical scrutiny. Efficient. However, these schemes suffer from the same keydistribution problem that plagues onetime pads. One emerging solution to the key distribution problem is to use quantum mechanics. This is known as Quantum Key Distribution. It is an unconditionally secure way for two parties to share a onetime pad. Moreover, there is an intrusion detection component so that if Eve observes even one bit, both parties will learn about the attempted eavesdropping.
Modern cryptography. The modern theory of cryptography leverages the theory of hard problems. The goal is to show that breaking security system is equivalent to solving some of the world's greatest unsolved problems! Bruce Schneier, a noted electronic security expert, wrote in Applied Cryptography, "It is insufficient to protect ourselves with laws, we need to protect ourselves with mathematics." The foundations of modern cryptography hinges on three crucial axioms and one important fact.
 Axiom 1. Players can toss coins. Crypto is impossible without randomness so this axiom is essential. In practice we can generate truly random bits by using quantum phenomenon or the radioactive decay of particles.
 Axiom 2. Players are computationally limited. We express this notion formally by restricting the participants (communicating parties and malicious adversaries) to use only polynomial time algorithms.
 Axiom 3. Factoring is hard computationally. We assume that it is not possible to factor an Nbit integer in time polynomial in N. Given an integer (e.g., 1541) it appears difficult to find its prime factorization. However, given the factors (e.g., 23 * 67) it is easy to multiply them out and obtain the original number. This is referred to as a "1way trapdoor function" since it is easy to go one way (from factors to product), but apparently hard to go the other way (from product to factors).
 Fact. Primality testing is easy computationally. MillerRabin primality testing algorithm. PRIMES in P proved in 2002.
If the three axioms above are valid, then digital cryptography exists. That is, it is possible to do all of the previous tasks digitally.
Public key cryptography. Public key cryptography is an amazing scheme that enables two parties to communicate securely, even if they've never met. It is the digital analog of a box with a combination lock. Suppose Alice wants to send Bob a message. First, Bob send the box to Alice with the padlock in the open position, without revealing the combination to anyone. Alice puts her message in the box, closes the combination lock, and sends it back to Bob. Eve may intercept the box in transit, but since she doesn't know the combination, she is unable to open it. She can try to guess the combination, but there are just too many possibilities. When the box arrives, Bob can open it, knowing that nobody else looked inside (unless they knew the combination).
To do this digitally, Bob has two keys (or combinations): his private key d is not revealed to anybody, his public key e is published in an Internet phonebook. We think of the keys as integers, but they are really just sequences of bits, say 1024. If Alice wants to send a message to Bob, she looks up Bob's public key e on the Internet. She uses e to encrypt her message and sends it to Bob. Bob uses his private key d to decrypt the message.
The idea of public key cryptography was first published in 1976 by Whitfield Diffie and Martin Hellman in their groundbreaking paper New Directions in Cryptography. This paper described a public key cryptosystem for the key distribution problem. The idea was apparently discovered independently by Ellis, Cocks, and Williamson in the UK at the Government Communications Headquarters (GCHQ) in the early 1970s, but their work remained a secret for two decades.
RSA cryptosystem.
We will describe the basic mechanics of the RSA cryptosystem, a scheme developed by Adleman, Rivest, and Shamir in 1978. Here's the RSA paper. RSA is very widely used today for secure Internet communication (browsers, S/MIME, SSL, S/WAN, PGP, Microsoft Outlook), operating systems (Sun, Microsoft, Apple, Novell) and hardware (cell phones, ATM machines, wireless Ethernet cards, Mondex smart cards, Palm Pilots). Then, we will give intuition for why it works and describe to implement it efficiently. The RSA cryptosystem involves modular arithmetic. Recall .....Key generation. To participate in the RSA cryptosystem, Bob must first generate a public and private key. He only needs to do this once, even if he plans to use the system many times.
 Select two large prime numbers p and q at random.
 Compute n = p × q.
 Select two integers e and d such that (m^{e})^{d} ≡ m (mod n) for all integers m.
As an example, we might choose the following parameters, although in practice we would need to use much larger integers to guarantee security.
p = 11, q = 29 n = 11 * 29 = 319 e = 3 d = 187
Encryption. Alice wants to send an Nbit secret message m to Bob. She obtains Bob's public key (e, n) from the Internet. Then she encrypts the message m using the encryption function E(m) = m^{e} (mod n), and sends E(m) to Bob.
m = 100 E(m) = 100^{3} (mod 319) = 254
Decryption. Bob receives the encrypted message c from Alice. Bob recalls his private key (d, n). Then he decrypts the ciphertext by applying the decryption function D(c) = c^{d} (mod n). Since Bob knows d, he can compute this function.
c = 254 D(c) = 254^{187} (mod 319) = 100
RSA simulator. RSA simulator.
Correctness. To make sure that Bob receives the original message, we must check that D(E(m)) = m. It worked in the example above where m = 100, E(100) = 254, D(254) = 100, but we need to be sure it works for all possible messages, and for all valid choices of e, d, and n. This follows in a straightforward way from the defintions and the way we chose e and d.
We have supressed one important detail  how to choose e and d so that the magic property holds.
Implementing the RSA cryptosystem. Implementing the RSA cryptosystem is a formidable engineering challenge. A successful implementation requires many ingenious algorithms and knowledge of several theorems in number theory. We will describe a barebones implemenation, but commercial implementations are more sophisticated.
Big integers. Can't use built in int or long types since numbers are too big. Need to reimplement the laws of arithmetic, e.g., addition, subtraction, multiplication, and division. Grade school algorithms are reasonably efficient for all of these operations, although there is always opportunity for improvement using clever algorithms.
Modular exponentation. How to perform modular exponentation: a^{b} (mod c). The naive method would be to repeatedly multiply a by itself, b times, and then divide by c and return the remainder. When a, b, and c are Nbit integers, this fails spectacularly for two reasons. First, the intermediate number a^{b} can be monstrously large. The number of digits can be exponential in N. When N = 50, this consumes 128TB memory. The second problem is that the number of multiplications also takes exponential time. So it will take forever.
A better alternative is to use repeated squaring. This idea dates back to at least 200 BCE according to Knuth. Program ModExp.java uses the following recurrence to compute a^{b} mod n:
 if b is zero: 1
 if b is even: (a^{b/2} * a^{b/2}) mod n
 if b is odd: (a * a^{b/2} * a^{b/2}) mod n
 if b is zero: 1
 if b is even: (a * b/2) + (a * b/2)
 if b is odd: (a * b/2) + (a * b/2) + a
Computing a random prime. To generate the key, we must have a method for generating a random Nbit prime, say N = 1024. One idea is to choose an Nbit integer at random and check if it is prime. If it is, then stop; otherwise repeat until you stumble upon one that is prime.
REPEAT x = random Nbit integer UNTIL (x is prime)
 Generating random numbers. (move to onetime pad?) Physical sources of randomness. The Java library SecureRandom is a pseudorandom number generator that generates cryptographically secure random numbers. This means that it is computationally intractable to predict future bits. Unlike a LFSR, you can't reverseengineer it.
 Computing the private exponent.
One final challenge is choosing the public and private keys.
In practice it is common use e = 65,537 as the public key.
But this means we need to find a private key that makes the magic
property hold. This turns out to be a well understood problem in
number theory and a unqiue d always exists provided gcd(e, (p1)(q1)) = 1.
We can use an extension of Euclid's algorithm
(see exercise xyz) for this purpose.
Easy to do using Java's BigInteger library for manipulating huge integers.
private final static SecureRandom random = new SecureRandom(); BigInteger ONE = new BigInteger("1"); BigInteger p = BigInteger.probablePrime(N/2, random); BigInteger q = BigInteger.probablePrime(N/2, random); BigInteger n = p.multiply(q); BigInteger phi = (p.subtract(ONE)).multiply(q.subtract(ONE)); BigInteger e = new BigInteger("65537"); BigInteger d = e.modInverse(phi); BigInteger rsa(BigInteger a, BigInteger b, BigInteger c) { return a.modPow(b, n); }
RSA Attacks. Cryptanalysis is the science of breaking secret codes. We describe a few common attacks to the RSA crpytosystem to give you the flavor of modern cryptanalysis.
 Factoring. The most obvious way to break the RSA cryptosystem is by factoring the modulus n. If Eve can factor n = pq, then she has exactly the same information as Bob, so she can efficiently compute his private exponent given the public one (using exactly the same algorithm that Bob used to compute his private exponent in the first place). Using a very sophisticated factoring algorithm known as the general number field sieve, researchers were recently able to factor RSA576, a 576bit (174 decimal digits) composite integer offered as a challenge problem by RSA Security. This effort required 100 workstations and 3 months of number crunching. The running time of this algorithm is superpolynomial but subexponential  O(exp(c (log n)^{1/3} (log log n)^{2/3})).
 Improper usage. The RSA system can also be broken if it is used improperly. For example, if Bob decides to use a small private exponent to lessen his computational burden for decryption, then he is sacrificing security. If d < 1/3 n^{1/4}, then can recover d in polynomial time (Wiener attack). Note that it is okay to use a small public exponent e, and 65,537 is common in practice. Another mistake is to allow two participants to share the same modulus n (even if neither party knows how to factor n). For example, suppose Bob and Ben have (d1, e1) and (d2, e2) for their private and public exponents, respectively, but they are both using n as their modulus. Then it is possible for either party to discover the other's private exponent (Simmon's attack).
 Side channel attack. Exploit physical information leaked from machine, including electromagnetic emanations, power consumption, diffuse visible light from CRT displays, and acoustic emanations. For example, in a timing attack Eve gleans information about Bob's private key by measuring the amount of time it takes for Bob to exponentiate. If Bob is using a highly optimized exponentiation routine, then Eve can discover enough information to reveal Bob's private key. Recently, Dan Boneh showed how to use this technique to break SSL on a LAN.
It is a longstanding open research question whether or not there is a way to break the RSA system without factoring or physical access. There is no guarantee that RSA is secure even if factoring is hard. Also, there are currently know guarantees that factoring is hard other Also, currently no mathematical guarantee that factoring is hard! FACTOR and its complementary problem NONFACTOR are both in NP. This makes it unlikely that FACTOR is NPcomplete since this would imply NP = coNP...
Semantic security. Other stronger notions of security. A public key cryptosystem is semantically secure if anything Eve can compute in polynomial time with the ciphertext can be computed without the ciphertext. Thus, observing the ciphertext provides no useful information. For example, we shouldn't be able to determine if the last bit of the plaintext is 0 or 1 or if the plaintext has more 1 bits than 0 bits. The RSA system is not semantically secure, and in fact no deterministic scheme can be. This is not just a theoretical shortcoming. To see why, suppose that Eve knows Alice is going to send Bob either the message ATTACK or RETREAT. Eve can encrypt both message using Bob's public key and then compare against the encrypted message that Alice sends to Bob. Thus, Alice can learn exactly which message was sent. Naive ideas like appending a random sequence of 0's and 1's to the plaintext before encrypting do not typically guarantee additional security.
Provably secure cryptosystems. It is a bit unsatisfying to be using a cryptosystem that is not provably as difficult as some hard problem, e.g., factoring. Theoretical highground = BlumGoldwasser (1985). Provably as hard as factoring, semantically secure. Based on the probabilistic encryption scheme of Goldwasser and Micali. Comparable in speed to RSA.
Electronic voting. Need a crypto scheme that makes it possible to confirm that your vote was correctly counted, without revealing whom the vote was for. Need the second condition to prevent someone from "buying" your vote since if they have no way to verify for whom you voted, they have no incentive to bribe you.
Zero knowledge. Alice wants to prove to Bob that a graph G is 3 colorable, but doesn't want to reveal any additional information. Example generalizes to many other problems since 3Color is NP complete.
Digital rights management. In the traditonal setting, the Alice, Bob, and Eve who are trying to communicate are human beings, and they use a computer to assist with the comptuation. An intriguing variant is when Alice and Bob are computers, and Eve is a human being. This is exactly the setting that the music industry envisions with digital rights management. In this case Alice is your computer, Bob is your speakers, and you are Eve. The music industry wants only your computer to be able to play the legally purchased music on your computer, but does not want you to be able to intercept the raw audio data. We can quickly imagine a world where there are restrictions for copying DVDs, runnig software, printing documents, and forwarding email. All of these restrictions will be enforced via cryptographic algorithms and protocols.
Goal: transform a program into an obfuscated version that computes the same function, but reveals no extra information (e.g., the source code) to a polynomialbounded adversary. Obfuscation not possible in general.
Security. Cryptography is only one part of overall computer security. This survey revealed that 70% of people would reveal their computer password in exchange for a chocolate bar. One security expert comments "using encryption on the Internet is the equilvant of arranging an armored car to deliver credit card information from someone living in a cardboard box to someone living on a park bench."
CAPTCHAs. Completely automated public Turing test to tell computers and humans apart. Reverse Turing test where computer is the judge, trying to distinguish between a human and a computer. New York Times article.
Q+A
Exercises
 Write a program to empirically determine the running time of the methods methods BigInteger.add, BigInteger.multiply, BigInteger.mod, and BigInteger.modExp. Try to model the running time of each operation as c N^{k} seconds for some constants c and k. Use BigInteger.rand to generate random input parameters. For add, multiply, and modular exponentiation use Nbit integers for all of the arguments; for division, use a Nbit numerator and an N/2bit denominator.
 Write a program RandomPrime.java that takes a commandline argument N and prints out an Nbit integer that is (probably) prime. Use BigInteger.probablePrime for primality testing and SecureRandom to generate cryptographically secure pseudorandom numbers.
 Estimate the running time of RandomPrime.java as a function of the number of bits N.
 Suppose that instead of using RandomPrime.java to choose a prime with N bits, you used the following strategy: generate all primes with at most N bits, and choose a random one. What will happen if N is large, say 512?
 Suppose that instead of using reapeated squaring to compute a^b mod c, you repeatedly multiply a to itself, b times, modding out by c. Estimate how long will it take if a, b, and c are N bit integers.
 What is the complexity of the following problem: given an even integer x, determine if x has any odd factors greater than one. Answer: polynomial  check whether x is a power of 2.
 What is the complexity of the following problem: Given an even integer x and another integer y, determine whether x has any odd factors between 3 and y. Answer: equivalent to factoring problem.
Creative Exercises
 Extended Euclid's algorithm.
Extend Euclid's algorithm for computing the greatest common divisor of p and q
to also compute coefficients a and b (possibly zero or negative) such
that ap + bq = gcd(p, q). Write a program
ExtendedEuclid.java that takes two command line parameters p and q and
outputs gcd(p, q) and a pair of integers a and b as described above.
EXTENDEDEUCLID(p, q) if q = 0 then return (p, 1, 0) (d', a', b') < EXTENDEDEUCLID(q, p % q) (d, a, b) < (d', b', a'  (p/q) b') return (d, x, y)
Hint: since your method needs to return three integers, consider using an array of three elements.
 Best rectangle. Given area A of a rectangle, find a rectangle with integer width and height whose area is A and such that the difference between the height and width are as close to each other as possible. For example, if A = 48, then the best rectangle is 6by8 and not 3by16 or 4by12. Show that if you could solve this problem, you could break the RSA cryptosystem.
 Buckets of water.
Given two buckets of capacity p and q, a receptacle of infinite capacity,
a water hose, and a drain, devise a method to get exactly k liters of water
into the receptacle using the following rules:
 You can fill either of the two buckets with the huse.
 You can empty either bucket to the drain.
 You can transfer water between the two buckets or either bucket and the receptacle until one is full or ther other is empty.
 Multiplicative inverse. Given a positive integer n, a multiplicative inverse mod b of an integer k is an integer x such that (k * x) % n = 1. Such an inverse exists if and only if gcd(k, n) = 1. Write a program Inverse.java that reads in two command line arguments k and n and computes the modular inverse if it exists. Hint: use the answer to the previous exercise. See also BigInteger.modInverse.
 Breaking the RSA cryptosystem.
One potential way to break the RSA cryptosystem is to compute
φ(n) given n. Recall that if n = pq, then φ(n) = (p1)(q1).
Show that computing φ(n) is equivalent to factoring.
Solution: obviously if you can factor n = pq, then computing φ(n) = (p1)(q1) is easy. To see the other direction, observer that n + 1  φ(n) = pq + 1  (p1)(q1) = p + q = n/q + q. Thus q^{2}  (n + 1  φ(n))q + n = 0. Assuming we know φ(n), we can solve the quadratic equation for q and recover one of the factor of n. We can recover the other factor p by computing n/q.
 Generating public and private RSA keys. Write a program RSA.java to generate a key pair for use with the RSA cryptosystem, determine two N/2 bit primes p and q. Set e = 65537, compute n = (p1)(q1), and find a number d such that (e * d) % n == 0. Assuming gcd(e, n) = 1, the inverse d will exist. Hint: use the RandomPrime.java to compute p and q, and use Inverse.java to compute d.
 Sophie Germaine primes. The security of the RSA cryptosystem appears to be improved if you use special types of primes for p and q. Specifically, a Sophie Germaine prime is a prime number p where (p1)/2 is also prime. Generate a public and private RSA key where p and q are Sophie Germaine primes. Investigate how long it takes to find such a prime as a function of the number of bits N.
 Fermat primality testing. The Fermat primality test is an algorithm that takes an odd integer n and reports that it is definitely composite or "likely" prime. By "likely", the algorithm is sometimes wrong, but not too often. Fermat's theorem says that if p is prime and gcd(a, p) = 1, then a^{p1} = 1 (mod p). A version of the converse is used as a crude primality test in the PGP cryptosystem: if 2^{p1} = 3^{p1} = 5^{p1} = 7^{p1} = 1 (mod p), then use p as a prime. Unfortunately, there are some numbers that satisfy this Fermat test, but are not prime (e.g., 29341, 46657, 75361).
 MillerRabin primality testing.
The MillerRabin algorithm is a randomized algorithm for determining
whether an odd integer n is prime. It takes a security parameter t
and outputs either prime or composite. If it outputs
composite, then n is definitely composite; if it outputs
prime, then n is probably prime, but the algorithm could
be wrong with probability 2^{t}.
boolean isProbablyPrime(BigInteger n, int t) { Compute r and s such that n1 = 2^{s}r and r is odd Repeat from 1 to t { Choose a random integer a such that 1 < a < n  1 Compute y = a^{r} mod n by repeated squaring If y ≠ 1 and y ≠ n1 { j = 1 while (j < s and y ≠ n1) y = y^{2} mod n if (y == 1) return false j = j + 1 if y ≠ n1 return false } return true }
 Factoring. Win $200,000 from RSA Security for factoring a 2048 bit number (616 digits). Factor a 64 bit number (32 bit RSA) using Program xyz in under a minute. How long to factor a 128 bit number?
 Pollard's rho method. Pollard's rho method is a randomized factoring algorithm that can factor 128 bit numbers in a reasonable amount of time, especially if the numbers have some small factors. It is based on the following fact: if d is the smallest nontrivial factor of N and x  y is a nontrivial multiple of d then gcd(xy, N) = d. A naive method would be to generate a bunch of random values x[1], x[2], ..., x[m] and compute gcd(x[i]x[j], N) for all pairs i and j. Pollard's rho method is an ingenious method way to find x and y without doing all of the pairwise computations. It works as follows: choose a and b at random between 1 and N1, and initialize x = y = a. Repeatedly update x = f(x), y = f(f(y)), where f(x) = x^{2} + b as long as gcd(xy, N) = 1. The gcd is a factor of N, but if you get unlucky, it could be equal to N. By randomly choosing a and b each time, we ensure that we never get too unlucky. Write a program PollardRho.java that takes a commandline argument N and uses the Pollard rho method to compute a prime factorization of N. Estimate the running time as a function of N.
 FeitThompson Conjecture. Disprove the FeitThompson conjecture: there are no two primes p and q such that (p^q  1) / (p  1) and (q^p  1) (q  1) have a common factor other than 1. Counterexample: (17, 3313) with common factor 112643.
 Karatsuba multiplication.
Write a program Karatsuba.java
that multiplies two integers using the
Karatsuba
algorithm.
This ingenious algorithm computes the product of two 2Nbit integers
using only three Nbit multiplications (and a linear amount of extra work).
To multiply x and y, break up x and y into Nbit chunks and use
the following identity:
xy = (a + 2^{N}b) (c + 2^{N} d) = ac + [(a+b)(c+d)  ac  bd] 2^{N} + bd 2^{2N}
Your recursive algorithm should compute the number of bits N and cutoff to the default BigInteger.multiply method when N is small (say 10,000) and apply the Karatsuba divideandconquer strategy otherwise. Investigate the optimal cutoff point and compare its effectiveness against BigInteger.multiply when N = 10 million.
 Factoring reduces to finding a factor. Given a function factor(N) that returns 1 if N is prime, and any nontrivial factor of N otherwise, write a function factorize(N) that returns the prime factorization of N.
 Perfect power. An integer N is a perfect power if N = p^{q} for two integers p ≥ 2 and q ≥ 2. Design an efficient algorithm (polynomial in the number of bits in N) to determine if N is a perfect power, and if so, find its prime factorization. Hint: for all q ≤ lg N binary search for p satisfying N = p^{q}.

Euler's conjecture.
In 1769 Euler conjectured that there are no
positive integer solutions to a^{4} + b^{4} + c^{4} = d^{4}.
Noam Elkies discovered the first counterexample
2682440^{4} + 15365639^{4} + 18796760^{4} =
20615673^{4}
over 218 years later.
Write a program Euler.java to disprove
Euler's conjecture.
The brute force solution outlined in Exercise XYZ won't work for two reasons:
(i) it will take too much time to find the solution using a quadruply nested loop,
and (ii) computing a^{4} will overflow a long since
the smallest such counterexample is
95800^{4} + 217519^{4} + 414560^{4} = 422481^{4}.
 Use the following idea. Iterate over all integers a and b between 1 and N and insert a^{4} + b^{4} into a hash table. Then, iterate over all integers c and d between 1 and N and search to see if d^{4}  c^{4} is in the hash table. Use extended precision integers to avoid overflow.
 Using extended precision integers can be a significant overhead over using primitive types. Instead of inserting a^{4} + b^{4} into the hash table, insert a^{4} + b^{4} modulo p, where p is some big prime, say XYZ. Then, iterate over all c and d and search for d^{4}  c^{4} modulo p. If there's a match, use extended precision arithmetic to check that it isn't just an coincidental collision. Hint: to avoid overflow when computing a^{4} + b^{4} modulo p, modulo out multiples of p after each multiplication.
 Fingerprinting. Alice and Bob maintain two copies of a large genomics database in different locations. For consistency, they want to be ble to compare whether the two databases are identical. We interpret the databases as Nbit integers, say A and B. Because N is very large, they can't afford to transmit the whole database. Instead, consider the following scheme for sending a fingerprint of the data that enables Alice and Bob to check if the data is inconsistent. Alice generates a random prime number p between 2 and, say, N^{2} and sends p and (A % p). This takes only O(log N) bits. Bob declares that A and B are the same if ((A % p) == (B % p)). The probability of having a false negative (a no that should have been a yes) from the scheme is zero. Show that the probability of having a false positive (a yes that should have been a no) goes to 0 as n goes to infinity. Hint: Use the fact that the number of primes less than n^{2} is at least c n^{2} / log n, for some constant c > 0. How many bits are sent?

Flipping a coin over the phone.
Alice and Bob are in the midst of a bitter divorce.
They have decided to flip a coin to see who will get custody of their
only son Carl. However, they refuse to see each other in person and
they don't want anyone else to know how the resolved the custody dispute.
In other, words, we want to devise a method to flip a fair coin over
a phone line or the Internet so that neither party can cheat. Here is
an elegant protocol:
 Alice multiplies together two or three large primes to sends the product N to Bob.
 Bob receives the integer N and responds with the number 2 or 3.
 Alice waits for a valid response from Bob and then sends Bob the prime factorization of N.
 If Bob guesses the correct number of factors, then he gets custody. Otherwise, assuming Alice follows the protocol, she wins custody.
Explain why the system works by answering each of the following questions. You may assume that there is no efficient way to determine whether a given integer N has at least 3 nontrivial factors (although this is an unresolved conjecture).
 How can Alice compute N efficiently?
 Why can't Bob efficiently determine the true answer on his own?
 How can Bob efficiently check that Alice sent him the correct factorization of N? In other words, what's to prevent Alice from revealing two factors (one of which is not prime) if Bob says 3, even if she multiplied three (or more) primes together?
 Poker over the phone. Use the bitcommitment scheme described above to develop a protocol to play poker over the phone, say between two parties.
 Discrete log. Let p be a prime number. The discrete log of a to the base b is the unique integer x between 0 and p1 such that a = b^{x} (mod p). For example, if p = 97, b = 5, and a = 35, then log_{5} 35 = 32 since 5^{32} = 35 (mod 97). Write a program DiscreteLog.java that takes three command line inputs a, b, and p, and computes log_{b} a modulo p via brute force search.
 Diffie Hellman. Let p be a prime number, and let a and b be two integers. Given p, an x, x^{a} (mod p) and x^{b} (mod p), the DiffieHellman problem is to compute x^{ab} (mod p).
 Rabin's cryptosystem. Select p, q to be prime such that p = 3 mod 4 and q = 3 mod 4. The public key is n = pq and the private key is (p, q). To encrypt, compute E(m) = m^{2} mod n. To decrypt compute D(c) = sqrt(c) mod n. How to compute square root: c = x^{2} mod n? Use extended Euclid's algorithm to find a, b such that ap + bq = 1. Compute r = c^{(p+1)/4} mod p and s = c^{(q+1)/4} mod q. Compute m = (aps + bqr) mod n and t = (aps  bqr) mod n. The four square roots of c are m, m mod n, t, and t mod n.
 Analog privatekey exchange. You are stranded on an island with a box, a padlock with key, and a copy of Introduction to Computer Science. You have a friend on another island who also has a box, a padlock with key, but wants to borrow your copy of the textbook. You can ship stuff via an unscrupulous courier service who will pillage anything inside the box if it is left unlocked. How can you get your book to your friend?
 Cryptographically secure hash functions.
SHA1 and MD5. Can compute it by converting string to bytes,
or when reading in bytes 1 at a time.
import java.security.MessageDigest; ... MessageDigest sha1 = MessageDigest.getInstance("SHA1"); sha1.update(s.getBytes()); Byte[] hash = sha1.digest();
 RSA in Java.
Builtin functionality for RSA or DSA. Untested code below.
// key generation KeyPairGenerator keygen = KeyPairGenerator.getInstance("DSA"); SecureRandom random = new SecureRandom(); keygen.initialize(512, random); KeyPair keys = keygen.generateKeyPair(); PublicKey pubkey = keys.getPublic(); PrivateKey prikey = keys.getPrivate(); // digital signing Signature signer = Signature.getInstance("DSA"); signer.initSign(prikey); signer.update(s.getBytes()); Byte[] signature = signer.sign(); // verifying Signature verifier = Signature.getInstance("DSA"); verifier.initVerify(pubkey); verifier.update(t.getBytes()); Boolean check = verifier.verify(signature);
 BlumBlumShub pseudorandom bit generator.
Choose two distinct
Nbit primes p and q such that p mod 4 = q mod 4 = 3.
Set n = pq and choose a starting value x_{0}
by selecting a random seed 1 < s < n such that
gcd(s, n) = 1.
Form the sequence of integers
x_{0} = s^{2} mod n and
x_{i+1} = x_{i} x_{i} mod n.
Use x_{i} % 2 as the sequence of pseudorandom bits.
No need to keep n secret.
Discovering any pattern (in poly time)
is provably as hard as factoring n.
Note: we still need to generate p, q, and s at random,
but these have only O(N) bits, and we will be able to
generate 2^N pseudorandom bits.
Can use as a onetime pad.
Can also be used for directly for publickey crypto
 VCR Plus decoding. Remote control scheme for recording programs on a VCR using special code printed in newspapers. Bad cryptography so easy to break. paper
 Pascal's triangle.
One way to compute the kth row of Pascal's triangle (for k > 2) is to compute
(2^{k} + 1)^{k+1} and take its binary representation k bits at a time.
1^{0} = 1 (1^{0} = 1 = 1) 11^{1} = 1 1 (3^{1} = 3 = 1 1) 101^{10} = 01 10 01 (5^{2} = 25 = 1 2 1) 101^{11} = 01 11 11 01 (5^{3} = 125 = 1 3 3 1) 1001^{100} = 001 100 110 100 001 1 4 6 4 1 (9^{4} = 6561 = 1 4 6 4 1) 10001^{101} = 0001 0101 1010 1010 0101 0001 (17^{5} = 1419857)
 BaileyBorweinPlouffe algorithm. Compute the ith binary digit of π without computing the earlier digits using the BBP algorithm which requires modular exponentiation.
 Secret sharing. Want to distribute a message to N people so that any 3 of them can recover the original message, but any 1 or 2 cannot. reference. Scientific American puzzle
 Mersenne prime. A Mersenne prime is a prime of the form M_p = 2^p  1, where p is an odd prime. To test whether M_p is prime, form the following sequence: s_0 = 4, s_i+1 = (s_i)^2  2 mod M_p. M_p is prime iff s_(p2) = 0 mod M_p. This is method known as the LucasLehmer primality test.