CS152-lecture-20210624

Asymmetric encryption #

A very useful type of encryption where encryption and decryption is done using key pairs, one public and one private. This can solve the key exchange problem, and also can be used to digitally sign messages.

Now most cryptography is symmetric, only a small amount of CPU cycles are spent on asymmetric cryptography.

RSA #

A revolutionary cryptosystem designed by Rivest, Shamir, Adleman at MIT in the 1970s.

Read more on RSA

It is called a “cryptosystem” because it is a collection of algorithms that work together to produce cryptographic services. It provides a permutation that is

  • easy to compute, given a public key, and
  • hard to invert without the key

RSA can be used for encryption,

\[\begin{aligned} \text{RSA Encryption} (x) = x^e \mod n \end{aligned}\]

where \( 0 < x < n \) where \( n \) is the upper bound.

The public key is the pair \( (e, n) \) .

\[\begin{aligned} \text{RSA Decryption} (y) = y^d \mod n \end{aligned}\]

The private key is the pair \( (d, n) \) . (Note that we haven’t defined what \( e,d,n \) are.)

Note: The parameter x is usually plaintext made up of chars, so it is considered as the bit sequence for each ASCII value.

The individual’s public key is published for all to see. So anyone can encrypt something, using anyone’s public key. Nobody can decrypt it without the private key. This is very different from symmetric cryptography.

RSA key generation (the magic) #

Key generation follows these steps:

  1. Choose random distinct primes \( p,q \) where \( |p| \approx |q| \) .
    1. Let \( n = pq \) , and
    2. \( \phi(n) = (p-1)(q-1) \) (the totient of \( n \) ).
  2. Choose \( e \) so that \( 1 < e < \phi(n) \) and \( \text{gcd} (e, \phi(n)) = 1 \) .
  3. Let \( d = e^{-1} \mod \phi(n) \)
  • The public key is \( (e, n) \)
  • The private key is \( (d, n) \)
  • \( e \) is usually picked to be something that is efficient, something like \( 2^{16} + 1 \) , (sometimes \( 3 \) ).
  • Recall that \( x^{-1} \mod n \) exists if and only if \( \text{gcd} (x, n) = 1 \) .
Note: Security increases the larger that \(p,q\) are. These values are usually around 2000-8000 bits long, depending on the time needed for security.

The best attack on RSA is to factor \( n \) into \( p \) and \( q \) and generate \( d \) .

Key gen and encryption example #

Key generation #

\[\begin{aligned} p &= 47 &\text{step 1}\\ q &= 11 \\ n &= pq \\ &= 517 \\ \phi(n) &= (p-1)(q-1) \\ &= 460 \\ e &= 3 &\text{step 2} \\ d = e^{-1} \mod \phi(n) &= 307 &\text{step 3} \\ \end{aligned}\]

Note

  • So our public key is \( (e, n) = (3, 517) \)
  • Our private key is \( (d, n) = (307, 517) \)
  • When picking \( e \) , check \( \text{gcd} (e, \phi(n)) = 1 \) .
  • To find the multiplicative inverse, we can use something like wolfram alpha: “inverse of 3 mod 460”, or you can use the egcd algorithm.

Encryption #

Lets encrypt the value \( 416 \) :

\[\begin{aligned} 416^3 \mod 517 = 80 \end{aligned}\]

Decryption #

To decrypt our ciphertext value \( 80 \) :

\[\begin{aligned} 80^{307} \mod 517 = 416 \end{aligned}\]

Why RSA works #

Euler’s theorem: When \(\text{gcd}(a, n) = 1 \) then \[a^{\phi(n)} \mod n = 1\]

We want to show that if we have a cipher text \( y \) , and we raise it to the decryption power \( d \) (all \( \mod n \) ) equals our original plaintext \( x \) .

\[\begin{aligned} y^d &= (x^e)^d \\ &= x^{ed} &&\text{aside } \\ && ed \mod \phi(n) &= 1\\ && ed &= 1 + k \phi(n) \\ &= x^{1 + k \phi(n)} \\ &= (x)(x)^{k \phi(n)} \\ &= (x)(x^{\phi(n)})^k &&\text{use Euler's theorem} \\ &= (x)(1)^k \\ &= x \end{aligned}\]

Practicality, algorithms on large numbers #

During key generation and encryption/decryption, we see a lot of exponentiation in the calculation. So a proper RSA implementation must use the most efficient algorithms.

Efficient algorithms

  • run in time proportional to length (number of bits used) rather than the value
  • use basic operations: add, multiply, divide
  • exponentiation, when implemented naively can be proportional to the value of the exponent. So that must be taken into consideration
  • multiplicative inverse, when implemented naively can be proportional to the value of the modulus, rather than the length.
  • finding primes, can be implemented poorly when testing to see if the number is indeed prime

Finding large primes #

Since finding large primes is the first step in RSA key generation, we need a way of picking large prime numbers.

It is fairly easy to prove that the number of primes is infinite.

Theorem: Primes are infinite.

Small sketch proof:

  • Assume that there are not an infinite number of primes (we’ll prove by contradiction)
  • Let \( P_1, P_2, \ldots, P_r \) be the list of all primes
  • Let \( n = P_1 \cdot P_2 \cdot \cdots \cdot P_r \)
  • \( n + 1 \) is bigger than every prime number in the list, and so is not prime
  • Let \( P \) be a prime divisor of \( n + 1 \)
  • \( n \) and \( n + 1 \) are both multiples of \( P \)
  • this is impossible, so there is an infinite number of primes.
    • we can’t have 2 consecutive numbers be multiples of the same \( P \) , they must be at least \( P \) apart

So knowing this, we know that there are prime numbers still even when we’re up around 4000 bit long numbers. There actually is the fact that about \( \frac{1}{n} \) of \( n \) -bit numbers are prime (Density of primes).

How to find them #

Number theory tells us that the probability

\[\begin{aligned} P(x^{\frac{P-1}{2}} \mod P = 1 \text{ or } P - 1 \mid \text{P is prime}) &= 1 \\ P(x^{\frac{P-1}{2}} \mod P = 1 \text{ or } P - 1 \mid \text{P is composite}) &\leq \frac{1}{2} \end{aligned}\]

So for example, lets choose \( q \) random \( x \) in the range \( 0 < x < P \) , and calculate \( x^{\frac{P-1}{2}} \mod P \) for each.

\[\begin{aligned} P(\text{all are 1 or } P - 1 \mid \text{P is prime}) &= 1 \\ P(\text{all are 1 or } P - 1 \mid \text{P is composite}) &= \left( \frac{1}{2} \right)^q \\ \end{aligned}\]

This is a probabilistic algorithm, meaning that there is a probability that the algorithm is incorrect.

Generating prime number algorithm #

// b    number of bits
// c    confidence level
// returns b-bit prime with false positive rate of 1/2^c
gen_prime(b, c):
    do:
        p = random odd number (b-bits long)
    while (!is_prime(p, c))
    return p

is_prime(p, c):
    for i from 1 to c:
        x = random where 0 < x < p
        if x^{(p-1)/2} mod p is not 1 or p - 1:
            return false
    return true (and saw at least 1 "P-1")

This algorithm fails when checking Carmichael numbers (because they pass this test but are composite). So the last return true is only really true if we’ve seen at least one \( P-1 \) value.

The line with the calculation of \( x^{\frac{p-1}{2}} \mod p \) is the most expensive computationally. The amount of times that calculation will happen is around \( \frac{1}{n} \) where the numbers you’re checking are \( n \) -bits long. There is an expected of 2 iterations to that calculation on a failure. So, is_prime should be called about about \( 2n \) times. The actual exponent \( \frac{P-1}{2} \) is about \( n \) bits long, and so there is about \( \frac{3}{2}n \) multiplications.

So if we have a 4000 bit long prime we’re looking for,

\[\begin{aligned} 8000 \cdot 6000 = 48,000,000 \text{ multiplications} \end{aligned}\]

where each of those multiplications is about 4000 bits long.

On the next programming assignment #

The last 2 assignments have been combined into 1, and will be spread over the next 2 weeks.

image_2021-06-24-14-58-42

Since the number one use of asymmetric cryptography is to establish a key to use in symmetric cryptography (the idea of hybrid cryptography), we will implement a key negotiation.

We will use

  • perm384hash256, and
  • BIGNUM a big number library part of OpenSSL

BIGNUM #

A BIGNUM example for calculating gcd:

#include <openssl/bn.h>
#include <stdio.h>

void gcd(BIGNUM *r, BIGNUM *a, BIGNUM *b, BN_CTX *ctx)
{
    // create BIGNUM pointers
    BIGNUM * c = BN_new();
    BIGNUM * d = BN_new();
    BIGNUM * td = BN_new();

    // can't use assignment operator =, must use BN_copy
    BN_copy(c, a);
    BN_copy(d, b);

    // while d isn't 0
    while (!BN_is_zero(d))
    {
        BN_copy(td, d);             // td = d
        BN_mod(d, c, td, ctx);      // d = c % td
        BN_copy(c, td);             // c = td
    }

    // Copy results
    BN_copy(r, c);

    // free allocated BIGNUMs
    BN_free(c);
    BN_free(d);
    BN_free(td);
}

int main()
{
    // BIGNUM context
    BN_CTX *ctx = BN_CTX_new();

    // two 800 bit numbers, a and b
    unsigned char a[20] = {0xff,};
    unsigned char b[20] = {0xee,};

    // convert binary to BIGNUM
    BIGNUM *bna = BN_bin2bn(a, sizeof(a), NULL);
    BIGNUM *bnb = BN_bin2bn(b, sizeof(b), NULL);

    // two BIGNUMs for comparison
    BIGNUM *r_me = BN_new();
    BIGNUM *r_ssl = BN_new();

    // gcd(a, b)
    gcd(r_me, bna, bnb, ctx);
    BN_gcd(r_ssl, bna, bnb, ctx);

    // check if they're equal
    if (BN_cmp(r_ssl, r_me) != 0)
        printf("Not equal!\n");

    // convert BIGNUMs to strings
    char *as = BN_bn2dec(bna);
    char *bs = BN_bn2dec(bnb);
    char *rs = BN_bn2dec(r_me);
    printf("GCD of\n%s\nand\n%s\nis\n%s\n",as,bs,rs);
    return 0;
}

Some example output:

GCD of
1455792646560079078679451688838485039110401556480
and
1358739803456073806767488242915919369836374786048
is
97052843104005271911963445922565669274026770432