James Manger’s CCA on RSAES-OAEP

Introduction

98年，Daniel Blechenbacher把PKCS #1 v1.5 RSA给日了^[1]，所以PKCS #1 v2.0 RSA加入了这个Optimal Asymmetric Encryption Padding(OAEP)。

PKCS #1 v2.0 RSA says, “a chosen ciphertext attack is ineffective to against a plaintext-aware encryption scheme such as RSAES-OAEP”.

但是不料的是，01年，OAEP又被James Manger给日了^[2]。（后来又有了OAEP+）

？

本文即复现一波OAEP是怎么被日的。

RSAES-OAEP

一般来说，比较短的message（比如说用于DES、AES等对称密码加密用的密钥，都很短，64bit、128bit等等），直接用RSA对它pow(m, e, n)是不太安全的。其中的一个例子就是^[3]，存在一个平方根复杂度的算法来破解这个密钥。

所以，padding就显得十分重要。

OAEP就是一种对RSA加密的一种填充方案。

关于OAEP具体的流程，wiki^[4]上讲的很不错，但具体实现如何在这次的CCA里面并不是那么的关键。

先来关注一下OAEP的解密流程：

Screen Shot 2020-03-14 at 4.03.42 AM

解密的时候，先用pow(c, d, n)得到plaintext（这个图是从下往上的）。

然后由于这个OAEP的原因，plaintext的开头必须得是8个0bit（比如说模数为1024bit，那么这个解密出来的plaintext就必须得小于等于1014bit）。

如果不是这样的话（开头不是8个0bit），就会报错，返回integrity check fails。

如果是这样子的话（开头是8个bit），那么就再继续下去，看能不能通过后面的检测，没通过也报错，返回decryption error；通过了，就返回解密成功。

Chosen Ciphertext Attack

我们先令$k = \lceil{log_{2^8}{n}}\rceil$，也就是$n$的bytes数；$B = 2^{8k - 8}$。

注意到这边，它报错会有两种情况：

integrity check fails：pow(c, d, n) >= B
decryption fails：pow(c, d, n) < B

其实上，在这种decryption oracle下，我们是可以仅凭这两种报错信息来操作出明文的。

RSA一个很fishy的一点就是它的malleability。

即，$E(m) = m^e \pmod{n}$，那么$E(m)\cdot t^e = m^e \cdot t^e \pmod{n} = E(m\cdot t)$。

我们可以利用这一点来破解某个给定的密文$c \equiv m^e \pmod{n}$（其中$m < B$，由于OAEP的原因）。

虽然直接发送$c$给这个oracle是没什么用的（返回解密成功），但是我们可以发$f^e \cdot c \pmod {n}$过去，这样它pow(c, d, n)解密了就是$f \cdot m \pmod{n}$，显然这个样子的解密结果是不可能返回解密成功的（由于OAEP的检测）。

但我们可以从oracle的两种不同的报错信息来判断出这个$f \cdot m \pmod{n}$是在区间$[0, B)$还是$[B, n)$中。

这种攻击的一个前提是：$2B < n$，显然这基本是肯定的。

攻击流程如下：

Screen Shot 2020-03-14 at 4.53.14 AM

Screen Shot 2020-03-14 at 4.53.33 AM

Screen Shot 2020-03-14 at 4.53.52 AM

Screen Shot 2020-03-14 at 5.57.04 AM

具体的攻击流程，实在是有些难懂，我看这个看睡着了2次。。。

不过能感觉得出来，这个跟RSA parity oracle attack有很多相似之处，但这个稍微复杂一些。

Reproduction

先写了一个decryption oracle（OAEP的后面的操作省略了）:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20


import os

from Crypto.Util.number import *

p = getPrime(512)
q = getPrime(512)
n = p*q
e = 0x10001
print(p, q, n, sep='\n')

m = bytes_to_long(os.urandom(16))
print(m)
c = pow(m, e, n)
print(c)
d = inverse(e, (p-1)*(q-1))

B = 2**(1024-8)
while True:
    inp = int(input("$ "))
    print(pow(inp, d, n) > B)

然后根据攻击流程写出exp：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71


from pwn import *

r = process(argv=["python3", "OAEP.py"])

e = 0x10001
p, q, n, m, c = [int(r.recvline()) for _ in range(5)]
# print(p,q,n,m,c)
B = 2**(1024-8)

# Step 1
f1 = 1
i = 0
while True:
    i += 1
    f1 = f1 * 2 % n  # start from 2
    payload = pow(f1, e, n) * c % n # Decrypt: f1 * m
    r.sendlineafter(b"$ ", str(payload))
    response = r.recvline(keepends=False)
    print(i, response)
    if response == b"True": # f1 * m > B
        break
    # else: continue

# Now, f1 * m is in [B, 2B), hence f1/2 * m is in [B/2, B)
f1_2 = f1 // 2
assert(f1_2*m >= B//2 and f1_2*m < B)


# Step 2
f2 = ((n+B)//B) * f1_2
i = 1
while True:
    if i > 1:
        f2 = f2 + f1_2 # f2 * m is in [n/2, n+B)
    i += 1
    payload = pow(f2, e, n) * c % n # Decrypt: f2 * m
    r.sendlineafter(b"$ ", str(payload))
    response = r.recvline(keepends=False)
    print(i, response)
    if response == b"False": # n < f2 * m < n + B
        break
    # else: continue

# Now, f2 * m is in [n, n+B)
print(f2*m >= n and f2*m < n+B)


ceiling = lambda a, b: a//b + (1 if a % b != 0 else 0)
# Step 3
# f2 * m is in [n, n+B), so m is in [n/f2, (n+B)/f2)
m_min = ceiling(n, f2)
m_max = (n+B) // f2
time = 0
while m_min != m_max:
    time += 1
    f_tmp = (2*B) // (m_max - m_min) # ftmp * m is in [2n, 2n+2B)
    i = (f_tmp * m_min) // n
    f3 = ceiling(i*n, m_min)
    payload = pow(f3, e, n) * c % n
    r.sendlineafter(b"$ ", str(payload))
    response = r.recvline(keepends=False)
    print(i, time, response)
    if response == b"True": # i*n + B < f3 * m < i*n + 2*B
        m_min = ceiling(i*n + B, f3)
    else:  # i*n < f3 * m < i*n + B
        m_max = (i*n+B) // f3

print(m_min == m == m_max)


r.interactive()

等一连串的调试信息刷过后，可以看到最后一句print(m_min == m == m_max)的输出是True。

PoC成功！

Extension

照这个思路的话，只要是所有满足$2B < n$的$B$，给了这样一个范围信息，都是可以操作出明文的？？

注意到Step 2是会在最多ceil(n / B)次交互后停下来的，所以这个B不能搞的太小了。如果太小了的话，f1会很小，导致f2增长地极慢，会让交互的次数变得很多。

修改了一下代码中的B，发现的确如此，有点意思。。

那如果$2B > n$呢？

Screen Shot 2020-03-14 at 5.59.00 AM

也是能操作的，不过更复杂。。

实际上，PKCS #1 v2.0已经意识到了这个问题，所以它明确指出了：

It is important that the error messages output in steps 4 [interger-to- octets conversion] and 5 [OAEP decoding] be the same.

不过这种攻击在某些情境下还是有可能发生的。。

Bleichenbacher, Daniel. “Chosen ciphertext attacks against protocols based on the RSA encryption standard PKCS# 1.” Annual International Cryptology Conference. Springer, Berlin, Heidelberg, 1998.
Manger, James. “A chosen ciphertext attack on RSA optimal asymmetric encryption padding (OAEP) as standardized in PKCS# 1 v2. 0.” Annual International Cryptology Conference. Springer, Berlin, Heidelberg, 2001.
Boneh, Dan, Antoine Joux, and Phong Q. Nguyen. “Why textbook ElGamal and RSA encryption are insecure.” International Conference on the Theory and Application of Cryptology and Information Security. Springer, Berlin, Heidelberg, 2000.
https://en.wikipedia.org/wiki/Optimal_asymmetric_encryption_padding