「Cryptography-Boneh」:Stream Cipher 2

2020-03-192021-12-28 Cryptography-Boneh 28 minutes read (About 4223 words) 0 visits

作为Stream Cipher的第二篇文章。
第一部分分析了基于Stream Cipher的两种攻击：第一种是Two time pad,第二种是对与其完整性的攻击，即流密码是可被篡改的。
第二部分具体说明了一些使用流密码加密的例子。包括分析基于软件的RC4流密码、基于硬件的CSS流密码和现代的安全流密码:eStream中的Salsa20。

Attack on OTP and stream ciphers

Attack1: two time pad is insecure

Never use strame cipher key more than once!!

why insecure

使用相同的PRG(k)加密不同明文时：
$$
C_1 \leftarrow m_1 \oplus \text{PRG(k)}\
C_2 \leftarrow m_2 \oplus \text{PRG(k)}
$$
Eavesdropper（窃听者）截获这两段密文 $C_1\ C_2$ ，对密文进行疑惑操作，可得： $C_1 \oplus C_2\rightarrow m_1\oplus m_2$ 。

在传输中，英语字母是用ASCII编码后再传输，所以这样的编码会带来很多redundancy（冗余），即根据 $m_1\oplus m_2$ ，可以得到 $m_1\ m_2$ 。

因此，当一个密钥会被使用多次时，就不应该直接用stream cipher，后面的章节会介绍multi-use ciphers。

Examples: Project Venona(1941-1946)

我们已经知道：加密应该用OTP，即一次性密钥。

但是，当时是通过人工掷骰子并记录得到密钥，工作费时费力。因此不得不用生成的密钥加密多条消息。

最后仅凭截获密文，就破译了3000多条消息。

Examples: MS-PPTP(Windows NT)

微软在Windows NT的PPTP协议（point to point transfer protocol）中使用的流密码是：RC4。

在这个协议中允许一个端系统向另一个端系统发送加密后的信息。

过程如下：

在一次对话连接中：主机想发送$m_1\ m_1\ m_3$ 三条消息进行查询，服务器想发送 $s_1\ s_1\ s_3$ 三条消息进行响应。
主机和服务器hava a shared key:k。
知道密钥不能加密多次，于是主机将三条消息进行concatenation（联结）： $m_1||m_2||m_3$ 。
主机用k作为密钥，得到G(k)，进行加密 $[m_1||m_2||m_3]\oplus\text{G(k)}$ 。
同样，服务器也将响应消息进行联结： $s_1||s_2||s_3$ 。
服务器也用k作为密钥，得到相同的G(k)，对响应消息进行加密 $[s_1||s_2||s_3]\oplus\text{G(k)}$ 。

因此，在一次对话中，主机和服务器都使用了相同的 G(k)进行加密，也就是 two time pad。

如何改进

主机和服务器have a shared pair of key, 即主机和服务器都使用不同的key进行加密。

Examples: 802.11b WEP

How it works

WEP(Wired Equivalent Privacy)，有效等效加密，是一种用于IEEE 802.11b的一种安全算法。这个算法设计的很糟糕，现已被WPA所淘汰。

WEP用于Wi-Fi通信，是他的的加密层。

WEP的算法过程如下：

主机和路由器进行通信：

主机和路由 have a shared key。
主机想要发送一段消息，包括明文m和其校验码CRC(m)。
PRG’s seed： IV||k, k is a long term key，IV is a counter.

Length of IV: 24 bits.

IV的作用：每一次传送数据包时，用IV来改变每次的密钥。
用(IV||k作为密钥，得到PRG(IV||k),使用流密码进行加密传输。
主机直接发送IV和密文。
路由器用收到的IV和k连接，用PRG(IV||k)，对密文解密。

Problems of IV

IV 导致的问题1: two time pad

Length of IV: 24 bits

Related IV after $2^{24}$ (16M frames)

【当发送16百万的帧后，PRG又会重复】
On some 802.11 cards: IV rests to 0 after power cycle.

【在有些单片机上，重启后IV会变成0：每次重启都会使用PRG(0||k)加密】

IV 导致的问题2: related keys

在PRG中，key for frame is related。(IV||k)，k是104 bits, IV 是24 bits，所以key的后104位总是相同的，不同密钥之间的差异也很小。

对RC4 PRG的攻击：

Fluhrer, Mantin, and Shamir在2001年:只需要监听 $10^6$ 帧即可破译密钥[1]。
Recent attacks：只需要监听4000帧，即可破译WEP网络中的密钥。

所以，密钥关联是灾难性的。

Avoid related keys！

A better construction

对于WEP，一种更好的做法是：将多个要发送的帧联结起来，得到 $m_1||m_2||…||m_n$ 长流，再用PRG对这个长流加密。

如上图所示，k扩展后，被分成很多段，用第一段加密第一帧，第二段加密第二帧……。

这样，加密每一帧的密钥都是一个伪随机密钥。(no relationship, and looks like random)。

当然，更好的解决方法是使用更强的加密算法（WPA2）。

Examples: disk encryption

另一个例子是硬盘加密，在这个例子中，你会发现：使用流密码对硬盘加密不是一个好的举措。

如果使用流密码：

Alice 想要给Bob写一封邮件，如上图所示。
- 邮件经过硬盘加密（流密码）后，存入内存块。
Alice 想要对存在这个硬盘中的邮件进行修改。
- Alice只把Bob改成了Eve，其他部分都没有变，如上图所示。
- 保存后，邮件再次经过硬盘加密（流密码）后，存入内存块。
Attacker：他得到了硬盘上最初的密文和修改后的密文。
- 通过分析，他发现两段密文只有前小部分不同。（用相同的流密码密钥加密，修改后，密文很容易看出变化）
- 虽然Attacker不知道Alice是怎么修改的，但是他知道了Alice修改的具体位置。
- $\Rightarrow$ attacker得到了他不应该知道的信息，即修改的具体位置。

在硬盘加密中，对于文本内容的修改前后，也使用了相同的密钥段加密不同的消息，即two time pad。

因此在硬盘加密中，不推荐使用流密码。

Two time pad: Summary

Never use stream cipher key more than once!!

Network traffic: negotiate new key for every session.
Disk encryption: typically do not use a stream cipher.

Attack2: no integrity(OTP is malleable)

OPTP和Stream Cipher一样，不提供完整性的保证，只提供机密性的保证。

如上图所示：

如果attacke截获：密文 $m\oplus k$ 。
并用sub-permutation key（子置换密钥）来对密文进行修改，得到新的密文：$(m\oplus k)\oplus p$
新的密文最后解密得到的明文是 $m\oplus p$ 。

所以对于有修改密文能力的攻击者来说，攻击者很容易修改密文，并且修改后的密文，对原本解密后的明文也有很大的影响。

具体攻击如下：

Bob想要发送一封邮件：消息开头是From: Bob，使用OTP加密后，发送密文。
Attacker：截获了这段密文
- 假设：attacker知道这封邮件是来自Bob。
- attacker想修改这封密文邮件，使得它来自others。
- 于是它用一个子置换密钥对原密文的特定位置（即Bob密文位置）进行操作，得到新的密文：From： Eve。
  - 这个子置换密钥是什么呢？
    
    如上图所示，Bob的ASCII码是 42 6F 62，Eve的ASCII码是 45 76 65。
    
    那么Bob $\oplus$ Eve的ASCII码是 07 19 07。
    
    因此这个子置换密钥是07 19 07。
最后收件人进行解密，得到的是明文：From：Eve。

对attacker来说，虽然他不能创建来自Eve的密文，但是他可以通过修改原本的密文，达到相同的目的。

Conclusion: Modifications to ciphertext are undertected and have predictable impact on plaintext.

Real-world Stream Ciphers

Old example(SW): RC4

RC4流密码，是Ron RivestRC4在1987年设计的。曾用于secure Web traffic(in the SSL/TLS protocol) 和wireless traffic (in the 802.11b WEP protocol).

It is designed to operate on 8-bit processors with little internal memory.

At the heart of the RC4 cipher is a PRG, called the RC4 PRG. The PRG maintains an internal

state consisting of an array S of 256 bytes plus two additional bytes i,j used as pointers into S.

【RC4 cipher的核心是一个PRG，called the RC4 PRG。这个PRG的内部状态是一个256字节的数组S和两个指向S数组的指针】

RC4 stream cipher generator

setup algorithms:
- 对S数组进行初始化，0-255都只出现一次入下图所示：
- 伪代码
stream generator:
- Once tha array S is initialized, the PRG generates pseudo-random output one byte at a time, using the following stream generator:
- The procedure runs for as long as necessary.
  
  Again, the index i runs linearly through the array while the index j jumps around.

Security of RC4

具体参见「A Graduate Course in Applied Cryptography」的p76-78

挖坑，有空填坑 cryptanalysis of RC4[2]

Weakness of RC4

Bias in initial output: Pr[2^nd^ byte=0]=2/256.
- 如果PRG是随机的，其概率应该是1/256。
  
  而Pr[2^nd^ byte=0]=2/256的结果是：消息的第二个字节不被加密的概率比正常情况多一倍。（0异或不变）
- 统计的真实情况是，不止第二个字节，前面很多字节的概率很不都均匀。
  
  因此，如果要使用RC4 PRG，从其输出的257个字节开始使用。
Prob. of (0,0) is 1/256^2^ +1/256^3^ .
- 如果PRG是随机的，00串出现的概率应该是(1/256)^2^ .
Related key attackes.

在上小节中「Examples: 802.11b WEP」，WEP使用的RC4流密码，related key对安全通信也是灾难性的。

Old example(HW): CSS (badly broken)

The Content Scrambling System (CSS) is a system used for protecting movies on DVD disks.

It uses a stream cipher, called the CSS stream cipher, to encrypt movie contents.

CSS was designed in the 1980’s when exportable encryption was restricted to 40-bits keys. As a result, CSS encrypts movies using a 40-bits key.

【1980的美国出口法，限制出口的加密算法只能是40位，于是CSS的密钥是40位】[amazing.jpg]

While ciphers using 40-bit keys are woefully insecure, we show that the CSS stream cipher is particularly weak and can be broken in far less time than an exhaustive search over all 2^40^ keys.

【虽然40位的密钥本来就不够安全，但是我们能用远小于穷举时间的方法破解CSS】

因为博主也是第一次学，很多东西也不了解。

所以概述性语言，我用教科书的原文记录，附注一些中文。望海涵～

Linear feedback shift register(LFSR)

CSS 流密码是由两个LFSR（线性反馈移位寄存器）组成的，除了CSS，还有很多硬件系统是基于LFSR进行加密操作，但无一例外，都被破解了。

DVD encryption (CSS)：2 LFSRs
GSM encryption (A5/,2): 3 LFSRs

【全球通信系统】
Bluetooth(E0): 4LFSRs

LFSR can be implemented in hardware with few transistors. And a result, stream ciphers built from LFSR are attractive for low-cost consumer electronics such as DVD players, cell phones, and Bluetooth devices.

【LFSR在硬件上运行很快，也很省电，所以虽然基于LFSR的算法都被破解了，但是改硬件有点困难，所以还是有很多系统在使用】

上图是一个8位LFSR{4,3,2,0}。

LFSR是由一组寄存器组成，LFSR每个周期输出一位（即0位）。

有一些位（如上图的4，3，2，0）称为tap positions(出头)，通过这些位计算出feedback bit(反馈位)。

反馈位和寄存器组的未输出位组成新的寄存器组。

伪代码如下：

所以基于LFSR的PGR的seed是寄存器组的初始状态。

how CSS works

CSS的seed=5 bytes=40 bits。

CSS由两个LFSR组成，如下图所示，一个17-bit LFSR，一个25-bit LFSR。

seed of LFSR:

17-bit LFSR: 1||first 2 bytes ，即17位。
25-bit LFSR: 1||last 3 bytes，即25位。

CSS过程：

两个LFSR分别运行8轮，输出8 bits。
两个8bits 相加mod 256（还有加上前一块的进位）即是CSS一轮的输出：one byte at a time.

Cryptanalysis of CSS (2^16 time attack)

当已知CSS PRG的输出时，我们通过穷举（2^40^ time）破解得到CSS的seed。

但还有一种更快的破解算法，只需要最多2^16^ 的尝试。

破解过程如下：

影片文件一般是MPEG文件，假设我们已知明文MPEG文件的前20个字节。（已知明文的prefix）
CSS是流密码，可以通过已知prefix还原出CSS的prefix，即CSS PRG的前20个字节的输出。
For all possible initial settings of 17-bit LFSR do:
- run 17-it LFSR to get 20 bytes of output.
  
  【先让17-bit输出20个字节】
- subtract from CSS prefix $\Rightarrow$ candidate 20 bytes output of 25-bit LFSR.
  
  【通过还原的CSS PRG的输出，得到25-bit输出的前20个字节】
- If consistent with 25-bit LFSR, found correct initial settings of both!!
  
  【具体是如何判别这20个字节是否是一个25-bit LFSR的输出呢？】
  1. 假设前三个字节是y1, y2, y3.
  2. 那么25-bit LFSR的initial :{1, y1 , y2, y3},其中y都是8 bits.
  3. 用这个初始状态生成20个字节，如果和得到的20个字节相同，则正确，否则再次枚举。
当得到了两个LFSR的seed, 就可以得到CSS PRG的全部输出，即可破解。

Modern stream ciphers: eStream

main idea

eStream PRG ： $\{0,1\}^s\times R \rightarrow \{0,1\}^n$ (n>>s)

seed: ${0,1}^s$

nonce: a non-repeating value for a given key.【就对于确定的seed,nonce绝不重复】
Encryption: $\text{E(k, m; r)}=\text{m}\oplus \text{PRG(k; r)} $

The pair (k,r) is never used more than once.

【PRG的输入是(k,r), 确保OTP】

eStram: Salsa 20(SW+HW)

Salsa20/12 and Salsa20/20 are fast stream ciphers designed by Dan Bernstein in 2005.

Salsa 20/12 is one of four Profile 1 stream cipher selected for the eStream portfolio of stream ciphers.

eStream is a project that identifies fast and secure stream ciphers that are appropriate for practicle use.

Variants of Salsa20/12 and Salsa20/20, called ChaCha12 and ChaCha20 respectively, were proposed by Bernstein in 2008.

These stream ciphers have been incorporated into several widely deployed protocols such as TLS and SSH.

Salsa20 PRG: $\{0,1\}^{128 \text { or } 256} \times\{0,1\}^{64} \longrightarrow\{0,1\}^{n}$ (max n = 2^73^ bits)

Salsa20 PRG $(\mathrm{k} ; \mathrm{r}):=\mathrm{H}(\mathrm{k},(\mathrm{r}, 0)) | \mathrm{H}(\mathrm{k},(\mathrm{r}, 1)) | \ldots$ 通过计数器，使得输出联结，可以输出as long as you want pseudorandom segment.

算法过程如上图所示：Salsa20 PRG $(\mathrm{k} ; \mathrm{r}):=\mathrm{H}(\mathrm{k},(\mathrm{r}, 0)) | \mathrm{H}(\mathrm{k},(\mathrm{r}, 1)) | \ldots$

32 bytes的{k,r,i}通过扩展得到64 bytes的{ $\tau_0,k,\tau_1,r,i,\tau_2,k,\tau_3$ }.
- k :16 bytes的seed.
- i: 8 bytes的index，计数器。
- r: 8 bytes的nonce.
- $\tau_{0,1,2,3}$ 都是4 bytes的常数，Salsa20算法说明书上公开确定的值。
64 bytes 通过h函数映射，十轮。
- h : invertible function designed to be fast on x86(SEE2).
  
  在x86上有SEE2指令可以直接运行h函数，所以很快。
- h是逆函数（也是公开的函数），输出可以通过逆运算得到其输入。
- h是一个一一映射的map，每一个64bytes的输入都有唯一对应的64 bytes的输出。
将第十轮H函数的输出和最开始的输入做加法运算，word by word(32位)，即每4 bytes相加。
- 为什么要有这一步？
  
  h是可逆运算，如果直接将函数的输出作为PRG的输出，那可以通过h的逆运算得到原64 bytes，也就得到了(k; r).

Is Salsa20 secure(unpredictable)

前文我们通过unpredictable来定义PRG的安全（下一篇文章会给出安全PRG更好的定义），所以Salsa20 安全吗？是否是不可预测的？

Unknown：no known provably secure PRGs.

【不能严格证明是一个安全PRG（后文会继续讲解什么是安全的PRG），如果严格证明了，也就证明了P=NP】
In reality： no known attacks bertter than exhaustive search.

【现实中还没有比穷举更快的算法】

Performance

通过下图的比较，如果在系统中需要使用流密码，建议使用eStream。

speed ：该算法每秒加密多少MB的数据。

Reference

S. Fluhrer, I. Mantin, and A. Shamir. Weaknesses in the key scheduling algorithm of RC4.

In proceedings of selected areas of cryptography (SAC), pages 1-24, 2001.
「A Graduate Course in Applied Cryptography」p76-78:挖坑待补充

「Cryptography-Boneh」:Stream Cipher 2

https://f7ed.com/2020/03/19/stanford-crypto-streamcipher2/

Author

f7ed

Posted on

2020-03-19

Updated on

2021-12-28

Licensed under

CC BY-NC-SA 4.0

「Cryptography-Boneh」:Stream Cipher 2

Attack on OTP and stream ciphers

Attack1: two time pad is insecure

why insecure

Examples: Project Venona(1941-1946)

Examples: MS-PPTP(Windows NT)

如何改进

Examples: 802.11b WEP

How it works

Problems of IV

A better construction

Examples: disk encryption

如果使用流密码：

Two time pad: Summary

Attack2: no integrity(OTP is malleable)

具体攻击如下：

Real-world Stream Ciphers

Old example(SW): RC4

RC4 stream cipher generator

Security of RC4

Weakness of RC4

Old example(HW): CSS (badly broken)

Linear feedback shift register(LFSR)

how CSS works

Cryptanalysis of CSS (2^16 time attack)

Modern stream ciphers: eStream

main idea

eStram: Salsa 20(SW+HW)

Is Salsa20 secure(unpredictable)

Performance

Reference

Author

Posted on

Updated on

Licensed under

Like this article? Support the author with

Comments

Catalogue