Disk encryption theory

From Wikipedia, the free encyclopedia

Jump to: navigation, search

Disk encryption is a special case of data at rest protection when the storage media is a sector-addressable device (e.g., a hard disk). This article presents cryptographic aspects of the problem. For discussion of different software packages and hardware devices devoted to this problem see disk encryption software and disk encryption hardware.

Contents

[edit] Problem definition

Disk encryption methods aim to provide three distinct properties:

  1. The data on the disk should remain confidential
  2. Data retrieval and storage should both be fast operations, no matter where on the disk the data is stored.
  3. The encryption method should not waste disk space.

The first property requires defining an adversary with respect to whom the data is being kept confidential. The strongest adversaries studied in the field of disk encryption have these abilities: they can read the raw contents of the disk at any time; they can request the disk to encrypt and store arbitrary files of their choosing; and they can modify unused sectors on the disk and then request their decryption. A method provides good confidentiality if the only information such an adversary can determine over time is whether the data in a sector has or has not changed since the last time they looked.

The second property requires dividing the disk into several sectors, usually 512 bytes (4096 bits) long, which are encrypted and decrypted independently of each other. In turn, if the data is to stay confidential, the encryption method must be tweakable -- no two sectors should be processed in exactly the same way. Otherwise, the adversary could decrypt any sector of the disk by copying it to an unused sector of the disk and requesting its decryption.

The third property is generally noncontroversial. However, it indirectly prohibits the use of stream ciphers, since stream ciphers require, for their security, that the same initial state not be used twice; and this would require an encryption method to store separate initial states for every sector on disk -- seemingly a clear waste of space. The alternative -- a block cipher -- is limited to a certain block size (usually 128 or 256 bits). Because of this, disk encryption chiefly studies chaining modes, which expand the block cipher by a factor of 16x or 32x. The considerations already listed make two well-known chaining modes unsuitable: ECB mode cannot be tweaked, and CTR mode turns block ciphers into stream ciphers.

These three properties do not provide any assurance of disk integrity -- that is, they don't tell you whether an adversary has been modifying your ciphertext. In part, this is because an absolute assurance of disk integrity is impossible: no matter what, an adversary could always revert the entire disk to a prior state, circumventing any such checks. If some non-absolute level of disk integrity is desired, it can be achieved within the encrypted disk on a file-by-file basis using message authentication codes.

[edit] CBC-based approaches

Cipher block chaining (CBC) is a common chaining mode in which the previous block's ciphertext is xored with the current block's plaintext before encryption:

C_i = E_K(C_{i-1} \oplus P_i)

Since there isn't a "previous block's ciphertext" for the first sector, an initialization vector (IV) must be used as C − 1. This, in turn, makes CBC tweakable in some ways.

CBC suffers from some major problems. Because the IV only directly affects the encryption and decryption of the first block on the sector, an adversary who can request the decryption of a free sector can also decrypt almost all of any other sector they choose. They simply copy the sector they're interested in to the free sector space, and request its decryption. Every block except for the first will be properly decrypted to its plaintext, revealing 15/16 or 31/32 of the original plaintext. This applies no matter how secure you make the IV: it is a property of CBC that every block except for the first one in the sector has an IV input which is public knowledge.

Even if an adversary cannot request the decryption of free sectors, if the IVs are predictable, and the adversary can modify a file that they think the user will store, the adversary could maliciously construct the file to leave a "watermark" on the disk, proving that it is, indeed, stored on disk. The "watermark" component could then be inserted in a code comment so that the user is never aware of its existence in the first place. The exact method of constructing the watermark depends on the exact function providing the IVs; but the general recipe is to create two sectors which are identical except for their first blocks b1 and b2; these two are then related to each other by b_2 \oplus IV_2 = b_1 \oplus IV_1. Thus, when these two sectors are encrypted, they both encrypt to the same thing, leaving a watermark on the disk. The exact pattern of "same-different-same-different" on disk can then be altered to make the watermark unique to a given file.

To protect against the watermarking attack, a stream cipher is often used to generate the IVs from the key, so that an adversary cannot predict them. In particular, the ESSIV approach discussed below uses a block cipher in CTR mode to generate the IVs. Another option would supplement this IV with a hash function applied to every block but the first. However, neither of these options guards against the decryption attack above.

[edit] LRW

In order to prevent such elaborate attacks, different modes of operation were introduced: tweakable narrow-block encryption (LRW and XEX) and wide-block encryption (CMC and EME).

Whereas a purpose of a usual block cipher EK is to mimic a random permutation for any secret key K, the purpose of tweakable encryption E_K^Tis to mimic a random permutation for any secret key K and any known tweak T. The tweakable narrow-block encryption (LRW)[1] is an instantiation of the mode of operations introduced by Liskov, Rivest, and Wagner[2] (see Theorem 2). This mode uses two keys: K is the key for the block cipher and F is an additional key of the same size as block. For example, for AES with a 256-bit key, K is a 256-bit number and F is a 128-bit number. Encrypting block P with logical index (tweak) I uses the following formula: E_K(P \oplus X) \oplus X, where X = F \otimes I. Here multiplication \otimes and addition \oplus are performed in the finite field (GF(2128) for AES). With some precomputation, only a single multiplication per sector is required (note that addition in a binary finite field is a simple bitwise addition, also known as xor): F \otimes I = F \otimes (I_0 \oplus \delta) = F \otimes I_0 \oplus F \otimes \delta, where F \otimes \delta are precomputed for all possible values of δ. This mode of operation needs only a single encryption per block and protects against all the above attacks except a minor leak: if the user changes a single plaintext block in a sector then only a single ciphertext block changes. (Note that this is not the same leak the ECB mode has: with LRW mode equal plaintexts in different positions are encrypted to different ciphertexts.)

Some security concerns exist with LRW, and this mode of operation has now been replaced by XTS.

LRW is employed by the FreeOTFE, Bestcrypt and dm-crypt disk encryption systems.

[edit] XEX

Another tweakable encryption mode XEX (Xor-Encrypt-Xor), was designed by Rogaway[3] to allow very efficient processing of consecutive blocks. The key K is divided into two parts of equal size: K = K_1 \| K_2. The tweak is represented as a combination of the sector address and index of the block inside the sector (the original XEX mode proposed by Rogaway[3] allows to have several indexes). To encrypt block j in sector I, the following formula is used C = E_{K_1}(P \oplus X) \oplus X, where X = E_{K_2}(I) \otimes \alpha^j and α is the primitive element of GF(2128) defined by polynomial x (0x2 in hexadecimal).

The basic blocks of the LRW mode (AES cipher and Galois field multiplication) are the same as the ones used in the Galois/Counter Mode (GCM) thus permitting a compact implementation of the universal LRW/XEX/GCM hardware.

[edit] XTS

XTS is XEX-based Tweaked CodeBook mode (TCB) with CipherText Stealing (CTS). Although XEX-TCB-CTS should be abbreviated as XTC, “C” was replaced with “S” (for “stealing”) to avoid confusion with the abbreviated ecstasy. Ciphertext stealing provides support for sectors with size not divisible by block size, for example, 520-byte sectors and 16-byte blocks. XTS-AES was standardized in 2007-12-19[1] as IEEE P1619 Standard for Cryptographic Protection of Data on Block-Oriented Storage Devices.

As of May 2009, XTS is supported by dm-crypt, FreeOTFE, TrueCrypt, DiskCryptor and OpenBSD softraid disk encryption software.

[edit] CMC and EME

CMC and EME protect even against the minor leak mentioned above. Unfortunately, the price is a twofold degradation of performance: each block must be encrypted twice; many consider this to be too high a cost, since the same leak on a sector level is unavoidable anyway.

CMC, introduced by Halevi and Rogaway, stands for CBC-mask-CBC: the whole sector encrypted in CBC mode (with C − 1 = EA(I)), the ciphertext is masked by xoring with 2(C'_0 \oplus C'_{k-1}), and decrypted in CBC mode starting from the last block. When the underlying block cipher is a strong pseudorandom permutation (PRP) then on the sector level the scheme is a tweakable PRP. One problem is that in order to decrypt P0 one must sequentially pass over all the data twice.

In order to solve this problem, Halevi and Rogaway introduced a parallelizable variant called EME (ECB-mask-ECB). It works in the following way:

  • the plaintexts are xored with L = EK(0), shifted by different amount to the left, and are encrypted: P'_i = E_K(P_i \oplus 2^i L);
  • the mask is calculated: M = M_P \oplus M_C, where M_P = I \;\oplus\; \bigoplus P'_i and MC = EK(MP);
  • intermediate ciphertexts are masked: C'_i = P'_i \oplus 2^i M for i = 1, \ldots, k-1 and C'_0 = M_C \oplus I \oplus \bigoplus_{i=1}^{k-1} C'_i;
  • the final ciphertexts are calculated: C_i = E_K(C'_i) \oplus 2^i L for i = 0, \ldots, k-1.

Note that unlike LRW and CMC there is only a single key K.

CMC and EME were considered for standardization by SISWG. CMC was rejected for technical considerations.[citation needed] EME is patented, and so is not favored to be a primary supported mode.[4]

[edit] ESSIV

Encrypted Salt-Sector Initialization Vector (ESSIV)[2] is a method for generating initialization vectors for block encryption to use in disk encryption. The usual methods for generating IVs are predictable sequences of numbers based on, for example, time stamp or sector number, and permits certain attacks such as a watermarking attack. ESSIV prevents such attacks by generating IVs from a combination of the sector number with the hash of the key. It is the combination with the key in form of a hash that makes the IV unpredictable.


  \begin{matrix}
  & IV(sector) & = & E_s(Sector), & where & s = Hash_K
  \end{matrix}

ESSIV was designed by Clemens Fruhwirth and has been integrated into the Linux kernel since version 2.6.10, though a similar scheme has been used to generate IVs for OpenBSD's swap encryption since 2000.[3] It is employed by the dm-crypt and FreeOTFE disk encryption systems to increase security.

[edit] See also

[edit] Sources

[edit] References

[edit] Endnotes

  1. ^  Latest SISWG and IEEE P1619 drafts and meeting information are on the P1619 home page [5].
  2. ^  M. Liskov, R. Rivest, and D. Wagner. Tweakable block ciphers [6], CRYPTO '02 (LNCS, volume 2442), 2002.
  3. ^ a  P. Rogaway, Efficient Instantiations of Tweakable Blockciphers and Refinements to Modes OCB and PMAC [7].
  4. ^  P. Rogaway, Block cipher mode of operation for constructing a wide-blocksize block cipher from a conventional block cipher, US Patent Application 20040131182 A1, [8]

[edit] Papers

[edit] External links

  • Security in Storage Working Group SISWG.
Personal tools