|  | ===================================== | 
|  | Filesystem-level encryption (fscrypt) | 
|  | ===================================== | 
|  |  | 
|  | Introduction | 
|  | ============ | 
|  |  | 
|  | fscrypt is a library which filesystems can hook into to support | 
|  | transparent encryption of files and directories. | 
|  |  | 
|  | Note: "fscrypt" in this document refers to the kernel-level portion, | 
|  | implemented in ``fs/crypto/``, as opposed to the userspace tool | 
|  | `fscrypt <https://github.com/google/fscrypt>`_.  This document only | 
|  | covers the kernel-level portion.  For command-line examples of how to | 
|  | use encryption, see the documentation for the userspace tool `fscrypt | 
|  | <https://github.com/google/fscrypt>`_.  Also, it is recommended to use | 
|  | the fscrypt userspace tool, or other existing userspace tools such as | 
|  | `fscryptctl <https://github.com/google/fscryptctl>`_ or `Android's key | 
|  | management system | 
|  | <https://source.android.com/security/encryption/file-based>`_, over | 
|  | using the kernel's API directly.  Using existing tools reduces the | 
|  | chance of introducing your own security bugs.  (Nevertheless, for | 
|  | completeness this documentation covers the kernel's API anyway.) | 
|  |  | 
|  | Unlike dm-crypt, fscrypt operates at the filesystem level rather than | 
|  | at the block device level.  This allows it to encrypt different files | 
|  | with different keys and to have unencrypted files on the same | 
|  | filesystem.  This is useful for multi-user systems where each user's | 
|  | data-at-rest needs to be cryptographically isolated from the others. | 
|  | However, except for filenames, fscrypt does not encrypt filesystem | 
|  | metadata. | 
|  |  | 
|  | Unlike eCryptfs, which is a stacked filesystem, fscrypt is integrated | 
|  | directly into supported filesystems --- currently ext4, F2FS, and | 
|  | UBIFS.  This allows encrypted files to be read and written without | 
|  | caching both the decrypted and encrypted pages in the pagecache, | 
|  | thereby nearly halving the memory used and bringing it in line with | 
|  | unencrypted files.  Similarly, half as many dentries and inodes are | 
|  | needed.  eCryptfs also limits encrypted filenames to 143 bytes, | 
|  | causing application compatibility issues; fscrypt allows the full 255 | 
|  | bytes (NAME_MAX).  Finally, unlike eCryptfs, the fscrypt API can be | 
|  | used by unprivileged users, with no need to mount anything. | 
|  |  | 
|  | fscrypt does not support encrypting files in-place.  Instead, it | 
|  | supports marking an empty directory as encrypted.  Then, after | 
|  | userspace provides the key, all regular files, directories, and | 
|  | symbolic links created in that directory tree are transparently | 
|  | encrypted. | 
|  |  | 
|  | Threat model | 
|  | ============ | 
|  |  | 
|  | Offline attacks | 
|  | --------------- | 
|  |  | 
|  | Provided that userspace chooses a strong encryption key, fscrypt | 
|  | protects the confidentiality of file contents and filenames in the | 
|  | event of a single point-in-time permanent offline compromise of the | 
|  | block device content.  fscrypt does not protect the confidentiality of | 
|  | non-filename metadata, e.g. file sizes, file permissions, file | 
|  | timestamps, and extended attributes.  Also, the existence and location | 
|  | of holes (unallocated blocks which logically contain all zeroes) in | 
|  | files is not protected. | 
|  |  | 
|  | fscrypt is not guaranteed to protect confidentiality or authenticity | 
|  | if an attacker is able to manipulate the filesystem offline prior to | 
|  | an authorized user later accessing the filesystem. | 
|  |  | 
|  | Online attacks | 
|  | -------------- | 
|  |  | 
|  | fscrypt (and storage encryption in general) can only provide limited | 
|  | protection, if any at all, against online attacks.  In detail: | 
|  |  | 
|  | fscrypt is only resistant to side-channel attacks, such as timing or | 
|  | electromagnetic attacks, to the extent that the underlying Linux | 
|  | Cryptographic API algorithms are.  If a vulnerable algorithm is used, | 
|  | such as a table-based implementation of AES, it may be possible for an | 
|  | attacker to mount a side channel attack against the online system. | 
|  | Side channel attacks may also be mounted against applications | 
|  | consuming decrypted data. | 
|  |  | 
|  | After an encryption key has been provided, fscrypt is not designed to | 
|  | hide the plaintext file contents or filenames from other users on the | 
|  | same system, regardless of the visibility of the keyring key. | 
|  | Instead, existing access control mechanisms such as file mode bits, | 
|  | POSIX ACLs, LSMs, or mount namespaces should be used for this purpose. | 
|  | Also note that as long as the encryption keys are *anywhere* in | 
|  | memory, an online attacker can necessarily compromise them by mounting | 
|  | a physical attack or by exploiting any kernel security vulnerability | 
|  | which provides an arbitrary memory read primitive. | 
|  |  | 
|  | While it is ostensibly possible to "evict" keys from the system, | 
|  | recently accessed encrypted files will remain accessible at least | 
|  | until the filesystem is unmounted or the VFS caches are dropped, e.g. | 
|  | using ``echo 2 > /proc/sys/vm/drop_caches``.  Even after that, if the | 
|  | RAM is compromised before being powered off, it will likely still be | 
|  | possible to recover portions of the plaintext file contents, if not | 
|  | some of the encryption keys as well.  (Since Linux v4.12, all | 
|  | in-kernel keys related to fscrypt are sanitized before being freed. | 
|  | However, userspace would need to do its part as well.) | 
|  |  | 
|  | Currently, fscrypt does not prevent a user from maliciously providing | 
|  | an incorrect key for another user's existing encrypted files.  A | 
|  | protection against this is planned. | 
|  |  | 
|  | Key hierarchy | 
|  | ============= | 
|  |  | 
|  | Master Keys | 
|  | ----------- | 
|  |  | 
|  | Each encrypted directory tree is protected by a *master key*.  Master | 
|  | keys can be up to 64 bytes long, and must be at least as long as the | 
|  | greater of the key length needed by the contents and filenames | 
|  | encryption modes being used.  For example, if AES-256-XTS is used for | 
|  | contents encryption, the master key must be 64 bytes (512 bits).  Note | 
|  | that the XTS mode is defined to require a key twice as long as that | 
|  | required by the underlying block cipher. | 
|  |  | 
|  | To "unlock" an encrypted directory tree, userspace must provide the | 
|  | appropriate master key.  There can be any number of master keys, each | 
|  | of which protects any number of directory trees on any number of | 
|  | filesystems. | 
|  |  | 
|  | Userspace should generate master keys either using a cryptographically | 
|  | secure random number generator, or by using a KDF (Key Derivation | 
|  | Function).  Note that whenever a KDF is used to "stretch" a | 
|  | lower-entropy secret such as a passphrase, it is critical that a KDF | 
|  | designed for this purpose be used, such as scrypt, PBKDF2, or Argon2. | 
|  |  | 
|  | Per-file keys | 
|  | ------------- | 
|  |  | 
|  | Master keys are not used to encrypt file contents or names directly. | 
|  | Instead, a unique key is derived for each encrypted file, including | 
|  | each regular file, directory, and symbolic link.  This has several | 
|  | advantages: | 
|  |  | 
|  | - In cryptosystems, the same key material should never be used for | 
|  | different purposes.  Using the master key as both an XTS key for | 
|  | contents encryption and as a CTS-CBC key for filenames encryption | 
|  | would violate this rule. | 
|  | - Per-file keys simplify the choice of IVs (Initialization Vectors) | 
|  | for contents encryption.  Without per-file keys, to ensure IV | 
|  | uniqueness both the inode and logical block number would need to be | 
|  | encoded in the IVs.  This would make it impossible to renumber | 
|  | inodes, which e.g. ``resize2fs`` can do when resizing an ext4 | 
|  | filesystem.  With per-file keys, it is sufficient to encode just the | 
|  | logical block number in the IVs. | 
|  | - Per-file keys strengthen the encryption of filenames, where IVs are | 
|  | reused out of necessity.  With a unique key per directory, IV reuse | 
|  | is limited to within a single directory. | 
|  | - Per-file keys allow individual files to be securely erased simply by | 
|  | securely erasing their keys.  (Not yet implemented.) | 
|  |  | 
|  | A KDF (Key Derivation Function) is used to derive per-file keys from | 
|  | the master key.  This is done instead of wrapping a randomly-generated | 
|  | key for each file because it reduces the size of the encryption xattr, | 
|  | which for some filesystems makes the xattr more likely to fit in-line | 
|  | in the filesystem's inode table.  With a KDF, only a 16-byte nonce is | 
|  | required --- long enough to make key reuse extremely unlikely.  A | 
|  | wrapped key, on the other hand, would need to be up to 64 bytes --- | 
|  | the length of an AES-256-XTS key.  Furthermore, currently there is no | 
|  | requirement to support unlocking a file with multiple alternative | 
|  | master keys or to support rotating master keys.  Instead, the master | 
|  | keys may be wrapped in userspace, e.g. as done by the `fscrypt | 
|  | <https://github.com/google/fscrypt>`_ tool. | 
|  |  | 
|  | The current KDF encrypts the master key using the 16-byte nonce as an | 
|  | AES-128-ECB key.  The output is used as the derived key.  If the | 
|  | output is longer than needed, then it is truncated to the needed | 
|  | length.  Truncation is the norm for directories and symlinks, since | 
|  | those use the CTS-CBC encryption mode which requires a key half as | 
|  | long as that required by the XTS encryption mode. | 
|  |  | 
|  | Note: this KDF meets the primary security requirement, which is to | 
|  | produce unique derived keys that preserve the entropy of the master | 
|  | key, assuming that the master key is already a good pseudorandom key. | 
|  | However, it is nonstandard and has some problems such as being | 
|  | reversible, so it is generally considered to be a mistake!  It may be | 
|  | replaced with HKDF or another more standard KDF in the future. | 
|  |  | 
|  | Encryption modes and usage | 
|  | ========================== | 
|  |  | 
|  | fscrypt allows one encryption mode to be specified for file contents | 
|  | and one encryption mode to be specified for filenames.  Different | 
|  | directory trees are permitted to use different encryption modes. | 
|  | Currently, the following pairs of encryption modes are supported: | 
|  |  | 
|  | - AES-256-XTS for contents and AES-256-CTS-CBC for filenames | 
|  | - AES-128-CBC for contents and AES-128-CTS-CBC for filenames | 
|  |  | 
|  | It is strongly recommended to use AES-256-XTS for contents encryption. | 
|  | AES-128-CBC was added only for low-powered embedded devices with | 
|  | crypto accelerators such as CAAM or CESA that do not support XTS. | 
|  |  | 
|  | New encryption modes can be added relatively easily, without changes | 
|  | to individual filesystems.  However, authenticated encryption (AE) | 
|  | modes are not currently supported because of the difficulty of dealing | 
|  | with ciphertext expansion. | 
|  |  | 
|  | For file contents, each filesystem block is encrypted independently. | 
|  | Currently, only the case where the filesystem block size is equal to | 
|  | the system's page size (usually 4096 bytes) is supported.  With the | 
|  | XTS mode of operation (recommended), the logical block number within | 
|  | the file is used as the IV.  With the CBC mode of operation (not | 
|  | recommended), ESSIV is used; specifically, the IV for CBC is the | 
|  | logical block number encrypted with AES-256, where the AES-256 key is | 
|  | the SHA-256 hash of the inode's data encryption key. | 
|  |  | 
|  | For filenames, the full filename is encrypted at once.  Because of the | 
|  | requirements to retain support for efficient directory lookups and | 
|  | filenames of up to 255 bytes, a constant initialization vector (IV) is | 
|  | used.  However, each encrypted directory uses a unique key, which | 
|  | limits IV reuse to within a single directory.  Note that IV reuse in | 
|  | the context of CTS-CBC encryption means that when the original | 
|  | filenames share a common prefix at least as long as the cipher block | 
|  | size (16 bytes for AES), the corresponding encrypted filenames will | 
|  | also share a common prefix.  This is undesirable; it may be fixed in | 
|  | the future by switching to an encryption mode that is a strong | 
|  | pseudorandom permutation on arbitrary-length messages, e.g. the HEH | 
|  | (Hash-Encrypt-Hash) mode. | 
|  |  | 
|  | Since filenames are encrypted with the CTS-CBC mode of operation, the | 
|  | plaintext and ciphertext filenames need not be multiples of the AES | 
|  | block size, i.e. 16 bytes.  However, the minimum size that can be | 
|  | encrypted is 16 bytes, so shorter filenames are NUL-padded to 16 bytes | 
|  | before being encrypted.  In addition, to reduce leakage of filename | 
|  | lengths via their ciphertexts, all filenames are NUL-padded to the | 
|  | next 4, 8, 16, or 32-byte boundary (configurable).  32 is recommended | 
|  | since this provides the best confidentiality, at the cost of making | 
|  | directory entries consume slightly more space.  Note that since NUL | 
|  | (``\0``) is not otherwise a valid character in filenames, the padding | 
|  | will never produce duplicate plaintexts. | 
|  |  | 
|  | Symbolic link targets are considered a type of filename and are | 
|  | encrypted in the same way as filenames in directory entries.  Each | 
|  | symlink also uses a unique key; hence, the hardcoded IV is not a | 
|  | problem for symlinks. | 
|  |  | 
|  | User API | 
|  | ======== | 
|  |  | 
|  | Setting an encryption policy | 
|  | ---------------------------- | 
|  |  | 
|  | The FS_IOC_SET_ENCRYPTION_POLICY ioctl sets an encryption policy on an | 
|  | empty directory or verifies that a directory or regular file already | 
|  | has the specified encryption policy.  It takes in a pointer to a | 
|  | :c:type:`struct fscrypt_policy`, defined as follows:: | 
|  |  | 
|  | #define FS_KEY_DESCRIPTOR_SIZE  8 | 
|  |  | 
|  | struct fscrypt_policy { | 
|  | __u8 version; | 
|  | __u8 contents_encryption_mode; | 
|  | __u8 filenames_encryption_mode; | 
|  | __u8 flags; | 
|  | __u8 master_key_descriptor[FS_KEY_DESCRIPTOR_SIZE]; | 
|  | }; | 
|  |  | 
|  | This structure must be initialized as follows: | 
|  |  | 
|  | - ``version`` must be 0. | 
|  |  | 
|  | - ``contents_encryption_mode`` and ``filenames_encryption_mode`` must | 
|  | be set to constants from ``<linux/fs.h>`` which identify the | 
|  | encryption modes to use.  If unsure, use | 
|  | FS_ENCRYPTION_MODE_AES_256_XTS (1) for ``contents_encryption_mode`` | 
|  | and FS_ENCRYPTION_MODE_AES_256_CTS (4) for | 
|  | ``filenames_encryption_mode``. | 
|  |  | 
|  | - ``flags`` must be set to a value from ``<linux/fs.h>`` which | 
|  | identifies the amount of NUL-padding to use when encrypting | 
|  | filenames.  If unsure, use FS_POLICY_FLAGS_PAD_32 (0x3). | 
|  |  | 
|  | - ``master_key_descriptor`` specifies how to find the master key in | 
|  | the keyring; see `Adding keys`_.  It is up to userspace to choose a | 
|  | unique ``master_key_descriptor`` for each master key.  The e4crypt | 
|  | and fscrypt tools use the first 8 bytes of | 
|  | ``SHA-512(SHA-512(master_key))``, but this particular scheme is not | 
|  | required.  Also, the master key need not be in the keyring yet when | 
|  | FS_IOC_SET_ENCRYPTION_POLICY is executed.  However, it must be added | 
|  | before any files can be created in the encrypted directory. | 
|  |  | 
|  | If the file is not yet encrypted, then FS_IOC_SET_ENCRYPTION_POLICY | 
|  | verifies that the file is an empty directory.  If so, the specified | 
|  | encryption policy is assigned to the directory, turning it into an | 
|  | encrypted directory.  After that, and after providing the | 
|  | corresponding master key as described in `Adding keys`_, all regular | 
|  | files, directories (recursively), and symlinks created in the | 
|  | directory will be encrypted, inheriting the same encryption policy. | 
|  | The filenames in the directory's entries will be encrypted as well. | 
|  |  | 
|  | Alternatively, if the file is already encrypted, then | 
|  | FS_IOC_SET_ENCRYPTION_POLICY validates that the specified encryption | 
|  | policy exactly matches the actual one.  If they match, then the ioctl | 
|  | returns 0.  Otherwise, it fails with EEXIST.  This works on both | 
|  | regular files and directories, including nonempty directories. | 
|  |  | 
|  | Note that the ext4 filesystem does not allow the root directory to be | 
|  | encrypted, even if it is empty.  Users who want to encrypt an entire | 
|  | filesystem with one key should consider using dm-crypt instead. | 
|  |  | 
|  | FS_IOC_SET_ENCRYPTION_POLICY can fail with the following errors: | 
|  |  | 
|  | - ``EACCES``: the file is not owned by the process's uid, nor does the | 
|  | process have the CAP_FOWNER capability in a namespace with the file | 
|  | owner's uid mapped | 
|  | - ``EEXIST``: the file is already encrypted with an encryption policy | 
|  | different from the one specified | 
|  | - ``EINVAL``: an invalid encryption policy was specified (invalid | 
|  | version, mode(s), or flags) | 
|  | - ``ENOTDIR``: the file is unencrypted and is a regular file, not a | 
|  | directory | 
|  | - ``ENOTEMPTY``: the file is unencrypted and is a nonempty directory | 
|  | - ``ENOTTY``: this type of filesystem does not implement encryption | 
|  | - ``EOPNOTSUPP``: the kernel was not configured with encryption | 
|  | support for this filesystem, or the filesystem superblock has not | 
|  | had encryption enabled on it.  (For example, to use encryption on an | 
|  | ext4 filesystem, CONFIG_EXT4_ENCRYPTION must be enabled in the | 
|  | kernel config, and the superblock must have had the "encrypt" | 
|  | feature flag enabled using ``tune2fs -O encrypt`` or ``mkfs.ext4 -O | 
|  | encrypt``.) | 
|  | - ``EPERM``: this directory may not be encrypted, e.g. because it is | 
|  | the root directory of an ext4 filesystem | 
|  | - ``EROFS``: the filesystem is readonly | 
|  |  | 
|  | Getting an encryption policy | 
|  | ---------------------------- | 
|  |  | 
|  | The FS_IOC_GET_ENCRYPTION_POLICY ioctl retrieves the :c:type:`struct | 
|  | fscrypt_policy`, if any, for a directory or regular file.  See above | 
|  | for the struct definition.  No additional permissions are required | 
|  | beyond the ability to open the file. | 
|  |  | 
|  | FS_IOC_GET_ENCRYPTION_POLICY can fail with the following errors: | 
|  |  | 
|  | - ``EINVAL``: the file is encrypted, but it uses an unrecognized | 
|  | encryption context format | 
|  | - ``ENODATA``: the file is not encrypted | 
|  | - ``ENOTTY``: this type of filesystem does not implement encryption | 
|  | - ``EOPNOTSUPP``: the kernel was not configured with encryption | 
|  | support for this filesystem | 
|  |  | 
|  | Note: if you only need to know whether a file is encrypted or not, on | 
|  | most filesystems it is also possible to use the FS_IOC_GETFLAGS ioctl | 
|  | and check for FS_ENCRYPT_FL, or to use the statx() system call and | 
|  | check for STATX_ATTR_ENCRYPTED in stx_attributes. | 
|  |  | 
|  | Getting the per-filesystem salt | 
|  | ------------------------------- | 
|  |  | 
|  | Some filesystems, such as ext4 and F2FS, also support the deprecated | 
|  | ioctl FS_IOC_GET_ENCRYPTION_PWSALT.  This ioctl retrieves a randomly | 
|  | generated 16-byte value stored in the filesystem superblock.  This | 
|  | value is intended to used as a salt when deriving an encryption key | 
|  | from a passphrase or other low-entropy user credential. | 
|  |  | 
|  | FS_IOC_GET_ENCRYPTION_PWSALT is deprecated.  Instead, prefer to | 
|  | generate and manage any needed salt(s) in userspace. | 
|  |  | 
|  | Adding keys | 
|  | ----------- | 
|  |  | 
|  | To provide a master key, userspace must add it to an appropriate | 
|  | keyring using the add_key() system call (see: | 
|  | ``Documentation/security/keys/core.rst``).  The key type must be | 
|  | "logon"; keys of this type are kept in kernel memory and cannot be | 
|  | read back by userspace.  The key description must be "fscrypt:" | 
|  | followed by the 16-character lower case hex representation of the | 
|  | ``master_key_descriptor`` that was set in the encryption policy.  The | 
|  | key payload must conform to the following structure:: | 
|  |  | 
|  | #define FS_MAX_KEY_SIZE 64 | 
|  |  | 
|  | struct fscrypt_key { | 
|  | u32 mode; | 
|  | u8 raw[FS_MAX_KEY_SIZE]; | 
|  | u32 size; | 
|  | }; | 
|  |  | 
|  | ``mode`` is ignored; just set it to 0.  The actual key is provided in | 
|  | ``raw`` with ``size`` indicating its size in bytes.  That is, the | 
|  | bytes ``raw[0..size-1]`` (inclusive) are the actual key. | 
|  |  | 
|  | The key description prefix "fscrypt:" may alternatively be replaced | 
|  | with a filesystem-specific prefix such as "ext4:".  However, the | 
|  | filesystem-specific prefixes are deprecated and should not be used in | 
|  | new programs. | 
|  |  | 
|  | There are several different types of keyrings in which encryption keys | 
|  | may be placed, such as a session keyring, a user session keyring, or a | 
|  | user keyring.  Each key must be placed in a keyring that is "attached" | 
|  | to all processes that might need to access files encrypted with it, in | 
|  | the sense that request_key() will find the key.  Generally, if only | 
|  | processes belonging to a specific user need to access a given | 
|  | encrypted directory and no session keyring has been installed, then | 
|  | that directory's key should be placed in that user's user session | 
|  | keyring or user keyring.  Otherwise, a session keyring should be | 
|  | installed if needed, and the key should be linked into that session | 
|  | keyring, or in a keyring linked into that session keyring. | 
|  |  | 
|  | Note: introducing the complex visibility semantics of keyrings here | 
|  | was arguably a mistake --- especially given that by design, after any | 
|  | process successfully opens an encrypted file (thereby setting up the | 
|  | per-file key), possessing the keyring key is not actually required for | 
|  | any process to read/write the file until its in-memory inode is | 
|  | evicted.  In the future there probably should be a way to provide keys | 
|  | directly to the filesystem instead, which would make the intended | 
|  | semantics clearer. | 
|  |  | 
|  | Access semantics | 
|  | ================ | 
|  |  | 
|  | With the key | 
|  | ------------ | 
|  |  | 
|  | With the encryption key, encrypted regular files, directories, and | 
|  | symlinks behave very similarly to their unencrypted counterparts --- | 
|  | after all, the encryption is intended to be transparent.  However, | 
|  | astute users may notice some differences in behavior: | 
|  |  | 
|  | - Unencrypted files, or files encrypted with a different encryption | 
|  | policy (i.e. different key, modes, or flags), cannot be renamed or | 
|  | linked into an encrypted directory; see `Encryption policy | 
|  | enforcement`_.  Attempts to do so will fail with EPERM.  However, | 
|  | encrypted files can be renamed within an encrypted directory, or | 
|  | into an unencrypted directory. | 
|  |  | 
|  | - Direct I/O is not supported on encrypted files.  Attempts to use | 
|  | direct I/O on such files will fall back to buffered I/O. | 
|  |  | 
|  | - The fallocate operations FALLOC_FL_COLLAPSE_RANGE, | 
|  | FALLOC_FL_INSERT_RANGE, and FALLOC_FL_ZERO_RANGE are not supported | 
|  | on encrypted files and will fail with EOPNOTSUPP. | 
|  |  | 
|  | - Online defragmentation of encrypted files is not supported.  The | 
|  | EXT4_IOC_MOVE_EXT and F2FS_IOC_MOVE_RANGE ioctls will fail with | 
|  | EOPNOTSUPP. | 
|  |  | 
|  | - The ext4 filesystem does not support data journaling with encrypted | 
|  | regular files.  It will fall back to ordered data mode instead. | 
|  |  | 
|  | - DAX (Direct Access) is not supported on encrypted files. | 
|  |  | 
|  | - The st_size of an encrypted symlink will not necessarily give the | 
|  | length of the symlink target as required by POSIX.  It will actually | 
|  | give the length of the ciphertext, which will be slightly longer | 
|  | than the plaintext due to NUL-padding and an extra 2-byte overhead. | 
|  |  | 
|  | - The maximum length of an encrypted symlink is 2 bytes shorter than | 
|  | the maximum length of an unencrypted symlink.  For example, on an | 
|  | EXT4 filesystem with a 4K block size, unencrypted symlinks can be up | 
|  | to 4095 bytes long, while encrypted symlinks can only be up to 4093 | 
|  | bytes long (both lengths excluding the terminating null). | 
|  |  | 
|  | Note that mmap *is* supported.  This is possible because the pagecache | 
|  | for an encrypted file contains the plaintext, not the ciphertext. | 
|  |  | 
|  | Without the key | 
|  | --------------- | 
|  |  | 
|  | Some filesystem operations may be performed on encrypted regular | 
|  | files, directories, and symlinks even before their encryption key has | 
|  | been provided: | 
|  |  | 
|  | - File metadata may be read, e.g. using stat(). | 
|  |  | 
|  | - Directories may be listed, in which case the filenames will be | 
|  | listed in an encoded form derived from their ciphertext.  The | 
|  | current encoding algorithm is described in `Filename hashing and | 
|  | encoding`_.  The algorithm is subject to change, but it is | 
|  | guaranteed that the presented filenames will be no longer than | 
|  | NAME_MAX bytes, will not contain the ``/`` or ``\0`` characters, and | 
|  | will uniquely identify directory entries. | 
|  |  | 
|  | The ``.`` and ``..`` directory entries are special.  They are always | 
|  | present and are not encrypted or encoded. | 
|  |  | 
|  | - Files may be deleted.  That is, nondirectory files may be deleted | 
|  | with unlink() as usual, and empty directories may be deleted with | 
|  | rmdir() as usual.  Therefore, ``rm`` and ``rm -r`` will work as | 
|  | expected. | 
|  |  | 
|  | - Symlink targets may be read and followed, but they will be presented | 
|  | in encrypted form, similar to filenames in directories.  Hence, they | 
|  | are unlikely to point to anywhere useful. | 
|  |  | 
|  | Without the key, regular files cannot be opened or truncated. | 
|  | Attempts to do so will fail with ENOKEY.  This implies that any | 
|  | regular file operations that require a file descriptor, such as | 
|  | read(), write(), mmap(), fallocate(), and ioctl(), are also forbidden. | 
|  |  | 
|  | Also without the key, files of any type (including directories) cannot | 
|  | be created or linked into an encrypted directory, nor can a name in an | 
|  | encrypted directory be the source or target of a rename, nor can an | 
|  | O_TMPFILE temporary file be created in an encrypted directory.  All | 
|  | such operations will fail with ENOKEY. | 
|  |  | 
|  | It is not currently possible to backup and restore encrypted files | 
|  | without the encryption key.  This would require special APIs which | 
|  | have not yet been implemented. | 
|  |  | 
|  | Encryption policy enforcement | 
|  | ============================= | 
|  |  | 
|  | After an encryption policy has been set on a directory, all regular | 
|  | files, directories, and symbolic links created in that directory | 
|  | (recursively) will inherit that encryption policy.  Special files --- | 
|  | that is, named pipes, device nodes, and UNIX domain sockets --- will | 
|  | not be encrypted. | 
|  |  | 
|  | Except for those special files, it is forbidden to have unencrypted | 
|  | files, or files encrypted with a different encryption policy, in an | 
|  | encrypted directory tree.  Attempts to link or rename such a file into | 
|  | an encrypted directory will fail with EPERM.  This is also enforced | 
|  | during ->lookup() to provide limited protection against offline | 
|  | attacks that try to disable or downgrade encryption in known locations | 
|  | where applications may later write sensitive data.  It is recommended | 
|  | that systems implementing a form of "verified boot" take advantage of | 
|  | this by validating all top-level encryption policies prior to access. | 
|  |  | 
|  | Implementation details | 
|  | ====================== | 
|  |  | 
|  | Encryption context | 
|  | ------------------ | 
|  |  | 
|  | An encryption policy is represented on-disk by a :c:type:`struct | 
|  | fscrypt_context`.  It is up to individual filesystems to decide where | 
|  | to store it, but normally it would be stored in a hidden extended | 
|  | attribute.  It should *not* be exposed by the xattr-related system | 
|  | calls such as getxattr() and setxattr() because of the special | 
|  | semantics of the encryption xattr.  (In particular, there would be | 
|  | much confusion if an encryption policy were to be added to or removed | 
|  | from anything other than an empty directory.)  The struct is defined | 
|  | as follows:: | 
|  |  | 
|  | #define FS_KEY_DESCRIPTOR_SIZE  8 | 
|  | #define FS_KEY_DERIVATION_NONCE_SIZE 16 | 
|  |  | 
|  | struct fscrypt_context { | 
|  | u8 format; | 
|  | u8 contents_encryption_mode; | 
|  | u8 filenames_encryption_mode; | 
|  | u8 flags; | 
|  | u8 master_key_descriptor[FS_KEY_DESCRIPTOR_SIZE]; | 
|  | u8 nonce[FS_KEY_DERIVATION_NONCE_SIZE]; | 
|  | }; | 
|  |  | 
|  | Note that :c:type:`struct fscrypt_context` contains the same | 
|  | information as :c:type:`struct fscrypt_policy` (see `Setting an | 
|  | encryption policy`_), except that :c:type:`struct fscrypt_context` | 
|  | also contains a nonce.  The nonce is randomly generated by the kernel | 
|  | and is used to derive the inode's encryption key as described in | 
|  | `Per-file keys`_. | 
|  |  | 
|  | Data path changes | 
|  | ----------------- | 
|  |  | 
|  | For the read path (->readpage()) of regular files, filesystems can | 
|  | read the ciphertext into the page cache and decrypt it in-place.  The | 
|  | page lock must be held until decryption has finished, to prevent the | 
|  | page from becoming visible to userspace prematurely. | 
|  |  | 
|  | For the write path (->writepage()) of regular files, filesystems | 
|  | cannot encrypt data in-place in the page cache, since the cached | 
|  | plaintext must be preserved.  Instead, filesystems must encrypt into a | 
|  | temporary buffer or "bounce page", then write out the temporary | 
|  | buffer.  Some filesystems, such as UBIFS, already use temporary | 
|  | buffers regardless of encryption.  Other filesystems, such as ext4 and | 
|  | F2FS, have to allocate bounce pages specially for encryption. | 
|  |  | 
|  | Filename hashing and encoding | 
|  | ----------------------------- | 
|  |  | 
|  | Modern filesystems accelerate directory lookups by using indexed | 
|  | directories.  An indexed directory is organized as a tree keyed by | 
|  | filename hashes.  When a ->lookup() is requested, the filesystem | 
|  | normally hashes the filename being looked up so that it can quickly | 
|  | find the corresponding directory entry, if any. | 
|  |  | 
|  | With encryption, lookups must be supported and efficient both with and | 
|  | without the encryption key.  Clearly, it would not work to hash the | 
|  | plaintext filenames, since the plaintext filenames are unavailable | 
|  | without the key.  (Hashing the plaintext filenames would also make it | 
|  | impossible for the filesystem's fsck tool to optimize encrypted | 
|  | directories.)  Instead, filesystems hash the ciphertext filenames, | 
|  | i.e. the bytes actually stored on-disk in the directory entries.  When | 
|  | asked to do a ->lookup() with the key, the filesystem just encrypts | 
|  | the user-supplied name to get the ciphertext. | 
|  |  | 
|  | Lookups without the key are more complicated.  The raw ciphertext may | 
|  | contain the ``\0`` and ``/`` characters, which are illegal in | 
|  | filenames.  Therefore, readdir() must base64-encode the ciphertext for | 
|  | presentation.  For most filenames, this works fine; on ->lookup(), the | 
|  | filesystem just base64-decodes the user-supplied name to get back to | 
|  | the raw ciphertext. | 
|  |  | 
|  | However, for very long filenames, base64 encoding would cause the | 
|  | filename length to exceed NAME_MAX.  To prevent this, readdir() | 
|  | actually presents long filenames in an abbreviated form which encodes | 
|  | a strong "hash" of the ciphertext filename, along with the optional | 
|  | filesystem-specific hash(es) needed for directory lookups.  This | 
|  | allows the filesystem to still, with a high degree of confidence, map | 
|  | the filename given in ->lookup() back to a particular directory entry | 
|  | that was previously listed by readdir().  See :c:type:`struct | 
|  | fscrypt_digested_name` in the source for more details. | 
|  |  | 
|  | Note that the precise way that filenames are presented to userspace | 
|  | without the key is subject to change in the future.  It is only meant | 
|  | as a way to temporarily present valid filenames so that commands like | 
|  | ``rm -r`` work as expected on encrypted directories. |