devpts: Fix NULL pointer dereference in dcache_readdir()

Closing /dev/pts/ptmx removes the corresponding pty under /dev/pts/
without synchronizing against concurrent path walkers. This can lead to
'dcache_readdir()' tripping over a 'struct dentry' with a NULL 'd_inode'
field:

  | BUG: kernel NULL pointer dereference, address: 0000000000000000
  | #PF: supervisor read access in kernel mode
  | #PF: error_code(0x0000) - not-present page
  | PGD 0 P4D 0
  | SMP PTI
  | CPU: 9 PID: 179 Comm: ptmx Not tainted 5.4.0-rc1+ #5
  | RIP: 0010:dcache_readdir+0xe1/0x150
  | Code: 48 83 f8 01 74 a2 48 83 f8 02 74 eb 4d 8d a7 90 00 00 00 45 31 f6 eb 42 48 8b 43 30 48 8b 4d 08 48 89 ef 8b 53 24 48 8b 73 28 <44> 0f b7 08 4c 8b 55 00 4c 8b 40 40 66 41 c1 e9 0c 41 83 e1 0f e8
  | RSP: 0018:ffffa7df8044be58 EFLAGS: 00010286
  | RAX: 0000000000000000 RBX: ffff9511c78f3ec0 RCX: 0000000000000002
  | RDX: 0000000000000001 RSI: ffff9511c78f3ef8 RDI: ffffa7df8044bed0
  | RBP: ffffa7df8044bed0 R08: 0000000000000000 R09: 00000000004bc478
  | R10: ffff9511c877c6a8 R11: ffff9511c8dde600 R12: ffff9511c878c460
  | R13: ffff9511c878c3c0 R14: 0000000000000000 R15: ffff9511c9442cc0
  | FS:  00007fc5ea2e1700(0000) GS:ffff9511ca280000(0000) knlGS:0000000000000000
  | CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  | CR2: 0000000000000000 CR3: 0000000047d68002 CR4: 0000000000760ea0
  | DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  | DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  | PKRU: 55555554
  | Call Trace:
  |  iterate_dir+0x137/0x190
  |  ksys_getdents64+0x97/0x130
  |  ? iterate_dir+0x190/0x190
  |  __x64_sys_getdents64+0x11/0x20
  |  do_syscall_64+0x43/0x110
  |  entry_SYSCALL_64_after_hwframe+0x44/0xa9

In this case, one CPU is deleting the dentry and clearing the inode
pointer via:

	 devpts_pty_kill()
		-> dput()
			-> dentry_kill()
				-> __dentry_kill()
					-> dentry_unlink_inode()

whilst the other is traversing the directory an obtaining a reference
to the dentry being deleted via:

	sys_getdents64()
		-> iterate_dir()
			-> dcache_readdir()
				-> next_positive()

Prevent the race by acquiring the inode lock of the parent in
'devpts_pty_kill()' so that path walkers are held off until the dentry
has been completely torn down.

Will's fix didn't link to the commit it fixes so I tracked it down.
devpts_pty_kill() used to take inode_lock() before removing a pts
device. The inode_lock() got removed in
8ead9dd54716 ("devpts: more pty driver interface cleanups"). The
reasoning behind the removal seemed to be that the inode_lock() was only
needed because d_find_alias(inode) had to be called before that commit
to find the dentry that was supposed to be removed. Linus then changed
the pty driver to stash away the dentry and subsequently got rid of the
inode_lock(). However, it seems that the inode_lock() is needed to
protect against the race outlined above. So add it back.

Note that this bug had been brought up before in November 2018 before
(cf. [1]). But a fix never got merged because a proper commit wasn't
sent. The issue came back up when Will and I talked about it at Kernel
Recipes in Paris. So here is a fix which prevents the issue. I very much
vote we get this merged asap, since as an unprivileged user I can do:

unshare -U --map-root --mount
mount -t devpts devpts
./<run_reproducer_below

/* Reproducer */
Note that the reproducer will take a little while. Will reported usually
around 10s. For me it took a few minutes.

 #ifndef _GNU_SOURCE
 #define _GNU_SOURCE 1
 #endif
 #include <dirent.h>
 #include <errno.h>
 #include <fcntl.h>
 #include <pthread.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <sys/stat.h>
 #include <sys/types.h>
 #include <unistd.h>

 static void *readdir_thread(void *arg)
 {
 	DIR *d = arg;
 	struct dirent *dent;

 	for (;;) {
 		errno = 0;
 		dent = readdir(d);
 		if (!dent) {
 			if (errno)
 				perror("readdir");
 			break;
 		}
 		rewinddir(d);
 	}

 	return NULL;
 }

 int main(void)
 {
 	DIR *d;
 	pthread_t t;
 	int ret = 0;

 	d = opendir("/dev/pts");
 	if (!d) {
 		ret = errno;
 		perror("opendir");
 		exit(EXIT_FAILURE);
 	}

 	ret = pthread_create(&t, NULL, readdir_thread, d);
 	if (ret) {
 		errno = ret;
 		perror("pthread_create");
 		exit(EXIT_FAILURE);
 	}

 	for (;;) {
 		int dfd;
 		int fd;

 		dfd = dirfd(d);
 		if (dfd < 0) {
 			perror("dirfd");
 			break;
 		}

 		fd = openat(dirfd(d), "ptmx", O_RDONLY | O_NONBLOCK | O_CLOEXEC);
 		if (fd < 0) {
 			perror("openat");
 			break;
 		}
 		close(fd);
 	}

 	pthread_join(t, NULL);
 	closedir(d);
 	exit(EXIT_SUCCESS);
 }

/* References */
[1]: https://lore.kernel.org/r/20181109143744.GA12128@hc

Fixes: 8ead9dd54716 ("devpts: more pty driver interface cleanups")
Cc: <stable@vger.kernel.org>
Cc: Jan Glauber <jglauber@marvell.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com>
[christian.brauner@ubuntu.com: dig into history and add context and reproducer to commit message]
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
1 file changed