Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6:
  JFS: commit_mutex cleanups
diff --git a/CREDITS b/CREDITS
index 9bf714a..29be6d1 100644
--- a/CREDITS
+++ b/CREDITS
@@ -24,6 +24,11 @@
 S: Iasi 6600
 S: Romania
 
+N: Mark Adler
+E: madler@alumni.caltech.edu
+W: http://alumnus.caltech.edu/~madler/
+D: zlib decompression
+
 N: Monalisa Agrawal
 E: magrawal@nortelnetworks.com
 D: Basic Interphase 5575 driver with UBR and ABR support.
@@ -523,11 +528,11 @@
 S: United Kingdom
 
 N: Luiz Fernando N. Capitulino
-E: lcapitulino@terra.com.br
-E: lcapitulino@prefeitura.sp.gov.br
-W: http://www.telecentros.sp.gov.br
-D: Little fixes and a lot of janitorial work
-S: E-GOV Telecentros SP
+E: lcapitulino@mandriva.com.br
+E: lcapitulino@gmail.com
+W: http://www.cpu.eti.br
+D: misc kernel hacking
+S: Mandriva
 S: Brazil
 
 N: Remy Card
@@ -1573,12 +1578,8 @@
 S: Czech Republic
 
 N: Niels Kristian Bech Jensen
-E: nkbj@image.dk
-W: http://www.image.dk/~nkbj
+E: nkbj1970@hotmail.com
 D: Miscellaneous kernel updates and fixes.
-S: Dr. Holsts Vej 34, lejl. 164
-S: DK-8230 Åbyhøj
-S: Denmark
 
 N: Michael K. Johnson
 E: johnsonm@redhat.com
@@ -3400,10 +3401,10 @@
 
 N: Thibaut Varene
 E: T-Bone@parisc-linux.org
-W: http://www.parisc-linux.org/
+W: http://www.parisc-linux.org/~varenet/
 P: 1024D/B7D2F063 E67C 0D43 A75E 12A5 BB1C  FA2F 1E32 C3DA B7D2 F063
 D: PA-RISC port minion, PDC and GSCPS2 drivers, debuglocks and other bits
-D: Some bits in an ARM port, S1D13XXX FB driver, random patches here and there
+D: Some ARM at91rm9200 bits, S1D13XXX FB driver, random patches here and there
 D: AD1889 sound driver
 S: Paris, France
 
diff --git a/Documentation/ABI/README b/Documentation/ABI/README
new file mode 100644
index 0000000..9feaf16
--- /dev/null
+++ b/Documentation/ABI/README
@@ -0,0 +1,77 @@
+This directory attempts to document the ABI between the Linux kernel and
+userspace, and the relative stability of these interfaces.  Due to the
+everchanging nature of Linux, and the differing maturity levels, these
+interfaces should be used by userspace programs in different ways.
+
+We have four different levels of ABI stability, as shown by the four
+different subdirectories in this location.  Interfaces may change levels
+of stability according to the rules described below.
+
+The different levels of stability are:
+
+  stable/
+	This directory documents the interfaces that the developer has
+	defined to be stable.  Userspace programs are free to use these
+	interfaces with no restrictions, and backward compatibility for
+	them will be guaranteed for at least 2 years.  Most interfaces
+	(like syscalls) are expected to never change and always be
+	available.
+
+  testing/
+	This directory documents interfaces that are felt to be stable,
+	as the main development of this interface has been completed.
+	The interface can be changed to add new features, but the
+	current interface will not break by doing this, unless grave
+	errors or security problems are found in them.  Userspace
+	programs can start to rely on these interfaces, but they must be
+	aware of changes that can occur before these interfaces move to
+	be marked stable.  Programs that use these interfaces are
+	strongly encouraged to add their name to the description of
+	these interfaces, so that the kernel developers can easily
+	notify them if any changes occur (see the description of the
+	layout of the files below for details on how to do this.)
+
+  obsolete/
+  	This directory documents interfaces that are still remaining in
+	the kernel, but are marked to be removed at some later point in
+	time.  The description of the interface will document the reason
+	why it is obsolete and when it can be expected to be removed.
+	The file Documentation/feature-removal-schedule.txt may describe
+	some of these interfaces, giving a schedule for when they will
+	be removed.
+
+  removed/
+	This directory contains a list of the old interfaces that have
+	been removed from the kernel.
+
+Every file in these directories will contain the following information:
+
+What:		Short description of the interface
+Date:		Date created
+KernelVersion:	Kernel version this feature first showed up in.
+Contact:	Primary contact for this interface (may be a mailing list)
+Description:	Long description of the interface and how to use it.
+Users:		All users of this interface who wish to be notified when
+		it changes.  This is very important for interfaces in
+		the "testing" stage, so that kernel developers can work
+		with userspace developers to ensure that things do not
+		break in ways that are unacceptable.  It is also
+		important to get feedback for these interfaces to make
+		sure they are working in a proper way and do not need to
+		be changed further.
+
+
+How things move between levels:
+
+Interfaces in stable may move to obsolete, as long as the proper
+notification is given.
+
+Interfaces may be removed from obsolete and the kernel as long as the
+documented amount of time has gone by.
+
+Interfaces in the testing state can move to the stable state when the
+developers feel they are finished.  They cannot be removed from the
+kernel tree without going through the obsolete state first.
+
+It's up to the developer to place their interfaces in the category they
+wish for it to start out in.
diff --git a/Documentation/ABI/obsolete/devfs b/Documentation/ABI/obsolete/devfs
new file mode 100644
index 0000000..b8b8739
--- /dev/null
+++ b/Documentation/ABI/obsolete/devfs
@@ -0,0 +1,13 @@
+What:		devfs
+Date:		July 2005
+Contact:	Greg Kroah-Hartman <gregkh@suse.de>
+Description:
+	devfs has been unmaintained for a number of years, has unfixable
+	races, contains a naming policy within the kernel that is
+	against the LSB, and can be replaced by using udev.
+	The files fs/devfs/*, include/linux/devfs_fs*.h will be removed,
+	along with the the assorted devfs function calls throughout the
+	kernel tree.
+
+Users:
+
diff --git a/Documentation/ABI/stable/syscalls b/Documentation/ABI/stable/syscalls
new file mode 100644
index 0000000..c3ae3e7
--- /dev/null
+++ b/Documentation/ABI/stable/syscalls
@@ -0,0 +1,10 @@
+What:		The kernel syscall interface
+Description:
+	This interface matches much of the POSIX interface and is based
+	on it and other Unix based interfaces.  It will only be added to
+	over time, and not have things removed from it.
+
+	Note that this interface is different for every architecture
+	that Linux supports.  Please see the architecture-specific
+	documentation for details on the syscall numbers that are to be
+	mapped to each syscall.
diff --git a/Documentation/ABI/stable/sysfs-module b/Documentation/ABI/stable/sysfs-module
new file mode 100644
index 0000000..75be431
--- /dev/null
+++ b/Documentation/ABI/stable/sysfs-module
@@ -0,0 +1,30 @@
+What:		/sys/module
+Description:
+	The /sys/module tree consists of the following structure:
+
+	/sys/module/MODULENAME
+		The name of the module that is in the kernel.  This
+		module name will show up either if the module is built
+		directly into the kernel, or if it is loaded as a
+		dyanmic module.
+
+	/sys/module/MODULENAME/parameters
+		This directory contains individual files that are each
+		individual parameters of the module that are able to be
+		changed at runtime.  See the individual module
+		documentation as to the contents of these parameters and
+		what they accomplish.
+
+		Note: The individual parameter names and values are not
+		considered stable, only the fact that they will be
+		placed in this location within sysfs.  See the
+		individual driver documentation for details as to the
+		stability of the different parameters.
+
+	/sys/module/MODULENAME/refcnt
+		If the module is able to be unloaded from the kernel, this file
+		will contain the current reference count of the module.
+
+		Note: If the module is built into the kernel, or if the
+		CONFIG_MODULE_UNLOAD kernel configuration value is not enabled,
+		this file will not be present.
diff --git a/Documentation/ABI/testing/sysfs-class b/Documentation/ABI/testing/sysfs-class
new file mode 100644
index 0000000..4b0cb89
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-class
@@ -0,0 +1,16 @@
+What:		/sys/class/
+Date:		Febuary 2006
+Contact:	Greg Kroah-Hartman <gregkh@suse.de>
+Description:
+		The /sys/class directory will consist of a group of
+		subdirectories describing individual classes of devices
+		in the kernel.  The individual directories will consist
+		of either subdirectories, or symlinks to other
+		directories.
+
+		All programs that use this directory tree must be able
+		to handle both subdirectories or symlinks in order to
+		work properly.
+
+Users:
+	udev <linux-hotplug-devel@lists.sourceforge.net>
diff --git a/Documentation/ABI/testing/sysfs-devices b/Documentation/ABI/testing/sysfs-devices
new file mode 100644
index 0000000..6a25671
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-devices
@@ -0,0 +1,25 @@
+What:		/sys/devices
+Date:		February 2006
+Contact:	Greg Kroah-Hartman <gregkh@suse.de>
+Description:
+		The /sys/devices tree contains a snapshot of the
+		internal state of the kernel device tree.  Devices will
+		be added and removed dynamically as the machine runs,
+		and between different kernel versions, the layout of the
+		devices within this tree will change.
+
+		Please do not rely on the format of this tree because of
+		this.  If a program wishes to find different things in
+		the tree, please use the /sys/class structure and rely
+		on the symlinks there to point to the proper location
+		within the /sys/devices tree of the individual devices.
+		Or rely on the uevent messages to notify programs of
+		devices being added and removed from this tree to find
+		the location of those devices.
+
+		Note that sometimes not all devices along the directory
+		chain will have emitted uevent messages, so userspace
+		programs must be able to handle such occurrences.
+
+Users:
+	udev <linux-hotplug-devel@lists.sourceforge.net>
diff --git a/Documentation/Changes b/Documentation/Changes
index b02f476..48827207 100644
--- a/Documentation/Changes
+++ b/Documentation/Changes
@@ -181,8 +181,8 @@
 --------------------
 
 A driver has been added to allow updating of Intel IA32 microcode,
-accessible as both a devfs regular file and as a normal (misc)
-character device.  If you are not using devfs you may need to:
+accessible as a normal (misc) character device.  If you are not using
+udev you may need to:
 
 mkdir /dev/cpu
 mknod /dev/cpu/microcode c 10 184
@@ -201,7 +201,9 @@
 udev
 ----
 udev is a userspace application for populating /dev dynamically with
-only entries for devices actually present. udev replaces devfs.
+only entries for devices actually present.  udev replaces the basic
+functionality of devfs, while allowing persistant device naming for
+devices.
 
 FUSE
 ----
@@ -231,18 +233,13 @@
 enable it to operate over diverse media layers.  If you use PPP,
 upgrade pppd to at least 2.4.0.
 
-If you are not using devfs, you must have the device file /dev/ppp
+If you are not using udev, you must have the device file /dev/ppp
 which can be made by:
 
 mknod /dev/ppp c 108 0
 
 as root.
 
-If you use devfsd and build ppp support as modules, you will need
-the following in your /etc/devfsd.conf file:
-
-LOOKUP	PPP	MODLOAD
-
 Isdn4k-utils
 ------------
 
diff --git a/Documentation/CodingStyle b/Documentation/CodingStyle
index ce5d2c0..6d2412e 100644
--- a/Documentation/CodingStyle
+++ b/Documentation/CodingStyle
@@ -155,7 +155,83 @@
 See next chapter.
 
 
-		Chapter 5: Functions
+		Chapter 5: Typedefs
+
+Please don't use things like "vps_t".
+
+It's a _mistake_ to use typedef for structures and pointers. When you see a
+
+	vps_t a;
+
+in the source, what does it mean?
+
+In contrast, if it says
+
+	struct virtual_container *a;
+
+you can actually tell what "a" is.
+
+Lots of people think that typedefs "help readability". Not so. They are
+useful only for:
+
+ (a) totally opaque objects (where the typedef is actively used to _hide_
+     what the object is).
+
+     Example: "pte_t" etc. opaque objects that you can only access using
+     the proper accessor functions.
+
+     NOTE! Opaqueness and "accessor functions" are not good in themselves.
+     The reason we have them for things like pte_t etc. is that there
+     really is absolutely _zero_ portably accessible information there.
+
+ (b) Clear integer types, where the abstraction _helps_ avoid confusion
+     whether it is "int" or "long".
+
+     u8/u16/u32 are perfectly fine typedefs, although they fit into
+     category (d) better than here.
+
+     NOTE! Again - there needs to be a _reason_ for this. If something is
+     "unsigned long", then there's no reason to do
+
+	typedef unsigned long myflags_t;
+
+     but if there is a clear reason for why it under certain circumstances
+     might be an "unsigned int" and under other configurations might be
+     "unsigned long", then by all means go ahead and use a typedef.
+
+ (c) when you use sparse to literally create a _new_ type for
+     type-checking.
+
+ (d) New types which are identical to standard C99 types, in certain
+     exceptional circumstances.
+
+     Although it would only take a short amount of time for the eyes and
+     brain to become accustomed to the standard types like 'uint32_t',
+     some people object to their use anyway.
+
+     Therefore, the Linux-specific 'u8/u16/u32/u64' types and their
+     signed equivalents which are identical to standard types are
+     permitted -- although they are not mandatory in new code of your
+     own.
+
+     When editing existing code which already uses one or the other set
+     of types, you should conform to the existing choices in that code.
+
+ (e) Types safe for use in userspace.
+
+     In certain structures which are visible to userspace, we cannot
+     require C99 types and cannot use the 'u32' form above. Thus, we
+     use __u32 and similar types in all structures which are shared
+     with userspace.
+
+Maybe there are other cases too, but the rule should basically be to NEVER
+EVER use a typedef unless you can clearly match one of those rules.
+
+In general, a pointer, or a struct that has elements that can reasonably
+be directly accessed should _never_ be a typedef.
+
+
+		Chapter 6: Functions
 
 Functions should be short and sweet, and do just one thing.  They should
 fit on one or two screenfuls of text (the ISO/ANSI screen size is 80x24,
@@ -183,7 +259,7 @@
 to understand what you did 2 weeks from now.
 
 
-		Chapter 6: Centralized exiting of functions
+		Chapter 7: Centralized exiting of functions
 
 Albeit deprecated by some people, the equivalent of the goto statement is
 used frequently by compilers in form of the unconditional jump instruction.
@@ -220,7 +296,7 @@
 	return result;
 }
 
-		Chapter 7: Commenting
+		Chapter 8: Commenting
 
 Comments are good, but there is also a danger of over-commenting.  NEVER
 try to explain HOW your code works in a comment: it's much better to
@@ -240,7 +316,7 @@
 See the files Documentation/kernel-doc-nano-HOWTO.txt and scripts/kernel-doc
 for details.
 
-		Chapter 8: You've made a mess of it
+		Chapter 9: You've made a mess of it
 
 That's OK, we all do.  You've probably been told by your long-time Unix
 user helper that "GNU emacs" automatically formats the C sources for
@@ -288,7 +364,7 @@
 remember: "indent" is not a fix for bad programming.
 
 
-		Chapter 9: Configuration-files
+		Chapter 10: Configuration-files
 
 For configuration options (arch/xxx/Kconfig, and all the Kconfig files),
 somewhat different indentation is used.
@@ -313,7 +389,7 @@
 experimental options should be denoted (EXPERIMENTAL).
 
 
-		Chapter 10: Data structures
+		Chapter 11: Data structures
 
 Data structures that have visibility outside the single-threaded
 environment they are created and destroyed in should always have
@@ -344,7 +420,7 @@
 have a reference count on it, you almost certainly have a bug.
 
 
-		Chapter 11: Macros, Enums and RTL
+		Chapter 12: Macros, Enums and RTL
 
 Names of macros defining constants and labels in enums are capitalized.
 
@@ -399,7 +475,7 @@
 covers RTL which is used frequently with assembly language in the kernel.
 
 
-		Chapter 12: Printing kernel messages
+		Chapter 13: Printing kernel messages
 
 Kernel developers like to be seen as literate. Do mind the spelling
 of kernel messages to make a good impression. Do not use crippled
@@ -410,7 +486,7 @@
 Printing numbers in parentheses (%d) adds no value and should be avoided.
 
 
-		Chapter 13: Allocating memory
+		Chapter 14: Allocating memory
 
 The kernel provides the following general purpose memory allocators:
 kmalloc(), kzalloc(), kcalloc(), and vmalloc().  Please refer to the API
@@ -429,7 +505,7 @@
 language.
 
 
-		Chapter 14: The inline disease
+		Chapter 15: The inline disease
 
 There appears to be a common misperception that gcc has a magic "make me
 faster" speedup option called "inline". While the use of inlines can be
@@ -457,7 +533,7 @@
 
 
 
-		Chapter 15: References
+		Appendix I: References
 
 The C Programming Language, Second Edition
 by Brian W. Kernighan and Dennis M. Ritchie.
@@ -481,4 +557,4 @@
 http://www.kroah.com/linux/talks/ols_2002_kernel_codingstyle_talk/html/
 
 --
-Last updated on 30 December 2005 by a community effort on LKML.
+Last updated on 30 April 2006.
diff --git a/Documentation/DMA-mapping.txt b/Documentation/DMA-mapping.txt
index 7c717699..63392c9 100644
--- a/Documentation/DMA-mapping.txt
+++ b/Documentation/DMA-mapping.txt
@@ -698,12 +698,12 @@
 always going to be SAC addressable.
 
 The first thing your driver needs to do is query the PCI platform
-layer with your devices DAC addressing capabilities:
+layer if it is capable of handling your devices DAC addressing
+capabilities:
 
-	int pci_dac_set_dma_mask(struct pci_dev *pdev, u64 mask);
+	int pci_dac_dma_supported(struct pci_dev *hwdev, u64 mask);
 
-This routine behaves identically to pci_set_dma_mask.  You may not
-use the following interfaces if this routine fails.
+You may not use the following interfaces if this routine fails.
 
 Next, DMA addresses using this API are kept track of using the
 dma64_addr_t type.  It is guaranteed to be big enough to hold any
diff --git a/Documentation/DocBook/Makefile b/Documentation/DocBook/Makefile
index 5a2882d..66e1cf7 100644
--- a/Documentation/DocBook/Makefile
+++ b/Documentation/DocBook/Makefile
@@ -10,7 +10,8 @@
 	    kernel-hacking.xml kernel-locking.xml deviceiobook.xml \
 	    procfs-guide.xml writing_usb_driver.xml \
 	    kernel-api.xml journal-api.xml lsm.xml usb.xml \
-	    gadget.xml libata.xml mtdnand.xml librs.xml rapidio.xml
+	    gadget.xml libata.xml mtdnand.xml librs.xml rapidio.xml \
+	    genericirq.xml
 
 ###
 # The build process is as follows (targets):
diff --git a/Documentation/DocBook/genericirq.tmpl b/Documentation/DocBook/genericirq.tmpl
new file mode 100644
index 0000000..0f4a4b6
--- /dev/null
+++ b/Documentation/DocBook/genericirq.tmpl
@@ -0,0 +1,474 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
+	"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
+
+<book id="Generic-IRQ-Guide">
+ <bookinfo>
+  <title>Linux generic IRQ handling</title>
+
+  <authorgroup>
+   <author>
+    <firstname>Thomas</firstname>
+    <surname>Gleixner</surname>
+    <affiliation>
+     <address>
+      <email>tglx@linutronix.de</email>
+     </address>
+    </affiliation>
+   </author>
+   <author>
+    <firstname>Ingo</firstname>
+    <surname>Molnar</surname>
+    <affiliation>
+     <address>
+      <email>mingo@elte.hu</email>
+     </address>
+    </affiliation>
+   </author>
+  </authorgroup>
+
+  <copyright>
+   <year>2005-2006</year>
+   <holder>Thomas Gleixner</holder>
+  </copyright>
+  <copyright>
+   <year>2005-2006</year>
+   <holder>Ingo Molnar</holder>
+  </copyright>
+
+  <legalnotice>
+   <para>
+     This documentation is free software; you can redistribute
+     it and/or modify it under the terms of the GNU General Public
+     License version 2 as published by the Free Software Foundation.
+   </para>
+
+   <para>
+     This program is distributed in the hope that it will be
+     useful, but WITHOUT ANY WARRANTY; without even the implied
+     warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+     See the GNU General Public License for more details.
+   </para>
+
+   <para>
+     You should have received a copy of the GNU General Public
+     License along with this program; if not, write to the Free
+     Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
+     MA 02111-1307 USA
+   </para>
+
+   <para>
+     For more details see the file COPYING in the source
+     distribution of Linux.
+   </para>
+  </legalnotice>
+ </bookinfo>
+
+<toc></toc>
+
+  <chapter id="intro">
+    <title>Introduction</title>
+    <para>
+	The generic interrupt handling layer is designed to provide a
+	complete abstraction of interrupt handling for device drivers.
+	It is able to handle all the different types of interrupt controller
+	hardware. Device drivers use generic API functions to request, enable,
+	disable and free interrupts. The drivers do not have to know anything
+	about interrupt hardware details, so they can be used on different
+	platforms without code changes.
+    </para>
+    <para>
+  	This documentation is provided to developers who want to implement
+	an interrupt subsystem based for their architecture, with the help
+	of the generic IRQ handling layer.
+    </para>
+  </chapter>
+
+  <chapter id="rationale">
+    <title>Rationale</title>
+	<para>
+	The original implementation of interrupt handling in Linux is using
+	the __do_IRQ() super-handler, which is able to deal with every
+	type of interrupt logic.
+	</para>
+	<para>
+	Originally, Russell King identified different types of handlers to
+	build a quite universal set for the ARM interrupt handler
+	implementation in Linux 2.5/2.6. He distinguished between:
+	<itemizedlist>
+	  <listitem><para>Level type</para></listitem>
+	  <listitem><para>Edge type</para></listitem>
+	  <listitem><para>Simple type</para></listitem>
+	</itemizedlist>
+	In the SMP world of the __do_IRQ() super-handler another type
+	was identified:
+	<itemizedlist>
+	  <listitem><para>Per CPU type</para></listitem>
+	</itemizedlist>
+	</para>
+	<para>
+	This split implementation of highlevel IRQ handlers allows us to
+	optimize the flow of the interrupt handling for each specific
+	interrupt type. This reduces complexity in that particular codepath
+	and allows the optimized handling of a given type.
+	</para>
+	<para>
+	The original general IRQ implementation used hw_interrupt_type
+	structures and their ->ack(), ->end() [etc.] callbacks to
+	differentiate the flow control in the super-handler. This leads to
+	a mix of flow logic and lowlevel hardware logic, and it also leads
+	to unnecessary code duplication: for example in i386, there is a
+	ioapic_level_irq and a ioapic_edge_irq irq-type which share many
+	of the lowlevel details but have different flow handling.
+	</para>
+	<para>
+	A more natural abstraction is the clean separation of the
+	'irq flow' and the 'chip details'.
+	</para>
+	<para>
+	Analysing a couple of architecture's IRQ subsystem implementations
+	reveals that most of them can use a generic set of 'irq flow'
+	methods and only need to add the chip level specific code.
+	The separation is also valuable for (sub)architectures
+	which need specific quirks in the irq flow itself but not in the
+	chip-details - and thus provides a more transparent IRQ subsystem
+	design.
+	</para>
+	<para>
+	Each interrupt descriptor is assigned its own highlevel flow
+	handler, which is normally one of the generic
+	implementations. (This highlevel flow handler implementation also
+	makes it simple to provide demultiplexing handlers which can be
+	found in embedded platforms on various architectures.)
+	</para>
+	<para>
+	The separation makes the generic interrupt handling layer more
+	flexible and extensible. For example, an (sub)architecture can
+	use a generic irq-flow implementation for 'level type' interrupts
+	and add a (sub)architecture specific 'edge type' implementation.
+	</para>
+	<para>
+	To make the transition to the new model easier and prevent the
+	breakage of existing implementations, the __do_IRQ() super-handler
+	is still available. This leads to a kind of duality for the time
+	being. Over time the new model should be used in more and more
+	architectures, as it enables smaller and cleaner IRQ subsystems.
+	</para>
+  </chapter>
+  <chapter id="bugs">
+    <title>Known Bugs And Assumptions</title>
+    <para>
+	None (knock on wood).
+    </para>
+  </chapter>
+
+  <chapter id="Abstraction">
+    <title>Abstraction layers</title>
+    <para>
+	There are three main levels of abstraction in the interrupt code:
+	<orderedlist>
+	  <listitem><para>Highlevel driver API</para></listitem>
+	  <listitem><para>Highlevel IRQ flow handlers</para></listitem>
+	  <listitem><para>Chiplevel hardware encapsulation</para></listitem>
+	</orderedlist>
+    </para>
+    <sect1>
+	<title>Interrupt control flow</title>
+	<para>
+	Each interrupt is described by an interrupt descriptor structure
+	irq_desc. The interrupt is referenced by an 'unsigned int' numeric
+	value which selects the corresponding interrupt decription structure
+	in the descriptor structures array.
+	The descriptor structure contains status information and pointers
+	to the interrupt flow method and the interrupt chip structure
+	which are assigned to this interrupt.
+	</para>
+	<para>
+	Whenever an interrupt triggers, the lowlevel arch code calls into
+	the generic interrupt code by calling desc->handle_irq().
+	This highlevel IRQ handling function only uses desc->chip primitives
+	referenced by the assigned chip descriptor structure.
+	</para>
+    </sect1>
+    <sect1>
+	<title>Highlevel Driver API</title>
+	<para>
+	  The highlevel Driver API consists of following functions:
+	  <itemizedlist>
+	  <listitem><para>request_irq()</para></listitem>
+	  <listitem><para>free_irq()</para></listitem>
+	  <listitem><para>disable_irq()</para></listitem>
+	  <listitem><para>enable_irq()</para></listitem>
+	  <listitem><para>disable_irq_nosync() (SMP only)</para></listitem>
+	  <listitem><para>synchronize_irq() (SMP only)</para></listitem>
+	  <listitem><para>set_irq_type()</para></listitem>
+	  <listitem><para>set_irq_wake()</para></listitem>
+	  <listitem><para>set_irq_data()</para></listitem>
+	  <listitem><para>set_irq_chip()</para></listitem>
+	  <listitem><para>set_irq_chip_data()</para></listitem>
+          </itemizedlist>
+	  See the autogenerated function documentation for details.
+	</para>
+    </sect1>
+    <sect1>
+	<title>Highlevel IRQ flow handlers</title>
+	<para>
+	  The generic layer provides a set of pre-defined irq-flow methods:
+	  <itemizedlist>
+	  <listitem><para>handle_level_irq</para></listitem>
+	  <listitem><para>handle_edge_irq</para></listitem>
+	  <listitem><para>handle_simple_irq</para></listitem>
+	  <listitem><para>handle_percpu_irq</para></listitem>
+	  </itemizedlist>
+	  The interrupt flow handlers (either predefined or architecture
+	  specific) are assigned to specific interrupts by the architecture
+	  either during bootup or during device initialization.
+	</para>
+	<sect2>
+	<title>Default flow implementations</title>
+	    <sect3>
+	 	<title>Helper functions</title>
+		<para>
+		The helper functions call the chip primitives and
+		are used by the default flow implementations.
+		The following helper functions are implemented (simplified excerpt):
+		<programlisting>
+default_enable(irq)
+{
+	desc->chip->unmask(irq);
+}
+
+default_disable(irq)
+{
+	if (!delay_disable(irq))
+		desc->chip->mask(irq);
+}
+
+default_ack(irq)
+{
+	chip->ack(irq);
+}
+
+default_mask_ack(irq)
+{
+	if (chip->mask_ack) {
+		chip->mask_ack(irq);
+	} else {
+		chip->mask(irq);
+		chip->ack(irq);
+	}
+}
+
+noop(irq)
+{
+}
+
+		</programlisting>
+	        </para>
+	    </sect3>
+	</sect2>
+	<sect2>
+	<title>Default flow handler implementations</title>
+	    <sect3>
+	 	<title>Default Level IRQ flow handler</title>
+		<para>
+		handle_level_irq provides a generic implementation
+		for level-triggered interrupts.
+		</para>
+		<para>
+		The following control flow is implemented (simplified excerpt):
+		<programlisting>
+desc->chip->start();
+handle_IRQ_event(desc->action);
+desc->chip->end();
+		</programlisting>
+		</para>
+   	    </sect3>
+	    <sect3>
+	 	<title>Default Edge IRQ flow handler</title>
+		<para>
+		handle_edge_irq provides a generic implementation
+		for edge-triggered interrupts.
+		</para>
+		<para>
+		The following control flow is implemented (simplified excerpt):
+		<programlisting>
+if (desc->status &amp; running) {
+	desc->chip->hold();
+	desc->status |= pending | masked;
+	return;
+}
+desc->chip->start();
+desc->status |= running;
+do {
+	if (desc->status &amp; masked)
+		desc->chip->enable();
+	desc-status &amp;= ~pending;
+	handle_IRQ_event(desc->action);
+} while (status &amp; pending);
+desc-status &amp;= ~running;
+desc->chip->end();
+		</programlisting>
+		</para>
+   	    </sect3>
+	    <sect3>
+	 	<title>Default simple IRQ flow handler</title>
+		<para>
+		handle_simple_irq provides a generic implementation
+		for simple interrupts.
+		</para>
+		<para>
+		Note: The simple flow handler does not call any
+		handler/chip primitives.
+		</para>
+		<para>
+		The following control flow is implemented (simplified excerpt):
+		<programlisting>
+handle_IRQ_event(desc->action);
+		</programlisting>
+		</para>
+   	    </sect3>
+	    <sect3>
+	 	<title>Default per CPU flow handler</title>
+		<para>
+		handle_percpu_irq provides a generic implementation
+		for per CPU interrupts.
+		</para>
+		<para>
+		Per CPU interrupts are only available on SMP and
+		the handler provides a simplified version without
+		locking.
+		</para>
+		<para>
+		The following control flow is implemented (simplified excerpt):
+		<programlisting>
+desc->chip->start();
+handle_IRQ_event(desc->action);
+desc->chip->end();
+		</programlisting>
+		</para>
+   	    </sect3>
+	</sect2>
+	<sect2>
+	<title>Quirks and optimizations</title>
+	<para>
+	The generic functions are intended for 'clean' architectures and chips,
+	which have no platform-specific IRQ handling quirks. If an architecture
+	needs to implement quirks on the 'flow' level then it can do so by
+	overriding the highlevel irq-flow handler.
+	</para>
+	</sect2>
+	<sect2>
+	<title>Delayed interrupt disable</title>
+	<para>
+	This per interrupt selectable feature, which was introduced by Russell
+	King in the ARM interrupt implementation, does not mask an interrupt
+	at the hardware level when disable_irq() is called. The interrupt is
+	kept enabled and is masked in the flow handler when an interrupt event
+	happens. This prevents losing edge interrupts on hardware which does
+	not store an edge interrupt event while the interrupt is disabled at
+	the hardware level. When an interrupt arrives while the IRQ_DISABLED
+	flag is set, then the interrupt is masked at the hardware level and
+	the IRQ_PENDING bit is set. When the interrupt is re-enabled by
+	enable_irq() the pending bit is checked and if it is set, the
+	interrupt is resent either via hardware or by a software resend
+	mechanism. (It's necessary to enable CONFIG_HARDIRQS_SW_RESEND when
+	you want to use the delayed interrupt disable feature and your
+	hardware is not capable of retriggering	an interrupt.)
+	The delayed interrupt disable can be runtime enabled, per interrupt,
+	by setting the IRQ_DELAYED_DISABLE flag in the irq_desc status field.
+	</para>
+	</sect2>
+    </sect1>
+    <sect1>
+	<title>Chiplevel hardware encapsulation</title>
+	<para>
+	The chip level hardware descriptor structure irq_chip
+	contains all the direct chip relevant functions, which
+	can be utilized by the irq flow implementations.
+	  <itemizedlist>
+	  <listitem><para>ack()</para></listitem>
+	  <listitem><para>mask_ack() - Optional, recommended for performance</para></listitem>
+	  <listitem><para>mask()</para></listitem>
+	  <listitem><para>unmask()</para></listitem>
+	  <listitem><para>retrigger() - Optional</para></listitem>
+	  <listitem><para>set_type() - Optional</para></listitem>
+	  <listitem><para>set_wake() - Optional</para></listitem>
+	  </itemizedlist>
+	These primitives are strictly intended to mean what they say: ack means
+	ACK, masking means masking of an IRQ line, etc. It is up to the flow
+	handler(s) to use these basic units of lowlevel functionality.
+	</para>
+    </sect1>
+  </chapter>
+
+  <chapter id="doirq">
+     <title>__do_IRQ entry point</title>
+     <para>
+ 	The original implementation __do_IRQ() is an alternative entry
+	point for all types of interrupts.
+     </para>
+     <para>
+	This handler turned out to be not suitable for all
+	interrupt hardware and was therefore reimplemented with split
+	functionality for egde/level/simple/percpu interrupts. This is not
+	only a functional optimization. It also shortens code paths for
+	interrupts.
+      </para>
+      <para>
+	To make use of the split implementation, replace the call to
+	__do_IRQ by a call to desc->chip->handle_irq() and associate
+        the appropriate handler function to desc->chip->handle_irq().
+	In most cases the generic handler implementations should
+	be sufficient.
+     </para>
+  </chapter>
+
+  <chapter id="locking">
+     <title>Locking on SMP</title>
+     <para>
+	The locking of chip registers is up to the architecture that
+	defines the chip primitives. There is a chip->lock field that can be used
+	for serialization, but the generic layer does not touch it. The per-irq
+	structure is protected via desc->lock, by the generic layer.
+     </para>
+  </chapter>
+  <chapter id="structs">
+     <title>Structures</title>
+     <para>
+     This chapter contains the autogenerated documentation of the structures which are
+     used in the generic IRQ layer.
+     </para>
+!Iinclude/linux/irq.h
+  </chapter>
+
+  <chapter id="pubfunctions">
+     <title>Public Functions Provided</title>
+     <para>
+     This chapter contains the autogenerated documentation of the kernel API functions
+      which are exported.
+     </para>
+!Ekernel/irq/manage.c
+!Ekernel/irq/chip.c
+  </chapter>
+
+  <chapter id="intfunctions">
+     <title>Internal Functions Provided</title>
+     <para>
+     This chapter contains the autogenerated documentation of the internal functions.
+     </para>
+!Ikernel/irq/handle.c
+!Ikernel/irq/chip.c
+  </chapter>
+
+  <chapter id="credits">
+     <title>Credits</title>
+	<para>
+		The following people have contributed to this document:
+		<orderedlist>
+			<listitem><para>Thomas Gleixner<email>tglx@linutronix.de</email></para></listitem>
+			<listitem><para>Ingo Molnar<email>mingo@elte.hu</email></para></listitem>
+		</orderedlist>
+	</para>
+  </chapter>
+</book>
diff --git a/Documentation/DocBook/kernel-api.tmpl b/Documentation/DocBook/kernel-api.tmpl
index ca02e04..1ae4dc0 100644
--- a/Documentation/DocBook/kernel-api.tmpl
+++ b/Documentation/DocBook/kernel-api.tmpl
@@ -62,6 +62,8 @@
      <sect1><title>Internal Functions</title>
 !Ikernel/exit.c
 !Ikernel/signal.c
+!Iinclude/linux/kthread.h
+!Ekernel/kthread.c
      </sect1>
 
      <sect1><title>Kernel objects manipulation</title>
@@ -114,9 +116,33 @@
      </sect1>
   </chapter>
 
+  <chapter id="kernel-lib">
+     <title>Basic Kernel Library Functions</title>
+
+     <para>
+       The Linux kernel provides more basic utility functions.
+     </para>
+
+     <sect1><title>Bitmap Operations</title>
+!Elib/bitmap.c
+!Ilib/bitmap.c
+     </sect1>
+
+     <sect1><title>Command-line Parsing</title>
+!Elib/cmdline.c
+     </sect1>
+
+     <sect1><title>CRC Functions</title>
+!Elib/crc16.c
+!Elib/crc32.c
+!Elib/crc-ccitt.c
+     </sect1>
+  </chapter>
+
   <chapter id="mm">
      <title>Memory Management in Linux</title>
      <sect1><title>The Slab Cache</title>
+!Iinclude/linux/slab.h
 !Emm/slab.c
      </sect1>
      <sect1><title>User Space Memory Access</title>
@@ -280,12 +306,13 @@
      <sect1><title>MTRR Handling</title>
 !Earch/i386/kernel/cpu/mtrr/main.c
      </sect1>
+
      <sect1><title>PCI Support Library</title>
 !Edrivers/pci/pci.c
 !Edrivers/pci/pci-driver.c
 !Edrivers/pci/remove.c
 !Edrivers/pci/pci-acpi.c
-<!-- kerneldoc does not understand to __devinit
+<!-- kerneldoc does not understand __devinit
 X!Edrivers/pci/search.c
  -->
 !Edrivers/pci/msi.c
@@ -314,9 +341,11 @@
      </sect1>
   </chapter>
 
-  <chapter id="devfs">
-     <title>The Device File System</title>
-!Efs/devfs/base.c
+  <chapter id="firmware">
+     <title>Firmware Interfaces</title>
+     <sect1><title>DMI Interfaces</title>
+!Edrivers/firmware/dmi_scan.c
+     </sect1>
   </chapter>
 
   <chapter id="sysfs">
@@ -331,6 +360,18 @@
 !Esecurity/security.c
   </chapter>
 
+  <chapter id="audit">
+     <title>Audit Interfaces</title>
+!Ekernel/audit.c
+!Ikernel/auditsc.c
+!Ikernel/auditfilter.c
+  </chapter>
+
+  <chapter id="accounting">
+     <title>Accounting Framework</title>
+!Ikernel/acct.c
+  </chapter>
+
   <chapter id="pmfuncs">
      <title>Power Management</title>
 !Ekernel/power/pm.c
@@ -390,7 +431,6 @@
      </sect1>
   </chapter>
 
-
   <chapter id="blkdev">
      <title>Block Devices</title>
 !Eblock/ll_rw_blk.c
@@ -401,6 +441,14 @@
 !Edrivers/char/misc.c
   </chapter>
 
+  <chapter id="parportdev">
+     <title>Parallel Port Devices</title>
+!Iinclude/linux/parport.h
+!Edrivers/parport/ieee1284.c
+!Edrivers/parport/share.c
+!Idrivers/parport/daisy.c
+  </chapter>
+
   <chapter id="viddev">
      <title>Video4Linux</title>
 !Edrivers/media/video/videodev.c
diff --git a/Documentation/DocBook/kernel-locking.tmpl b/Documentation/DocBook/kernel-locking.tmpl
index 158ffe9..644c388 100644
--- a/Documentation/DocBook/kernel-locking.tmpl
+++ b/Documentation/DocBook/kernel-locking.tmpl
@@ -1590,7 +1590,7 @@
     <para>
       Our final dilemma is this: when can we actually destroy the
       removed element?  Remember, a reader might be stepping through
-      this element in the list right now: it we free this element and
+      this element in the list right now: if we free this element and
       the <symbol>next</symbol> pointer changes, the reader will jump
       off into garbage and crash.  We need to wait until we know that
       all the readers who were traversing the list when we deleted the
diff --git a/Documentation/DocBook/libata.tmpl b/Documentation/DocBook/libata.tmpl
index f869b03..e97c323 100644
--- a/Documentation/DocBook/libata.tmpl
+++ b/Documentation/DocBook/libata.tmpl
@@ -169,6 +169,22 @@
 
 	</sect2>
 
+	<sect2><title>PIO data read/write</title>
+	<programlisting>
+void (*data_xfer) (struct ata_device *, unsigned char *, unsigned int, int);
+	</programlisting>
+
+	<para>
+All bmdma-style drivers must implement this hook.  This is the low-level
+operation that actually copies the data bytes during a PIO data
+transfer.
+Typically the driver
+will choose one of ata_pio_data_xfer_noirq(), ata_pio_data_xfer(), or
+ata_mmio_data_xfer().
+	</para>
+
+	</sect2>
+
 	<sect2><title>ATA command execute</title>
 	<programlisting>
 void (*exec_command)(struct ata_port *ap, struct ata_taskfile *tf);
@@ -204,11 +220,10 @@
 	<programlisting>
 u8   (*check_status)(struct ata_port *ap);
 u8   (*check_altstatus)(struct ata_port *ap);
-u8   (*check_err)(struct ata_port *ap);
 	</programlisting>
 
 	<para>
-	Reads the Status/AltStatus/Error ATA shadow register from
+	Reads the Status/AltStatus ATA shadow register from
 	hardware.  On some hardware, reading the Status register has
 	the side effect of clearing the interrupt condition.
 	Most drivers for taskfile-based hardware use
@@ -269,23 +284,6 @@
 
 	</sect2>
 
-	<sect2><title>Reset ATA bus</title>
-	<programlisting>
-void (*phy_reset) (struct ata_port *ap);
-	</programlisting>
-
-	<para>
-	The very first step in the probe phase.  Actions vary depending
-	on the bus type, typically.  After waking up the device and probing
-	for device presence (PATA and SATA), typically a soft reset
-	(SRST) will be performed.  Drivers typically use the helper
-	functions ata_bus_reset() or sata_phy_reset() for this hook.
-	Many SATA drivers use sata_phy_reset() or call it from within
-	their own phy_reset() functions.
-	</para>
-
-	</sect2>
-
 	<sect2><title>Control PCI IDE BMDMA engine</title>
 	<programlisting>
 void (*bmdma_setup) (struct ata_queued_cmd *qc);
@@ -354,16 +352,74 @@
 
 	</sect2>
 
-	<sect2><title>Timeout (error) handling</title>
+	<sect2><title>Exception and probe handling (EH)</title>
 	<programlisting>
 void (*eng_timeout) (struct ata_port *ap);
+void (*phy_reset) (struct ata_port *ap);
 	</programlisting>
 
 	<para>
-This is a high level error handling function, called from the
-error handling thread, when a command times out.  Most newer
-hardware will implement its own error handling code here.  IDE BMDMA
-drivers may use the helper function ata_eng_timeout().
+Deprecated.  Use ->error_handler() instead.
+	</para>
+
+	<programlisting>
+void (*freeze) (struct ata_port *ap);
+void (*thaw) (struct ata_port *ap);
+	</programlisting>
+
+	<para>
+ata_port_freeze() is called when HSM violations or some other
+condition disrupts normal operation of the port.  A frozen port
+is not allowed to perform any operation until the port is
+thawed, which usually follows a successful reset.
+	</para>
+
+	<para>
+The optional ->freeze() callback can be used for freezing the port
+hardware-wise (e.g. mask interrupt and stop DMA engine).  If a
+port cannot be frozen hardware-wise, the interrupt handler
+must ack and clear interrupts unconditionally while the port
+is frozen.
+	</para>
+	<para>
+The optional ->thaw() callback is called to perform the opposite of ->freeze():
+prepare the port for normal operation once again.  Unmask interrupts,
+start DMA engine, etc.
+	</para>
+
+	<programlisting>
+void (*error_handler) (struct ata_port *ap);
+	</programlisting>
+
+	<para>
+->error_handler() is a driver's hook into probe, hotplug, and recovery
+and other exceptional conditions.  The primary responsibility of an
+implementation is to call ata_do_eh() or ata_bmdma_drive_eh() with a set
+of EH hooks as arguments:
+	</para>
+
+	<para>
+'prereset' hook (may be NULL) is called during an EH reset, before any other actions
+are taken.
+	</para>
+
+	<para>
+'postreset' hook (may be NULL) is called after the EH reset is performed.  Based on
+existing conditions, severity of the problem, and hardware capabilities,
+	</para>
+
+	<para>
+Either 'softreset' (may be NULL) or 'hardreset' (may be NULL) will be
+called to perform the low-level EH reset.
+	</para>
+
+	<programlisting>
+void (*post_internal_cmd) (struct ata_queued_cmd *qc);
+	</programlisting>
+
+	<para>
+Perform any hardware-specific actions necessary to finish processing
+after executing a probe-time or EH-time command via ata_exec_internal().
 	</para>
 
 	</sect2>
diff --git a/Documentation/DocBook/mtdnand.tmpl b/Documentation/DocBook/mtdnand.tmpl
index 6e463d0..a8c8cce 100644
--- a/Documentation/DocBook/mtdnand.tmpl
+++ b/Documentation/DocBook/mtdnand.tmpl
@@ -109,7 +109,7 @@
 		for most of the implementations. These functions can be replaced by the
 		board driver if neccecary. Those functions are called via pointers in the
 		NAND chip description structure. The board driver can set the functions which
-		should be replaced by board dependend functions before calling nand_scan().
+		should be replaced by board dependent functions before calling nand_scan().
 		If the function pointer is NULL on entry to nand_scan() then the pointer
 		is set to the default function which is suitable for the detected chip type.
 		</para></listitem>
@@ -133,7 +133,7 @@
 	  	[REPLACEABLE]</para><para>
 		Replaceable members hold hardware related functions which can be 
 		provided by the board driver. The board driver can set the functions which
-		should be replaced by board dependend functions before calling nand_scan().
+		should be replaced by board dependent functions before calling nand_scan().
 		If the function pointer is NULL on entry to nand_scan() then the pointer
 		is set to the default function which is suitable for the detected chip type.
 		</para></listitem>
@@ -156,9 +156,8 @@
      	<title>Basic board driver</title>
 	<para>
 		For most boards it will be sufficient to provide just the
-		basic functions and fill out some really board dependend
+		basic functions and fill out some really board dependent
 		members in the nand chip description structure.
-		See drivers/mtd/nand/skeleton for reference.
 	</para>
 	<sect1>
 		<title>Basic defines</title>
@@ -189,9 +188,9 @@
 	<sect1>
 		<title>Partition defines</title>
 		<para>
-			If you want to divide your device into parititions, then
-			enable the configuration switch CONFIG_MTD_PARITIONS and define
-			a paritioning scheme suitable to your board.
+			If you want to divide your device into partitions, then
+			enable the configuration switch CONFIG_MTD_PARTITIONS and define
+			a partitioning scheme suitable to your board.
 		</para>
 		<programlisting>
 #define NUM_PARTITIONS 2
@@ -1295,7 +1294,9 @@
      </para>
 !Idrivers/mtd/nand/nand_base.c
 !Idrivers/mtd/nand/nand_bbt.c
-!Idrivers/mtd/nand/nand_ecc.c
+<!-- No internal functions for kernel-doc:
+X!Idrivers/mtd/nand/nand_ecc.c
+-->
   </chapter>
 
   <chapter id="credits">
diff --git a/Documentation/DocBook/videobook.tmpl b/Documentation/DocBook/videobook.tmpl
index fdff984..b629da3 100644
--- a/Documentation/DocBook/videobook.tmpl
+++ b/Documentation/DocBook/videobook.tmpl
@@ -976,7 +976,7 @@
   <title>Interrupt Handling</title>
   <para>
         Our example handler is for an ISA bus device. If it was PCI you would be
-        able to share the interrupt and would have set SA_SHIRQ to indicate a 
+        able to share the interrupt and would have set IRQF_SHARED to indicate a
         shared IRQ. We pass the device pointer as the interrupt routine argument. We
         don't need to since we only support one card but doing this will make it
         easier to upgrade the driver for multiple devices in the future.
diff --git a/Documentation/IPMI.txt b/Documentation/IPMI.txt
index bf1cf98d..0256805 100644
--- a/Documentation/IPMI.txt
+++ b/Documentation/IPMI.txt
@@ -10,7 +10,7 @@
 It provides for dynamic discovery of sensors in the system and the
 ability to monitor the sensors and be informed when the sensor's
 values change or go outside certain boundaries.  It also has a
-standardized database for field-replacable units (FRUs) and a watchdog
+standardized database for field-replaceable units (FRUs) and a watchdog
 timer.
 
 To use this, you need an interface to an IPMI controller in your
@@ -64,7 +64,7 @@
 IPMI defines a standard watchdog timer.  You can enable this with the
 'IPMI Watchdog Timer' config option.  If you compile the driver into
 the kernel, then via a kernel command-line option you can have the
-watchdog timer start as soon as it intitializes.  It also have a lot
+watchdog timer start as soon as it initializes.  It also have a lot
 of other options, see the 'Watchdog' section below for more details.
 Note that you can also have the watchdog continue to run if it is
 closed (by default it is disabled on close).  Go into the 'Watchdog
diff --git a/Documentation/IRQ.txt b/Documentation/IRQ.txt
new file mode 100644
index 0000000..1011e71
--- /dev/null
+++ b/Documentation/IRQ.txt
@@ -0,0 +1,22 @@
+What is an IRQ?
+
+An IRQ is an interrupt request from a device.
+Currently they can come in over a pin, or over a packet.
+Several devices may be connected to the same pin thus
+sharing an IRQ.
+
+An IRQ number is a kernel identifier used to talk about a hardware
+interrupt source.  Typically this is an index into the global irq_desc
+array, but except for what linux/interrupt.h implements the details
+are architecture specific.
+
+An IRQ number is an enumeration of the possible interrupt sources on a
+machine.  Typically what is enumerated is the number of input pins on
+all of the interrupt controller in the system.  In the case of ISA
+what is enumerated are the 16 input pins on the two i8259 interrupt
+controllers.
+
+Architectures can assign additional meaning to the IRQ numbers, and
+are encouraged to in the case  where there is any manual configuration
+of the hardware involved.  The ISA IRQs are a classic example of
+assigning this kind of additional meaning.
diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt
index 49e27cc..1d50cf0 100644
--- a/Documentation/RCU/checklist.txt
+++ b/Documentation/RCU/checklist.txt
@@ -144,9 +144,47 @@
 	whether the increased speed is worth it.
 
 8.	Although synchronize_rcu() is a bit slower than is call_rcu(),
-	it usually results in simpler code.  So, unless update performance
-	is important or the updaters cannot block, synchronize_rcu()
-	should be used in preference to call_rcu().
+	it usually results in simpler code.  So, unless update
+	performance is critically important or the updaters cannot block,
+	synchronize_rcu() should be used in preference to call_rcu().
+
+	An especially important property of the synchronize_rcu()
+	primitive is that it automatically self-limits: if grace periods
+	are delayed for whatever reason, then the synchronize_rcu()
+	primitive will correspondingly delay updates.  In contrast,
+	code using call_rcu() should explicitly limit update rate in
+	cases where grace periods are delayed, as failing to do so can
+	result in excessive realtime latencies or even OOM conditions.
+
+	Ways of gaining this self-limiting property when using call_rcu()
+	include:
+
+	a.	Keeping a count of the number of data-structure elements
+		used by the RCU-protected data structure, including those
+		waiting for a grace period to elapse.  Enforce a limit
+		on this number, stalling updates as needed to allow
+		previously deferred frees to complete.
+
+		Alternatively, limit only the number awaiting deferred
+		free rather than the total number of elements.
+
+	b.	Limiting update rate.  For example, if updates occur only
+		once per hour, then no explicit rate limiting is required,
+		unless your system is already badly broken.  The dcache
+		subsystem takes this approach -- updates are guarded
+		by a global lock, limiting their rate.
+
+	c.	Trusted update -- if updates can only be done manually by
+		superuser or some other trusted user, then it might not
+		be necessary to automatically limit them.  The theory
+		here is that superuser already has lots of ways to crash
+		the machine.
+
+	d.	Use call_rcu_bh() rather than call_rcu(), in order to take
+		advantage of call_rcu_bh()'s faster grace periods.
+
+	e.	Periodically invoke synchronize_rcu(), permitting a limited
+		number of updates per grace period.
 
 9.	All RCU list-traversal primitives, which include
 	list_for_each_rcu(), list_for_each_entry_rcu(),
diff --git a/Documentation/RCU/torture.txt b/Documentation/RCU/torture.txt
index e4c3815..a494859 100644
--- a/Documentation/RCU/torture.txt
+++ b/Documentation/RCU/torture.txt
@@ -7,7 +7,7 @@
 implementations.  It creates an rcutorture kernel module that can
 be loaded to run a torture test.  The test periodically outputs
 status messages via printk(), which can be examined via the dmesg
-command (perhaps grepping for "rcutorture").  The test is started
+command (perhaps grepping for "torture").  The test is started
 when the module is loaded, and stops when the module is unloaded.
 
 However, actually setting this config option to "y" results in the system
@@ -35,6 +35,19 @@
 		be printed -only- when the module is unloaded, and this
 		is the default.
 
+shuffle_interval
+		The number of seconds to keep the test threads affinitied
+		to a particular subset of the CPUs.  Used in conjunction
+		with test_no_idle_hz.
+
+test_no_idle_hz	Whether or not to test the ability of RCU to operate in
+		a kernel that disables the scheduling-clock interrupt to
+		idle CPUs.  Boolean parameter, "1" to test, "0" otherwise.
+
+torture_type	The type of RCU to test: "rcu" for the rcu_read_lock()
+		API, "rcu_bh" for the rcu_read_lock_bh() API, and "srcu"
+		for the "srcu_read_lock()" API.
+
 verbose		Enable debug printk()s.  Default is disabled.
 
 
@@ -42,14 +55,14 @@
 
 The statistics output is as follows:
 
-	rcutorture: --- Start of test: nreaders=16 stat_interval=0 verbose=0
-	rcutorture: rtc: 0000000000000000 ver: 1916 tfle: 0 rta: 1916 rtaf: 0 rtf: 1915
-	rcutorture: Reader Pipe:  1466408 9747 0 0 0 0 0 0 0 0 0
-	rcutorture: Reader Batch:  1464477 11678 0 0 0 0 0 0 0 0
-	rcutorture: Free-Block Circulation:  1915 1915 1915 1915 1915 1915 1915 1915 1915 1915 0
-	rcutorture: --- End of test
+	rcu-torture: --- Start of test: nreaders=16 stat_interval=0 verbose=0
+	rcu-torture: rtc: 0000000000000000 ver: 1916 tfle: 0 rta: 1916 rtaf: 0 rtf: 1915
+	rcu-torture: Reader Pipe:  1466408 9747 0 0 0 0 0 0 0 0 0
+	rcu-torture: Reader Batch:  1464477 11678 0 0 0 0 0 0 0 0
+	rcu-torture: Free-Block Circulation:  1915 1915 1915 1915 1915 1915 1915 1915 1915 1915 0
+	rcu-torture: --- End of test
 
-The command "dmesg | grep rcutorture:" will extract this information on
+The command "dmesg | grep torture:" will extract this information on
 most systems.  On more esoteric configurations, it may be necessary to
 use other commands to access the output of the printk()s used by
 the RCU torture test.  The printk()s use KERN_ALERT, so they should
@@ -115,8 +128,9 @@
 	modprobe rcutorture
 	sleep 100
 	rmmod rcutorture
-	dmesg | grep rcutorture:
+	dmesg | grep torture:
 
 The output can be manually inspected for the error flag of "!!!".
 One could of course create a more elaborate script that automatically
-checked for such errors.
+checked for such errors.  The "rmmod" command forces a "SUCCESS" or
+"FAILURE" indication to be printk()ed.
diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt
index 07cb93b..318df44 100644
--- a/Documentation/RCU/whatisRCU.txt
+++ b/Documentation/RCU/whatisRCU.txt
@@ -184,7 +184,17 @@
 	blocking, it registers a function and argument which are invoked
 	after all ongoing RCU read-side critical sections have completed.
 	This callback variant is particularly useful in situations where
-	it is illegal to block.
+	it is illegal to block or where update-side performance is
+	critically important.
+
+	However, the call_rcu() API should not be used lightly, as use
+	of the synchronize_rcu() API generally results in simpler code.
+	In addition, the synchronize_rcu() API has the nice property
+	of automatically limiting update rate should grace periods
+	be delayed.  This property results in system resilience in face
+	of denial-of-service attacks.  Code using call_rcu() should limit
+	update rate in order to gain this same sort of resilience.  See
+	checklist.txt for some approaches to limiting the update rate.
 
 rcu_assign_pointer()
 
@@ -677,8 +687,9 @@
 	+	spin_lock(&listmutex);
 		list_for_each_entry(p, head, lp) {
 			if (p->key == key) {
-				list_del(&p->list);
+	-			list_del(&p->list);
 	-			write_unlock(&listmutex);
+	+			list_del_rcu(&p->list);
 	+			spin_unlock(&listmutex);
 	+			synchronize_rcu();
 				kfree(p);
@@ -726,7 +737,7 @@
  5   write_lock(&listmutex);            5   spin_lock(&listmutex);
  6   list_for_each_entry(p, head, lp) { 6   list_for_each_entry(p, head, lp) {
  7     if (p->key == key) {             7     if (p->key == key) {
- 8       list_del(&p->list);            8       list_del(&p->list);
+ 8       list_del(&p->list);            8       list_del_rcu(&p->list);
  9       write_unlock(&listmutex);      9       spin_unlock(&listmutex);
                                        10       synchronize_rcu();
 10       kfree(p);                     11       kfree(p);
@@ -790,7 +801,6 @@
 
 RCU grace period:
 
-	synchronize_kernel (deprecated)
 	synchronize_net
 	synchronize_sched
 	synchronize_rcu
diff --git a/Documentation/README.DAC960 b/Documentation/README.DAC960
index 98ea617..0e8f618 100644
--- a/Documentation/README.DAC960
+++ b/Documentation/README.DAC960
@@ -78,9 +78,9 @@
 terms are in use in the Mylex documentation; I have chosen to standardize on
 the more generic "Logical Drive" and "Drive Group".
 
-DAC960 RAID disk devices are named in the style of the Device File System
-(DEVFS).  The device corresponding to Logical Drive D on Controller C is
-referred to as /dev/rd/cCdD, and the partitions are called /dev/rd/cCdDp1
+DAC960 RAID disk devices are named in the style of the obsolete Device File
+System (DEVFS).  The device corresponding to Logical Drive D on Controller C
+is referred to as /dev/rd/cCdD, and the partitions are called /dev/rd/cCdDp1
 through /dev/rd/cCdDp7.  For example, partition 3 of Logical Drive 5 on
 Controller 2 is referred to as /dev/rd/c2d5p3.  Note that unlike with SCSI
 disks the device names will not change in the event of a disk drive failure.
diff --git a/Documentation/SubmitChecklist b/Documentation/SubmitChecklist
new file mode 100644
index 0000000..a10bfb6
--- /dev/null
+++ b/Documentation/SubmitChecklist
@@ -0,0 +1,63 @@
+Linux Kernel patch sumbittal checklist
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Here are some basic things that developers should do if they want to see their
+kernel patch submissions accepted more quickly.
+
+These are all above and beyond the documentation that is provided in
+Documentation/SubmittingPatches and elsewhere regarding submitting Linux
+kernel patches.
+
+
+
+1: Builds cleanly with applicable or modified CONFIG options =y, =m, and
+   =n.  No gcc warnings/errors, no linker warnings/errors.
+
+2: Passes allnoconfig, allmodconfig
+
+3: Builds on multiple CPU architectures by using local cross-compile tools
+   or something like PLM at OSDL.
+
+4: ppc64 is a good architecture for cross-compilation checking because it
+   tends to use `unsigned long' for 64-bit quantities.
+
+5: Matches kernel coding style(!)
+
+6: Any new or modified CONFIG options don't muck up the config menu.
+
+7: All new Kconfig options have help text.
+
+8: Has been carefully reviewed with respect to relevant Kconfig
+   combinations.  This is very hard to get right with testing -- brainpower
+   pays off here.
+
+9: Check cleanly with sparse.
+
+10: Use 'make checkstack' and 'make namespacecheck' and fix any problems
+    that they find.  Note: checkstack does not point out problems explicitly,
+    but any one function that uses more than 512 bytes on the stack is a
+    candidate for change.
+
+11: Include kernel-doc to document global kernel APIs.  (Not required for
+    static functions, but OK there also.) Use 'make htmldocs' or 'make
+    mandocs' to check the kernel-doc and fix any issues.
+
+12: Has been tested with CONFIG_PREEMPT, CONFIG_DEBUG_PREEMPT,
+    CONFIG_DEBUG_SLAB, CONFIG_DEBUG_PAGEALLOC, CONFIG_DEBUG_MUTEXES,
+    CONFIG_DEBUG_SPINLOCK, CONFIG_DEBUG_SPINLOCK_SLEEP all simultaneously
+    enabled.
+
+13: Has been build- and runtime tested with and without CONFIG_SMP and
+    CONFIG_PREEMPT.
+
+14: If the patch affects IO/Disk, etc: has been tested with and without
+    CONFIG_LBD.
+
+15: All codepaths have been exercised with all lockdep features enabled.
+
+16: All new /proc entries are documented under Documentation/
+
+17: All new kernel boot parameters are documented in
+    Documentation/kernel-parameters.txt.
+
+18: All new module parameters are documented with MODULE_PARM_DESC()
diff --git a/Documentation/accounting/delay-accounting.txt b/Documentation/accounting/delay-accounting.txt
new file mode 100644
index 0000000..be215e5
--- /dev/null
+++ b/Documentation/accounting/delay-accounting.txt
@@ -0,0 +1,110 @@
+Delay accounting
+----------------
+
+Tasks encounter delays in execution when they wait
+for some kernel resource to become available e.g. a
+runnable task may wait for a free CPU to run on.
+
+The per-task delay accounting functionality measures
+the delays experienced by a task while
+
+a) waiting for a CPU (while being runnable)
+b) completion of synchronous block I/O initiated by the task
+c) swapping in pages
+
+and makes these statistics available to userspace through
+the taskstats interface.
+
+Such delays provide feedback for setting a task's cpu priority,
+io priority and rss limit values appropriately. Long delays for
+important tasks could be a trigger for raising its corresponding priority.
+
+The functionality, through its use of the taskstats interface, also provides
+delay statistics aggregated for all tasks (or threads) belonging to a
+thread group (corresponding to a traditional Unix process). This is a commonly
+needed aggregation that is more efficiently done by the kernel.
+
+Userspace utilities, particularly resource management applications, can also
+aggregate delay statistics into arbitrary groups. To enable this, delay
+statistics of a task are available both during its lifetime as well as on its
+exit, ensuring continuous and complete monitoring can be done.
+
+
+Interface
+---------
+
+Delay accounting uses the taskstats interface which is described
+in detail in a separate document in this directory. Taskstats returns a
+generic data structure to userspace corresponding to per-pid and per-tgid
+statistics. The delay accounting functionality populates specific fields of
+this structure. See
+     include/linux/taskstats.h
+for a description of the fields pertaining to delay accounting.
+It will generally be in the form of counters returning the cumulative
+delay seen for cpu, sync block I/O, swapin etc.
+
+Taking the difference of two successive readings of a given
+counter (say cpu_delay_total) for a task will give the delay
+experienced by the task waiting for the corresponding resource
+in that interval.
+
+When a task exits, records containing the per-task statistics
+are sent to userspace without requiring a command. If it is the last exiting
+task of a thread group, the per-tgid statistics are also sent. More details
+are given in the taskstats interface description.
+
+The getdelays.c userspace utility in this directory allows simple commands to
+be run and the corresponding delay statistics to be displayed. It also serves
+as an example of using the taskstats interface.
+
+Usage
+-----
+
+Compile the kernel with
+	CONFIG_TASK_DELAY_ACCT=y
+	CONFIG_TASKSTATS=y
+
+Enable the accounting at boot time by adding
+the following to the kernel boot options
+	delayacct
+
+and after the system has booted up, use a utility
+similar to  getdelays.c to access the delays
+seen by a given task or a task group (tgid).
+The utility also allows a given command to be
+executed and the corresponding delays to be
+seen.
+
+General format of the getdelays command
+
+getdelays [-t tgid] [-p pid] [-c cmd...]
+
+
+Get delays, since system boot, for pid 10
+# ./getdelays -p 10
+(output similar to next case)
+
+Get sum of delays, since system boot, for all pids with tgid 5
+# ./getdelays -t 5
+
+
+CPU	count	real total	virtual total	delay total
+	7876	92005750	100000000	24001500
+IO	count	delay total
+	0	0
+MEM	count	delay total
+	0	0
+
+Get delays seen in executing a given simple command
+# ./getdelays -c ls /
+
+bin   data1  data3  data5  dev  home  media  opt   root  srv        sys  usr
+boot  data2  data4  data6  etc  lib   mnt    proc  sbin  subdomain  tmp  var
+
+
+CPU	count	real total	virtual total	delay total
+	6	4000250		4000000		0
+IO	count	delay total
+	0	0
+MEM	count	delay total
+	0	0
diff --git a/Documentation/accounting/getdelays.c b/Documentation/accounting/getdelays.c
new file mode 100644
index 0000000..795ca39
--- /dev/null
+++ b/Documentation/accounting/getdelays.c
@@ -0,0 +1,396 @@
+/* getdelays.c
+ *
+ * Utility to get per-pid and per-tgid delay accounting statistics
+ * Also illustrates usage of the taskstats interface
+ *
+ * Copyright (C) Shailabh Nagar, IBM Corp. 2005
+ * Copyright (C) Balbir Singh, IBM Corp. 2006
+ * Copyright (c) Jay Lan, SGI. 2006
+ *
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <unistd.h>
+#include <poll.h>
+#include <string.h>
+#include <fcntl.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/socket.h>
+#include <sys/types.h>
+#include <signal.h>
+
+#include <linux/genetlink.h>
+#include <linux/taskstats.h>
+
+/*
+ * Generic macros for dealing with netlink sockets. Might be duplicated
+ * elsewhere. It is recommended that commercial grade applications use
+ * libnl or libnetlink and use the interfaces provided by the library
+ */
+#define GENLMSG_DATA(glh)	((void *)(NLMSG_DATA(glh) + GENL_HDRLEN))
+#define GENLMSG_PAYLOAD(glh)	(NLMSG_PAYLOAD(glh, 0) - GENL_HDRLEN)
+#define NLA_DATA(na)		((void *)((char*)(na) + NLA_HDRLEN))
+#define NLA_PAYLOAD(len)	(len - NLA_HDRLEN)
+
+#define err(code, fmt, arg...) do { printf(fmt, ##arg); exit(code); } while (0)
+int done = 0;
+int rcvbufsz=0;
+
+    char name[100];
+int dbg=0, print_delays=0;
+__u64 stime, utime;
+#define PRINTF(fmt, arg...) {			\
+	    if (dbg) {				\
+		printf(fmt, ##arg);		\
+	    }					\
+	}
+
+/* Maximum size of response requested or message sent */
+#define MAX_MSG_SIZE	256
+/* Maximum number of cpus expected to be specified in a cpumask */
+#define MAX_CPUS	32
+/* Maximum length of pathname to log file */
+#define MAX_FILENAME	256
+
+struct msgtemplate {
+	struct nlmsghdr n;
+	struct genlmsghdr g;
+	char buf[MAX_MSG_SIZE];
+};
+
+char cpumask[100+6*MAX_CPUS];
+
+/*
+ * Create a raw netlink socket and bind
+ */
+static int create_nl_socket(int protocol)
+{
+	int fd;
+	struct sockaddr_nl local;
+
+	fd = socket(AF_NETLINK, SOCK_RAW, protocol);
+	if (fd < 0)
+		return -1;
+
+	if (rcvbufsz)
+		if (setsockopt(fd, SOL_SOCKET, SO_RCVBUF,
+				&rcvbufsz, sizeof(rcvbufsz)) < 0) {
+			printf("Unable to set socket rcv buf size to %d\n",
+			       rcvbufsz);
+			return -1;
+		}
+
+	memset(&local, 0, sizeof(local));
+	local.nl_family = AF_NETLINK;
+
+	if (bind(fd, (struct sockaddr *) &local, sizeof(local)) < 0)
+		goto error;
+
+	return fd;
+error:
+	close(fd);
+	return -1;
+}
+
+
+int send_cmd(int sd, __u16 nlmsg_type, __u32 nlmsg_pid,
+	     __u8 genl_cmd, __u16 nla_type,
+	     void *nla_data, int nla_len)
+{
+	struct nlattr *na;
+	struct sockaddr_nl nladdr;
+	int r, buflen;
+	char *buf;
+
+	struct msgtemplate msg;
+
+	msg.n.nlmsg_len = NLMSG_LENGTH(GENL_HDRLEN);
+	msg.n.nlmsg_type = nlmsg_type;
+	msg.n.nlmsg_flags = NLM_F_REQUEST;
+	msg.n.nlmsg_seq = 0;
+	msg.n.nlmsg_pid = nlmsg_pid;
+	msg.g.cmd = genl_cmd;
+	msg.g.version = 0x1;
+	na = (struct nlattr *) GENLMSG_DATA(&msg);
+	na->nla_type = nla_type;
+	na->nla_len = nla_len + 1 + NLA_HDRLEN;
+	memcpy(NLA_DATA(na), nla_data, nla_len);
+	msg.n.nlmsg_len += NLMSG_ALIGN(na->nla_len);
+
+	buf = (char *) &msg;
+	buflen = msg.n.nlmsg_len ;
+	memset(&nladdr, 0, sizeof(nladdr));
+	nladdr.nl_family = AF_NETLINK;
+	while ((r = sendto(sd, buf, buflen, 0, (struct sockaddr *) &nladdr,
+			   sizeof(nladdr))) < buflen) {
+		if (r > 0) {
+			buf += r;
+			buflen -= r;
+		} else if (errno != EAGAIN)
+			return -1;
+	}
+	return 0;
+}
+
+
+/*
+ * Probe the controller in genetlink to find the family id
+ * for the TASKSTATS family
+ */
+int get_family_id(int sd)
+{
+	struct {
+		struct nlmsghdr n;
+		struct genlmsghdr g;
+		char buf[256];
+	} ans;
+
+	int id, rc;
+	struct nlattr *na;
+	int rep_len;
+
+	strcpy(name, TASKSTATS_GENL_NAME);
+	rc = send_cmd(sd, GENL_ID_CTRL, getpid(), CTRL_CMD_GETFAMILY,
+			CTRL_ATTR_FAMILY_NAME, (void *)name,
+			strlen(TASKSTATS_GENL_NAME)+1);
+
+	rep_len = recv(sd, &ans, sizeof(ans), 0);
+	if (ans.n.nlmsg_type == NLMSG_ERROR ||
+	    (rep_len < 0) || !NLMSG_OK((&ans.n), rep_len))
+		return 0;
+
+	na = (struct nlattr *) GENLMSG_DATA(&ans);
+	na = (struct nlattr *) ((char *) na + NLA_ALIGN(na->nla_len));
+	if (na->nla_type == CTRL_ATTR_FAMILY_ID) {
+		id = *(__u16 *) NLA_DATA(na);
+	}
+	return id;
+}
+
+void print_delayacct(struct taskstats *t)
+{
+	printf("\n\nCPU   %15s%15s%15s%15s\n"
+	       "      %15llu%15llu%15llu%15llu\n"
+	       "IO    %15s%15s\n"
+	       "      %15llu%15llu\n"
+	       "MEM   %15s%15s\n"
+	       "      %15llu%15llu\n\n",
+	       "count", "real total", "virtual total", "delay total",
+	       t->cpu_count, t->cpu_run_real_total, t->cpu_run_virtual_total,
+	       t->cpu_delay_total,
+	       "count", "delay total",
+	       t->blkio_count, t->blkio_delay_total,
+	       "count", "delay total", t->swapin_count, t->swapin_delay_total);
+}
+
+int main(int argc, char *argv[])
+{
+	int c, rc, rep_len, aggr_len, len2, cmd_type;
+	__u16 id;
+	__u32 mypid;
+
+	struct nlattr *na;
+	int nl_sd = -1;
+	int len = 0;
+	pid_t tid = 0;
+	pid_t rtid = 0;
+
+	int fd = 0;
+	int count = 0;
+	int write_file = 0;
+	int maskset = 0;
+	char logfile[128];
+	int loop = 0;
+
+	struct msgtemplate msg;
+
+	while (1) {
+		c = getopt(argc, argv, "dw:r:m:t:p:v:l");
+		if (c < 0)
+			break;
+
+		switch (c) {
+		case 'd':
+			printf("print delayacct stats ON\n");
+			print_delays = 1;
+			break;
+		case 'w':
+			strncpy(logfile, optarg, MAX_FILENAME);
+			printf("write to file %s\n", logfile);
+			write_file = 1;
+			break;
+		case 'r':
+			rcvbufsz = atoi(optarg);
+			printf("receive buf size %d\n", rcvbufsz);
+			if (rcvbufsz < 0)
+				err(1, "Invalid rcv buf size\n");
+			break;
+		case 'm':
+			strncpy(cpumask, optarg, sizeof(cpumask));
+			maskset = 1;
+			printf("cpumask %s maskset %d\n", cpumask, maskset);
+			break;
+		case 't':
+			tid = atoi(optarg);
+			if (!tid)
+				err(1, "Invalid tgid\n");
+			cmd_type = TASKSTATS_CMD_ATTR_TGID;
+			print_delays = 1;
+			break;
+		case 'p':
+			tid = atoi(optarg);
+			if (!tid)
+				err(1, "Invalid pid\n");
+			cmd_type = TASKSTATS_CMD_ATTR_PID;
+			print_delays = 1;
+			break;
+		case 'v':
+			printf("debug on\n");
+			dbg = 1;
+			break;
+		case 'l':
+			printf("listen forever\n");
+			loop = 1;
+			break;
+		default:
+			printf("Unknown option %d\n", c);
+			exit(-1);
+		}
+	}
+
+	if (write_file) {
+		fd = open(logfile, O_WRONLY | O_CREAT | O_TRUNC,
+			  S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH);
+		if (fd == -1) {
+			perror("Cannot open output file\n");
+			exit(1);
+		}
+	}
+
+	if ((nl_sd = create_nl_socket(NETLINK_GENERIC)) < 0)
+		err(1, "error creating Netlink socket\n");
+
+
+	mypid = getpid();
+	id = get_family_id(nl_sd);
+	if (!id) {
+		printf("Error getting family id, errno %d", errno);
+		goto err;
+	}
+	PRINTF("family id %d\n", id);
+
+	if (maskset) {
+		rc = send_cmd(nl_sd, id, mypid, TASKSTATS_CMD_GET,
+			      TASKSTATS_CMD_ATTR_REGISTER_CPUMASK,
+			      &cpumask, sizeof(cpumask));
+		PRINTF("Sent register cpumask, retval %d\n", rc);
+		if (rc < 0) {
+			printf("error sending register cpumask\n");
+			goto err;
+		}
+	}
+
+	if (tid) {
+		rc = send_cmd(nl_sd, id, mypid, TASKSTATS_CMD_GET,
+			      cmd_type, &tid, sizeof(__u32));
+		PRINTF("Sent pid/tgid, retval %d\n", rc);
+		if (rc < 0) {
+			printf("error sending tid/tgid cmd\n");
+			goto done;
+		}
+	}
+
+	do {
+		int i;
+
+		rep_len = recv(nl_sd, &msg, sizeof(msg), 0);
+		PRINTF("received %d bytes\n", rep_len);
+
+		if (rep_len < 0) {
+			printf("nonfatal reply error: errno %d\n", errno);
+			continue;
+		}
+		if (msg.n.nlmsg_type == NLMSG_ERROR ||
+		    !NLMSG_OK((&msg.n), rep_len)) {
+			printf("fatal reply error,  errno %d\n", errno);
+			goto done;
+		}
+
+		PRINTF("nlmsghdr size=%d, nlmsg_len=%d, rep_len=%d\n",
+		       sizeof(struct nlmsghdr), msg.n.nlmsg_len, rep_len);
+
+
+		rep_len = GENLMSG_PAYLOAD(&msg.n);
+
+		na = (struct nlattr *) GENLMSG_DATA(&msg);
+		len = 0;
+		i = 0;
+		while (len < rep_len) {
+			len += NLA_ALIGN(na->nla_len);
+			switch (na->nla_type) {
+			case TASKSTATS_TYPE_AGGR_TGID:
+				/* Fall through */
+			case TASKSTATS_TYPE_AGGR_PID:
+				aggr_len = NLA_PAYLOAD(na->nla_len);
+				len2 = 0;
+				/* For nested attributes, na follows */
+				na = (struct nlattr *) NLA_DATA(na);
+				done = 0;
+				while (len2 < aggr_len) {
+					switch (na->nla_type) {
+					case TASKSTATS_TYPE_PID:
+						rtid = *(int *) NLA_DATA(na);
+						if (print_delays)
+							printf("PID\t%d\n", rtid);
+						break;
+					case TASKSTATS_TYPE_TGID:
+						rtid = *(int *) NLA_DATA(na);
+						if (print_delays)
+							printf("TGID\t%d\n", rtid);
+						break;
+					case TASKSTATS_TYPE_STATS:
+						count++;
+						if (print_delays)
+							print_delayacct((struct taskstats *) NLA_DATA(na));
+						if (fd) {
+							if (write(fd, NLA_DATA(na), na->nla_len) < 0) {
+								err(1,"write error\n");
+							}
+						}
+						if (!loop)
+							goto done;
+						break;
+					default:
+						printf("Unknown nested nla_type %d\n", na->nla_type);
+						break;
+					}
+					len2 += NLA_ALIGN(na->nla_len);
+					na = (struct nlattr *) ((char *) na + len2);
+				}
+				break;
+
+			default:
+				printf("Unknown nla_type %d\n", na->nla_type);
+				break;
+			}
+			na = (struct nlattr *) (GENLMSG_DATA(&msg) + len);
+		}
+	} while (loop);
+done:
+	if (maskset) {
+		rc = send_cmd(nl_sd, id, mypid, TASKSTATS_CMD_GET,
+			      TASKSTATS_CMD_ATTR_DEREGISTER_CPUMASK,
+			      &cpumask, sizeof(cpumask));
+		printf("Sent deregister mask, retval %d\n", rc);
+		if (rc < 0)
+			err(rc, "error sending deregister cpumask\n");
+	}
+err:
+	close(nl_sd);
+	if (fd)
+		close(fd);
+	return 0;
+}
diff --git a/Documentation/accounting/taskstats.txt b/Documentation/accounting/taskstats.txt
new file mode 100644
index 0000000..92ebf29
--- /dev/null
+++ b/Documentation/accounting/taskstats.txt
@@ -0,0 +1,181 @@
+Per-task statistics interface
+-----------------------------
+
+
+Taskstats is a netlink-based interface for sending per-task and
+per-process statistics from the kernel to userspace.
+
+Taskstats was designed for the following benefits:
+
+- efficiently provide statistics during lifetime of a task and on its exit
+- unified interface for multiple accounting subsystems
+- extensibility for use by future accounting patches
+
+Terminology
+-----------
+
+"pid", "tid" and "task" are used interchangeably and refer to the standard
+Linux task defined by struct task_struct.  per-pid stats are the same as
+per-task stats.
+
+"tgid", "process" and "thread group" are used interchangeably and refer to the
+tasks that share an mm_struct i.e. the traditional Unix process. Despite the
+use of tgid, there is no special treatment for the task that is thread group
+leader - a process is deemed alive as long as it has any task belonging to it.
+
+Usage
+-----
+
+To get statistics during a task's lifetime, userspace opens a unicast netlink
+socket (NETLINK_GENERIC family) and sends commands specifying a pid or a tgid.
+The response contains statistics for a task (if pid is specified) or the sum of
+statistics for all tasks of the process (if tgid is specified).
+
+To obtain statistics for tasks which are exiting, the userspace listener
+sends a register command and specifies a cpumask. Whenever a task exits on
+one of the cpus in the cpumask, its per-pid statistics are sent to the
+registered listener. Using cpumasks allows the data received by one listener
+to be limited and assists in flow control over the netlink interface and is
+explained in more detail below.
+
+If the exiting task is the last thread exiting its thread group,
+an additional record containing the per-tgid stats is also sent to userspace.
+The latter contains the sum of per-pid stats for all threads in the thread
+group, both past and present.
+
+getdelays.c is a simple utility demonstrating usage of the taskstats interface
+for reporting delay accounting statistics. Users can register cpumasks,
+send commands and process responses, listen for per-tid/tgid exit data,
+write the data received to a file and do basic flow control by increasing
+receive buffer sizes.
+
+Interface
+---------
+
+The user-kernel interface is encapsulated in include/linux/taskstats.h
+
+To avoid this documentation becoming obsolete as the interface evolves, only
+an outline of the current version is given. taskstats.h always overrides the
+description here.
+
+struct taskstats is the common accounting structure for both per-pid and
+per-tgid data. It is versioned and can be extended by each accounting subsystem
+that is added to the kernel. The fields and their semantics are defined in the
+taskstats.h file.
+
+The data exchanged between user and kernel space is a netlink message belonging
+to the NETLINK_GENERIC family and using the netlink attributes interface.
+The messages are in the format
+
+    +----------+- - -+-------------+-------------------+
+    | nlmsghdr | Pad |  genlmsghdr | taskstats payload |
+    +----------+- - -+-------------+-------------------+
+
+
+The taskstats payload is one of the following three kinds:
+
+1. Commands: Sent from user to kernel. Commands to get data on
+a pid/tgid consist of one attribute, of type TASKSTATS_CMD_ATTR_PID/TGID,
+containing a u32 pid or tgid in the attribute payload. The pid/tgid denotes
+the task/process for which userspace wants statistics.
+
+Commands to register/deregister interest in exit data from a set of cpus
+consist of one attribute, of type
+TASKSTATS_CMD_ATTR_REGISTER/DEREGISTER_CPUMASK and contain a cpumask in the
+attribute payload. The cpumask is specified as an ascii string of
+comma-separated cpu ranges e.g. to listen to exit data from cpus 1,2,3,5,7,8
+the cpumask would be "1-3,5,7-8". If userspace forgets to deregister interest
+in cpus before closing the listening socket, the kernel cleans up its interest
+set over time. However, for the sake of efficiency, an explicit deregistration
+is advisable.
+
+2. Response for a command: sent from the kernel in response to a userspace
+command. The payload is a series of three attributes of type:
+
+a) TASKSTATS_TYPE_AGGR_PID/TGID : attribute containing no payload but indicates
+a pid/tgid will be followed by some stats.
+
+b) TASKSTATS_TYPE_PID/TGID: attribute whose payload is the pid/tgid whose stats
+is being returned.
+
+c) TASKSTATS_TYPE_STATS: attribute with a struct taskstsats as payload. The
+same structure is used for both per-pid and per-tgid stats.
+
+3. New message sent by kernel whenever a task exits. The payload consists of a
+   series of attributes of the following type:
+
+a) TASKSTATS_TYPE_AGGR_PID: indicates next two attributes will be pid+stats
+b) TASKSTATS_TYPE_PID: contains exiting task's pid
+c) TASKSTATS_TYPE_STATS: contains the exiting task's per-pid stats
+d) TASKSTATS_TYPE_AGGR_TGID: indicates next two attributes will be tgid+stats
+e) TASKSTATS_TYPE_TGID: contains tgid of process to which task belongs
+f) TASKSTATS_TYPE_STATS: contains the per-tgid stats for exiting task's process
+
+
+per-tgid stats
+--------------
+
+Taskstats provides per-process stats, in addition to per-task stats, since
+resource management is often done at a process granularity and aggregating task
+stats in userspace alone is inefficient and potentially inaccurate (due to lack
+of atomicity).
+
+However, maintaining per-process, in addition to per-task stats, within the
+kernel has space and time overheads. To address this, the taskstats code
+accumalates each exiting task's statistics into a process-wide data structure.
+When the last task of a process exits, the process level data accumalated also
+gets sent to userspace (along with the per-task data).
+
+When a user queries to get per-tgid data, the sum of all other live threads in
+the group is added up and added to the accumalated total for previously exited
+threads of the same thread group.
+
+Extending taskstats
+-------------------
+
+There are two ways to extend the taskstats interface to export more
+per-task/process stats as patches to collect them get added to the kernel
+in future:
+
+1. Adding more fields to the end of the existing struct taskstats. Backward
+   compatibility is ensured by the version number within the
+   structure. Userspace will use only the fields of the struct that correspond
+   to the version its using.
+
+2. Defining separate statistic structs and using the netlink attributes
+   interface to return them. Since userspace processes each netlink attribute
+   independently, it can always ignore attributes whose type it does not
+   understand (because it is using an older version of the interface).
+
+
+Choosing between 1. and 2. is a matter of trading off flexibility and
+overhead. If only a few fields need to be added, then 1. is the preferable
+path since the kernel and userspace don't need to incur the overhead of
+processing new netlink attributes. But if the new fields expand the existing
+struct too much, requiring disparate userspace accounting utilities to
+unnecessarily receive large structures whose fields are of no interest, then
+extending the attributes structure would be worthwhile.
+
+Flow control for taskstats
+--------------------------
+
+When the rate of task exits becomes large, a listener may not be able to keep
+up with the kernel's rate of sending per-tid/tgid exit data leading to data
+loss. This possibility gets compounded when the taskstats structure gets
+extended and the number of cpus grows large.
+
+To avoid losing statistics, userspace should do one or more of the following:
+
+- increase the receive buffer sizes for the netlink sockets opened by
+listeners to receive exit data.
+
+- create more listeners and reduce the number of cpus being listened to by
+each listener. In the extreme case, there could be one listener for each cpu.
+Users may also consider setting the cpu affinity of the listener to the subset
+of cpus to which it listens, especially if they are listening to just one cpu.
+
+Despite these measures, if the userspace receives ENOBUFS error messages
+indicated overflow of receive buffers, it should take measures to handle the
+loss of data.
+
+----
diff --git a/Documentation/arm/IXP4xx b/Documentation/arm/IXP4xx
index d4c6d3a..43edb4e 100644
--- a/Documentation/arm/IXP4xx
+++ b/Documentation/arm/IXP4xx
@@ -85,7 +85,7 @@
 2) If > 64MB of memory space is required, the IXP4xx can be 
    configured to use indirect registers to access PCI This allows 
    for up to 128MB (0x48000000 to 0x4fffffff) of memory on the bus. 
-   The disadvantadge of this is that every PCI access requires 
+   The disadvantage of this is that every PCI access requires 
    three local register accesses plus a spinlock, but in some 
    cases the performance hit is acceptable. In addition, you cannot 
    mmap() PCI devices in this case due to the indirect nature
diff --git a/Documentation/arm/Samsung-S3C24XX/Overview.txt b/Documentation/arm/Samsung-S3C24XX/Overview.txt
index 8c6ee68..3e46d2a 100644
--- a/Documentation/arm/Samsung-S3C24XX/Overview.txt
+++ b/Documentation/arm/Samsung-S3C24XX/Overview.txt
@@ -7,11 +7,13 @@
 ------------
 
   The Samsung S3C24XX range of ARM9 System-on-Chip CPUs are supported
-  by the 's3c2410' architecture of ARM Linux. Currently the S3C2410 and
-  the S3C2440 are supported CPUs.
+  by the 's3c2410' architecture of ARM Linux. Currently the S3C2410,
+  S3C2440 and S3C2442 devices are supported.
 
   Support for the S3C2400 series is in progress.
 
+  Support for the S3C2412 and S3C2413 CPUs is being merged.
+
 
 Configuration
 -------------
@@ -43,9 +45,18 @@
 
     Samsung's own development board, geared for PDA work.
 
+  Samsung/Aiji SMDK2412
+
+    The S3C2412 version of the SMDK2440.
+
+  Samsung/Aiji SMDK2413
+
+    The S3C2412 version of the SMDK2440.
+
   Samsung/Meritech SMDK2440
 
-    The S3C2440 compatible version of the SMDK2440
+    The S3C2440 compatible version of the SMDK2440, which has the
+    option of an S3C2440 or S3C2442 CPU module.
 
   Thorcom VR1000
 
@@ -211,24 +222,6 @@
   Lucas Correia Villa Real (S3C2400 port)
 
 
-Document Changes
-----------------
-
-  05 Sep 2004 - BJD - Added Document Changes section
-  05 Sep 2004 - BJD - Added Klaus Fetscher to list of contributors
-  25 Oct 2004 - BJD - Added Dimitry Andric to list of contributors
-  25 Oct 2004 - BJD - Updated the MTD from the 2.6.9 merge
-  21 Jan 2005 - BJD - Added rx3715, added Shannon to contributors
-  10 Feb 2005 - BJD - Added Guillaume Gourat to contributors
-  02 Mar 2005 - BJD - Added SMDK2440 to list of machines
-  06 Mar 2005 - BJD - Added Christer Weinigel
-  08 Mar 2005 - BJD - Added LCVR to list of people, updated introduction
-  08 Mar 2005 - BJD - Added section on adding machines
-  09 Sep 2005 - BJD - Added section on platform data
-  11 Feb 2006 - BJD - Added I2C, RTC and Watchdog sections
-  11 Feb 2006 - BJD - Added Osiris machine, and S3C2400 information
-
-
 Document Author
 ---------------
 
diff --git a/Documentation/arm/Samsung-S3C24XX/S3C2412.txt b/Documentation/arm/Samsung-S3C24XX/S3C2412.txt
new file mode 100644
index 0000000..cb82a7f
--- /dev/null
+++ b/Documentation/arm/Samsung-S3C24XX/S3C2412.txt
@@ -0,0 +1,120 @@
+		S3C2412 ARM Linux Overview
+		==========================
+
+Introduction
+------------
+
+  The S3C2412 is part of the S3C24XX range of ARM9 System-on-Chip CPUs
+  from Samsung. This part has an ARM926-EJS core, capable of running up
+  to 266MHz (see data-sheet for more information)
+
+
+Clock
+-----
+
+  The core clock code provides a set of clocks to the drivers, and allows
+  for source selection and a number of other features.
+
+
+Power
+-----
+
+  No support for suspend/resume to RAM in the current system.
+
+
+DMA
+---
+
+  No current support for DMA.
+
+
+GPIO
+----
+
+  There is support for setting the GPIO to input/output/special function
+  and reading or writing to them.
+
+
+UART
+----
+
+  The UART hardware is similar to the S3C2440, and is supported by the
+  s3c2410 driver in the drivers/serial directory.
+
+
+NAND
+----
+
+  The NAND hardware is similar to the S3C2440, and is supported by the
+  s3c2410 driver in the drivers/mtd/nand directory.
+
+
+USB Host
+--------
+
+  The USB hardware is similar to the S3C2410, with extended clock source
+  control. The OHCI portion is supported by the ohci-s3c2410 driver, and
+  the clock control selection is supported by the core clock code.
+
+
+USB Device
+----------
+
+  No current support in the kernel
+
+
+IRQs
+----
+
+  All the standard, and external interrupt sources are supported. The
+  extra sub-sources are not yet supported.
+
+
+RTC
+---
+
+  The RTC hardware is similar to the S3C2410, and is supported by the
+  s3c2410-rtc driver.
+
+
+Watchdog
+--------
+
+  The watchdog harware is the same as the S3C2410, and is supported by
+  the s3c2410_wdt driver.
+
+
+MMC/SD/SDIO
+-----------
+
+  No current support for the MMC/SD/SDIO block.
+
+IIC
+---
+
+  The IIC hardware is the same as the S3C2410, and is supported by the
+  i2c-s3c24xx driver.
+
+
+IIS
+---
+
+  No current support for the IIS interface.
+
+
+SPI
+---
+
+  No current support for the SPI interfaces.
+
+
+ATA
+---
+
+  No current support for the on-board ATA block.
+
+
+Document Author
+---------------
+
+Ben Dooks, (c) 2006 Simtec Electronics
diff --git a/Documentation/arm/Samsung-S3C24XX/S3C2413.txt b/Documentation/arm/Samsung-S3C24XX/S3C2413.txt
new file mode 100644
index 0000000..ab2a888
--- /dev/null
+++ b/Documentation/arm/Samsung-S3C24XX/S3C2413.txt
@@ -0,0 +1,21 @@
+		S3C2413 ARM Linux Overview
+		==========================
+
+Introduction
+------------
+
+  The S3C2413 is an extended version of the S3C2412, with an camera
+  interface and mobile DDR memory support. See the S3C2412 support
+  documentation for more information.
+
+
+Camera Interface
+---------------
+
+  This block is currently not supported.
+
+
+Document Author
+---------------
+
+Ben Dooks, (c) 2006 Simtec Electronics
diff --git a/Documentation/arm/Sharp-LH/ADC-LH7-Touchscreen b/Documentation/arm/Sharp-LH/ADC-LH7-Touchscreen
new file mode 100644
index 0000000..1e6a23f
--- /dev/null
+++ b/Documentation/arm/Sharp-LH/ADC-LH7-Touchscreen
@@ -0,0 +1,61 @@
+README on the ADC/Touchscreen Controller
+========================================
+
+The LH79524 and LH7A404 include a built-in Analog to Digital
+controller (ADC) that is used to process input from a touchscreen.
+The driver only implements a four-wire touch panel protocol.
+
+The touchscreen driver is maintenance free except for the pen-down or
+touch threshold.  Some resistive displays and board combinations may
+require tuning of this threshold.  The driver exposes some of it's
+internal state in the sys filesystem.  If the kernel is configured
+with it, CONFIG_SYSFS, and sysfs is mounted at /sys, there will be a
+directory
+
+  /sys/devices/platform/adc-lh7.0
+
+containing these files.
+
+  -r--r--r--    1 root     root         4096 Jan  1 00:00 samples
+  -rw-r--r--    1 root     root         4096 Jan  1 00:00 threshold
+  -r--r--r--    1 root     root         4096 Jan  1 00:00 threshold_range
+
+The threshold is the current touch threshold.  It defaults to 750 on
+most targets.
+
+  # cat threshold
+ 750
+
+The threshold_range contains the range of valid values for the
+threshold.  Values outside of this range will be silently ignored.
+
+  # cat threshold_range
+  0 1023
+
+To change the threshold, write a value to the threshold file.
+
+  # echo 500 > threshold
+  # cat threshold
+  500
+
+The samples file contains the most recently sampled values from the
+ADC.  There are 12.  Below are typical of the last sampled values when
+the pen has been released.  The first two and last two samples are for
+detecting whether or not the pen is down.  The third through sixth are
+X coordinate samples.  The seventh through tenth are Y coordinate
+samples.
+
+  # cat samples
+  1023 1023 0 0 0 0 530 529 530 529 1023 1023
+
+To determine a reasonable threshold, press on the touch panel with an
+appropriate stylus and read the values from samples.
+
+  # cat samples
+  1023 676 92 103 101 102 855 919 922 922 1023 679
+
+The first and eleventh samples are discarded.  Thus, the important
+values are the second and twelfth which are used to determine if the
+pen is down.  When both are below the threshold, the driver registers
+that the pen is down.  When either is above the threshold, it
+registers then pen is up.
diff --git a/Documentation/arm/Sharp-LH/LCDPanels b/Documentation/arm/Sharp-LH/LCDPanels
new file mode 100644
index 0000000..fb1b21c
--- /dev/null
+++ b/Documentation/arm/Sharp-LH/LCDPanels
@@ -0,0 +1,59 @@
+README on the LCD Panels
+========================
+
+Configuration options for several LCD panels, available from Logic PD,
+are included in the kernel source.  This README will help you
+understand the configuration data and give you some guidance for
+adding support for other panels if you wish.
+
+
+lcd-panels.h
+------------
+
+There is no way, at present, to detect which panel is attached to the
+system at runtime.  Thus the kernel configuration is static.  The file
+arch/arm/mach-ld7a40x/lcd-panels.h (or similar) defines all of the
+panel specific parameters.
+
+It should be possible for this data to be shared among several device
+families.  The current layout may be insufficiently general, but it is
+amenable to improvement.
+
+
+PIXEL_CLOCK
+-----------
+
+The panel data sheets will give a range of acceptable pixel clocks.
+The fundamental LCDCLK input frequency is divided down by a PCD
+constant in field '.tim2'.  It may happen that it is impossible to set
+the pixel clock within this range.  A clock which is too slow will
+tend to flicker.  For the highest quality image, set the clock as high
+as possible.
+
+
+MARGINS
+-------
+
+These values may be difficult to glean from the panel data sheet.  In
+the case of the Sharp panels, the upper margin is explicitly called
+out as a specific number of lines from the top of the frame.  The
+other values may not matter as much as the panels tend to
+automatically center the image.
+
+
+Sync Sense
+----------
+
+The sense of the hsync and vsync pulses may be called out in the data
+sheet.  On one panel, the sense of these pulses determine the height
+of the visible region on the panel.  Most of the Sharp panels use
+negative sense sync pulses set by the TIM2_IHS and TIM2_IVS bits in
+'.tim2'.
+
+
+Pel Layout
+----------
+
+The Sharp color TFT panels are all configured for 16 bit direct color
+modes.  The amba-lcd driver sets the pel mode to 565 for 5 bits of
+each red and blue and 6 bits of green.
diff --git a/Documentation/atomic_ops.txt b/Documentation/atomic_ops.txt
index 23a1c24..2a63d56 100644
--- a/Documentation/atomic_ops.txt
+++ b/Documentation/atomic_ops.txt
@@ -157,13 +157,13 @@
 	smp_mb__before_atomic_dec();
 	atomic_dec(&obj->ref_count);
 
-It makes sure that all memory operations preceeding the atomic_dec()
+It makes sure that all memory operations preceding the atomic_dec()
 call are strongly ordered with respect to the atomic counter
-operation.  In the above example, it guarentees that the assignment of
+operation.  In the above example, it guarantees that the assignment of
 "1" to obj->dead will be globally visible to other cpus before the
 atomic counter decrement.
 
-Without the explicitl smp_mb__before_atomic_dec() call, the
+Without the explicit smp_mb__before_atomic_dec() call, the
 implementation could legally allow the atomic counter update visible
 to other cpus before the "obj->dead = 1;" assignment.
 
@@ -173,11 +173,11 @@
 (smp_mb__{before,after}_atomic_inc()).
 
 A missing memory barrier in the cases where they are required by the
-atomic_t implementation above can have disasterous results.  Here is
-an example, which follows a pattern occuring frequently in the Linux
+atomic_t implementation above can have disastrous results.  Here is
+an example, which follows a pattern occurring frequently in the Linux
 kernel.  It is the use of atomic counters to implement reference
 counting, and it works such that once the counter falls to zero it can
-be guarenteed that no other entity can be accessing the object:
+be guaranteed that no other entity can be accessing the object:
 
 static void obj_list_add(struct obj *obj)
 {
@@ -291,9 +291,9 @@
 size.  The endianness of the bits within each "unsigned long" are the
 native endianness of the cpu.
 
-	void set_bit(unsigned long nr, volatils unsigned long *addr);
-	void clear_bit(unsigned long nr, volatils unsigned long *addr);
-	void change_bit(unsigned long nr, volatils unsigned long *addr);
+	void set_bit(unsigned long nr, volatile unsigned long *addr);
+	void clear_bit(unsigned long nr, volatile unsigned long *addr);
+	void change_bit(unsigned long nr, volatile unsigned long *addr);
 
 These routines set, clear, and change, respectively, the bit number
 indicated by "nr" on the bit mask pointed to by "ADDR".
@@ -301,9 +301,9 @@
 They must execute atomically, yet there are no implicit memory barrier
 semantics required of these interfaces.
 
-	int test_and_set_bit(unsigned long nr, volatils unsigned long *addr);
-	int test_and_clear_bit(unsigned long nr, volatils unsigned long *addr);
-	int test_and_change_bit(unsigned long nr, volatils unsigned long *addr);
+	int test_and_set_bit(unsigned long nr, volatile unsigned long *addr);
+	int test_and_clear_bit(unsigned long nr, volatile unsigned long *addr);
+	int test_and_change_bit(unsigned long nr, volatile unsigned long *addr);
 
 Like the above, except that these routines return a boolean which
 indicates whether the changed bit was set _BEFORE_ the atomic bit
@@ -335,7 +335,7 @@
 		/* ... */;
 	obj->killed = 1;
 
-The implementation of test_and_set_bit() must guarentee that
+The implementation of test_and_set_bit() must guarantee that
 "obj->dead = 1;" is visible to cpus before the atomic memory operation
 done by test_and_set_bit() becomes visible.  Likewise, the atomic
 memory operation done by test_and_set_bit() must become visible before
@@ -474,7 +474,7 @@
 strictly orders all subsequent memory operations (including
 the cas()) with respect to itself, things will be fine.
 
-Said another way, _atomic_dec_and_lock() must guarentee that
+Said another way, _atomic_dec_and_lock() must guarantee that
 a counter dropping to zero is never made visible before the
 spinlock being acquired.
 
diff --git a/Documentation/console/console.txt b/Documentation/console/console.txt
new file mode 100644
index 0000000..d3e1744
--- /dev/null
+++ b/Documentation/console/console.txt
@@ -0,0 +1,144 @@
+Console Drivers
+===============
+
+The linux kernel has 2 general types of console drivers.  The first type is
+assigned by the kernel to all the virtual consoles during the boot process.
+This type will be called 'system driver', and only one system driver is allowed
+to exist. The system driver is persistent and it can never be unloaded, though
+it may become inactive.
+
+The second type has to be explicitly loaded and unloaded. This will be called
+'modular driver' by this document. Multiple modular drivers can coexist at
+any time with each driver sharing the console with other drivers including
+the system driver. However, modular drivers cannot take over the console
+that is currently occupied by another modular driver. (Exception: Drivers that
+call take_over_console() will succeed in the takeover regardless of the type
+of driver occupying the consoles.) They can only take over the console that is
+occupied by the system driver. In the same token, if the modular driver is
+released by the console, the system driver will take over.
+
+Modular drivers, from the programmer's point of view, has to call:
+
+	 take_over_console() - load and bind driver to console layer
+	 give_up_console() - unbind and unload driver
+
+In newer kernels, the following are also available:
+
+	 register_con_driver()
+	 unregister_con_driver()
+
+If sysfs is enabled, the contents of /sys/class/vtconsole can be
+examined. This shows the console backends currently registered by the
+system which are named vtcon<n> where <n> is an integer fro 0 to 15. Thus:
+
+       ls /sys/class/vtconsole
+       .  ..  vtcon0  vtcon1
+
+Each directory in /sys/class/vtconsole has 3 files:
+
+     ls /sys/class/vtconsole/vtcon0
+     .  ..  bind  name  uevent
+
+What do these files signify?
+
+     1. bind - this is a read/write file. It shows the status of the driver if
+        read, or acts to bind or unbind the driver to the virtual consoles
+        when written to. The possible values are:
+
+	0 - means the driver is not bound and if echo'ed, commands the driver
+	    to unbind
+
+        1 - means the driver is bound and if echo'ed, commands the driver to
+	    bind
+
+     2. name - read-only file. Shows the name of the driver in this format:
+
+	cat /sys/class/vtconsole/vtcon0/name
+	(S) VGA+
+
+	    '(S)' stands for a (S)ystem driver, ie, it cannot be directly
+	    commanded to bind or unbind
+
+	    'VGA+' is the name of the driver
+
+	cat /sys/class/vtconsole/vtcon1/name
+	(M) frame buffer device
+
+	    In this case, '(M)' stands for a (M)odular driver, one that can be
+	    directly commanded to bind or unbind.
+
+     3. uevent - ignore this file
+
+When unbinding, the modular driver is detached first, and then the system
+driver takes over the consoles vacated by the driver. Binding, on the other
+hand, will bind the driver to the consoles that are currently occupied by a
+system driver.
+
+NOTE1: Binding and binding must be selected in Kconfig. It's under:
+
+Device Drivers -> Character devices -> Support for binding and unbinding
+console drivers
+
+NOTE2: If any of the virtual consoles are in KD_GRAPHICS mode, then binding or
+unbinding will not succeed. An example of an application that sets the console
+to KD_GRAPHICS is X.
+
+How useful is this feature? This is very useful for console driver
+developers. By unbinding the driver from the console layer, one can unload the
+driver, make changes, recompile, reload and rebind the driver without any need
+for rebooting the kernel. For regular users who may want to switch from
+framebuffer console to VGA console and vice versa, this feature also makes
+this possible. (NOTE NOTE NOTE: Please read fbcon.txt under Documentation/fb
+for more details).
+
+Notes for developers:
+=====================
+
+take_over_console() is now broken up into:
+
+     register_con_driver()
+     bind_con_driver() - private function
+
+give_up_console() is a wrapper to unregister_con_driver(), and a driver must
+be fully unbound for this call to succeed. con_is_bound() will check if the
+driver is bound or not.
+
+Guidelines for console driver writers:
+=====================================
+
+In order for binding to and unbinding from the console to properly work,
+console drivers must follow these guidelines:
+
+1. All drivers, except system drivers, must call either register_con_driver()
+   or take_over_console(). register_con_driver() will just add the driver to
+   the console's internal list. It won't take over the
+   console. take_over_console(), as it name implies, will also take over (or
+   bind to) the console.
+
+2. All resources allocated during con->con_init() must be released in
+   con->con_deinit().
+
+3. All resources allocated in con->con_startup() must be released when the
+   driver, which was previously bound, becomes unbound.  The console layer
+   does not have a complementary call to con->con_startup() so it's up to the
+   driver to check when it's legal to release these resources. Calling
+   con_is_bound() in con->con_deinit() will help.  If the call returned
+   false(), then it's safe to release the resources.  This balance has to be
+   ensured because con->con_startup() can be called again when a request to
+   rebind the driver to the console arrives.
+
+4. Upon exit of the driver, ensure that the driver is totally unbound. If the
+   condition is satisfied, then the driver must call unregister_con_driver()
+   or give_up_console().
+
+5. unregister_con_driver() can also be called on conditions which make it
+   impossible for the driver to service console requests.  This can happen
+   with the framebuffer console that suddenly lost all of its drivers.
+
+The current crop of console drivers should still work correctly, but binding
+and unbinding them may cause problems. With minimal fixes, these drivers can
+be made to work correctly.
+
+==========================
+Antonino Daplas <adaplas@pol.net>
+
diff --git a/Documentation/devices.txt b/Documentation/devices.txt
index b369a8c..4aaf68f 100644
--- a/Documentation/devices.txt
+++ b/Documentation/devices.txt
@@ -3,7 +3,7 @@
 
 	     Maintained by Torben Mathiasen <device@lanana.org>
 
-		      Last revised: 25 January 2005
+		      Last revised: 15 May 2006
 
 This list is the Linux Device List, the official registry of allocated
 device numbers and /dev directory nodes for the Linux operating
@@ -94,7 +94,6 @@
 		  9 = /dev/urandom	Faster, less secure random number gen.
 		 10 = /dev/aio		Asyncronous I/O notification interface
 		 11 = /dev/kmsg		Writes to this come out as printk's
-		 12 = /dev/oldmem	Access to crash dump from kexec kernel
   1 block	RAM disk
 		  0 = /dev/ram0		First RAM disk
 		  1 = /dev/ram1		Second RAM disk
@@ -262,13 +261,13 @@
 		NOTE: These devices permit both read and write access.
 
   7 block	Loopback devices
-		  0 = /dev/loop0	First loopback device
-		  1 = /dev/loop1	Second loopback device
+		  0 = /dev/loop0	First loop device
+		  1 = /dev/loop1	Second loop device
 		    ...
 
-		The loopback devices are used to mount filesystems not
+		The loop devices are used to mount filesystems not
 		associated with block devices.	The binding to the
-		loopback devices is handled by mount(8) or losetup(8).
+		loop devices is handled by mount(8) or losetup(8).
 
   8 block	SCSI disk devices (0-15)
 		  0 = /dev/sda		First SCSI disk whole disk
@@ -943,7 +942,7 @@
 		240 = /dev/ftlp		FTL on 16th Memory Technology Device 
 
 		Partitions are handled in the same way as for IDE
-		disks (see major number 3) expect that the partition
+		disks (see major number 3) except that the partition
 		limit is 15 rather than 63 per disk (same as SCSI.)
 
  45 char	isdn4linux ISDN BRI driver
@@ -1168,7 +1167,7 @@
 		The filename of the encrypted container and the passwords
 		are sent via ioctls (using the sdmount tool) to the master
 		node which then activates them via one of the
-		/dev/scramdisk/x nodes for loopback mounting (all handled
+		/dev/scramdisk/x nodes for loop mounting (all handled
 		through the sdmount tool).
 
 		Requested by: andy@scramdisklinux.org
@@ -2538,18 +2537,32 @@
 		  0 = /dev/usb/lp0	First USB printer
 		    ...
 		 15 = /dev/usb/lp15	16th USB printer
-		 16 = /dev/usb/mouse0	First USB mouse
-		    ...
-		 31 = /dev/usb/mouse15	16th USB mouse
-		 32 = /dev/usb/ez0	First USB firmware loader
-		    ...
-		 47 = /dev/usb/ez15	16th USB firmware loader
 		 48 = /dev/usb/scanner0	First USB scanner
 		    ...
 		 63 = /dev/usb/scanner15 16th USB scanner
 		 64 = /dev/usb/rio500	Diamond Rio 500
 		 65 = /dev/usb/usblcd	USBLCD Interface (info@usblcd.de)
 		 66 = /dev/usb/cpad0	Synaptics cPad (mouse/LCD)
+		 96 = /dev/usb/hiddev0	1st USB HID device
+		    ...
+		111 = /dev/usb/hiddev15	16th USB HID device
+		112 = /dev/usb/auer0	1st auerswald ISDN device
+		    ...
+		127 = /dev/usb/auer15	16th auerswald ISDN device
+		128 = /dev/usb/brlvgr0	First Braille Voyager device
+		    ...
+		131 = /dev/usb/brlvgr3	Fourth Braille Voyager device
+		132 = /dev/usb/idmouse	ID Mouse (fingerprint scanner) device
+		133 = /dev/usb/sisusbvga1	First SiSUSB VGA device
+		    ...
+		140 = /dev/usb/sisusbvga8	Eigth SISUSB VGA device
+		144 = /dev/usb/lcd	USB LCD device
+		160 = /dev/usb/legousbtower0	1st USB Legotower device
+		    ...
+		175 = /dev/usb/legousbtower15	16th USB Legotower device
+		240 = /dev/usb/dabusb0	First daubusb device
+		    ...
+		243 = /dev/usb/dabusb3	Fourth dabusb device
 
 180 block	USB block devices
 		0 = /dev/uba		First USB block device
@@ -2710,6 +2723,17 @@
 		  1 = /dev/cpu/1/msr		MSRs on CPU 1
 		    ...
 
+202 block	Xen Virtual Block Device
+		  0 = /dev/xvda       First Xen VBD whole disk
+		  16 = /dev/xvdb      Second Xen VBD whole disk
+		  32 = /dev/xvdc      Third Xen VBD whole disk
+		    ...
+		  240 = /dev/xvdp     Sixteenth Xen VBD whole disk
+
+                Partitions are handled in the same way as for IDE
+                disks (see major number 3) except that the limit on
+                partitions is 15.
+
 203 char	CPU CPUID information
 		  0 = /dev/cpu/0/cpuid		CPUID on CPU 0
 		  1 = /dev/cpu/1/cpuid		CPUID on CPU 1
@@ -2747,11 +2771,27 @@
 		 46 = /dev/ttyCPM0		PPC CPM (SCC or SMC) - port 0
 		    ...
 		 47 = /dev/ttyCPM5		PPC CPM (SCC or SMC) - port 5
-		 50 = /dev/ttyIOC40		Altix serial card
+		 50 = /dev/ttyIOC0		Altix serial card
 		    ...
-		 81 = /dev/ttyIOC431		Altix serial card
-		 82 = /dev/ttyVR0               NEC VR4100 series SIU
-		 83 = /dev/ttyVR1               NEC VR4100 series DSIU
+		 81 = /dev/ttyIOC31		Altix serial card
+		 82 = /dev/ttyVR0		NEC VR4100 series SIU
+		 83 = /dev/ttyVR1		NEC VR4100 series DSIU
+		 84 = /dev/ttyIOC84		Altix ioc4 serial card
+		    ...
+		 115 = /dev/ttyIOC115		Altix ioc4 serial card
+		 116 = /dev/ttySIOC0		Altix ioc3 serial card
+		    ...
+		 147 = /dev/ttySIOC31		Altix ioc3 serial card
+		 148 = /dev/ttyPSC0		PPC PSC - port 0
+		    ...
+		 153 = /dev/ttyPSC5		PPC PSC - port 5
+		 154 = /dev/ttyAT0		ATMEL serial port 0
+		    ...
+		 169 = /dev/ttyAT15		ATMEL serial port 15
+		 170 = /dev/ttyNX0		Hilscher netX serial port 0
+		    ...
+		 185 = /dev/ttyNX15		Hilscher netX serial port 15
+		 186 = /dev/ttyJ0		JTAG1 DCC protocol based serial port emulation
 
 205 char	Low-density serial ports (alternate device)
 		  0 = /dev/culu0		Callout device for ttyLU0
@@ -2786,8 +2826,8 @@
 		 50 = /dev/cuioc40		Callout device for ttyIOC40
 		    ...
 		 81 = /dev/cuioc431		Callout device for ttyIOC431
-		 82 = /dev/cuvr0                Callout device for ttyVR0
-		 83 = /dev/cuvr1                Callout device for ttyVR1
+		 82 = /dev/cuvr0		Callout device for ttyVR0
+		 83 = /dev/cuvr1		Callout device for ttyVR1
 
 
 206 char	OnStream SC-x0 tape devices
@@ -2897,7 +2937,6 @@
 		    ...
 		196 = /dev/dvb/adapter3/video0    first video decoder of fourth card
 
-
 216 char	Bluetooth RFCOMM TTY devices
 		  0 = /dev/rfcomm0		First Bluetooth RFCOMM TTY device
 		  1 = /dev/rfcomm1		Second Bluetooth RFCOMM TTY device
@@ -3002,12 +3041,43 @@
 		ioctl()'s can be used to rewind the tape regardless of
 		the device used to access it.
 
-231 char	InfiniBand MAD
+231 char	InfiniBand
 		0 = /dev/infiniband/umad0
 		1 = /dev/infiniband/umad1
-		 ...
+		  ...
+		63 = /dev/infiniband/umad63    63rd InfiniBandMad device
+		64 = /dev/infiniband/issm0     First InfiniBand IsSM device
+		65 = /dev/infiniband/issm1     Second InfiniBand IsSM device
+		  ...
+		127 = /dev/infiniband/issm63    63rd InfiniBand IsSM device
+		128 = /dev/infiniband/uverbs0   First InfiniBand verbs device
+		129 = /dev/infiniband/uverbs1   Second InfiniBand verbs device
+		  ...
+		159 = /dev/infiniband/uverbs31  31st InfiniBand verbs device
 
-232-239		UNASSIGNED
+232 char	Biometric Devices
+		0 = /dev/biometric/sensor0/fingerprint	first fingerprint sensor on first device
+		1 = /dev/biometric/sensor0/iris		first iris sensor on first device
+		2 = /dev/biometric/sensor0/retina	first retina sensor on first device
+		3 = /dev/biometric/sensor0/voiceprint	first voiceprint sensor on first device
+		4 = /dev/biometric/sensor0/facial	first facial sensor on first device
+		5 = /dev/biometric/sensor0/hand		first hand sensor on first device
+		  ...
+		10 = /dev/biometric/sensor1/fingerprint	first fingerprint sensor on second device
+		  ...
+		20 = /dev/biometric/sensor2/fingerprint	first fingerprint sensor on third device
+		  ...
+
+233 char	PathScale InfiniPath interconnect
+		0 = /dev/ipath        Primary device for programs (any unit)
+		1 = /dev/ipath0       Access specifically to unit 0
+		2 = /dev/ipath1       Access specifically to unit 1
+		  ...
+		4 = /dev/ipath3       Access specifically to unit 3
+		129 = /dev/ipath_sma    Device used by Subnet Management Agent
+		130 = /dev/ipath_diag   Device used by diagnostics programs
+
+234-239		UNASSIGNED
 
 240-254 char	LOCAL/EXPERIMENTAL USE
 240-254 block	LOCAL/EXPERIMENTAL USE
@@ -3021,6 +3091,28 @@
 		This major is reserved to assist the expansion to a
 		larger number space.  No device nodes with this major
 		should ever be created on the filesystem.
+		(This is probaly not true anymore, but I'll leave it
+		for now /Torben)
+
+---LARGE MAJORS!!!!!---
+
+256 char	Equinox SST multi-port serial boards
+		   0 = /dev/ttyEQ0	First serial port on first Equinox SST board
+		 127 = /dev/ttyEQ127	Last serial port on first Equinox SST board
+		 128 = /dev/ttyEQ128	First serial port on second Equinox SST board
+		  ...
+		1027 = /dev/ttyEQ1027	Last serial port on eighth Equinox SST board
+
+256 block	Resident Flash Disk Flash Translation Layer
+		  0 = /dev/rfda		First RFD FTL layer
+		 16 = /dev/rfdb		Second RFD FTL layer
+		  ...
+		240 = /dev/rfdp		16th RFD FTL layer
+
+257 char	Phoenix Technologies Cryptographic Services Driver
+		  0 = /dev/ptlsec	Crypto Services Driver
+
+
 
  ****	ADDITIONAL /dev DIRECTORY ENTRIES
 
diff --git a/Documentation/digiepca.txt b/Documentation/digiepca.txt
index 88820fe..f2560e2 100644
--- a/Documentation/digiepca.txt
+++ b/Documentation/digiepca.txt
@@ -2,7 +2,7 @@
 http://www.digi.com for PCI cards.  They no longer maintain this driver,
 and have no 2.6 driver for ISA cards.
 
-This driver requires a number of user-space tools.  They can be aquired from
+This driver requires a number of user-space tools.  They can be acquired from
 http://www.digi.com, but only works with 2.4 kernels.
 
 
diff --git a/Documentation/driver-model/overview.txt b/Documentation/driver-model/overview.txt
index ac4a7a7..2050c9f 100644
--- a/Documentation/driver-model/overview.txt
+++ b/Documentation/driver-model/overview.txt
@@ -18,7 +18,7 @@
 (sometimes just a list) for the devices they control. There wasn't any
 uniformity across the different bus types.
 
-The current driver model provides a comon, uniform data model for describing
+The current driver model provides a common, uniform data model for describing
 a bus and the devices that can appear under the bus. The unified bus
 model includes a set of common attributes which all busses carry, and a set
 of common callbacks, such as device discovery during bus probing, bus
diff --git a/Documentation/drivers/edac/edac.txt b/Documentation/drivers/edac/edac.txt
index 70d96a6..7b3d969 100644
--- a/Documentation/drivers/edac/edac.txt
+++ b/Documentation/drivers/edac/edac.txt
@@ -35,15 +35,14 @@
 to generate parity.  Some vendors do not do this, and thus the parity bit
 can "float" giving false positives.
 
-The PCI Parity EDAC device has the ability to "skip" known flaky
-cards during the parity scan. These are set by the parity "blacklist"
-interface in the sysfs for PCI Parity. (See the PCI section in the sysfs
-section below.) There is also a parity "whitelist" which is used as
-an explicit list of devices to scan, while the blacklist is a list
-of devices to skip.
+[There are patches in the kernel queue which will allow for storage of
+quirks of PCI devices reporting false parity positives. The 2.6.18
+kernel should have those patches included. When that becomes available,
+then EDAC will be patched to utilize that information to "skip" such
+devices.]
 
-EDAC will have future error detectors that will be added or integrated
-into EDAC in the following list:
+EDAC will have future error detectors that will be integrated with
+EDAC or added to it, in the following list:
 
 	MCE	Machine Check Exception
 	MCA	Machine Check Architecture
@@ -93,22 +92,24 @@
 there currently reside 2 'edac' components:
 
 	mc	memory controller(s) system
-	pci	PCI status system
+	pci	PCI control and status system
 
 
 ============================================================================
 Memory Controller (mc) Model
 
 First a background on the memory controller's model abstracted in EDAC.
-Each mc device controls a set of DIMM memory modules. These modules are
+Each 'mc' device controls a set of DIMM memory modules. These modules are
 laid out in a Chip-Select Row (csrowX) and Channel table (chX). There can
-be multiple csrows and two channels.
+be multiple csrows and multiple channels.
 
 Memory controllers allow for several csrows, with 8 csrows being a typical value.
 Yet, the actual number of csrows depends on the electrical "loading"
 of a given motherboard, memory controller and DIMM characteristics.
 
 Dual channels allows for 128 bit data transfers to the CPU from memory.
+Some newer chipsets allow for more than 2 channels, like Fully Buffered DIMMs
+(FB-DIMMs). The following example will assume 2 channels:
 
 
 		Channel 0	Channel 1
@@ -234,23 +235,15 @@
 	The time period, in milliseconds, for polling for error information.
 	Too small a value wastes resources.  Too large a value might delay
 	necessary handling of errors and might loose valuable information for
-	locating the error.  1000 milliseconds (once each second) is about
-	right for most uses.
+	locating the error.  1000 milliseconds (once each second) is the current
+	default. Systems which require all the bandwidth they can get, may
+	increase this.
 
 	LOAD TIME: module/kernel parameter: poll_msec=[0|1]
 
 	RUN TIME: echo "1000" >/sys/devices/system/edac/mc/poll_msec
 
 
-Module Version read-only attribute file:
-
-	'mc_version'
-
-	The EDAC CORE module's version and compile date are shown here to
-	indicate what EDAC is running.
-
-
-
 ============================================================================
 'mcX' DIRECTORIES
 
@@ -284,35 +277,6 @@
 
 
 
-DIMM capability attribute file:
-
-	'edac_capability'
-
-	The EDAC (Error Detection and Correction) capabilities/modes of
-	the memory controller hardware.
-
-
-DIMM Current Capability attribute file:
-
-	'edac_current_capability'
-
-	The EDAC capabilities available with the hardware
-	configuration.  This may not be the same as "EDAC capability"
-	if the correct memory is not used.  If a memory controller is
-	capable of EDAC, but DIMMs without check bits are in use, then
-	Parity, SECDED, S4ECD4ED capabilities will not be available
-	even though the memory controller might be capable of those
-	modes with the proper memory loaded.
-
-
-Memory Type supported on this controller attribute file:
-
-	'supported_mem_type'
-
-	This attribute file displays the memory type, usually
-	buffered and unbuffered DIMMs.
-
-
 Memory Controller name attribute file:
 
 	'mc_name'
@@ -321,16 +285,6 @@
 	that is being utilized.
 
 
-Memory Controller Module name attribute file:
-
-	'module_name'
-
-	This attribute file displays the memory controller module name,
-	version and date built.  The name of the memory controller
-	hardware - some drivers work with multiple controllers and
-	this field shows which hardware is present.
-
-
 Total memory managed by this memory controller attribute file:
 
 	'size_mb'
@@ -432,6 +386,9 @@
 
 	This attribute file will display what type of memory is currently
 	on this csrow. Normally, either buffered or unbuffered memory.
+	Examples:
+		Registered-DDR
+		Unbuffered-DDR
 
 
 EDAC Mode of operation attribute file:
@@ -446,8 +403,13 @@
 
 	'dev_type'
 
-	This attribute file will display what type of DIMM device is
-	being utilized. Example:  x4
+	This attribute file will display what type of DRAM device is
+	being utilized on this DIMM.
+	Examples:
+		x1
+		x2
+		x4
+		x8
 
 
 Channel 0 CE Count attribute file:
@@ -522,10 +484,10 @@
 If logging for UEs and CEs are enabled then system logs will have
 error notices indicating errors that have been detected:
 
-MC0: CE page 0x283, offset 0xce0, grain 8, syndrome 0x6ec3, row 0,
+EDAC MC0: CE page 0x283, offset 0xce0, grain 8, syndrome 0x6ec3, row 0,
 channel 1 "DIMM_B1": amd76x_edac
 
-MC0: CE page 0x1e5, offset 0xfb0, grain 8, syndrome 0xb741, row 0,
+EDAC MC0: CE page 0x1e5, offset 0xfb0, grain 8, syndrome 0xb741, row 0,
 channel 1 "DIMM_B1": amd76x_edac
 
 
@@ -610,64 +572,4 @@
 
 
 
-PCI Device Whitelist:
-
-	'pci_parity_whitelist'
-
-	This control file allows for an explicit list of PCI devices to be
-	scanned for parity errors. Only devices found on this list will
-	be examined.  The list is a line of hexadecimal VENDOR and DEVICE
-	ID tuples:
-
-	1022:7450,1434:16a6
-
-	One or more can be inserted, separated by a comma.
-
-	To write the above list doing the following as one command line:
-
-	echo "1022:7450,1434:16a6"
-		> /sys/devices/system/edac/pci/pci_parity_whitelist
-
-
-
-	To display what the whitelist is, simply 'cat' the same file.
-
-
-PCI Device Blacklist:
-
-	'pci_parity_blacklist'
-
-	This control file allows for a list of PCI devices to be
-	skipped for scanning.
-	The list is a line of hexadecimal VENDOR and DEVICE ID tuples:
-
-	1022:7450,1434:16a6
-
-	One or more can be inserted, separated by a comma.
-
-	To write the above list doing the following as one command line:
-
-	echo "1022:7450,1434:16a6"
-		> /sys/devices/system/edac/pci/pci_parity_blacklist
-
-
-	To display what the whitelist currently contains,
-	simply 'cat' the same file.
-
 =======================================================================
-
-PCI Vendor and Devices IDs can be obtained with the lspci command. Using
-the -n option lspci will display the vendor and device IDs. The system
-administrator will have to determine which devices should be scanned or
-skipped.
-
-
-
-The two lists (white and black) are prioritized. blacklist is the lower
-priority and will NOT be utilized when a whitelist has been set.
-Turn OFF a whitelist by an empty echo command:
-
-	echo > /sys/devices/system/edac/pci/pci_parity_whitelist
-
-and any previous blacklist will be utilized.
-
diff --git a/Documentation/fb/fbcon.txt b/Documentation/fb/fbcon.txt
index 08dce0f..f373df1 100644
--- a/Documentation/fb/fbcon.txt
+++ b/Documentation/fb/fbcon.txt
@@ -135,10 +135,10 @@
 
 	The angle can be changed anytime afterwards by 'echoing' the same
 	numbers to any one of the 2 attributes found in
-	/sys/class/graphics/fb{x}
+	 /sys/class/graphics/fbcon
 
-		con_rotate     - rotate the display of the active console
-		con_rotate_all - rotate the display of all consoles
+		rotate     - rotate the display of the active console
+		rotate_all - rotate the display of all consoles
 
 	Console rotation will only become available if Console Rotation
 	Support is compiled in your kernel.
@@ -148,5 +148,177 @@
 	Actually, the underlying fb driver is totally ignorant of console
 	rotation.
 
----
+C. Attaching, Detaching and Unloading
+
+Before going on on how to attach, detach and unload the framebuffer console, an
+illustration of the dependencies may help.
+
+The console layer, as with most subsystems, needs a driver that interfaces with
+the hardware. Thus, in a VGA console:
+
+console ---> VGA driver ---> hardware.
+
+Assuming the VGA driver can be unloaded, one must first unbind the VGA driver
+from the console layer before unloading the driver.  The VGA driver cannot be
+unloaded if it is still bound to the console layer. (See
+Documentation/console/console.txt for more information).
+
+This is more complicated in the case of the the framebuffer console (fbcon),
+because fbcon is an intermediate layer between the console and the drivers:
+
+console ---> fbcon ---> fbdev drivers ---> hardware
+
+The fbdev drivers cannot be unloaded if it's bound to fbcon, and fbcon cannot
+be unloaded if it's bound to the console layer.
+
+So to unload the fbdev drivers, one must first unbind fbcon from the console,
+then unbind the fbdev drivers from fbcon.  Fortunately, unbinding fbcon from
+the console layer will automatically unbind framebuffer drivers from
+fbcon. Thus, there is no need to explicitly unbind the fbdev drivers from
+fbcon.
+
+So, how do we unbind fbcon from the console? Part of the answer is in
+Documentation/console/console.txt. To summarize:
+
+Echo a value to the bind file that represents the framebuffer console
+driver. So assuming vtcon1 represents fbcon, then:
+
+echo 1 > sys/class/vtconsole/vtcon1/bind - attach framebuffer console to
+                                           console layer
+echo 0 > sys/class/vtconsole/vtcon1/bind - detach framebuffer console from
+                                           console layer
+
+If fbcon is detached from the console layer, your boot console driver (which is
+usually VGA text mode) will take over.  A few drivers (rivafb and i810fb) will
+restore VGA text mode for you.  With the rest, before detaching fbcon, you
+must take a few additional steps to make sure that your VGA text mode is
+restored properly. The following is one of the several methods that you can do:
+
+1. Download or install vbetool.  This utility is included with most
+   distributions nowadays, and is usually part of the suspend/resume tool.
+
+2. In your kernel configuration, ensure that CONFIG_FRAMEBUFFER_CONSOLE is set
+   to 'y' or 'm'. Enable one or more of your favorite framebuffer drivers.
+
+3. Boot into text mode and as root run:
+
+	vbetool vbestate save > <vga state file>
+
+	The above command saves the register contents of your graphics
+	hardware to <vga state file>.  You need to do this step only once as
+	the state file can be reused.
+
+4. If fbcon is compiled as a module, load fbcon by doing:
+
+       modprobe fbcon
+
+5. Now to detach fbcon:
+
+       vbetool vbestate restore < <vga state file> && \
+       echo 0 > /sys/class/vtconsole/vtcon1/bind
+
+6. That's it, you're back to VGA mode. And if you compiled fbcon as a module,
+   you can unload it by 'rmmod fbcon'
+
+7. To reattach fbcon:
+
+       echo 1 > /sys/class/vtconsole/vtcon1/bind
+
+8. Once fbcon is unbound, all drivers registered to the system will also
+become unbound.  This means that fbcon and individual framebuffer drivers
+can be unloaded or reloaded at will. Reloading the drivers or fbcon will
+automatically bind the console, fbcon and the drivers together. Unloading
+all the drivers without unloading fbcon will make it impossible for the
+console to bind fbcon.
+
+Notes for vesafb users:
+=======================
+
+Unfortunately, if your bootline includes a vga=xxx parameter that sets the
+hardware in graphics mode, such as when loading vesafb, vgacon will not load.
+Instead, vgacon will replace the default boot console with dummycon, and you
+won't get any display after detaching fbcon. Your machine is still alive, so
+you can reattach vesafb. However, to reattach vesafb, you need to do one of
+the following:
+
+Variation 1:
+
+    a. Before detaching fbcon, do
+
+       vbetool vbemode save > <vesa state file> # do once for each vesafb mode,
+						# the file can be reused
+
+    b. Detach fbcon as in step 5.
+
+    c. Attach fbcon
+
+        vbetool vbestate restore < <vesa state file> && \
+	echo 1 > /sys/class/vtconsole/vtcon1/bind
+
+Variation 2:
+
+    a. Before detaching fbcon, do:
+	echo <ID> > /sys/class/tty/console/bind
+
+
+       vbetool vbemode get
+
+    b. Take note of the mode number
+
+    b. Detach fbcon as in step 5.
+
+    c. Attach fbcon:
+
+       vbetool vbemode set <mode number> && \
+       echo 1 > /sys/class/vtconsole/vtcon1/bind
+
+Samples:
+========
+
+Here are 2 sample bash scripts that you can use to bind or unbind the
+framebuffer console driver if you are in an X86 box:
+
+---------------------------------------------------------------------------
+#!/bin/bash
+# Unbind fbcon
+
+# Change this to where your actual vgastate file is located
+# Or Use VGASTATE=$1 to indicate the state file at runtime
+VGASTATE=/tmp/vgastate
+
+# path to vbetool
+VBETOOL=/usr/local/bin
+
+
+for (( i = 0; i < 16; i++))
+do
+  if test -x /sys/class/vtconsole/vtcon$i; then
+      if [ `cat /sys/class/vtconsole/vtcon$i/name | grep -c "frame buffer"` \
+           = 1 ]; then
+	    if test -x $VBETOOL/vbetool; then
+	       echo Unbinding vtcon$i
+	       $VBETOOL/vbetool vbestate restore < $VGASTATE
+	       echo 0 > /sys/class/vtconsole/vtcon$i/bind
+	    fi
+      fi
+  fi
+done
+
+---------------------------------------------------------------------------
+#!/bin/bash
+# Bind fbcon
+
+for (( i = 0; i < 16; i++))
+do
+  if test -x /sys/class/vtconsole/vtcon$i; then
+      if [ `cat /sys/class/vtconsole/vtcon$i/name | grep -c "frame buffer"` \
+           = 1 ]; then
+	  echo Unbinding vtcon$i
+	  echo 1 > /sys/class/vtconsole/vtcon$i/bind
+      fi
+  fi
+done
+---------------------------------------------------------------------------
+
+--
 Antonino Daplas <adaplas@pol.net>
diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index 43ab119..9d3a077 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -6,17 +6,6 @@
 
 ---------------------------
 
-What:	devfs
-When:	July 2005
-Files:	fs/devfs/*, include/linux/devfs_fs*.h and assorted devfs
-	function calls throughout the kernel tree
-Why:	It has been unmaintained for a number of years, has unfixable
-	races, contains a naming policy within the kernel that is
-	against the LSB, and can be replaced by using udev.
-Who:	Greg Kroah-Hartman <greg@kroah.com>
-
----------------------------
-
 What:	RAW driver (CONFIG_RAW_DRIVER)
 When:	December 2005
 Why:	declared obsolete since kernel 2.6.3
@@ -33,27 +22,12 @@
 
 ---------------------------
 
-What:	RCU API moves to EXPORT_SYMBOL_GPL
-When:	April 2006
-Files:	include/linux/rcupdate.h, kernel/rcupdate.c
-Why:	Outside of Linux, the only implementations of anything even
-	vaguely resembling RCU that I am aware of are in DYNIX/ptx,
-	VM/XA, Tornado, and K42.  I do not expect anyone to port binary
-	drivers or kernel modules from any of these, since the first two
-	are owned by IBM and the last two are open-source research OSes.
-	So these will move to GPL after a grace period to allow
-	people, who might be using implementations that I am not aware
-	of, to adjust to this upcoming change.
-Who:	Paul E. McKenney <paulmck@us.ibm.com>
-
----------------------------
-
 What:	raw1394: requests of type RAW1394_REQ_ISO_SEND, RAW1394_REQ_ISO_LISTEN
-When:	November 2005
+When:	November 2006
 Why:	Deprecated in favour of the new ioctl-based rawiso interface, which is
 	more efficient.  You should really be using libraw1394 for raw1394
 	access anyway.
-Who:	Jody McIntyre <scjody@steamballoon.com>
+Who:	Jody McIntyre <scjody@modernduck.com>
 
 ---------------------------
 
@@ -81,14 +55,6 @@
 
 ---------------------------
 
-What:	remove EXPORT_SYMBOL(insert_resource)
-When:	April 2006
-Files:	kernel/resource.c
-Why:	No modular usage in the kernel.
-Who:	Adrian Bunk <bunk@stusta.de>
-
----------------------------
-
 What:	PCMCIA control ioctl (needed for pcmcia-cs [cardmgr, cardctl])
 When:	November 2005
 Files:	drivers/pcmcia/: pcmcia_ioctl.c
@@ -147,16 +113,6 @@
 
 ---------------------------
 
-What:	au1x00_uart driver
-When:	January 2006
-Why:	The 8250 serial driver now has the ability to deal with the differences
-	between the standard 8250 family of UARTs and their slightly strange
-	brother on Alchemy SOCs.  The loss of features is not considered an
-	issue.
-Who:	Ralf Baechle <ralf@linux-mips.org>
-
----------------------------
-
 What:   eepro100 network driver
 When:   January 2007
 Why:    replaced by the e100 driver
@@ -192,14 +148,13 @@
 
 ---------------------------
 
-What:	remove EXPORT_SYMBOL(tasklist_lock)
-When:	August 2006
-Files:	kernel/fork.c
-Why:	tasklist_lock protects the kernel internal task list.  Modules have
-	no business looking at it, and all instances in drivers have been due
-	to use of too-lowlevel APIs.  Having this symbol exported prevents
-	moving to more scalable locking schemes for the task list.
-Who:	Christoph Hellwig <hch@lst.de>
+What:	Unused EXPORT_SYMBOL/EXPORT_SYMBOL_GPL exports
+	(temporary transition config option provided until then)
+	The transition config option will also be removed at the same time.
+When:	before 2.6.19
+Why:	Unused symbols are both increasing the size of the kernel binary
+	and are often a sign of "wrong API"
+Who:	Arjan van de Ven <arjan@linux.intel.com>
 
 ---------------------------
 
@@ -212,15 +167,6 @@
 
 ---------------------------
 
-What:	Support for NEC DDB5074 and DDB5476 evaluation boards.
-When:	June 2006
-Why:	Board specific code doesn't build anymore since ~2.6.0 and no
-	users have complained indicating there is no more need for these
-	boards.  This should really be considered a last call.
-Who:	Ralf Baechle <ralf@linux-mips.org>
-
----------------------------
-
 What:	USB driver API moves to EXPORT_SYMBOL_GPL
 When:	Febuary 2008
 Files:	include/linux/usb.h, drivers/usb/core/driver.c
@@ -248,3 +194,67 @@
 Who:	Nick Piggin <npiggin@suse.de>
 
 ---------------------------
+
+What:	Support for the MIPS EV96100 evaluation board
+When:	September 2006
+Why:	Does no longer build since at least November 15, 2003, apparently
+	no userbase left.
+Who:	Ralf Baechle <ralf@linux-mips.org>
+
+---------------------------
+
+What:	Support for the Momentum / PMC-Sierra Jaguar ATX evaluation board
+When:	September 2006
+Why:	Does no longer build since quite some time, and was never popular,
+	due to the platform being replaced by successor models.  Apparently
+	no user base left.  It also is one of the last users of
+	WANT_PAGE_VIRTUAL.
+Who:	Ralf Baechle <ralf@linux-mips.org>
+
+---------------------------
+
+What:	Support for the Momentum Ocelot, Ocelot 3, Ocelot C and Ocelot G
+When:	September 2006
+Why:	Some do no longer build and apparently there is no user base left
+	for these platforms.
+Who:	Ralf Baechle <ralf@linux-mips.org>
+
+---------------------------
+
+What:	Support for MIPS Technologies' Altas and SEAD evaluation board
+When:	September 2006
+Why:	Some do no longer build and apparently there is no user base left
+	for these platforms.  Hardware out of production since several years.
+Who:	Ralf Baechle <ralf@linux-mips.org>
+
+---------------------------
+
+What:	Support for the IT8172-based platforms, ITE 8172G and Globespan IVR
+When:	September 2006
+Why:	Code does no longer build since at least 2.6.0,  apparently there is
+	no user base left for these platforms.  Hardware out of production
+	since several years and hardly a trace of the manufacturer left on
+	the net.
+Who:	Ralf Baechle <ralf@linux-mips.org>
+
+---------------------------
+
+What:	Interrupt only SA_* flags
+When:	Januar 2007
+Why:	The interrupt related SA_* flags are replaced by IRQF_* to move them
+	out of the signal namespace.
+
+Who:	Thomas Gleixner <tglx@linutronix.de>
+
+---------------------------
+
+What:	i2c-ite and i2c-algo-ite drivers
+When:	September 2006
+Why:	These drivers never compiled since they were added to the kernel
+	tree 5 years ago. This feature removal can be reevaluated if
+	someone shows interest in the drivers, fixes them and takes over
+	maintenance.
+	http://marc.theaimsgroup.com/?l=linux-mips&m=115040510817448
+Who:	Jean Delvare <khali@linux-fr.org>
+
+---------------------------
diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
index 1045da5..247d7f6 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -99,7 +99,7 @@
 	int (*sync_fs)(struct super_block *sb, int wait);
 	void (*write_super_lockfs) (struct super_block *);
 	void (*unlockfs) (struct super_block *);
-	int (*statfs) (struct super_block *, struct kstatfs *);
+	int (*statfs) (struct dentry *, struct kstatfs *);
 	int (*remount_fs) (struct super_block *, int *, char *);
 	void (*clear_inode) (struct inode *);
 	void (*umount_begin) (struct super_block *);
@@ -142,15 +142,16 @@
 
 --------------------------- file_system_type ---------------------------
 prototypes:
-	struct super_block *(*get_sb) (struct file_system_type *, int,
-			const char *, void *);
+	int (*get_sb) (struct file_system_type *, int,
+		       const char *, void *, struct vfsmount *);
 	void (*kill_sb) (struct super_block *);
 locking rules:
 		may block	BKL
 get_sb		yes		yes
 kill_sb		yes		yes
 
-->get_sb() returns error or a locked superblock (exclusive on ->s_umount).
+->get_sb() returns error or 0 with locked superblock attached to the vfsmount
+(exclusive on ->s_umount).
 ->kill_sb() takes a write-locked superblock, does all shutdown work on it,
 unlocks and drops the reference.
 
diff --git a/Documentation/filesystems/automount-support.txt b/Documentation/filesystems/automount-support.txt
index 58c65a1..7cac200 100644
--- a/Documentation/filesystems/automount-support.txt
+++ b/Documentation/filesystems/automount-support.txt
@@ -19,7 +19,7 @@
 
  (2) Have the follow_link() op do the following steps:
 
-     (a) Call do_kern_mount() to call the appropriate filesystem to set up a
+     (a) Call vfs_kern_mount() to call the appropriate filesystem to set up a
          superblock and gain a vfsmount structure representing it.
 
      (b) Copy the nameidata provided as an argument and substitute the dentry
diff --git a/Documentation/filesystems/configfs/configfs_example.c b/Documentation/filesystems/configfs/configfs_example.c
index 3d4713a..2d6a14a 100644
--- a/Documentation/filesystems/configfs/configfs_example.c
+++ b/Documentation/filesystems/configfs/configfs_example.c
@@ -264,6 +264,15 @@
 };
 
 
+struct simple_children {
+	struct config_group group;
+};
+
+static inline struct simple_children *to_simple_children(struct config_item *item)
+{
+	return item ? container_of(to_config_group(item), struct simple_children, group) : NULL;
+}
+
 static struct config_item *simple_children_make_item(struct config_group *group, const char *name)
 {
 	struct simple_child *simple_child;
@@ -304,7 +313,13 @@
 "items have only one attribute that is readable and writeable.\n");
 }
 
+static void simple_children_release(struct config_item *item)
+{
+	kfree(to_simple_children(item));
+}
+
 static struct configfs_item_operations simple_children_item_ops = {
+	.release 	= simple_children_release,
 	.show_attribute	= simple_children_attr_show,
 };
 
@@ -345,10 +360,6 @@
  * children of its own.
  */
 
-struct simple_children {
-	struct config_group group;
-};
-
 static struct config_group *group_children_make_group(struct config_group *group, const char *name)
 {
 	struct simple_children *simple_children;
diff --git a/Documentation/filesystems/devfs/ChangeLog b/Documentation/filesystems/devfs/ChangeLog
deleted file mode 100644
index e5aba52..0000000
--- a/Documentation/filesystems/devfs/ChangeLog
+++ /dev/null
@@ -1,1977 +0,0 @@
-/* -*- auto-fill -*-                                                         */
-===============================================================================
-Changes for patch v1
-
-- creation of devfs
-
-- modified miscellaneous character devices to support devfs
-===============================================================================
-Changes for patch v2
-
-- bug fix with manual inode creation
-===============================================================================
-Changes for patch v3
-
-- bugfixes
-
-- documentation improvements
-
-- created a couple of scripts (one to save&restore a devfs and the
-  other to set up compatibility symlinks)
-
-- devfs support for SCSI discs. New name format is: sd_hHcCiIlL
-===============================================================================
-Changes for patch v4
-
-- bugfix for the directory reading code
-
-- bugfix for compilation with kerneld
-
-- devfs support for generic hard discs
-
-- rationalisation of the various watchdog drivers
-===============================================================================
-Changes for patch v5
-
-- support for mounting directly from entries in the devfs (it doesn't
-  need to be mounted to do this), including the root filesystem.
-  Mounting of swap partitions also works. Hence, now if you set
-  CONFIG_DEVFS_ONLY to 'Y' then you won't be able to access your discs
-  via ordinary device nodes. Naturally, the default is 'N' so that you
-  can still use your old device nodes.  If you want to mount from devfs
-  entries, make sure you use: append = "root=/dev/sd_..." in your
-  lilo.conf. It seems LILO looks for the device number (major&minor)
-  and writes that into the kernel image :-( 
-
-- support for character memory devices (/dev/null, /dev/zero, /dev/full
-  and so on). Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
-===============================================================================
-Changes for patch v6
-
-- support for subdirectories
-
-- support for symbolic links (created by devfs_mk_symlink(), no
-  support yet for creation via symlink(2))
-
-- SCSI disc naming now cast in stone, with the format:
-  /dev/sd/c0b1t2u3	controller=0, bus=1, ID=2, LUN=3, whole disc
-  /dev/sd/c0b1t2u3p4	controller=0, bus=1, ID=2, LUN=3, 4th partition
-
-- loop devices now appear in devfs
-
-- tty devices, console, serial ports, etc. now appear in devfs
-  Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- bugs with mounting devfs-only devices now fixed
-===============================================================================
-Changes for patch v7
-
-- SCSI CD-ROMS, tapes and generic devices now appear in devfs
-===============================================================================
-Changes for patch v8
-
-- bugfix with no-rewind SCSI tapes
-
-- RAMDISCs now appear in devfs
-
-- better cleaning up of devfs entries created by various modules
-
-- interface change to <devfs_register>
-===============================================================================
-Changes for patch v9
-
-- the v8 patch was corrupted somehow, which would affect the patch for
-  linux/fs/filesystems.c
-  I've also fixed the v8 patch file on the WWW
-
-- MetaDevices (/dev/md*) should now appear in devfs
-===============================================================================
-Changes for patch v10
-
-- bugfix in meta device support for devfs
-
-- created this ChangeLog file
-
-- added devfs support to the floppy driver
-
-- added support for creating sockets in a devfs
-===============================================================================
-Changes for patch v11
-
-- added DEVFS_FL_HIDE_UNREG flag
-
-- incorporated better patch for ttyname() in libc 5.4.43 from H.J. Lu.
-
-- interface change to <devfs_mk_symlink>
-
-- support for creating symlinks with symlink(2)
-
-- parallel port printer (/dev/lp*) now appears in devfs
-===============================================================================
-Changes for patch v12
-
-- added inode check to <devfs_fill_file> function
-
-- improved devfs support when mounting from devfs
-
-- added call to <<release>> operation when removing swap areas on
-  devfs devices
-
-- increased NR_SUPER to 128 to support large numbers of devfs mounts
-  (for chroot(2) gaols)
-
-- fixed bug in SCSI disc support: was generating incorrect minors if
-  SCSI ID's did not start at 0 and increase by 1
-
-- support symlink traversal when mounting root
-===============================================================================
-Changes for patch v13
-
-- added devfs support to soundcard driver
-  Thanks to Eric Dumas <dumas@linux.eu.org> and
-  C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- added devfs support to the joystick driver
-
-- loop driver now has it's own subdirectory "/dev/loop/"
-
-- created <devfs_get_flags> and <devfs_set_flags> functions
-
-- fix problem with SCSI disc compatibility names (sd{a,b,c,d,e,f})
-  which assumes ID's start at 0 and increase by 1. Also only create
-  devfs entries for SCSI disc partitions which actually exist
-  Show new names in partition check
-  Thanks to Jakub Jelinek <jj@sunsite.ms.mff.cuni.cz>
-===============================================================================
-Changes for patch v14
-
-- bug fix in floppy driver: would not compile without
-  CONFIG_DEVFS_FS='Y'
-  Thanks to Jurgen Botz <jbotz@nova.botz.org>
-
-- bug fix in loop driver
-  Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- do not create devfs entries for printers not configured
-  Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- do not create devfs entries for serial ports not present
-  Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- ensure <tty_register_devfs> is exported from tty_io.c
-  Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- allow unregistering of devfs symlink entries
-
-- fixed bug in SCSI disc naming introduced in last patch version
-===============================================================================
-Changes for patch v15
-
-- ported to kernel 2.1.81
-===============================================================================
-Changes for patch v16
-
-- created <devfs_set_symlink_destination> function
-
-- moved DEVFS_SUPER_MAGIC into header file
-
-- added DEVFS_FL_HIDE flag
-
-- created <devfs_get_maj_min>
-
-- created <devfs_get_handle_from_inode>
-
-- fixed bugs in searching by major&minor
-
-- changed interface to <devfs_unregister>, <devfs_fill_file> and
-  <devfs_find_handle>
-
-- fixed inode times when symlink created with symlink(2)
-
-- change tty driver to do auto-creation of devfs entries
-  Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- fixed bug in genhd.c: whole disc (non-SCSI) was not registered to
-  devfs
-
-- updated libc 5.4.43 patch for ttyname()
-===============================================================================
-Changes for patch v17
-
-- added CONFIG_DEVFS_TTY_COMPAT
-  Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- bugfix in devfs support for drivers/char/lp.c
-  Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- clean up serial driver so that PCMCIA devices unregister correctly
-  Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- fixed bug in genhd.c: whole disc (non-SCSI) was not registered to
-  devfs [was missing in patch v16]
-
-- updated libc 5.4.43 patch for ttyname() [was missing in patch v16]
-
-- all SCSI devices now registered in /dev/sg
-
-- support removal of devfs entries via unlink(2)
-===============================================================================
-Changes for patch v18
-
-- added floppy/?u720 floppy entry
-
-- fixed kerneld support for entries in devfs subdirectories
-
-- incorporated latest patch for ttyname() in libc 5.4.43 from H.J. Lu.
-===============================================================================
-Changes for patch v19
-
-- bug fix when looking up unregistered entries: kerneld was not called
-
-- fixes for kernel 2.1.86 (now requires 2.1.86)
-===============================================================================
-Changes for patch v20
-
-- only create available floppy entries
-  Thanks to Andrzej Krzysztofowicz <ankry@green.mif.pg.gda.pl>
-
-- new IDE naming scheme following SCSI format (i.e. /dev/id/c0b0t0u0p1
-  instead of /dev/hda1)
-  Thanks to Andrzej Krzysztofowicz <ankry@green.mif.pg.gda.pl>
-
-- new XT disc naming scheme following SCSI format (i.e. /dev/xd/c0t0p1
-  instead of /dev/xda1)
-  Thanks to Andrzej Krzysztofowicz <ankry@green.mif.pg.gda.pl>
-
-- new non-standard CD-ROM names (i.e. /dev/sbp/c#t#)
-  Thanks to Andrzej Krzysztofowicz <ankry@green.mif.pg.gda.pl>
-
-- allow symlink traversal when mounting the root filesystem
-
-- Create entries for MD devices at MD init
-  Thanks to Christophe Leroy <christophe.leroy5@capway.com>
-===============================================================================
-Changes for patch v21
-
-- ported to kernel 2.1.91
-===============================================================================
-Changes for patch v22
-
-- SCSI host number patch ("scsihosts=" kernel option)
-  Thanks to Andrzej Krzysztofowicz <ankry@green.mif.pg.gda.pl>
-===============================================================================
-Changes for patch v23
-
-- Fixed persistence bug with device numbers for manually created
-  device files
-
-- Fixed problem with recreating symlinks with different content
-
-- Added CONFIG_DEVFS_MOUNT (mount devfs on /dev at boot time)
-===============================================================================
-Changes for patch v24
-
-- Switched from CONFIG_KERNELD to CONFIG_KMOD: module autoloading
-  should now work again
-
-- Hide entries which are manually unlinked
-
-- Always invalidate devfs dentry cache when registering entries
-
-- Support removal of devfs directories via rmdir(2)
-
-- Ensure directories created by <devfs_mk_dir> are visible
-
-- Default no access for "other" for floppy device
-===============================================================================
-Changes for patch v25
-
-- Updates to CREDITS file and minor IDE numbering change
-  Thanks to Andrzej Krzysztofowicz <ankry@green.mif.pg.gda.pl>
-
-- Invalidate devfs dentry cache when making directories
-
-- Invalidate devfs dentry cache when removing entries
-
-- More informative message if root FS mount fails when devfs
-  configured
-
-- Fixed persistence bug with fifos
-===============================================================================
-Changes for patch v26
-
-- ported to kernel 2.1.97
-
-- Changed serial directory from "/dev/serial" to "/dev/tts" and
-  "/dev/consoles" to "/dev/vc" to be more friendly to new procps
-===============================================================================
-Changes for patch v27
-
-- Added support for IDE4 and IDE5
-  Thanks to Andrzej Krzysztofowicz <ankry@green.mif.pg.gda.pl>
-
-- Documented "scsihosts=" boot parameter
-
-- Print process command when debugging kerneld/kmod
-
-- Added debugging for register/unregister/change operations
-
-- Added "devfs=" boot options
-
-- Hide unregistered entries by default
-===============================================================================
-Changes for patch v28
-
-- No longer lock/unlock superblock in <devfs_put_super> (cope with
-  recent VFS interface change)
-
-- Do not automatically change ownership/protection of /dev/tty
-
-- Drop negative dentries when they are released
-
-- Manage dcache more efficiently
-===============================================================================
-Changes for patch v29
-
-- Added DEVFS_FL_AUTO_DEVNUM flag
-===============================================================================
-Changes for patch v30
-
-- No longer set unnecessary methods
-
-- Ported to kernel 2.1.99-pre3
-===============================================================================
-Changes for patch v31
-
-- Added PID display to <call_kerneld> debugging message
-
-- Added "diread" and "diwrite" options
-
-- Ported to kernel 2.1.102
-
-- Fixed persistence problem with permissions
-===============================================================================
-Changes for patch v32
-
-- Fixed devfs support in drivers/block/md.c
-===============================================================================
-Changes for patch v33
-
-- Support legacy device nodes
-
-- Fixed bug where recreated inodes were hidden
-
-- New IDE naming scheme: everything is under /dev/ide
-===============================================================================
-Changes for patch v34
-
-- Improved debugging in <get_vfs_inode>
-
-- Prevent duplicate calls to <devfs_mk_dir> in SCSI layer
-
-- No longer free old dentries in <devfs_mk_dir>
-
-- Free all dentries for a given entry when deleting inodes
-===============================================================================
-Changes for patch v35
-
-- Ported to kernel 2.1.105 (sound driver changes)
-===============================================================================
-Changes for patch v36
-
-- Fixed sound driver port
-===============================================================================
-Changes for patch v37
-
-- Minor documentation tweaks
-===============================================================================
-Changes for patch v38
-
-- More documentation tweaks
-
-- Fix for sound driver port
-
-- Removed ttyname-patch (grab libc 5.4.44 instead)
-
-- Ported to kernel 2.1.107-pre2 (loop driver fix)
-===============================================================================
-Changes for patch v39
-
-- Ported to kernel 2.1.107 (hd.c hunk broke due to spelling "fixes"). Sigh
-
-- Removed many #ifdef's, replaced with trickery in include/devfs_fs.h
-===============================================================================
-Changes for patch v40
-
-- Fix for sound driver port
-
-- Limit auto-device numbering to majors 128 to 239
-===============================================================================
-Changes for patch v41
-
-- Fixed inode times persistence problem
-===============================================================================
-Changes for patch v42
-
-- Ported to kernel 2.1.108 (drivers/scsi/hosts.c hunk broke)
-===============================================================================
-Changes for patch v43
-
-- Fixed spelling in <devfs_readlink> debug
-
-- Fixed bug in <devfs_setup> parsing "dilookup"
-
-- More #ifdef's removed
-
-- Supported Sparc keyboard (/dev/kbd)
-
-- Supported DSP56001 digital signal processor (/dev/dsp56k)
-
-- Supported Apple Desktop Bus (/dev/adb)
-
-- Supported Coda network file system (/dev/cfs*)
-===============================================================================
-Changes for patch v44
-
-- Fixed devfs inode leak when manually recreating inodes
-
-- Fixed permission persistence problem when recreating inodes
-===============================================================================
-Changes for patch v45
-
-- Ported to kernel 2.1.110
-===============================================================================
-Changes for patch v46
-
-- Ported to kernel 2.1.112-pre1
-
-- Removed harmless "unused variable" compiler warning
-
-- Fixed modes for manually recreated device nodes
-===============================================================================
-Changes for patch v47
-
-- Added NULL devfs inode warning in <devfs_read_inode>
-
-- Force all inode nlink values to 1
-===============================================================================
-Changes for patch v48
-
-- Added "dimknod" option
-
-- Set inode nlink to 0 when freeing dentries
-
-- Added support for virtual console capture devices (/dev/vcs*)
-  Thanks to Dennis Hou <smilax@mindmeld.yi.org>
-
-- Fixed modes for manually recreated symlinks
-===============================================================================
-Changes for patch v49
-
-- Ported to kernel 2.1.113
-===============================================================================
-Changes for patch v50
-
-- Fixed bugs in recreated directories and symlinks
-===============================================================================
-Changes for patch v51
-
-- Improved robustness of rc.devfs script
-  Thanks to Roderich Schupp <rsch@experteam.de>
-
-- Fixed bugs in recreated device nodes
-
-- Fixed bug in currently unused <devfs_get_handle_from_inode>
-
-- Defined new <devfs_handle_t> type
-
-- Improved debugging when getting entries
-
-- Fixed bug where directories could be emptied
-
-- Ported to kernel 2.1.115
-===============================================================================
-Changes for patch v52
-
-- Replaced dummy .epoch inode with .devfsd character device
-
-- Modified rc.devfs to take account of above change
-
-- Removed spurious driver warning messages when CONFIG_DEVFS_FS=n
-
-- Implemented devfsd protocol revision 0
-===============================================================================
-Changes for patch v53
-
-- Ported to kernel 2.1.116 (kmod change broke hunk)
-
-- Updated Documentation/Configure.help
-
-- Test and tty pattern patch for rc.devfs script
-  Thanks to Roderich Schupp <rsch@experteam.de>
-
-- Added soothing message to warning in <devfs_d_iput>
-===============================================================================
-Changes for patch v54
-
-- Ported to kernel 2.1.117
-
-- Fixed default permissions in sound driver
-
-- Added support for frame buffer devices (/dev/fb*)
-===============================================================================
-Changes for patch v55
-
-- Ported to kernel 2.1.119
-
-- Use GCC extensions for structure initialisations
-
-- Implemented async open notification
-
-- Incremented devfsd protocol revision to 1
-===============================================================================
-Changes for patch v56
-
-- Ported to kernel 2.1.120-pre3
-
-- Moved async open notification to end of <devfs_open>
-===============================================================================
-Changes for patch v57
-
-- Ported to kernel 2.1.121
-
-- Prepended "/dev/" to module load request
-
-- Renamed <call_kerneld> to <call_kmod>
-
-- Created sample modules.conf file
-===============================================================================
-Changes for patch v58
-
-- Fixed typo "AYSNC" -> "ASYNC"
-===============================================================================
-Changes for patch v59
-
-- Added open flag for files
-===============================================================================
-Changes for patch v60
-
-- Ported to kernel 2.1.123-pre2
-===============================================================================
-Changes for patch v61
-
-- Set i_blocks=0 and i_blksize=1024 in <devfs_read_inode>
-===============================================================================
-Changes for patch v62
-
-- Ported to kernel 2.1.123
-===============================================================================
-Changes for patch v63
-
-- Ported to kernel 2.1.124-pre2
-===============================================================================
-Changes for patch v64
-
-- Fixed Unix98 pty support
-
-- Increased buffer size in <get_partition_list> to avoid crash and
-  burn
-===============================================================================
-Changes for patch v65
-
-- More Unix98 pty support fixes
-
-- Added test for empty <<name>> in <devfs_find_handle>
-
-- Renamed <generate_path> to <devfs_generate_path> and published
-
-- Created /dev/root symlink
-  Thanks to Roderich Schupp <rsch@ExperTeam.de>
-  with further modifications by me
-===============================================================================
-Changes for patch v66
-
-- Yet more Unix98 pty support fixes (now tested)
-
-- Created <devfs_get_fops>
-
-- Support media change checks when CONFIG_DEVFS_ONLY=y
-
-- Abolished Unix98-style PTY names for old PTY devices
-===============================================================================
-Changes for patch v67
-
-- Added inline declaration for dummy <devfs_generate_path>
-
-- Removed spurious "unable to register... in devfs" messages when
-  CONFIG_DEVFS_FS=n
-
-- Fixed misc. devices when CONFIG_DEVFS_FS=n
-
-- Limit auto-device numbering to majors 144 to 239
-===============================================================================
-Changes for patch v68
-
-- Hide unopened virtual consoles from directory listings
-
-- Added support for video capture devices
-
-- Ported to kernel 2.1.125
-===============================================================================
-Changes for patch v69
-
-- Fix for CONFIG_VT=n
-===============================================================================
-Changes for patch v70
-
-- Added support for non-OSS/Free sound cards
-===============================================================================
-Changes for patch v71
-
-- Ported to kernel 2.1.126-pre2
-===============================================================================
-Changes for patch v72
-
-- #ifdef's for CONFIG_DEVFS_DISABLE_OLD_NAMES removed
-===============================================================================
-Changes for patch v73
-
-- CONFIG_DEVFS_DISABLE_OLD_NAMES replaced with "nocompat" boot option
-
-- CONFIG_DEVFS_BOOT_OPTIONS removed: boot options always available
-===============================================================================
-Changes for patch v74
-
-- Removed CONFIG_DEVFS_MOUNT and "mount" boot option and replaced with
-  "nomount" boot option
-
-- Documentation updates
-
-- Updated sample modules.conf
-===============================================================================
-Changes for patch v75
-
-- Updated sample modules.conf
-
-- Remount devfs after initrd finishes
-
-- Ported to kernel 2.1.127
-
-- Added support for ISDN
-  Thanks to Christophe Leroy <christophe.leroy5@capway.com>
-===============================================================================
-Changes for patch v76
-
-- Updated an email address in ChangeLog
-
-- CONFIG_DEVFS_ONLY replaced with "only" boot option
-===============================================================================
-Changes for patch v77
-
-- Added DEVFS_FL_REMOVABLE flag
-
-- Check for disc change when listing directories with removable media
-  devices
-
-- Use DEVFS_FL_REMOVABLE in sd.c
-
-- Ported to kernel 2.1.128
-===============================================================================
-Changes for patch v78
-
-- Only call <scan_dir_for_removable> on first call to <devfs_readdir>
-
-- Ported to kernel 2.1.129-pre5
-
-- ISDN support improvements
-  Thanks to Christophe Leroy <christophe.leroy5@capway.com>
-===============================================================================
-Changes for patch v79
-
-- Ported to kernel 2.1.130
-
-- Renamed miscdevice "apm" to "apm_bios" to be consistent with
-  devices.txt
-===============================================================================
-Changes for patch v80
-
-- Ported to kernel 2.1.131
-
-- Updated <devfs_rmdir> for VFS change in 2.1.131
-===============================================================================
-Changes for patch v81
-
-- Fixed permissions on /dev/ptmx
-===============================================================================
-Changes for patch v82
-
-- Ported to kernel 2.1.132-pre4
-
-- Changed initial permissions on /dev/pts/*
-
-- Created <devfs_mk_compat>
-
-- Added "symlinks" boot option
-
-- Changed devfs_register_blkdev() back to register_blkdev() for IDE
-
-- Check for partitions on removable media in <devfs_lookup>
-===============================================================================
-Changes for patch v83
-
-- Fixed support for ramdisc when using string-based root FS name
-
-- Ported to kernel 2.2.0-pre1
-===============================================================================
-Changes for patch v84
-
-- Ported to kernel 2.2.0-pre7
-===============================================================================
-Changes for patch v85
-
-- Compile fixes for driver/sound/sound_common.c (non-module) and
-  drivers/isdn/isdn_common.c
-  Thanks to Christophe Leroy <christophe.leroy5@capway.com>
-
-- Added support for registering regular files
-
-- Created <devfs_set_file_size>
-
-- Added /dev/cpu/mtrr as an alternative interface to /proc/mtrr
-
-- Update devfs inodes from entries if not changed through FS
-===============================================================================
-Changes for patch v86
-
-- Ported to kernel 2.2.0-pre9
-===============================================================================
-Changes for patch v87
-
-- Fixed bug when mounting non-devfs devices in a devfs
-===============================================================================
-Changes for patch v88
-
-- Fixed <devfs_fill_file> to only initialise temporary inodes
-
-- Trap for NULL fops in <devfs_register>
-
-- Return -ENODEV in <devfs_fill_file> for non-driver inodes
-
-- Fixed bug when unswapping non-devfs devices in a devfs
-===============================================================================
-Changes for patch v89
-
-- Switched to C data types in include/linux/devfs_fs.h
-
-- Switched from PATH_MAX to DEVFS_PATHLEN
-
-- Updated Documentation/filesystems/devfs/modules.conf to take account
-  of reverse scanning (!) by modprobe
-
-- Ported to kernel 2.2.0
-===============================================================================
-Changes for patch v90
-
-- CONFIG_DEVFS_DISABLE_OLD_TTY_NAMES replaced with "nottycompat" boot
-  option
-
-- CONFIG_DEVFS_TTY_COMPAT removed: existing "symlinks" boot option now
-  controls this. This means you must have libc 5.4.44 or later, or a
-  recent version of libc 6 if you use the "symlinks" option
-===============================================================================
-Changes for patch v91
-
-- Switch from <devfs_mk_symlink> to <devfs_mk_compat> in
-  drivers/char/vc_screen.c to fix problems with Midnight Commander
-===============================================================================
-Changes for patch v92
-
-- Ported to kernel 2.2.2-pre5
-===============================================================================
-Changes for patch v93
-
-- Modified <sd_name> in drivers/scsi/sd.c to cope with devices that
-  don't exist (which happens with new RAID autostart code printk()s)
-===============================================================================
-Changes for patch v94
-
-- Fixed bug in joystick driver: only first joystick was registered
-===============================================================================
-Changes for patch v95
-
-- Fixed another bug in joystick driver
-
-- Fixed <devfsd_read> to not overrun event buffer
-===============================================================================
-Changes for patch v96
-
-- Ported to kernel 2.2.5-2
-
-- Created <devfs_auto_unregister>
-
-- Fixed bugs: compatibility entries were not unregistered for:
-    loop driver
-    floppy driver
-    RAMDISC driver
-    IDE tape driver
-    SCSI CD-ROM driver
-    SCSI HDD driver
-===============================================================================
-Changes for patch v97
-
-- Fixed bugs: compatibility entries were not unregistered for:
-    ALSA sound driver
-    partitions in generic disc driver
-
-- Don't return unregistred entries in <devfs_find_handle>
-
-- Panic in <devfs_unregister> if entry unregistered
-
-- Don't panic in <devfs_auto_unregister> for duplicates
-===============================================================================
-Changes for patch v98
-
-- Don't unregister already unregistered entries in <unregister>
-
-- Register entry in <sd_detect>
-
-- Unregister entry in <sd_detach>
-
-- Changed to <devfs_*register_chrdev> in drivers/char/tty_io.c
-
-- Ported to kernel 2.2.7
-===============================================================================
-Changes for patch v99
-
-- Ported to kernel 2.2.8
-
-- Fixed bug in drivers/scsi/sd.c when >16 SCSI discs
-
-- Disable warning messages when unable to read partition table for
-  removable media
-===============================================================================
-Changes for patch v100
-
-- Ported to kernel 2.3.1-pre5
-
-- Added "oops-on-panic" boot option
-
-- Improved debugging in <devfs_register> and <devfs_unregister>
-
-- Register entry in <sr_detect>
-
-- Unregister entry in <sr_detach>
-
-- Register entry in <sg_detect>
-
-- Unregister entry in <sg_detach>
-
-- Added support for ALSA drivers
-===============================================================================
-Changes for patch v101
-
-- Ported to kernel 2.3.2
-===============================================================================
-Changes for patch v102
-
-- Update serial driver to register PCMCIA entries
-  Thanks to Roch-Alexandre Nomine-Beguin <roch@samarkand.infini.fr>
-
-- Updated an email address in ChangeLog
-
-- Hide virtual console capture entries from directory listings when
-  corresponding console device is not open
-===============================================================================
-Changes for patch v103
-
-- Ported to kernel 2.3.3
-===============================================================================
-Changes for patch v104
-
-- Added documentation for some functions
-
-- Added "doc" target to fs/devfs/Makefile
-
-- Added "v4l" directory for video4linux devices
-
-- Replaced call to <devfs_unregister> in <sd_detach> with call to
-  <devfs_register_partitions>
-
-- Moved registration for sr and sg drivers from detect() to attach()
-  methods
-
-- Register entries in <st_attach> and unregister in <st_detach>
-
-- Work around IDE driver treating CD-ROM as gendisk
-
-- Use <sed> instead of <tr> in rc.devfs
-
-- Updated ToDo list
-
-- Removed "oops-on-panic" boot option: now always Oops
-===============================================================================
-Changes for patch v105
-
-- Unregister SCSI host from <scsi_host_no_list> in <scsi_unregister>
-  Thanks to Zoltán Böszörményi <zboszor@mail.externet.hu>
-
-- Don't save /dev/log in rc.devfs
-
-- Ported to kernel 2.3.4-pre1
-===============================================================================
-Changes for patch v106
-
-- Fixed silly typo in drivers/scsi/st.c
-
-- Improved debugging in <devfs_register>
-===============================================================================
-Changes for patch v107
-
-- Added "diunlink" and "nokmod" boot options
-
-- Removed superfluous warning message in <devfs_d_iput>
-===============================================================================
-Changes for patch v108
-
-- Remove entries when unloading sound module
-===============================================================================
-Changes for patch v109
-
-- Ported to kernel 2.3.6-pre2
-===============================================================================
-Changes for patch v110
-
-- Took account of change to <d_alloc_root>
-===============================================================================
-Changes for patch v111
-
-- Created separate event queue for each mounted devfs
-
-- Removed <devfs_invalidate_dcache>
-
-- Created new ioctl()s for devfsd
-
-- Incremented devfsd protocol revision to 3
-
-- Fixed bug when re-creating directories: contents were lost
-
-- Block access to inodes until devfsd updates permissions
-===============================================================================
-Changes for patch v112
-
-- Modified patch so it applies against 2.3.5 and 2.3.6
-
-- Updated an email address in ChangeLog
-
-- Do not automatically change ownership/protection of /dev/tty<n>
-
-- Updated sample modules.conf
-
-- Switched to sending process uid/gid to devfsd
-
-- Renamed <call_kmod> to <try_modload>
-
-- Added DEVFSD_NOTIFY_LOOKUP event
-
-- Added DEVFSD_NOTIFY_CHANGE event
-
-- Added DEVFSD_NOTIFY_CREATE event
-
-- Incremented devfsd protocol revision to 4
-
-- Moved kernel-specific stuff to include/linux/devfs_fs_kernel.h
-===============================================================================
-Changes for patch v113
-
-- Ported to kernel 2.3.9
-
-- Restricted permissions on some block devices
-===============================================================================
-Changes for patch v114
-
-- Added support for /dev/netlink
-  Thanks to Dennis Hou <smilax@mindmeld.yi.org>
-
-- Return EISDIR rather than EINVAL for read(2) on directories
-
-- Ported to kernel 2.3.10
-===============================================================================
-Changes for patch v115
-
-- Added support for all remaining character devices
-  Thanks to Dennis Hou <smilax@mindmeld.yi.org>
-
-- Cleaned up netlink support
-===============================================================================
-Changes for patch v116
-
-- Added support for /dev/parport%d
-  Thanks to Tim Waugh <tim@cyberelk.demon.co.uk>
-
-- Fixed parallel port ATAPI tape driver
-
-- Fixed Atari SLM laser printer driver
-===============================================================================
-Changes for patch v117
-
-- Added support for COSA card
-  Thanks to Dennis Hou <smilax@mindmeld.yi.org>
-
-- Fixed drivers/char/ppdev.c: missing #include <linux/init.h>
-
-- Fixed drivers/char/ftape/zftape/zftape-init.c
-  Thanks to Vladimir Popov <mashgrad@usa.net>
-===============================================================================
-Changes for patch v118
-
-- Ported to kernel 2.3.15-pre3
-
-- Fixed bug in loop driver
-
-- Unregister /dev/lp%d entries in drivers/char/lp.c
-  Thanks to Maciej W. Rozycki <macro@ds2.pg.gda.pl>
-===============================================================================
-Changes for patch v119
-
-- Ported to kernel 2.3.16
-===============================================================================
-Changes for patch v120
-
-- Fixed bug in drivers/scsi/scsi.c
-
-- Added /dev/ppp
-  Thanks to Dennis Hou <smilax@mindmeld.yi.org>
-
-- Ported to kernel 2.3.17
-===============================================================================
-Changes for patch v121
-
-- Fixed bug in drivers/block/loop.c
-
-- Ported to kernel 2.3.18
-===============================================================================
-Changes for patch v122
-
-- Ported to kernel 2.3.19
-===============================================================================
-Changes for patch v123
-
-- Ported to kernel 2.3.20
-===============================================================================
-Changes for patch v124
-
-- Ported to kernel 2.3.21
-===============================================================================
-Changes for patch v125
-
-- Created <devfs_get_info>, <devfs_set_info>,
-  <devfs_get_first_child> and <devfs_get_next_sibling>
-  Added <<dir>> parameter to <devfs_register>, <devfs_mk_compat>,
-  <devfs_mk_dir> and <devfs_find_handle>
-  Work sponsored by SGI
-
-- Fixed apparent bug in COSA driver
-
-- Re-instated "scsihosts=" boot option
-===============================================================================
-Changes for patch v126
-
-- Always create /dev/pts if CONFIG_UNIX98_PTYS=y
-
-- Fixed call to <devfs_mk_dir> in drivers/block/ide-disk.c
-  Thanks to Dennis Hou <smilax@mindmeld.yi.org>
-
-- Allow multiple unregistrations
-
-- Created /dev/scsi hierarchy
-  Work sponsored by SGI
-===============================================================================
-Changes for patch v127
-
-Work sponsored by SGI
-
-- No longer disable devpts if devfs enabled (caveat emptor)
-
-- Added flags array to struct gendisk and removed code from
-  drivers/scsi/sd.c
-
-- Created /dev/discs hierarchy
-===============================================================================
-Changes for patch v128
-
-Work sponsored by SGI
-
-- Created /dev/cdroms hierarchy
-===============================================================================
-Changes for patch v129
-
-Work sponsored by SGI
-
-- Removed compatibility entries for sound devices
-
-- Removed compatibility entries for printer devices
-
-- Removed compatibility entries for video4linux devices
-
-- Removed compatibility entries for parallel port devices
-
-- Removed compatibility entries for frame buffer devices
-===============================================================================
-Changes for patch v130
-
-Work sponsored by SGI
-
-- Added major and minor number to devfsd protocol
-
-- Incremented devfsd protocol revision to 5
-
-- Removed compatibility entries for SoundBlaster CD-ROMs
-
-- Removed compatibility entries for netlink devices
-
-- Removed compatibility entries for SCSI generic devices
-
-- Removed compatibility entries for SCSI tape devices
-===============================================================================
-Changes for patch v131
-
-Work sponsored by SGI
-
-- Support info pointer for all devfs entry types
-
-- Added <<info>> parameter to <devfs_mk_dir> and <devfs_mk_symlink>
-
-- Removed /dev/st hierarchy
-
-- Removed /dev/sg hierarchy
-
-- Removed compatibility entries for loop devices
-
-- Removed compatibility entries for IDE tape devices
-
-- Removed compatibility entries for SCSI CD-ROMs
-
-- Removed /dev/sr hierarchy
-===============================================================================
-Changes for patch v132
-
-Work sponsored by SGI
-
-- Removed compatibility entries for floppy devices
-
-- Removed compatibility entries for RAMDISCs
-
-- Removed compatibility entries for meta-devices
-
-- Removed compatibility entries for SCSI discs
-
-- Created <devfs_make_root>
-
-- Removed /dev/sd hierarchy
-
-- Support "../" when searching devfs namespace
-
-- Created /dev/ide/host* hierarchy
-
-- Supported IDE hard discs in /dev/ide/host* hierarchy
-
-- Removed compatibility entries for IDE discs
-
-- Removed /dev/ide/hd hierarchy
-
-- Supported IDE CD-ROMs in /dev/ide/host* hierarchy
-
-- Removed compatibility entries for IDE CD-ROMs
-
-- Removed /dev/ide/cd hierarchy
-===============================================================================
-Changes for patch v133
-
-Work sponsored by SGI
-
-- Created <devfs_get_unregister_slave>
-
-- Fixed bug in fs/partitions/check.c when rescanning
-===============================================================================
-Changes for patch v134
-
-Work sponsored by SGI
-
-- Removed /dev/sd, /dev/sr, /dev/st and /dev/sg directories
-
-- Removed /dev/ide/hd directory
-
-- Exported <devfs_get_parent>
-
-- Created <devfs_register_tape> and /dev/tapes hierarchy
-
-- Removed /dev/ide/mt hierarchy
-
-- Removed /dev/ide/fd hierarchy
-
-- Ported to kernel 2.3.25
-===============================================================================
-Changes for patch v135
-
-Work sponsored by SGI
-
-- Removed compatibility entries for virtual console capture devices
-
-- Removed unused <devfs_set_symlink_destination>
-
-- Removed compatibility entries for serial devices
-
-- Removed compatibility entries for console devices
-
-- Do not hide entries from devfsd or children
-
-- Removed DEVFS_FL_TTY_COMPAT flag
-
-- Removed "nottycompat" boot option
-
-- Removed <devfs_mk_compat>
-===============================================================================
-Changes for patch v136
-
-Work sponsored by SGI
-
-- Moved BSD pty devices to /dev/pty
-
-- Added DEVFS_FL_WAIT flag
-===============================================================================
-Changes for patch v137
-
-Work sponsored by SGI
-
-- Really fixed bug in fs/partitions/check.c when rescanning
-
-- Support new "disc" naming scheme in <get_removable_partition>
-
-- Allow NULL fops in <devfs_register>
-
-- Removed redundant name functions in SCSI disc and IDE drivers
-===============================================================================
-Changes for patch v138
-
-Work sponsored by SGI
-
-- Fixed old bugs in drivers/block/paride/pt.c, drivers/char/tpqic02.c,
-  drivers/net/wan/cosa.c and drivers/scsi/scsi.c
-  Thanks to Sergey Kubushin <ksi@ksi-linux.com>
-
-- Fall back to major table if NULL fops given to <devfs_register>
-===============================================================================
-Changes for patch v139
-
-Work sponsored by SGI
-
-- Corrected and moved <get_blkfops> and <get_chrfops> declarations
-  from arch/alpha/kernel/osf_sys.c to include/linux/fs.h
-
-- Removed name function from struct gendisk
-
-- Updated devfs FAQ
-===============================================================================
-Changes for patch v140
-
-Work sponsored by SGI
-
-- Ported to kernel 2.3.27
-===============================================================================
-Changes for patch v141
-
-Work sponsored by SGI
-
-- Bug fix in arch/m68k/atari/joystick.c
-
-- Moved ISDN and capi devices to /dev/isdn
-===============================================================================
-Changes for patch v142
-
-Work sponsored by SGI
-
-- Bug fix in drivers/block/ide-probe.c (patch confusion)
-===============================================================================
-Changes for patch v143
-
-Work sponsored by SGI
-
-- Bug fix in drivers/block/blkpg.c:partition_name()
-===============================================================================
-Changes for patch v144
-
-Work sponsored by SGI
-
-- Ported to kernel 2.3.29
-
-- Removed calls to <devfs_register> from cdu31a, cm206, mcd and mcdx
-  CD-ROM drivers: generic driver handles this now
-
-- Moved joystick devices to /dev/joysticks
-===============================================================================
-Changes for patch v145
-
-Work sponsored by SGI
-
-- Ported to kernel 2.3.30-pre3
-
-- Register whole-disc entry even for invalid partition tables
-
-- Fixed bug in mounting root FS when initrd enabled
-
-- Fixed device entry leak with IDE CD-ROMs
-
-- Fixed compile problem with drivers/isdn/isdn_common.c
-
-- Moved COSA devices to /dev/cosa
-
-- Support fifos when unregistering
-
-- Created <devfs_register_series> and used in many drivers
-
-- Moved Coda devices to /dev/coda
-
-- Moved parallel port IDE tapes to /dev/pt
-
-- Moved parallel port IDE generic devices to /dev/pg
-===============================================================================
-Changes for patch v146
-
-Work sponsored by SGI
-
-- Removed obsolete DEVFS_FL_COMPAT and DEVFS_FL_TOLERANT flags
-
-- Fixed compile problem with fs/coda/psdev.c
-
-- Reinstate change to <devfs_register_blkdev> in
-  drivers/block/ide-probe.c now that fs/isofs/inode.c is fixed
-
-- Switched to <devfs_register_blkdev> in drivers/block/floppy.c,
-  drivers/scsi/sr.c and drivers/block/md.c
-
-- Moved DAC960 devices to /dev/dac960
-===============================================================================
-Changes for patch v147
-
-Work sponsored by SGI
-
-- Ported to kernel 2.3.32-pre4
-===============================================================================
-Changes for patch v148
-
-Work sponsored by SGI
-
-- Removed kmod support: use devfsd instead
-
-- Moved miscellaneous character devices to /dev/misc
-===============================================================================
-Changes for patch v149
-
-Work sponsored by SGI
-
-- Ensure include/linux/joystick.h is OK for user-space
-
-- Improved debugging in <get_vfs_inode>
-
-- Ensure dentries created by devfsd will be cleaned up
-===============================================================================
-Changes for patch v150
-
-Work sponsored by SGI
-
-- Ported to kernel 2.3.34
-===============================================================================
-Changes for patch v151
-
-Work sponsored by SGI
-
-- Ported to kernel 2.3.35-pre1
-
-- Created <devfs_get_name>
-===============================================================================
-Changes for patch v152
-
-Work sponsored by SGI
-
-- Updated sample modules.conf
-
-- Ported to kernel 2.3.36-pre1
-===============================================================================
-Changes for patch v153
-
-Work sponsored by SGI
-
-- Ported to kernel 2.3.42
-
-- Removed <devfs_fill_file>
-===============================================================================
-Changes for patch v154
-
-Work sponsored by SGI
-
-- Took account of device number changes for /dev/fb*
-===============================================================================
-Changes for patch v155
-
-Work sponsored by SGI
-
-- Ported to kernel 2.3.43-pre8
-
-- Moved /dev/tty0 to /dev/vc/0
-
-- Moved sequence number formatting from <_tty_make_name> to drivers
-===============================================================================
-Changes for patch v156
-
-Work sponsored by SGI
-
-- Fixed breakage in drivers/scsi/sd.c due to recent SCSI changes
-===============================================================================
-Changes for patch v157
-
-Work sponsored by SGI
-
-- Ported to kernel 2.3.45
-===============================================================================
-Changes for patch v158
-
-Work sponsored by SGI
-
-- Ported to kernel 2.3.46-pre2
-===============================================================================
-Changes for patch v159
-
-Work sponsored by SGI
-
-- Fixed drivers/block/md.c
-  Thanks to Mike Galbraith <mikeg@weiden.de>
-
-- Documentation fixes
-
-- Moved device registration from <lp_init> to <lp_register>
-  Thanks to Tim Waugh <twaugh@redhat.com>
-===============================================================================
-Changes for patch v160
-
-Work sponsored by SGI
-
-- Fixed drivers/char/joystick/joystick.c
-  Thanks to Vojtech Pavlik <vojtech@suse.cz>
-
-- Documentation updates
-
-- Fixed arch/i386/kernel/mtrr.c if procfs and devfs not enabled
-
-- Fixed drivers/char/stallion.c
-===============================================================================
-Changes for patch v161
-
-Work sponsored by SGI
-
-- Remove /dev/ide when ide-mod is unloaded
-
-- Fixed bug in drivers/block/ide-probe.c when secondary but no primary
-
-- Added DEVFS_FL_NO_PERSISTENCE flag
-
-- Used new DEVFS_FL_NO_PERSISTENCE flag for Unix98 pty slaves
-
-- Removed unnecessary call to <update_devfs_inode_from_entry> in
-  <devfs_readdir>
-
-- Only set auto-ownership for /dev/pty/s*
-===============================================================================
-Changes for patch v162
-
-Work sponsored by SGI
-
-- Set inode->i_size to correct size for symlinks
-  Thanks to Jeremy Fitzhardinge <jeremy@goop.org>
-
-- Only give lookup() method to directories to comply with new VFS
-  assumptions
-
-- Remove unnecessary tests in symlink methods
-
-- Don't kill existing block ops in <devfs_read_inode>
-
-- Restore auto-ownership for /dev/pty/m*
-===============================================================================
-Changes for patch v163
-
-Work sponsored by SGI
-
-- Don't create missing directories in <devfs_find_handle>
-
-- Removed Documentation/filesystems/devfs/mk-devlinks
-
-- Updated Documentation/filesystems/devfs/README
-===============================================================================
-Changes for patch v164
-
-Work sponsored by SGI
-
-- Fixed CONFIG_DEVFS breakage in drivers/char/serial.c introduced in
-  linux-2.3.99-pre6-7
-===============================================================================
-Changes for patch v165
-
-Work sponsored by SGI
-
-- Ported to kernel 2.3.99-pre6
-===============================================================================
-Changes for patch v166
-
-Work sponsored by SGI
-
-- Added CONFIG_DEVFS_MOUNT
-===============================================================================
-Changes for patch v167
-
-Work sponsored by SGI
-
-- Updated Documentation/filesystems/devfs/README
-
-- Updated sample modules.conf
-===============================================================================
-Changes for patch v168
-
-Work sponsored by SGI
-
-- Disabled multi-mount capability (use VFS bindings instead)
-
-- Updated README from master HTML file
-===============================================================================
-Changes for patch v169
-
-Work sponsored by SGI
-
-- Removed multi-mount code
-
-- Removed compatibility macros: VFS has changed too much
-===============================================================================
-Changes for patch v170
-
-Work sponsored by SGI
-
-- Updated README from master HTML file
-
-- Merged devfs inode into devfs entry
-===============================================================================
-Changes for patch v171
-
-Work sponsored by SGI
-
-- Updated sample modules.conf
-
-- Removed dead code in <devfs_register> which used to call
-  <free_dentries>
-
-- Ported to kernel 2.4.0-test2-pre3
-===============================================================================
-Changes for patch v172
-
-Work sponsored by SGI
-
-- Changed interface to <devfs_register>
-
-- Changed interface to <devfs_register_series>
-===============================================================================
-Changes for patch v173
-
-Work sponsored by SGI
-
-- Simplified interface to <devfs_mk_symlink>
-
-- Simplified interface to <devfs_mk_dir>
-
-- Simplified interface to <devfs_find_handle>
-===============================================================================
-Changes for patch v174
-
-Work sponsored by SGI
-
-- Updated README from master HTML file
-===============================================================================
-Changes for patch v175
-
-Work sponsored by SGI
-
-- DocBook update for fs/devfs/base.c
-  Thanks to Tim Waugh <twaugh@redhat.com>
-
-- Removed stale fs/tunnel.c (was never used or completed)
-===============================================================================
-Changes for patch v176
-
-Work sponsored by SGI
-
-- Updated ToDo list
-
-- Removed sample modules.conf: now distributed with devfsd
-
-- Updated README from master HTML file
-
-- Ported to kernel 2.4.0-test3-pre4 (which had devfs-patch-v174)
-===============================================================================
-Changes for patch v177
-
-- Updated README from master HTML file
-
-- Documentation cleanups
-
-- Ensure <devfs_generate_path> terminates string for root entry
-  Thanks to Tim Jansen <tim@tjansen.de>
-
-- Exported <devfs_get_name> to modules
-
-- Make <devfs_mk_symlink> send events to devfsd
-
-- Cleaned up option processing in <devfs_setup>
-
-- Fixed bugs in handling symlinks: could leak or cause Oops
-
-- Cleaned up directory handling by separating fops
-  Thanks to Alexander Viro <viro@parcelfarce.linux.theplanet.co.uk>
-===============================================================================
-Changes for patch v178
-
-- Fixed handling of inverted options in <devfs_setup>
-===============================================================================
-Changes for patch v179
-
-- Adjusted <try_modload> to account for <devfs_generate_path> fix
-===============================================================================
-Changes for patch v180
-
-- Fixed !CONFIG_DEVFS_FS stub declaration of <devfs_get_info>
-===============================================================================
-Changes for patch v181
-
-- Answered question posed by Al Viro and removed his comments from <devfs_open>
-
-- Moved setting of registered flag after other fields are changed
-
-- Fixed race between <devfsd_close> and <devfsd_notify_one>
-
-- Global VFS changes added bogus BKL to devfsd_close(): removed
-
-- Widened locking in <devfs_readlink> and <devfs_follow_link>
-
-- Replaced <devfsd_read> stack usage with <devfsd_ioctl> kmalloc
-
-- Simplified locking in <devfsd_ioctl> and fixed memory leak
-===============================================================================
-Changes for patch v182
-
-- Created <devfs_*alloc_major> and <devfs_*alloc_devnum>
-
-- Removed broken devnum allocation and use <devfs_alloc_devnum>
-
-- Fixed old devnum leak by calling new <devfs_dealloc_devnum>
-
-- Created <devfs_*alloc_unique_number>
-
-- Fixed number leak for /dev/cdroms/cdrom%d
-
-- Fixed number leak for /dev/discs/disc%d
-===============================================================================
-Changes for patch v183
-
-- Fixed bug in <devfs_setup> which could hang boot process
-===============================================================================
-Changes for patch v184
-
-- Documentation typo fix for fs/devfs/util.c
-
-- Fixed drivers/char/stallion.c for devfs
-
-- Added DEVFSD_NOTIFY_DELETE event
-
-- Updated README from master HTML file
-
-- Removed #include <asm/segment.h> from fs/devfs/base.c
-===============================================================================
-Changes for patch v185
-
-- Made <block_semaphore> and <char_semaphore> in fs/devfs/util.c
-  private
-
-- Fixed inode table races by removing it and using inode->u.generic_ip
-  instead
-
-- Moved <devfs_read_inode> into <get_vfs_inode>
-
-- Moved <devfs_write_inode> into <devfs_notify_change>
-===============================================================================
-Changes for patch v186
-
-- Fixed race in <devfs_do_symlink> for uni-processor
-
-- Updated README from master HTML file
-===============================================================================
-Changes for patch v187
-
-- Fixed drivers/char/stallion.c for devfs
-
-- Fixed drivers/char/rocket.c for devfs
-
-- Fixed bug in <devfs_alloc_unique_number>: limited to 128 numbers
-===============================================================================
-Changes for patch v188
-
-- Updated major masks in fs/devfs/util.c up to Linus' "no new majors"
-  proclamation. Block: were 126 now 122 free, char: were 26 now 19 free
-
-- Updated README from master HTML file
-
-- Removed remnant of multi-mount support in <devfs_mknod>
-
-- Removed unused DEVFS_FL_SHOW_UNREG flag
-===============================================================================
-Changes for patch v189
-
-- Removed nlink field from struct devfs_inode
-
-- Removed auto-ownership for /dev/pty/* (BSD ptys) and used
-  DEVFS_FL_CURRENT_OWNER|DEVFS_FL_NO_PERSISTENCE for /dev/pty/s* (just
-  like Unix98 pty slaves) and made /dev/pty/m* rw-rw-rw- access
-===============================================================================
-Changes for patch v190
-
-- Updated README from master HTML file
-
-- Replaced BKL with global rwsem to protect symlink data (quick and
-  dirty hack)
-===============================================================================
-Changes for patch v191
-
-- Replaced global rwsem for symlink with per-link refcount
-===============================================================================
-Changes for patch v192
-
-- Removed unnecessary #ifdef CONFIG_DEVFS_FS from arch/i386/kernel/mtrr.c
-
-- Ported to kernel 2.4.10-pre11
-
-- Set inode->i_mapping->a_ops for block nodes in <get_vfs_inode>
-===============================================================================
-Changes for patch v193
-
-- Went back to global rwsem for symlinks (refcount scheme no good)
-===============================================================================
-Changes for patch v194
-
-- Fixed overrun in <devfs_link> by removing function (not needed)
-
-- Updated README from master HTML file
-===============================================================================
-Changes for patch v195
-
-- Fixed buffer underrun in <try_modload>
-
-- Moved down_read() from <search_for_entry_in_dir> to <find_entry>
-===============================================================================
-Changes for patch v196
-
-- Fixed race in <devfsd_ioctl> when setting event mask
-  Thanks to Kari Hurtta <hurtta@leija.mh.fmi.fi>
-
-- Avoid deadlock in <devfs_follow_link> by using temporary buffer
-===============================================================================
-Changes for patch v197
-
-- First release of new locking code for devfs core (v1.0)
-
-- Fixed bug in drivers/cdrom/cdrom.c
-===============================================================================
-Changes for patch v198
-
-- Discard temporary buffer, now use "%s" for dentry names
-
-- Don't generate path in <try_modload>: use fake entry instead
-
-- Use "existing" directory in <_devfs_make_parent_for_leaf>
-
-- Use slab cache rather than fixed buffer for devfsd events
-===============================================================================
-Changes for patch v199
-
-- Removed obsolete usage of DEVFS_FL_NO_PERSISTENCE
-
-- Send DEVFSD_NOTIFY_REGISTERED events in <devfs_mk_dir>
-
-- Fixed locking bug in <devfs_d_revalidate_wait> due to typo
-
-- Do not send CREATE, CHANGE, ASYNC_OPEN or DELETE events from devfsd
-  or children
-===============================================================================
-Changes for patch v200
-
-- Ported to kernel 2.5.1-pre2
-===============================================================================
-Changes for patch v201
-
-- Fixed bug in <devfsd_read>: was dereferencing freed pointer
-===============================================================================
-Changes for patch v202
-
-- Fixed bug in <devfsd_close>: was dereferencing freed pointer
-
-- Added process group check for devfsd privileges
-===============================================================================
-Changes for patch v203
-
-- Use SLAB_ATOMIC in <devfsd_notify_de> from <devfs_d_delete>
-===============================================================================
-Changes for patch v204
-
-- Removed long obsolete rc.devfs
-
-- Return old entry in <devfs_mk_dir> for 2.4.x kernels
-
-- Updated README from master HTML file
-
-- Increment refcount on module in <check_disc_changed>
-
-- Created <devfs_get_handle> and exported <devfs_put>
-
-- Increment refcount on module in <devfs_get_ops>
-
-- Created <devfs_put_ops> and used where needed to fix races
-
-- Added clarifying comments in response to preliminary EMC code review
-
-- Added poisoning to <devfs_put>
-
-- Improved debugging messages
-
-- Fixed unregister bugs in drivers/md/lvm-fs.c
-===============================================================================
-Changes for patch v205
-
-- Corrected (made useful) debugging message in <unregister>
-
-- Moved <kmem_cache_create> in <mount_devfs_fs> to <init_devfs_fs>
-
-- Fixed drivers/md/lvm-fs.c to create "lvm" entry
-
-- Added magic number to guard against scribbling drivers
-
-- Only return old entry in <devfs_mk_dir> if a directory
-
-- Defined macros for error and debug messages
-
-- Updated README from master HTML file
-===============================================================================
-Changes for patch v206
-
-- Added support for multiple Compaq cpqarray controllers
-
-- Fixed (rare, old) race in <devfs_lookup>
-===============================================================================
-Changes for patch v207
-
-- Fixed deadlock bug in <devfs_d_revalidate_wait>
-
-- Tag VFS deletable in <devfs_mk_symlink> if handle ignored
-
-- Updated README from master HTML file
-===============================================================================
-Changes for patch v208
-
-- Added KERN_* to remaining messages
-
-- Cleaned up declaration of <stat_read>
-
-- Updated README from master HTML file
-===============================================================================
-Changes for patch v209
-
-- Updated README from master HTML file
-
-- Removed silently introduced calls to lock_kernel() and
-  unlock_kernel() due to recent VFS locking changes. BKL isn't
-  required in devfs 
-
-- Changed <devfs_rmdir> to allow later additions if not yet empty
-
-- Added calls to <devfs_register_partitions> in drivers/block/blkpc.c
-  <add_partition> and <del_partition>
-
-- Fixed bug in <devfs_alloc_unique_number>: was clearing beyond
-  bitfield
-
-- Fixed bitfield data type for <devfs_*alloc_devnum>
-
-- Made major bitfield type and initialiser 64 bit safe
-===============================================================================
-Changes for patch v210
-
-- Updated fs/devfs/util.c to fix shift warning on 64 bit machines
-  Thanks to Anton Blanchard <anton@samba.org>
-
-- Updated README from master HTML file
-===============================================================================
-Changes for patch v211
-
-- Do not put miscellaneous character devices in /dev/misc if they
-  specify their own directory (i.e. contain a '/' character)
-
-- Copied macro for error messages from fs/devfs/base.c to
-  fs/devfs/util.c and made use of this macro
-
-- Removed 2.4.x compatibility code from fs/devfs/base.c
-===============================================================================
-Changes for patch v212
-
-- Added BKL to <devfs_open> because drivers still need it
-===============================================================================
-Changes for patch v213
-
-- Protected <scan_dir_for_removable> and <get_removable_partition>
-  from changing directory contents
-===============================================================================
-Changes for patch v214
-
-- Switched to ISO C structure field initialisers
-
-- Switch to set_current_state() and move before add_wait_queue()
-
-- Updated README from master HTML file
-
-- Fixed devfs entry leak in <devfs_readdir> when *readdir fails
-===============================================================================
-Changes for patch v215
-
-- Created <devfs_find_and_unregister>
-
-- Switched many functions from <devfs_find_handle> to
-  <devfs_find_and_unregister>
-
-- Switched many functions from <devfs_find_handle> to <devfs_get_handle>
-===============================================================================
-Changes for patch v216
-
-- Switched arch/ia64/sn/io/hcl.c from <devfs_find_handle> to
-  <devfs_get_handle>
-
-- Removed deprecated <devfs_find_handle>
-===============================================================================
-Changes for patch v217
-
-- Exported <devfs_find_and_unregister> and <devfs_only> to modules
-
-- Updated README from master HTML file
-
-- Fixed module unload race in <devfs_open>
-===============================================================================
-Changes for patch v218
-
-- Removed DEVFS_FL_AUTO_OWNER flag
-
-- Switched lingering structure field initialiser to ISO C
-
-- Added locking when setting/clearing flags
-
-- Documentation fix in fs/devfs/util.c
diff --git a/Documentation/filesystems/devfs/README b/Documentation/filesystems/devfs/README
deleted file mode 100644
index aabfba2..0000000
--- a/Documentation/filesystems/devfs/README
+++ /dev/null
@@ -1,1959 +0,0 @@
-Devfs (Device File System) FAQ
-
-
-Linux Devfs (Device File System) FAQ
-Richard Gooch
-20-AUG-2002
-
-
-Document languages:
-
-
-
-
-
-
-
------------------------------------------------------------------------------
-
-NOTE: the master copy of this document is available online at:
-
-http://www.atnf.csiro.au/~rgooch/linux/docs/devfs.html
-and looks much better than the text version distributed with the
-kernel sources. A mirror site is available at:
-
-http://www.ras.ucalgary.ca/~rgooch/linux/docs/devfs.html
-
-There is also an optional daemon that may be used with devfs. You can
-find out more about it at:
-
-http://www.atnf.csiro.au/~rgooch/linux/
-
-A mailing list is available which you may subscribe to. Send
-email
-to majordomo@oss.sgi.com with the following line in the
-body of the message:
-subscribe devfs
-To unsubscribe, send the message body:
-unsubscribe devfs
-instead. The list is archived at
-
-http://oss.sgi.com/projects/devfs/archive/.
-
------------------------------------------------------------------------------
-
-Contents
-
-
-What is it?
-
-Why do it?
-
-Who else does it?
-
-How it works
-
-Operational issues (essential reading)
-
-Instructions for the impatient
-Permissions persistence across reboots
-Dealing with drivers without devfs support
-All the way with Devfs
-Other Issues
-Kernel Naming Scheme
-Devfsd Naming Scheme
-Old Compatibility Names
-SCSI Host Probing Issues
-
-
-
-Device drivers currently ported
-
-Allocation of Device Numbers
-
-Questions and Answers
-
-Making things work
-Alternatives to devfs
-What I don't like about devfs
-How to report bugs
-Strange kernel messages
-Compilation problems with devfsd
-
-
-Other resources
-
-Translations of this document
-
-
------------------------------------------------------------------------------
-
-
-What is it?
-
-Devfs is an alternative to "real" character and block special devices
-on your root filesystem. Kernel device drivers can register devices by
-name rather than major and minor numbers. These devices will appear in
-devfs automatically, with whatever default ownership and
-protection the driver specified. A daemon (devfsd) can be used to
-override these defaults. Devfs has been in the kernel since 2.3.46.
-
-NOTE that devfs is entirely optional. If you prefer the old
-disc-based device nodes, then simply leave CONFIG_DEVFS_FS=n (the
-default). In this case, nothing will change.  ALSO NOTE that if you do
-enable devfs, the defaults are such that full compatibility is
-maintained with the old devices names.
-
-There are two aspects to devfs: one is the underlying device
-namespace, which is a namespace just like any mounted filesystem. The
-other aspect is the filesystem code which provides a view of the
-device namespace. The reason I make a distinction is because devfs
-can be mounted many times, with each mount showing the same device
-namespace. Changes made are global to all mounted devfs filesystems.
-Also, because the devfs namespace exists without any devfs mounts, you
-can easily mount the root filesystem by referring to an entry in the
-devfs namespace.
-
-
-The cost of devfs is a small increase in kernel code size and memory
-usage. About 7 pages of code (some of that in __init sections) and 72
-bytes for each entry in the namespace. A modest system has only a
-couple of hundred device entries, so this costs a few more
-pages. Compare this with the suggestion to put /dev on a <a
-href="#why-faq-ramdisc">ramdisc.
-
-On a typical machine, the cost is under 0.2 percent. On a modest
-system with 64 MBytes of RAM, the cost is under 0.1 percent.  The
-accusations of "bloatware" levelled at devfs are not justified.
-
------------------------------------------------------------------------------
-
-
-Why do it?
-
-There are several problems that devfs addresses. Some of these
-problems are more serious than others (depending on your point of
-view), and some can be solved without devfs. However, the totality of
-these problems really calls out for devfs.
-
-The choice is a patchwork of inefficient user space solutions, which
-are complex and likely to be fragile, or to use a simple and efficient
-devfs which is robust.
-
-There have been many counter-proposals to devfs, all seeking to
-provide some of the benefits without actually implementing devfs. So
-far there has been an absence of code and no proposed alternative has
-been able to provide all the features that devfs does. Further,
-alternative proposals require far more complexity in user-space (and
-still deliver less functionality than devfs). Some people have the
-mantra of reducing "kernel bloat", but don't consider the effects on
-user-space.
-
-A good solution limits the total complexity of kernel-space and
-user-space.
-
-
-Major&minor allocation
-
-The existing scheme requires the allocation of major and minor device
-numbers for each and every device. This means that a central
-co-ordinating authority is required to issue these device numbers
-(unless you're developing a "private" device driver), in order to
-preserve uniqueness. Devfs shifts the burden to a namespace. This may
-not seem like a huge benefit, but actually it is. Since driver authors
-will naturally choose a device name which reflects the functionality
-of the device, there is far less potential for namespace conflict.
-Solving this requires a kernel change.
-
-/dev management
-
-Because you currently access devices through device nodes, these must
-be created by the system administrator. For standard devices you can
-usually find a MAKEDEV programme which creates all these (hundreds!)
-of nodes. This means that changes in the kernel must be reflected by
-changes in the MAKEDEV programme, or else the system administrator
-creates device nodes by hand.
-
-The basic problem is that there are two separate databases of
-major and minor numbers. One is in the kernel and one is in /dev (or
-in a MAKEDEV programme, if you want to look at it that way). This is
-duplication of information, which is not good practice.
-Solving this requires a kernel change.
-
-/dev growth
-
-A typical /dev has over 1200 nodes! Most of these devices simply don't
-exist because the hardware is not available. A huge /dev increases the
-time to access devices (I'm just referring to the dentry lookup times
-and the time taken to read inodes off disc: the next subsection shows
-some more horrors).
-
-An example of how big /dev can grow is if we consider SCSI devices:
-
-host           6  bits  (say up to 64 hosts on a really big machine)
-channel        4  bits  (say up to 16 SCSI buses per host)
-id             4  bits
-lun            3  bits
-partition      6  bits
-TOTAL          23 bits
-
-
-This requires 8 Mega (1024*1024) inodes if we want to store all
-possible device nodes. Even if we scrap everything but id,partition
-and assume a single host adapter with a single SCSI bus and only one
-logical unit per SCSI target (id), that's still 10 bits or 1024
-inodes. Each VFS inode takes around 256 bytes (kernel 2.1.78), so
-that's 256 kBytes of inode storage on disc (assuming real inodes take
-a similar amount of space as VFS inodes). This is actually not so bad,
-because disc is cheap these days. Embedded systems would care about
-256 kBytes of /dev inodes, but you could argue that embedded systems
-would have hand-tuned /dev directories. I've had to do just that on my
-embedded systems, but I would rather just leave it to devfs.
-
-Another issue is the time taken to lookup an inode when first
-referenced. Not only does this take time in scanning through a list in
-memory, but also the seek times to read the inodes off disc.
-This could be solved in user-space using a clever programme which
-scanned the kernel logs and deleted /dev entries which are not
-available and created them when they were available. This programme
-would need to be run every time a new module was loaded, which would
-slow things down a lot.
-
-There is an existing programme called scsidev which will automatically
-create device nodes for SCSI devices. It can do this by scanning files
-in /proc/scsi. Unfortunately, to extend this idea to other device
-nodes would require significant modifications to existing drivers (so
-they too would provide information in /proc). This is a non-trivial
-change (I should know: devfs has had to do something similar). Once
-you go to this much effort, you may as well use devfs itself (which
-also provides this information).  Furthermore, such a system would
-likely be implemented in an ad-hoc fashion, as different drivers will
-provide their information in different ways.
-
-Devfs is much cleaner, because it (naturally) has a uniform mechanism
-to provide this information: the device nodes themselves!
-
-
-Node to driver file_operations translation
-
-There is an important difference between the way disc-based character
-and block nodes and devfs entries make the connection between an entry
-in /dev and the actual device driver.
-
-With the current 8 bit major and minor numbers the connection between
-disc-based c&b nodes and per-major drivers is done through a
-fixed-length table of 128 entries. The various filesystem types set
-the inode operations for c&b nodes to {chr,blk}dev_inode_operations,
-so when a device is opened a few quick levels of indirection bring us
-to the driver file_operations.
-
-For miscellaneous character devices a second step is required: there
-is a scan for the driver entry with the same minor number as the file
-that was opened, and the appropriate minor open method is called. This
-scanning is done *every time* you open a device node. Potentially, you
-may be searching through dozens of misc. entries before you find your
-open method. While not an enormous performance overhead, this does
-seem pointless.
-
-Linux *must* move beyond the 8 bit major and minor barrier,
-somehow. If we simply increase each to 16 bits, then the indexing
-scheme used for major driver lookup becomes untenable, because the
-major tables (one each for character and block devices) would need to
-be 64 k entries long (512 kBytes on x86, 1 MByte for 64 bit
-systems). So we would have to use a scheme like that used for
-miscellaneous character devices, which means the search time goes up
-linearly with the average number of major device drivers on your
-system. Not all "devices" are hardware, some are higher-level drivers
-like KGI, so you can get more "devices" without adding hardware
-You can improve this by creating an ordered (balanced:-)
-binary tree, in which case your search time becomes log(N).
-Alternatively, you can use hashing to speed up the search.
-But why do that search at all if you don't have to? Once again, it
-seems pointless.
-
-Note that devfs doesn't use the major&minor system. For devfs
-entries, the connection is done when you lookup the /dev entry. When
-devfs_register() is called, an internal table is appended which has
-the entry name and the file_operations. If the dentry cache doesn't
-have the /dev entry already, this internal table is scanned to get the
-file_operations, and an inode is created. If the dentry cache already
-has the entry, there is *no lookup time* (other than the dentry scan
-itself, but we can't avoid that anyway, and besides Linux dentries
-cream other OS's which don't have them:-). Furthermore, the number of
-node entries in a devfs is only the number of available device
-entries, not the number of *conceivable* entries. Even if you remove
-unnecessary entries in a disc-based /dev, the number of conceivable
-entries remains the same: you just limit yourself in order to save
-space.
-
-Devfs provides a fast connection between a VFS node and the device
-driver, in a scalable way.
-
-/dev as a system administration tool
-
-Right now /dev contains a list of conceivable devices, most of which I
-don't have. Devfs only shows those devices available on my
-system. This means that listing /dev is a handy way of checking what
-devices are available.
-
-Major&minor size
-
-Existing major and minor numbers are limited to 8 bits each. This is
-now a limiting factor for some drivers, particularly the SCSI disc
-driver, which consumes a single major number. Only 16 discs are
-supported, and each disc may have only 15 partitions. Maybe this isn't
-a problem for you, but some of us are building huge Linux systems with
-disc arrays. With devfs an arbitrary pointer can be associated with
-each device entry, which can be used to give an effective 32 bit
-device identifier (i.e. that's like having a 32 bit minor
-number). Since this is private to the kernel, there are no C library
-compatibility issues which you would have with increasing major and
-minor number sizes. See the section on "Allocation of Device Numbers"
-for details on maintaining compatibility with userspace.
-
-Solving this requires a kernel change.
-
-Since writing this, the kernel has been modified so that the SCSI disc
-driver has more major numbers allocated to it and now supports up to
-128 discs. Since these major numbers are non-contiguous (a result of
-unplanned expansion), the implementation is a little more cumbersome
-than originally.
-
-Just like the changes to IPv4 to fix impending limitations in the
-address space, people find ways around the limitations. In the long
-run, however, solutions like IPv6 or devfs can't be put off forever.
-
-Read-only root filesystem
-
-Having your device nodes on the root filesystem means that you can't
-operate properly with a read-only root filesystem. This is because you
-want to change ownerships and protections of tty devices. Existing
-practice prevents you using a CD-ROM as your root filesystem for a
-*real* system. Sure, you can boot off a CD-ROM, but you can't change
-tty ownerships, so it's only good for installing.
-
-Also, you can't use a shared NFS root filesystem for a cluster of
-discless Linux machines (having tty ownerships changed on a common
-/dev is not good). Nor can you embed your root filesystem in a
-ROM-FS.
-
-You can get around this by creating a RAMDISC at boot time, making
-an ext2 filesystem in it, mounting it somewhere and copying the
-contents of /dev into it, then unmounting it and mounting it over
-/dev.
-
-A devfs is a cleaner way of solving this.
-
-Non-Unix root filesystem
-
-Non-Unix filesystems (such as NTFS) can't be used for a root
-filesystem because they variously don't support character and block
-special files or symbolic links. You can't have a separate disc-based
-or RAMDISC-based filesystem mounted on /dev because you need device
-nodes before you can mount these. Devfs can be mounted without any
-device nodes. Devlinks won't work because symlinks aren't supported.
-An alternative solution is to use initrd to mount a RAMDISC initial
-root filesystem (which is populated with a minimal set of device
-nodes), and then construct a new /dev in another RAMDISC, and finally
-switch to your non-Unix root filesystem. This requires clever boot
-scripts and a fragile and conceptually complex boot procedure.
-
-Devfs solves this in a robust and conceptually simple way.
-
-PTY security
-
-Current pseudo-tty (pty) devices are owned by root and read-writable
-by everyone. The user of a pty-pair cannot change
-ownership/protections without being suid-root.
-
-This could be solved with a secure user-space daemon which runs as
-root and does the actual creation of pty-pairs. Such a daemon would
-require modification to *every* programme that wants to use this new
-mechanism. It also slows down creation of pty-pairs.
-
-An alternative is to create a new open_pty() syscall which does much
-the same thing as the user-space daemon. Once again, this requires
-modifications to pty-handling programmes.
-
-The devfs solution allows a device driver to "tag" certain device
-files so that when an unopened device is opened, the ownerships are
-changed to the current euid and egid of the opening process, and the
-protections are changed to the default registered by the driver. When
-the device is closed ownership is set back to root and protections are
-set back to read-write for everybody. No programme need be changed.
-The devpts filesystem provides this auto-ownership feature for Unix98
-ptys. It doesn't support old-style pty devices, nor does it have all
-the other features of devfs.
-
-Intelligent device management
-
-Devfs implements a simple yet powerful protocol for communication with
-a device management daemon (devfsd) which runs in user space. It is
-possible to send a message (either synchronously or asynchronously) to
-devfsd on any event, such as registration/unregistration of device
-entries, opening and closing devices, looking up inodes, scanning
-directories and more. This has many possibilities. Some of these are
-already implemented. See:
-
-
-http://www.atnf.csiro.au/~rgooch/linux/
-
-Device entry registration events can be used by devfsd to change
-permissions of newly-created device nodes. This is one mechanism to
-control device permissions.
-
-Device entry registration/unregistration events can be used to run
-programmes or scripts. This can be used to provide automatic mounting
-of filesystems when a new block device media is inserted into the
-drive.
-
-Asynchronous device open and close events can be used to implement
-clever permissions management. For example, the default permissions on
-/dev/dsp do not allow everybody to read from the device. This is
-sensible, as you don't want some remote user recording what you say at
-your console. However, the console user is also prevented from
-recording. This behaviour is not desirable. With asynchronous device
-open and close events, you can have devfsd run a programme or script
-when console devices are opened to change the ownerships for *other*
-device nodes (such as /dev/dsp). On closure, you can run a different
-script to restore permissions. An advantage of this scheme over
-modifying the C library tty handling is that this works even if your
-programme crashes (how many times have you seen the utmp database with
-lingering entries for non-existent logins?).
-
-Synchronous device open events can be used to perform intelligent
-device access protections. Before the device driver open() method is
-called, the daemon must first validate the open attempt, by running an
-external programme or script. This is far more flexible than access
-control lists, as access can be determined on the basis of other
-system conditions instead of just the UID and GID.
-
-Inode lookup events can be used to authenticate module autoload
-requests. Instead of using kmod directly, the event is sent to
-devfsd which can implement an arbitrary authentication before loading
-the module itself.
-
-Inode lookup events can also be used to construct arbitrary
-namespaces, without having to resort to populating devfs with symlinks
-to devices that don't exist.
-
-Speculative Device Scanning
-
-Consider an application (like cdparanoia) that wants to find all
-CD-ROM devices on the system (SCSI, IDE and other types), whether or
-not their respective modules are loaded. The application must
-speculatively open certain device nodes (such as /dev/sr0 for the SCSI
-CD-ROMs) in order to make sure the module is loaded. This requires
-that all Linux distributions follow the standard device naming scheme
-(last time I looked RedHat did things differently). Devfs solves the
-naming problem.
-
-The same application also wants to see which devices are actually
-available on the system. With the existing system it needs to read the
-/dev directory and speculatively open each /dev/sr* device to
-determine if the device exists or not. With a large /dev this is an
-inefficient operation, especially if there are many /dev/sr* nodes. A
-solution like scsidev could reduce the number of /dev/sr* entries (but
-of course that also requires all that inefficient directory scanning).
-
-With devfs, the application can open the /dev/sr directory
-(which triggers the module autoloading if required), and proceed to
-read /dev/sr. Since only the available devices will have
-entries, there are no inefficencies in directory scanning or device
-openings.
-
------------------------------------------------------------------------------
-
-Who else does it?
-
-FreeBSD has a devfs implementation. Solaris and AIX each have a
-pseudo-devfs (something akin to scsidev but for all devices, with some
-unspecified kernel support). BeOS, Plan9 and QNX also have it. SGI's
-IRIX 6.4 and above also have a device filesystem.
-
-While we shouldn't just automatically do something because others do
-it, we should not ignore the work of others either. FreeBSD has a lot
-of competent people working on it, so their opinion should not be
-blithely ignored.
-
------------------------------------------------------------------------------
-
-
-How it works
-
-Registering device entries
-
-For every entry (device node) in a devfs-based /dev a driver must call
-devfs_register(). This adds the name of the device entry, the
-file_operations structure pointer and a few other things to an
-internal table. Device entries may be added and removed at any
-time. When a device entry is registered, it automagically appears in
-any mounted devfs'.
-
-Inode lookup
-
-When a lookup operation on an entry is performed and if there is no
-driver information for that entry devfs will attempt to call
-devfsd. If still no driver information can be found then a negative
-dentry is yielded and the next stage operation will be called by the
-VFS (such as create() or mknod() inode methods). If driver information
-can be found, an inode is created (if one does not exist already) and
-all is well.
-
-Manually creating device nodes
-
-The mknod() method allows you to create an ordinary named pipe in the
-devfs, or you can create a character or block special inode if one
-does not already exist. You may wish to create a character or block
-special inode so that you can set permissions and ownership. Later, if
-a device driver registers an entry with the same name, the
-permissions, ownership and times are retained. This is how you can set
-the protections on a device even before the driver is loaded. Once you
-create an inode it appears in the directory listing.
-
-Unregistering device entries
-
-A device driver calls devfs_unregister() to unregister an entry.
-
-Chroot() gaols
-
-2.2.x kernels
-
-The semantics of inode creation are different when devfs is mounted
-with the "explicit" option. Now, when a device entry is registered, it
-will not appear until you use mknod() to create the device. It doesn't
-matter if you mknod() before or after the device is registered with
-devfs_register(). The purpose of this behaviour is to support
-chroot(2) gaols, where you want to mount a minimal devfs inside the
-gaol. Only the devices you specifically want to be available (through
-your mknod() setup) will be accessible.
-
-2.4.x kernels
-
-As of kernel 2.3.99, the VFS has had the ability to rebind parts of
-the global filesystem namespace into another part of the namespace.
-This now works even at the leaf-node level, which means that
-individual files and device nodes may be bound into other parts of the
-namespace. This is like making links, but better, because it works
-across filesystems (unlike hard links) and works through chroot()
-gaols (unlike symbolic links).
-
-Because of these improvements to the VFS, the multi-mount capability
-in devfs is no longer needed. The administrator may create a minimal
-device tree inside a chroot(2) gaol by using VFS bindings. As this
-provides most of the features of the devfs multi-mount capability, I
-removed the multi-mount support code (after issuing an RFC). This
-yielded code size reductions and simplifications.
-
-If you want to construct a minimal chroot() gaol, the following
-command should suffice:
-
-mount --bind /dev/null /gaol/dev/null
-
-
-Repeat for other device nodes you want to expose. Simple!
-
------------------------------------------------------------------------------
-
-
-Operational issues
-
-
-Instructions for the impatient
-
-Nobody likes reading documentation. People just want to get in there
-and play. So this section tells you quickly the steps you need to take
-to run with devfs mounted over /dev. Skip these steps and you will end
-up with a nearly unbootable system. Subsequent sections describe the
-issues in more detail, and discuss non-essential configuration
-options.
-
-Devfsd
-OK, if you're reading this, I assume you want to play with
-devfs. First you should ensure that /usr/src/linux contains a
-recent kernel source tree. Then you need to compile devfsd, the device
-management daemon, available at
-
-http://www.atnf.csiro.au/~rgooch/linux/.
-Because the kernel has a naming scheme
-which is quite different from the old naming scheme, you need to
-install devfsd so that software and configuration files that use the
-old naming scheme will not break.
-
-Compile and install devfsd. You will be provided with a default
-configuration file /etc/devfsd.conf which will provide
-compatibility symlinks for the old naming scheme. Don't change this
-config file unless you know what you're doing. Even if you think you
-do know what you're doing, don't change it until you've followed all
-the steps below and booted a devfs-enabled system and verified that it
-works.
-
-Now edit your main system boot script so that devfsd is started at the
-very beginning (before any filesystem
-checks). /etc/rc.d/rc.sysinit is often the main boot script
-on systems with SysV-style boot scripts. On systems with BSD-style
-boot scripts it is often /etc/rc. Also check
-/sbin/rc.
-
-NOTE that the line you put into the boot
-script should be exactly:
-
-/sbin/devfsd /dev
-
-DO NOT use some special daemon-launching
-programme, otherwise the boot script may not wait for devfsd to finish
-initialising.
-
-System Libraries
-There may still be some problems because of broken software making
-assumptions about device names. In particular, some software does not
-handle devices which are symbolic links. If you are running a libc 5
-based system, install libc 5.4.44 (if you have libc 5.4.46, go back to
-libc 5.4.44, which is actually correct). If you are running a glibc
-based system, make sure you have glibc 2.1.3 or later.
-
-/etc/securetty
-PAM (Pluggable Authentication Modules) is supposed to be a flexible
-mechanism for providing better user authentication and access to
-services. Unfortunately, it's also fragile, complex and undocumented
-(check out RedHat 6.1, and probably other distributions as well). PAM
-has problems with symbolic links. Append the following lines to your
-/etc/securetty file:
-
-vc/1
-vc/2
-vc/3
-vc/4
-vc/5
-vc/6
-vc/7
-vc/8
-
-This will not weaken security. If you have a version of util-linux
-earlier than 2.10.h, please upgrade to 2.10.h or later. If you
-absolutely cannot upgrade, then also append the following lines to
-your /etc/securetty file:
-
-1
-2
-3
-4
-5
-6
-7
-8
-
-This may potentially weaken security by allowing root logins over the
-network (a password is still required, though). However, since there
-are problems with dealing with symlinks, I'm suspicious of the level
-of security offered in any case.
-
-XFree86
-While not essential, it's probably a good idea to upgrade to XFree86
-4.0, as patches went in to make it more devfs-friendly. If you don't,
-you'll probably need to apply the following patch to
-/etc/security/console.perms so that ordinary users can run
-startx. Note that not all distributions have this file (e.g. Debian),
-so if it's not present, don't worry about it.
-
---- /etc/security/console.perms.orig    Sat Apr 17 16:26:47 1999 
-+++ /etc/security/console.perms Fri Feb 25 23:53:55 2000 
-@@ -14,7 +14,7 @@ 
- # man 5 console.perms 
-
- # file classes -- these are regular expressions 
--<console>=tty[0-9][0-9]* :[0-9]\.[0-9] :[0-9] 
-+<console>=tty[0-9][0-9]* vc/[0-9][0-9]* :[0-9]\.[0-9] :[0-9] 
-
- # device classes -- these are shell-style globs 
- <floppy>=/dev/fd[0-1]* 
-
-If the patch does not apply, then change the line:
-
-<console>=tty[0-9][0-9]* :[0-9]\.[0-9] :[0-9]
-
-with:
-
-<console>=tty[0-9][0-9]* vc/[0-9][0-9]* :[0-9]\.[0-9] :[0-9]
-
-
-Disable devpts
-I've had a report of devpts mounted on /dev/pts not working
-correctly. Since devfs will also manage /dev/pts, there is no
-need to mount devpts as well. You should either edit your
-/etc/fstab so devpts is not mounted, or disable devpts from
-your kernel configuration.
-
-Unsupported drivers
-Not all drivers have devfs support. If you depend on one of these
-drivers, you will need to create a script or tarfile that you can use
-at boot time to create device nodes as appropriate. There is a
-section which describes this. Another
-section lists the drivers which have
-devfs support.
-
-/dev/mouse
-
-Many disributions configure /dev/mouse to be the mouse device
-for XFree86 and GPM. I actually think this is a bad idea, because it
-adds another level of indirection. When looking at a config file, if
-you see /dev/mouse you're left wondering which mouse
-is being referred to. Hence I recommend putting the actual mouse
-device (for example /dev/psaux) into your
-/etc/X11/XF86Config file (and similarly for the GPM
-configuration file).
-
-Alternatively, use the same technique used for unsupported drivers
-described above.
-
-The Kernel
-Finally, you need to make sure devfs is compiled into your kernel. Set
-CONFIG_EXPERIMENTAL=y, CONFIG_DEVFS_FS=y and CONFIG_DEVFS_MOUNT=y by
-using favourite configuration tool (i.e. make config or
-make xconfig) and then make clean and then recompile your kernel and 
-modules. At boot, devfs will be mounted onto /dev.
-
-If you encounter problems booting (for example if you forgot a
-configuration step), you can pass devfs=nomount at the kernel
-boot command line. This will prevent the kernel from mounting devfs at
-boot time onto /dev.
-
-In general, a kernel built with CONFIG_DEVFS_FS=y but without mounting
-devfs onto /dev is completely safe, and requires no
-configuration changes. One exception to take note of is when
-LABEL= directives are used in /etc/fstab. In this
-case you will be unable to boot properly. This is because the
-mount(8) programme uses /proc/partitions as part of
-the volume label search process, and the device names it finds are not
-available, because setting CONFIG_DEVFS_FS=y changes the names in
-/proc/partitions, irrespective of whether devfs is mounted.
-
-Now you've finished all the steps required. You're now ready to boot
-your shiny new kernel. Enjoy.
-
-Changing the configuration
-
-OK, you've now booted a devfs-enabled system, and everything works.
-Now you may feel like changing the configuration (common targets are
-/etc/fstab and /etc/devfsd.conf). Since you have a
-system that works, if you make any changes and it doesn't work, you
-now know that you only have to restore your configuration files to the
-default and it will work again.
-
-
-Permissions persistence across reboots
-
-If you don't use mknod(2) to create a device file, nor use chmod(2) or
-chown(2) to change the ownerships/permissions, the inode ctime will
-remain at 0 (the epoch, 12 am, 1-JAN-1970, GMT). Anything with a ctime
-later than this has had it's ownership/permissions changed. Hence, a
-simple script or programme may be used to tar up all changed inodes,
-prior to shutdown. Although effective, many consider this approach a
-kludge.
-
-A much better approach is to use devfsd to save and restore
-permissions. It may be configured to record changes in permissions and
-will save them in a database (in fact a directory tree), and restore
-these upon boot. This is an efficient method and results in immediate
-saving of current permissions (unlike the tar approach, which saves
-permissions at some unspecified future time).
-
-The default configuration file supplied with devfsd has config entries
-which you may uncomment to enable persistence management.
-
-If you decide to use the tar approach anyway, be aware that tar will
-first unlink(2) an inode before creating a new device node. The
-unlink(2) has the effect of breaking the connection between a devfs
-entry and the device driver. If you use the "devfs=only" boot option,
-you lose access to the device driver, requiring you to reload the
-module. I consider this a bug in tar (there is no real need to
-unlink(2) the inode first).
-
-Alternatively, you can use devfsd to provide more sophisticated
-management of device permissions. You can use devfsd to store
-permissions for whole groups of devices with a single configuration
-entry, rather than the conventional single entry per device entry.
-
-Permissions database stored in mounted-over /dev
-
-If you wish to save and restore your device permissions into the
-disc-based /dev while still mounting devfs onto /dev
-you may do so. This requires a 2.4.x kernel (in fact, 2.3.99 or
-later), which has the VFS binding facility. You need to do the
-following to set this up:
-
-
-
-make sure the kernel does not mount devfs at boot time
-
-
-make sure you have a correct /dev/console entry in your
-root file-system (where your disc-based /dev lives)
-
-create the /dev-state directory
-
-
-add the following lines near the very beginning of your boot
-scripts:
-
-mount --bind /dev /dev-state
-mount -t devfs none /dev
-devfsd /dev
-
-
-
-
-add the following lines to your /etc/devfsd.conf file:
-
-REGISTER	^pt[sy]		IGNORE
-CREATE		^pt[sy]		IGNORE
-CHANGE		^pt[sy]		IGNORE
-DELETE		^pt[sy]		IGNORE
-REGISTER	.*		COPY	/dev-state/$devname $devpath
-CREATE		.*		COPY	$devpath /dev-state/$devname
-CHANGE		.*		COPY	$devpath /dev-state/$devname
-DELETE		.*		CFUNCTION GLOBAL unlink /dev-state/$devname
-RESTORE		/dev-state
-
-Note that the sample devfsd.conf file contains these lines,
-as well as other sample configurations you may find useful. See the
-devfsd distribution
-
-
-reboot.
-
-
-
-
-Permissions database stored in normal directory
-
-If you are using an older kernel which doesn't support VFS binding,
-then you won't be able to have the permissions database in a
-mounted-over /dev. However, you can still use a regular
-directory to store the database. The sample /etc/devfsd.conf
-file above may still be used. You will need to create the
-/dev-state directory prior to installing devfsd. If you have
-old permissions in /dev, then just copy (or move) the device
-nodes over to the new directory.
-
-Which method is better?
-
-The best method is to have the permissions database stored in the
-mounted-over /dev. This is because you will not need to copy
-device nodes over to /dev-state, and because it allows you to
-switch between devfs and non-devfs kernels, without requiring you to
-copy permissions between /dev-state (for devfs) and
-/dev (for non-devfs).
-
-
-Dealing with drivers without devfs support
-
-Currently, not all device drivers in the kernel have been modified to
-use devfs. Device drivers which do not yet have devfs support will not
-automagically appear in devfs. The simplest way to create device nodes
-for these drivers is to unpack a tarfile containing the required
-device nodes. You can do this in your boot scripts. All your drivers
-will now work as before.
-
-Hopefully for most people devfs will have enough support so that they
-can mount devfs directly over /dev without losing most functionality
-(i.e. losing access to various devices). As of 22-JAN-1998 (devfs
-patch version 10) I am now running this way. All the devices I have
-are available in devfs, so I don't lose anything.
-
-WARNING: if your configuration requires the old-style device names
-(i.e. /dev/hda1 or /dev/sda1), you must install devfsd and configure
-it to maintain compatibility entries. It is almost certain that you
-will require this. Note that the kernel creates a compatibility entry
-for the root device, so you don't need initrd.
-
-Note that you no longer need to mount devpts if you use Unix98 PTYs,
-as devfs can manage /dev/pts itself. This saves you some RAM, as you
-don't need to compile and install devpts. Note that some versions of
-glibc have a bug with Unix98 pty handling on devfs systems. Contact
-the glibc maintainers for a fix. Glibc 2.1.3 has the fix.
-
-Note also that apart from editing /etc/fstab, other things will need
-to be changed if you *don't* install devfsd. Some software (like the X
-server) hard-wire device names in their source. It really is much
-easier to install devfsd so that compatibility entries are created.
-You can then slowly migrate your system to using the new device names
-(for example, by starting with /etc/fstab), and then limiting the
-compatibility entries that devfsd creates.
-
-IF YOU CONFIGURE TO MOUNT DEVFS AT BOOT, MAKE SURE YOU INSTALL DEVFSD
-BEFORE YOU BOOT A DEVFS-ENABLED KERNEL!
-
-Now that devfs has gone into the 2.3.46 kernel, I'm getting a lot of
-reports back. Many of these are because people are trying to run
-without devfsd, and hence some things break. Please just run devfsd if
-things break. I want to concentrate on real bugs rather than
-misconfiguration problems at the moment. If people are willing to fix
-bugs/false assumptions in other code (i.e. glibc, X server) and submit
-that to the respective maintainers, that would be great.
-
-
-All the way with Devfs
-
-The devfs kernel patch creates a rationalised device tree. As stated
-above, if you want to keep using the old /dev naming scheme,
-you just need to configure devfsd appopriately (see the man
-page). People who prefer the old names can ignore this section. For
-those of us who like the rationalised names and an uncluttered
-/dev, read on.
-
-If you don't run devfsd, or don't enable compatibility entry
-management, then you will have to configure your system to use the new
-names. For example, you will then need to edit your
-/etc/fstab to use the new disc naming scheme. If you want to
-be able to boot non-devfs kernels, you will need compatibility
-symlinks in the underlying disc-based /dev pointing back to
-the old-style names for when you boot a kernel without devfs.
-
-You can selectively decide which devices you want compatibility
-entries for. For example, you may only want compatibility entries for
-BSD pseudo-terminal devices (otherwise you'll have to patch you C
-library or use Unix98 ptys instead). It's just a matter of putting in
-the correct regular expression into /dev/devfsd.conf.
-
-There are other choices of naming schemes that you may prefer. For
-example, I don't use the kernel-supplied
-names, because they are too verbose. A common misconception is
-that the kernel-supplied names are meant to be used directly in
-configuration files. This is not the case. They are designed to
-reflect the layout of the devices attached and to provide easy
-classification.
-
-If you like the kernel-supplied names, that's fine. If you don't then
-you should be using devfsd to construct a namespace more to your
-liking. Devfsd has built-in code to construct a
-namespace that is both logical and easy to
-manage. In essence, it creates a convenient abbreviation of the
-kernel-supplied namespace.
-
-You are of course free to build your own namespace. Devfsd has all the
-infrastructure required to make this easy for you. All you need do is
-write a script. You can even write some C code and devfsd can load the
-shared object as a callable extension.
-
-
-Other Issues
-
-The init programme
-Another thing to take note of is whether your init programme
-creates a Unix socket /dev/telinit. Some versions of init
-create /dev/telinit so that the telinit programme can
-communicate with the init process. If you have such a system you need
-to make sure that devfs is mounted over /dev *before* init
-starts. In other words, you can't leave the mounting of devfs to
-/etc/rc, since this is executed after init. Other
-versions of init require a named pipe /dev/initctl
-which must exist *before* init starts. Once again, you need to
-mount devfs and then create the named pipe *before* init
-starts.
-
-The default behaviour now is not to mount devfs onto /dev at
-boot time for 2.3.x and later kernels. You can correct this with the
-"devfs=mount" boot option. This solves any problems with init,
-and also prevents the dreaded:
-
-Cannot open initial console
-
-message. For 2.2.x kernels where you need to apply the devfs patch,
-the default is to mount.
-
-If you have automatic mounting of devfs onto /dev then you
-may need to create /dev/initctl in your boot scripts. The
-following lines should suffice:
-
-mknod /dev/initctl p
-kill -SIGUSR1 1       # tell init that /dev/initctl now exists
-
-Alternatively, if you don't want the kernel to mount devfs onto
-/dev then you could use the following procedure is a
-guideline for how to get around /dev/initctl problems:
-
-# cd /sbin
-# mv init init.real
-# cat > init
-#! /bin/sh
-mount -n -t devfs none /dev
-mknod /dev/initctl p
-exec /sbin/init.real $*
-[control-D]
-# chmod a+x init
-
-Note that newer versions of init create /dev/initctl
-automatically, so you don't have to worry about this.
-
-Module autoloading
-You will need to configure devfsd to enable module
-autoloading. The following lines should be placed in your
-/etc/devfsd.conf file:
-
-LOOKUP	.*		MODLOAD
-
-
-As of devfsd-v1.3.10, a generic /etc/modules.devfs
-configuration file is installed, which is used by the MODLOAD
-action. This should be sufficient for most configurations. If you
-require further configuration, edit your /etc/modules.conf
-file. The way module autoloading work with devfs is:
-
-
-a process attempts to lookup a device node (e.g. /dev/fred)
-
-
-if that device node does not exist, the full pathname is passed to
-devfsd as a string
-
-
-devfsd will pass the string to the modprobe programme (provided the
-configuration line shown above is present), and specifies that
-/etc/modules.devfs is the configuration file
-
-
-/etc/modules.devfs includes /etc/modules.conf to
-access local configurations
-
-modprobe will search it's configuration files, looking for an alias
-that translates the pathname into a module name
-
-
-the translated pathname is then used to load the module.
-
-
-If you wanted a lookup of /dev/fred to load the
-mymod module, you would require the following configuration
-line in /etc/modules.conf:
-
-alias    /dev/fred    mymod
-
-The /etc/modules.devfs configuration file provides many such
-aliases for standard device names. If you look closely at this file,
-you will note that some modules require multiple alias configuration
-lines. This is required to support module autoloading for old and new
-device names.
-
-Mounting root off a devfs device
-If you wish to mount root off a devfs device when you pass the
-"devfs=only" boot option, then you need to pass in the
-"root=<device>" option to the kernel when booting. If you use
-LILO, then you must have this in lilo.conf:
-
-append = "root=<device>"
-
-Surprised? Yep, so was I. It turns out if you have (as most people
-do):
-
-root = <device>
-
-
-then LILO will determine the device number of <device> and will
-write that device number into a special place in the kernel image
-before starting the kernel, and the kernel will use that device number
-to mount the root filesystem. So, using the "append" variety ensures
-that LILO passes the root filesystem device as a string, which devfs
-can then use.
-
-Note that this isn't an issue if you don't pass "devfs=only".
-
-TTY issues
-The ttyname(3) function in some versions of the C library makes
-false assumptions about device entries which are symbolic links.  The
-tty(1) programme is one that depends on this function.  I've
-written a patch to libc 5.4.43 which fixes this. This has been
-included in libc 5.4.44 and a similar fix is in glibc 2.1.3.
-
-
-Kernel Naming Scheme
-
-The kernel provides a default naming scheme. This scheme is designed
-to make it easy to search for specific devices or device types, and to
-view the available devices. Some device types (such as hard discs),
-have a directory of entries, making it easy to see what devices of
-that class are available. Often, the entries are symbolic links into a
-directory tree that reflects the topology of available devices. The
-topological tree is useful for finding how your devices are arranged.
-
-Below is a list of the naming schemes for the most common drivers. A
-list of reserved device names is
-available for reference. Please send email to
-rgooch@atnf.csiro.au to obtain an allocation. Please be
-patient (the maintainer is busy). An alternative name may be allocated
-instead of the requested name, at the discretion of the maintainer.
-
-Disc Devices
-
-All discs, whether SCSI, IDE or whatever, are placed under the
-/dev/discs hierarchy:
-
-	/dev/discs/disc0	first disc
-	/dev/discs/disc1	second disc
-
-
-Each of these entries is a symbolic link to the directory for that
-device. The device directory contains:
-
-	disc	for the whole disc
-	part*	for individual partitions
-
-
-CD-ROM Devices
-
-All CD-ROMs, whether SCSI, IDE or whatever, are placed under the
-/dev/cdroms hierarchy:
-
-	/dev/cdroms/cdrom0	first CD-ROM
-	/dev/cdroms/cdrom1	second CD-ROM
-
-
-Each of these entries is a symbolic link to the real device entry for
-that device.
-
-Tape Devices
-
-All tapes, whether SCSI, IDE or whatever, are placed under the
-/dev/tapes hierarchy:
-
-	/dev/tapes/tape0	first tape
-	/dev/tapes/tape1	second tape
-
-
-Each of these entries is a symbolic link to the directory for that
-device. The device directory contains:
-
-	mt			for mode 0
-	mtl			for mode 1
-	mtm			for mode 2
-	mta			for mode 3
-	mtn			for mode 0, no rewind
-	mtln			for mode 1, no rewind
-	mtmn			for mode 2, no rewind
-	mtan			for mode 3, no rewind
-
-
-SCSI Devices
-
-To uniquely identify any SCSI device requires the following
-information:
-
-  controller	(host adapter)
-  bus		(SCSI channel)
-  target	(SCSI ID)
-  unit		(Logical Unit Number)
-
-
-All SCSI devices are placed under /dev/scsi (assuming devfs
-is mounted on /dev). Hence, a SCSI device with the following
-parameters: c=1,b=2,t=3,u=4 would appear as:
-
-	/dev/scsi/host1/bus2/target3/lun4	device directory
-
-
-Inside this directory, a number of device entries may be created,
-depending on which SCSI device-type drivers were installed.
-
-See the section on the disc naming scheme to see what entries the SCSI
-disc driver creates.
-
-See the section on the tape naming scheme to see what entries the SCSI
-tape driver creates.
-
-The SCSI CD-ROM driver creates:
-
-	cd
-
-
-The SCSI generic driver creates:
-
-	generic
-
-
-IDE Devices
-
-To uniquely identify any IDE device requires the following
-information:
-
-  controller
-  bus		(aka. primary/secondary)
-  target	(aka. master/slave)
-  unit
-
-
-All IDE devices are placed under /dev/ide, and uses a similar
-naming scheme to the SCSI subsystem.
-
-XT Hard Discs
-
-All XT discs are placed under /dev/xd. The first XT disc has
-the directory /dev/xd/disc0.
-
-TTY devices
-
-The tty devices now appear as:
-
-  New name                   Old-name                   Device Type
-  --------                   --------                   -----------
-  /dev/tts/{0,1,...}         /dev/ttyS{0,1,...}         Serial ports
-  /dev/cua/{0,1,...}         /dev/cua{0,1,...}          Call out devices
-  /dev/vc/0                  /dev/tty                   Current virtual console
-  /dev/vc/{1,2,...}          /dev/tty{1...63}           Virtual consoles
-  /dev/vcc/{0,1,...}         /dev/vcs{1...63}           Virtual consoles
-  /dev/pty/m{0,1,...}        /dev/ptyp??                PTY masters
-  /dev/pty/s{0,1,...}        /dev/ttyp??                PTY slaves
-
-
-RAMDISCS
-
-The RAMDISCS are placed in their own directory, and are named thus:
-
-  /dev/rd/{0,1,2,...}
-
-
-Meta Devices
-
-The meta devices are placed in their own directory, and are named
-thus:
-
-  /dev/md/{0,1,2,...}
-
-
-Floppy discs
-
-Floppy discs are placed in the /dev/floppy directory.
-
-Loop devices
-
-Loop devices are placed in the /dev/loop directory.
-
-Sound devices
-
-Sound devices are placed in the /dev/sound directory
-(audio, sequencer, ...).
-
-
-Devfsd Naming Scheme
-
-Devfsd provides a naming scheme which is a convenient abbreviation of
-the kernel-supplied namespace. In some
-cases, the kernel-supplied naming scheme is quite convenient, so
-devfsd does not provide another naming scheme. The convenience names
-that devfsd creates are in fact the same names as the original devfs
-kernel patch created (before Linus mandated the Big Name
-Change). These are referred to as "new compatibility entries".
-
-In order to configure devfsd to create these convenience names, the
-following lines should be placed in your /etc/devfsd.conf:
-
-REGISTER	.*		MKNEWCOMPAT
-UNREGISTER	.*		RMNEWCOMPAT
-
-This will cause devfsd to create (and destroy) symbolic links which
-point to the kernel-supplied names.
-
-SCSI Hard Discs
-
-All SCSI discs are placed under /dev/sd (assuming devfs is
-mounted on /dev). Hence, a SCSI disc with the following
-parameters: c=1,b=2,t=3,u=4 would appear as:
-
-	/dev/sd/c1b2t3u4	for the whole disc
-	/dev/sd/c1b2t3u4p5	for the 5th partition
-	/dev/sd/c1b2t3u4p5s6	for the 6th slice in the 5th partition
-
-
-SCSI Tapes
-
-All SCSI tapes are placed under /dev/st. A similar naming
-scheme is used as for SCSI discs. A SCSI tape with the
-parameters:c=1,b=2,t=3,u=4 would appear as:
-
-	/dev/st/c1b2t3u4m0	for mode 0
-	/dev/st/c1b2t3u4m1	for mode 1
-	/dev/st/c1b2t3u4m2	for mode 2
-	/dev/st/c1b2t3u4m3	for mode 3
-	/dev/st/c1b2t3u4m0n	for mode 0, no rewind
-	/dev/st/c1b2t3u4m1n	for mode 1, no rewind
-	/dev/st/c1b2t3u4m2n	for mode 2, no rewind
-	/dev/st/c1b2t3u4m3n	for mode 3, no rewind
-
-
-SCSI CD-ROMs
-
-All SCSI CD-ROMs are placed under /dev/sr. A similar naming
-scheme is used as for SCSI discs. A SCSI CD-ROM with the
-parameters:c=1,b=2,t=3,u=4 would appear as:
-
-	/dev/sr/c1b2t3u4
-
-
-SCSI Generic Devices
-
-The generic (aka. raw) interface for all SCSI devices are placed under
-/dev/sg. A similar naming scheme is used as for SCSI discs. A
-SCSI generic device with the parameters:c=1,b=2,t=3,u=4 would appear
-as:
-
-	/dev/sg/c1b2t3u4
-
-
-IDE Hard Discs
-
-All IDE discs are placed under /dev/ide/hd, using a similar
-convention to SCSI discs. The following mappings exist between the new
-and the old names:
-
-	/dev/hda	/dev/ide/hd/c0b0t0u0
-	/dev/hdb	/dev/ide/hd/c0b0t1u0
-	/dev/hdc	/dev/ide/hd/c0b1t0u0
-	/dev/hdd	/dev/ide/hd/c0b1t1u0
-
-
-IDE Tapes
-
-A similar naming scheme is used as for IDE discs. The entries will
-appear in the /dev/ide/mt directory.
-
-IDE CD-ROM
-
-A similar naming scheme is used as for IDE discs. The entries will
-appear in the /dev/ide/cd directory.
-
-IDE Floppies
-
-A similar naming scheme is used as for IDE discs. The entries will
-appear in the /dev/ide/fd directory.
-
-XT Hard Discs
-
-All XT discs are placed under /dev/xd. The first XT disc
-would appear as /dev/xd/c0t0.
-
-
-Old Compatibility Names
-
-The old compatibility names are the legacy device names, such as
-/dev/hda, /dev/sda, /dev/rtc and so on.
-Devfsd can be configured to create compatibility symlinks so that you
-may continue to use the old names in your configuration files and so
-that old applications will continue to function correctly.
-
-In order to configure devfsd to create these legacy names, the
-following lines should be placed in your /etc/devfsd.conf:
-
-REGISTER	.*		MKOLDCOMPAT
-UNREGISTER	.*		RMOLDCOMPAT
-
-This will cause devfsd to create (and destroy) symbolic links which
-point to the kernel-supplied names.
-
-
------------------------------------------------------------------------------
-
-
-Device drivers currently ported
-
-- All miscellaneous character devices support devfs (this is done
-  transparently through misc_register())
-
-- SCSI discs and generic hard discs
-
-- Character memory devices (null, zero, full and so on)
-  Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- Loop devices (/dev/loop?)
- 
-- TTY devices (console, serial ports, terminals and pseudo-terminals)
-  Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- SCSI tapes (/dev/scsi and /dev/tapes)
-
-- SCSI CD-ROMs (/dev/scsi and /dev/cdroms)
-
-- SCSI generic devices (/dev/scsi)
-
-- RAMDISCS (/dev/ram?)
-
-- Meta Devices (/dev/md*)
-
-- Floppy discs (/dev/floppy)
-
-- Parallel port printers (/dev/printers)
-
-- Sound devices (/dev/sound)
-  Thanks to Eric Dumas <dumas@linux.eu.org> and
-  C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- Joysticks (/dev/joysticks)
-
-- Sparc keyboard (/dev/kbd)
-
-- DSP56001 digital signal processor (/dev/dsp56k)
-
-- Apple Desktop Bus (/dev/adb)
-
-- Coda network file system (/dev/cfs*)
-
-- Virtual console capture devices (/dev/vcc)
-  Thanks to Dennis Hou <smilax@mindmeld.yi.org>
-
-- Frame buffer devices (/dev/fb)
-
-- Video capture devices (/dev/v4l)
-
-
------------------------------------------------------------------------------
-
-
-Allocation of Device Numbers
-
-Devfs allows you to write a driver which doesn't need to allocate a
-device number (major&minor numbers) for the internal operation of the
-kernel. However, there are a number of userspace programmes that use
-the device number as a unique handle for a device. An example is the
-find programme, which uses device numbers to determine whether
-an inode is on a different filesystem than another inode. The device
-number used is the one for the block device which a filesystem is
-using. To preserve compatibility with userspace programmes, block
-devices using devfs need to have unique device numbers allocated to
-them. Furthermore, POSIX specifies device numbers, so some kind of
-device number needs to be presented to userspace.
-
-The simplest option (especially when porting drivers to devfs) is to
-keep using the old major and minor numbers. Devfs will take whatever
-values are given for major&minor and pass them onto userspace.
-
-This device number is a 16 bit number, so this leaves plenty of space
-for large numbers of discs and partitions. This scheme can also be
-used for character devices, in particular the tty devices, which are
-currently limited to 256 pseudo-ttys (this limits the total number of
-simultaneous xterms and remote logins).  Note that the device number
-is limited to the range 36864-61439 (majors 144-239), in order to
-avoid any possible conflicts with existing official allocations.
-
-Please note that using dynamically allocated block device numbers may
-break the NFS daemons (both user and kernel mode), which expect dev_t
-for a given device to be constant over the lifetime of remote mounts.
-
-A final note on this scheme: since it doesn't increase the size of
-device numbers, there are no compatibility issues with userspace.
-
------------------------------------------------------------------------------
-
-
-Questions and Answers
-
-
-Making things work
-Alternatives to devfs
-What I don't like about devfs
-How to report bugs
-Strange kernel messages
-Compilation problems with devfsd
-
-
-
-Making things work
-
-Here are some common questions and answers.
-
-
-
-Devfsd doesn't start
-
-Make sure you have compiled and installed devfsd
-Make sure devfsd is being started from your boot
-scripts
-Make sure you have configured your kernel to enable devfs (see
-below)
-Make sure devfs is mounted (see below)
-
-
-Devfsd is not managing all my permissions
-
-Make sure you are capturing the appropriate events. For example,
-device entries created by the kernel generate REGISTER events,
-but those created by devfsd generate CREATE events.
-
-
-Devfsd is not capturing all REGISTER events
-
-See the previous entry: you may need to capture CREATE events.
-
-
-X will not start
-
-Make sure you followed the steps 
-outlined above.
-
-
-Why don't my network devices appear in devfs?
-
-This is not a bug. Network devices have their own, completely separate
-namespace. They are accessed via socket(2) and
-setsockopt(2) calls, and thus require no device nodes. I have
-raised the possibilty of moving network devices into the device
-namespace, but have had no response.
-
-
-How can I test if I have devfs compiled into my kernel?
-
-All filesystems built-in or currently loaded are listed in
-/proc/filesystems. If you see a devfs entry, then
-you know that devfs was compiled into your kernel. If you have
-correctly configured and rebuilt your kernel, then devfs will be
-built-in. If you think you've configured it in, but
-/proc/filesystems doesn't show it, you've made a mistake.
-Common mistakes include:
-
-Using a 2.2.x kernel without applying the devfs patch (if you
-don't know how to patch your kernel, use 2.4.x instead, don't bother
-asking me how to patch)
-Forgetting to set CONFIG_EXPERIMENTAL=y
-Forgetting to set CONFIG_DEVFS_FS=y
-Forgetting to set CONFIG_DEVFS_MOUNT=y (if you want devfs
-to be automatically mounted at boot)
-Editing your .config manually, instead of using make
-config or make xconfig
-Forgetting to run make dep; make clean after changing the
-configuration and before compiling
-Forgetting to compile your kernel and modules
-Forgetting to install your kernel
-Forgetting to install your modules
-
-Please check twice that you've done all these steps before sending in
-a bug report.
-
-
-
-How can I test if devfs is mounted on /dev?
-
-The device filesystem will always create an entry called
-".devfsd", which is used to communicate with the daemon. Even
-if the daemon is not running, this entry will exist. Testing for the
-existence of this entry is the approved method of determining if devfs
-is mounted or not. Note that the type of entry (i.e. regular file,
-character device, named pipe, etc.) may change without notice. Only
-the existence of the entry should be relied upon.
-
-
-When I start devfsd, I see the error:
-Error opening file: ".devfsd"   No such file or directory?
-
-This means that devfs is not mounted. Make sure you have devfs mounted.
-
-
-How do I mount devfs?
-
-First make sure you have devfs compiled into your kernel (see
-above). Then you will either need to:
-
-set CONFIG_DEVFS_MOUNT=y in your kernel config
-pass devfs=mount to your boot loader
-mount devfs manually in your boot scripts with:
-mount -t none devfs /dev
-
-
-
-Mount by volume LABEL=<label> doesn't work with
-devfs
-
-Most probably you are not mounting devfs onto /dev. What
-happens is that if your kernel config has CONFIG_DEVFS_FS=y
-then the contents of /proc/partitions will have the devfs
-names (such as scsi/host0/bus0/target0/lun0/part1). The
-contents of /proc/partitions are used by mount(8) when
-mounting by volume label. If devfs is not mounted on /dev,
-then mount(8) will fail to find devices. The solution is to
-make sure that devfs is mounted on /dev. See above for how to
-do that.
-
-
-I have extra or incorrect entries in /dev
-
-You may have stale entries in your dev-state area. Check for a
-RESTORE configuration line in your devfsd configuration
-(typically /etc/devfsd.conf). If you have this line, check
-the contents of the specified directory for stale entries. Remove
-any entries which are incorrect, then reboot.
-
-
-I get "Unable to open initial console" messages at boot
-
-This usually happens when you don't have devfs automounted onto
-/dev at boot time, and there is no valid
-/dev/console entry on your root file-system. Create a valid
-/dev/console device node.
-
-
-
-
-
-Alternatives to devfs
-
-I've attempted to collate all the anti-devfs proposals and explain
-their limitations. Under construction.
-
-
-Why not just pass device create/remove events to a daemon?
-
-Here the suggestion is to develop an API in the kernel so that devices
-can register create and remove events, and a daemon listens for those
-events. The daemon would then populate/depopulate /dev (which
-resides on disc).
-
-This has several limitations:
-
-
-it only works for modules loaded and unloaded (or devices inserted
-and removed) after the kernel has finished booting. Without a database
-of events, there is no way the daemon could fully populate
-/dev
-
-
-if you add a database to this scheme, the question is then how to
-present that database to user-space. If you make it a list of strings
-with embedded event codes which are passed through a pipe to the
-daemon, then this is only of use to the daemon. I would argue that the
-natural way to present this data is via a filesystem (since many of
-the events will be of a hierarchical nature), such as devfs.
-Presenting the data as a filesystem makes it easy for the user to see
-what is available and also makes it easy to write scripts to scan the
-"database"
-
-
-the tight binding between device nodes and drivers is no longer
-possible (requiring the otherwise perfectly avoidable
-table lookups)
-
-
-you cannot catch inode lookup events on /dev which means
-that module autoloading requires device nodes to be created. This is a
-problem, particularly for drivers where only a few inodes are created
-from a potentially large set
-
-
-this technique can't be used when the root FS is mounted
-read-only
-
-
-
-
-Just implement a better scsidev
-
-This suggestion involves taking the scsidev programme and
-extending it to scan for all devices, not just SCSI devices. The
-scsidev programme works by scanning /proc/scsi
-
-Problems:
-
-
-the kernel does not currently provide a list of all devices
-available. Not all drivers register entries in /proc or
-generate kernel messages
-
-
-there is no uniform mechanism to register devices other than the
-devfs API
-
-
-implementing such an API is then the same as the
-proposal above
-
-
-
-
-Put /dev on a ramdisc
-
-This suggestion involves creating a ramdisc and populating it with
-device nodes and then mounting it over /dev.
-
-Problems:
-
-
-
-this doesn't help when mounting the root filesystem, since you
-still need a device node to do that
-
-
-if you want to use this technique for the root device node as
-well, you need to use initrd. This complicates the booting sequence
-and makes it significantly harder to administer and configure. The
-initrd is essentially opaque, robbing the system administrator of easy
-configuration
-
-
-insufficient information is available to correctly populate the
-ramdisc. So we come back to the
-proposal above to "solve" this
-
-
-a ramdisc-based solution would take more kernel memory, since the
-backing store would be (at best) normal VFS inodes and dentries, which
-take 284 bytes and 112 bytes, respectively, for each entry. Compare
-that to 72 bytes for devfs
-
-
-
-
-Do nothing: there's no problem
-
-Sometimes people can be heard to claim that the existing scheme is
-fine. This is what they're ignoring:
-
-
-device number size (8 bits each for major and minor) is a real
-limitation, and must be fixed somehow. Systems with large numbers of
-SCSI devices, for example, will continue to consume the remaining
-unallocated major numbers. USB will also need to push beyond the 8 bit
-minor limitation
-
-
-simply increasing the device number size is insufficient. Apart
-from causing a lot of pain, it doesn't solve the management issues
-of a /dev with thousands or more device nodes
-
-
-ignoring the problem of a huge /dev will not make it go
-away, and dismisses the legitimacy of a large number of people who
-want a dynamic /dev
-
-
-the standard response then becomes: "write a device management
-daemon", which brings us back to the
-proposal above
-
-
-
-
-What I don't like about devfs
-
-Here are some common complaints about devfs, and some suggestions and
-solutions that may make it more palatable for you. I can't please
-everybody, but I do try :-)
-
-I hate the naming scheme
-
-First, remember that no naming scheme will please everybody. You hate
-the scheme, others love it. Who's to say who's right and who's wrong?
-Ultimately, the person who writes the code gets to choose, and what
-exists now is a combination of the choices made by the
-devfs author and the
-kernel maintainer (Linus).
-
-However, not all is lost. If you want to create your own naming
-scheme, it is a simple matter to write a standalone script, hack
-devfsd, or write a script called by devfsd. You can create whatever
-naming scheme you like.
-
-Further, if you want to remove all traces of the devfs naming scheme
-from /dev, you can mount devfs elsewhere (say
-/devfs) and populate /dev with links into
-/devfs. This population can be automated using devfsd if you
-wish.
-
-You can even use the VFS binding facility to make the links, rather
-than using symbolic links. This way, you don't even have to see the
-"destination" of these symbolic links.
-
-Devfs puts policy into the kernel
-
-There's already policy in the kernel. Device numbers are in fact
-policy (why should the kernel dictate what device numbers I use?).
-Face it, some policy has to be in the kernel. The real difference
-between device names as policy and device numbers as policy is that
-no one will use device numbers directly, because device
-numbers are devoid of meaning to humans and are ugly. At least with
-the devfs device names, (even though you can add your own naming
-scheme) some people will use the devfs-supplied names directly. This
-offends some people :-)
-
-Devfs is bloatware
-
-This is not even remotely true. As shown above,
-both code and data size are quite modest.
-
-
-How to report bugs
-
-If you have (or think you have) a bug with devfs, please follow the
-steps below:
-
-
-
-make sure you have enabled debugging output when configuring your
-kernel. You will need to set (at least) the following config options:
-
-CONFIG_DEVFS_DEBUG=y
-CONFIG_DEBUG_KERNEL=y
-CONFIG_DEBUG_SLAB=y
-
-
-
-please make sure you have the latest devfs patches applied. The
-latest kernel version might not have the latest devfs patches applied
-yet (Linus is very busy)
-
-
-save a copy of your complete kernel logs (preferably by
-using the dmesg programme) for later inclusion in your bug
-report. You may need to use the -s switch to increase the
-internal buffer size so you can capture all the boot messages.
-Don't edit or trim the dmesg output
-
-
-
-
-try booting with devfs=dall passed to the kernel boot
-command line (read the documentation on your bootloader on how to do
-this), and save the result to a file. This may be quite verbose, and
-it may overflow the messages buffer, but try to get as much of it as
-you can
-
-
-send a copy of your devfsd configuration file(s)
-
-send the bug report to me first.
-Don't expect that I will see it if you post it to the linux-kernel
-mailing list. Include all the information listed above, plus
-anything else that you think might be relevant. Put the string
-devfs somewhere in the subject line, so my mail filters mark
-it as urgent
-
-
-
-
-Here is a general guide on how to ask questions in a way that greatly
-improves your chances of getting a reply:
-
-http://www.tuxedo.org/~esr/faqs/smart-questions.html. If you have
-a bug to report, you should also read
-
-http://www.chiark.greenend.org.uk/~sgtatham/bugs.html.
-
-
-Strange kernel messages
-
-You may see devfs-related messages in your kernel logs. Below are some
-messages and what they mean (and what you should do about them, if
-anything).
-
-
-
-devfs_register(fred): could not append to parent, err: -17
-
-You need to check what the error code means, but usually 17 means
-EEXIST. This means that a driver attempted to create an entry
-fred in a directory, but there already was an entry with that
-name. This is often caused by flawed boot scripts which untar a bunch
-of inodes into /dev, as a way to restore permissions. This
-message is harmless, as the device nodes will still
-provide access to the driver (unless you use the devfs=only
-boot option, which is only for dedicated souls:-). If you want to get
-rid of these annoying messages, upgrade to devfsd-v1.3.20 and use the
-recommended RESTORE directive to restore permissions.
-
-
-devfs_mk_dir(bill): using old entry in dir: c1808724 ""
-
-This is similar to the message above, except that a driver attempted
-to create a directory named bill, and the parent directory
-has an entry with the same name. In this case, to ensure that drivers
-continue to work properly, the old entry is re-used and given to the
-driver. In 2.5 kernels, the driver is given a NULL entry, and thus,
-under rare circumstances, may not create the require device nodes.
-The solution is the same as above.
-
-
-
-
-
-Compilation problems with devfsd
-
-Usually, you can compile devfsd just by typing in
-make in the source directory, followed by a make
-install (as root). Sometimes, you may have problems, particularly
-on broken configurations.
-
-
-
-error messages relating to DEVFSD_NOTIFY_DELETE
-
-This happened because you have an ancient set of kernel headers
-installed in /usr/include/linux or /usr/src/linux.
-Install kernel 2.4.10 or later. You may need to pass the
-KERNEL_DIR variable to make (if you did not install
-the new kernel sources as /usr/src/linux), or you may copy
-the devfs_fs.h file in the kernel source tree into
-/usr/include/linux.
-
-
-
-
------------------------------------------------------------------------------
-
-
-Other resources
-
-
-
-Douglas Gilbert has written a useful document at
-
-http://www.torque.net/sg/devfs_scsi.html which
-explores the SCSI subsystem and how it interacts with devfs
-
-
-Douglas Gilbert has written another useful document at
-
-http://www.torque.net/scsi/SCSI-2.4-HOWTO/ which
-discusses the Linux SCSI subsystem in 2.4.
-
-
-Johannes Erdfelt has started a discussion paper on Linux and
-hot-swap devices, describing what the requirements are for a scalable
-solution and how and why he's used devfs+devfsd. Note that this is an
-early draft only, available in plain text form at:
-
-http://johannes.erdfelt.com/hotswap.txt.
-Johannes has promised a HTML version will follow.
-
-
-I presented an invited 
-paper
-at the
-
-2nd Annual Storage Management Workshop held in Miamia, Florida,
-U.S.A. in October 2000.
-
-
-
-
------------------------------------------------------------------------------
-
-
-Translations of this document
-
-This document has been translated into other languages.
-
-
-
-
-The document master (in English) by rgooch@atnf.csiro.au is
-available at
-
-http://www.atnf.csiro.au/~rgooch/linux/docs/devfs.html
-
-
-
-A Korean translation by viatoris@nownuri.net is available at
-
-http://your.destiny.pe.kr/devfs/devfs.html
-
-
-
-
------------------------------------------------------------------------------
-Most flags courtesy of ITA's 
-Flags of All Countries
-used with permission. 
diff --git a/Documentation/filesystems/devfs/ToDo b/Documentation/filesystems/devfs/ToDo
deleted file mode 100644
index afd5a8f..0000000
--- a/Documentation/filesystems/devfs/ToDo
+++ /dev/null
@@ -1,40 +0,0 @@
-		Device File System (devfs) ToDo List
-
-		Richard Gooch <rgooch@atnf.csiro.au>
-
-			      3-JUL-2000
-
-This is a list of things to be done for better devfs support in the
-Linux kernel. If you'd like to contribute to the devfs, please have a
-look at this list for anything that is unallocated. Also, if there are
-items missing (surely), please contact me so I can add them to the
-list (preferably with your name attached to them:-).
-
-
-- >256 ptys
-  Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- Amiga floppy driver (drivers/block/amiflop.c)
-
-- Atari floppy driver (drivers/block/ataflop.c)
-
-- SWIM3 (Super Woz Integrated Machine 3) floppy driver (drivers/block/swim3.c)
-
-- Amiga ZorroII ramdisc driver (drivers/block/z2ram.c)
-
-- Parallel port ATAPI CD-ROM (drivers/block/paride/pcd.c)
-
-- Parallel port ATAPI floppy (drivers/block/paride/pf.c)
-
-- AP1000 block driver (drivers/ap1000/ap.c, drivers/ap1000/ddv.c)
-
-- Archimedes floppy (drivers/acorn/block/fd1772.c)
-
-- MFM hard drive (drivers/acorn/block/mfmhd.c)
-
-- I2O block device (drivers/message/i2o/i2o_block.c)
-
-- ST-RAM device (arch/m68k/atari/stram.c)
-
-- Raw devices
-
diff --git a/Documentation/filesystems/devfs/boot-options b/Documentation/filesystems/devfs/boot-options
deleted file mode 100644
index df3d33b..0000000
--- a/Documentation/filesystems/devfs/boot-options
+++ /dev/null
@@ -1,65 +0,0 @@
-/* -*- auto-fill -*-                                                         */
-
-		Device File System (devfs) Boot Options
-
-		Richard Gooch <rgooch@atnf.csiro.au>
-
-			      18-AUG-2001
-
-
-When CONFIG_DEVFS_DEBUG is enabled, you can pass several boot options
-to the kernel to debug devfs. The boot options are prefixed by
-"devfs=", and are separated by commas. Spaces are not allowed. The
-syntax looks like this:
-
-devfs=<option1>,<option2>,<option3>
-
-and so on. For example, if you wanted to turn on debugging for module
-load requests and device registration, you would do:
-
-devfs=dmod,dreg
-
-You may prefix "no" to any option. This will invert the option.
-
-
-Debugging Options
-=================
-
-These requires CONFIG_DEVFS_DEBUG to be enabled.
-Note that all debugging options have 'd' as the first character. By
-default all options are off. All debugging output is sent to the
-kernel logs. The debugging options do not take effect until the devfs
-version message appears (just prior to the root filesystem being
-mounted).
-
-These are the options:
-
-dmod		print module load requests to <request_module>
-
-dreg		print device register requests to <devfs_register>
-
-dunreg		print device unregister requests to <devfs_unregister>
-
-dchange		print device change requests to <devfs_set_flags>
-
-dilookup	print inode lookup requests
-
-diget		print VFS inode allocations
-
-diunlink	print inode unlinks
-
-dichange	print inode changes
-
-dimknod		print calls to mknod(2)
-
-dall		some debugging turned on
-
-
-Other Options
-=============
-
-These control the default behaviour of devfs. The options are:
-
-mount		mount devfs onto /dev at boot time
-
-only		disable non-devfs device nodes for devfs-capable drivers
diff --git a/Documentation/filesystems/ext3.txt b/Documentation/filesystems/ext3.txt
index afb1335..4aecc9b 100644
--- a/Documentation/filesystems/ext3.txt
+++ b/Documentation/filesystems/ext3.txt
@@ -113,6 +113,14 @@
 grpquota
 usrquota
 
+bh		(*)	ext3 associates buffer heads to data pages to
+nobh			(a) cache disk block mapping information
+			(b) link pages into transaction to provide
+			    ordering guarantees.
+			"bh" option forces use of buffer heads.
+			"nobh" option tries to avoid associating buffer
+			heads (supported only for "writeback" mode).
+
 
 Specification
 =============
diff --git a/Documentation/filesystems/fuse.txt b/Documentation/filesystems/fuse.txt
index 33f7431..a584f05 100644
--- a/Documentation/filesystems/fuse.txt
+++ b/Documentation/filesystems/fuse.txt
@@ -18,6 +18,14 @@
   user.  NOTE: this is not the same as mounts allowed with the "user"
   option in /etc/fstab, which is not discussed here.
 
+Filesystem connection:
+
+  A connection between the filesystem daemon and the kernel.  The
+  connection exists until either the daemon dies, or the filesystem is
+  umounted.  Note that detaching (or lazy umounting) the filesystem
+  does _not_ break the connection, in this case it will exist until
+  the last reference to the filesystem is released.
+
 Mount owner:
 
   The user who does the mounting.
@@ -86,16 +94,20 @@
   The default is infinite.  Note that the size of read requests is
   limited anyway to 32 pages (which is 128kbyte on i386).
 
-Sysfs
-~~~~~
+Control filesystem
+~~~~~~~~~~~~~~~~~~
 
-FUSE sets up the following hierarchy in sysfs:
+There's a control filesystem for FUSE, which can be mounted by:
 
-  /sys/fs/fuse/connections/N/
+  mount -t fusectl none /sys/fs/fuse/connections
 
-where N is an increasing number allocated to each new connection.
+Mounting it under the '/sys/fs/fuse/connections' directory makes it
+backwards compatible with earlier versions.
 
-For each connection the following attributes are defined:
+Under the fuse control filesystem each connection has a directory
+named by a unique number.
+
+For each connection the following files exist within this directory:
 
  'waiting'
 
@@ -110,7 +122,47 @@
   connection.  This means that all waiting requests will be aborted an
   error returned for all aborted and new requests.
 
-Only a privileged user may read or write these attributes.
+Only the owner of the mount may read or write these files.
+
+Interrupting filesystem operations
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If a process issuing a FUSE filesystem request is interrupted, the
+following will happen:
+
+  1) If the request is not yet sent to userspace AND the signal is
+     fatal (SIGKILL or unhandled fatal signal), then the request is
+     dequeued and returns immediately.
+
+  2) If the request is not yet sent to userspace AND the signal is not
+     fatal, then an 'interrupted' flag is set for the request.  When
+     the request has been successfully transfered to userspace and
+     this flag is set, an INTERRUPT request is queued.
+
+  3) If the request is already sent to userspace, then an INTERRUPT
+     request is queued.
+
+INTERRUPT requests take precedence over other requests, so the
+userspace filesystem will receive queued INTERRUPTs before any others.
+
+The userspace filesystem may ignore the INTERRUPT requests entirely,
+or may honor them by sending a reply to the _original_ request, with
+the error set to EINTR.
+
+It is also possible that there's a race between processing the
+original request and it's INTERRUPT request.  There are two possibilities:
+
+  1) The INTERRUPT request is processed before the original request is
+     processed
+
+  2) The INTERRUPT request is processed after the original request has
+     been answered
+
+If the filesystem cannot find the original request, it should wait for
+some timeout and/or a number of new requests to arrive, after which it
+should reply to the INTERRUPT request with an EAGAIN error.  In case
+1) the INTERRUPT request will be requeued.  In case 2) the INTERRUPT
+reply will be ignored.
 
 Aborting a filesystem connection
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -139,8 +191,8 @@
   - Use forced umount (umount -f).  Works in all cases but only if
     filesystem is still attached (it hasn't been lazy unmounted)
 
-  - Abort filesystem through the sysfs interface.  Most powerful
-    method, always works.
+  - Abort filesystem through the FUSE control filesystem.  Most
+    powerful method, always works.
 
 How do non-privileged mounts work?
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -304,25 +356,7 @@
  |                                    |     for "file"]
  |                                    |    *DEADLOCK*
 
-The solution for this is to allow requests to be interrupted while
-they are in userspace:
-
- |      [interrupted by signal]       |
- |    <fuse_unlink()                  |
- |    [release semaphore]             |    [semaphore acquired]
- |  <sys_unlink()                     |
- |                                    |    >fuse_unlink()
- |                                    |      [queue req on fc->pending]
- |                                    |      [wake up fc->waitq]
- |                                    |      [sleep on req->waitq]
-
-If the filesystem daemon was single threaded, this will stop here,
-since there's no other thread to dequeue and execute the request.
-In this case the solution is to kill the FUSE daemon as well.  If
-there are multiple serving threads, you just have to kill them as
-long as any remain.
-
-Moral: a filesystem which deadlocks, can soon find itself dead.
+The solution for this is to allow the filesystem to be aborted.
 
 Scenario 2 - Tricky deadlock
 ----------------------------
@@ -355,24 +389,14 @@
  |                                    |           [lock page]
  |                                    |           * DEADLOCK *
 
-Solution is again to let the the request be interrupted (not
-elaborated further).
+Solution is basically the same as above.
 
-An additional problem is that while the write buffer is being
-copied to the request, the request must not be interrupted.  This
-is because the destination address of the copy may not be valid
-after the request is interrupted.
+An additional problem is that while the write buffer is being copied
+to the request, the request must not be interrupted/aborted.  This is
+because the destination address of the copy may not be valid after the
+request has returned.
 
-This is solved with doing the copy atomically, and allowing
-interruption while the page(s) belonging to the write buffer are
-faulted with get_user_pages().  The 'req->locked' flag indicates
-when the copy is taking place, and interruption is delayed until
-this flag is unset.
-
-Scenario 3 - Tricky deadlock with asynchronous read
----------------------------------------------------
-
-The same situation as above, except thread-1 will wait on page lock
-and hence it will be uninterruptible as well.  The solution is to
-abort the connection with forced umount (if mount is attached) or
-through the abort attribute in sysfs.
+This is solved with doing the copy atomically, and allowing abort
+while the page(s) belonging to the write buffer are faulted with
+get_user_pages().  The 'req->locked' flag indicates when the copy is
+taking place, and abort is delayed until this flag is unset.
diff --git a/Documentation/filesystems/inotify.txt b/Documentation/filesystems/inotify.txt
index 6d50190..59a919f 100644
--- a/Documentation/filesystems/inotify.txt
+++ b/Documentation/filesystems/inotify.txt
@@ -69,17 +69,135 @@
 	int inotify_rm_watch (int fd, __u32 mask);
 
 
-(iii) Internal Kernel Implementation
+(iii) Kernel Interface
 
-Each inotify instance is associated with an inotify_device structure.
+Inotify's kernel API consists a set of functions for managing watches and an
+event callback.
+
+To use the kernel API, you must first initialize an inotify instance with a set
+of inotify_operations.  You are given an opaque inotify_handle, which you use
+for any further calls to inotify.
+
+    struct inotify_handle *ih = inotify_init(my_event_handler);
+
+You must provide a function for processing events and a function for destroying
+the inotify watch.
+
+    void handle_event(struct inotify_watch *watch, u32 wd, u32 mask,
+    	              u32 cookie, const char *name, struct inode *inode)
+
+	watch - the pointer to the inotify_watch that triggered this call
+	wd - the watch descriptor
+	mask - describes the event that occurred
+	cookie - an identifier for synchronizing events
+	name - the dentry name for affected files in a directory-based event
+	inode - the affected inode in a directory-based event
+
+    void destroy_watch(struct inotify_watch *watch)
+
+You may add watches by providing a pre-allocated and initialized inotify_watch
+structure and specifying the inode to watch along with an inotify event mask.
+You must pin the inode during the call.  You will likely wish to embed the
+inotify_watch structure in a structure of your own which contains other
+information about the watch.  Once you add an inotify watch, it is immediately
+subject to removal depending on filesystem events.  You must grab a reference if
+you depend on the watch hanging around after the call.
+
+    inotify_init_watch(&my_watch->iwatch);
+    inotify_get_watch(&my_watch->iwatch);	// optional
+    s32 wd = inotify_add_watch(ih, &my_watch->iwatch, inode, mask);
+    inotify_put_watch(&my_watch->iwatch);	// optional
+
+You may use the watch descriptor (wd) or the address of the inotify_watch for
+other inotify operations.  You must not directly read or manipulate data in the
+inotify_watch.  Additionally, you must not call inotify_add_watch() more than
+once for a given inotify_watch structure, unless you have first called either
+inotify_rm_watch() or inotify_rm_wd().
+
+To determine if you have already registered a watch for a given inode, you may
+call inotify_find_watch(), which gives you both the wd and the watch pointer for
+the inotify_watch, or an error if the watch does not exist.
+
+    wd = inotify_find_watch(ih, inode, &watchp);
+
+You may use container_of() on the watch pointer to access your own data
+associated with a given watch.  When an existing watch is found,
+inotify_find_watch() bumps the refcount before releasing its locks.  You must
+put that reference with:
+
+    put_inotify_watch(watchp);
+
+Call inotify_find_update_watch() to update the event mask for an existing watch.
+inotify_find_update_watch() returns the wd of the updated watch, or an error if
+the watch does not exist.
+
+    wd = inotify_find_update_watch(ih, inode, mask);
+
+An existing watch may be removed by calling either inotify_rm_watch() or
+inotify_rm_wd().
+
+    int ret = inotify_rm_watch(ih, &my_watch->iwatch);
+    int ret = inotify_rm_wd(ih, wd);
+
+A watch may be removed while executing your event handler with the following:
+
+    inotify_remove_watch_locked(ih, iwatch);
+
+Call inotify_destroy() to remove all watches from your inotify instance and
+release it.  If there are no outstanding references, inotify_destroy() will call
+your destroy_watch op for each watch.
+
+    inotify_destroy(ih);
+
+When inotify removes a watch, it sends an IN_IGNORED event to your callback.
+You may use this event as an indication to free the watch memory.  Note that
+inotify may remove a watch due to filesystem events, as well as by your request.
+If you use IN_ONESHOT, inotify will remove the watch after the first event, at
+which point you may call the final inotify_put_watch.
+
+(iv) Kernel Interface Prototypes
+
+	struct inotify_handle *inotify_init(struct inotify_operations *ops);
+
+	inotify_init_watch(struct inotify_watch *watch);
+
+	s32 inotify_add_watch(struct inotify_handle *ih,
+		              struct inotify_watch *watch,
+			      struct inode *inode, u32 mask);
+
+	s32 inotify_find_watch(struct inotify_handle *ih, struct inode *inode,
+			       struct inotify_watch **watchp);
+
+	s32 inotify_find_update_watch(struct inotify_handle *ih,
+				      struct inode *inode, u32 mask);
+
+	int inotify_rm_wd(struct inotify_handle *ih, u32 wd);
+
+	int inotify_rm_watch(struct inotify_handle *ih,
+			     struct inotify_watch *watch);
+
+	void inotify_remove_watch_locked(struct inotify_handle *ih,
+					 struct inotify_watch *watch);
+
+	void inotify_destroy(struct inotify_handle *ih);
+
+	void get_inotify_watch(struct inotify_watch *watch);
+	void put_inotify_watch(struct inotify_watch *watch);
+
+
+(v) Internal Kernel Implementation
+
+Each inotify instance is represented by an inotify_handle structure.
+Inotify's userspace consumers also have an inotify_device which is
+associated with the inotify_handle, and on which events are queued.
 
 Each watch is associated with an inotify_watch structure.  Watches are chained
-off of each associated device and each associated inode.
+off of each associated inotify_handle and each associated inode.
 
-See fs/inotify.c for the locking and lifetime rules.
+See fs/inotify.c and fs/inotify_user.c for the locking and lifetime rules.
 
 
-(iv) Rationale
+(vi) Rationale
 
 Q: What is the design decision behind not tying the watch to the open fd of
    the watched object?
@@ -145,7 +263,7 @@
    file descriptor-based one that allows basic file I/O and poll/select.
    Obtaining the fd and managing the watches could have been done either via a
    device file or a family of new system calls.  We decided to implement a
-   family of system calls because that is the preffered approach for new kernel
+   family of system calls because that is the preferred approach for new kernel
    interfaces.  The only real difference was whether we wanted to use open(2)
    and ioctl(2) or a couple of new system calls.  System calls beat ioctls.
 
diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting
index 2f38846..5531694 100644
--- a/Documentation/filesystems/porting
+++ b/Documentation/filesystems/porting
@@ -50,10 +50,11 @@
 success and negative number in case of error (-EINVAL unless you have more
 informative error value to report).  Call it foo_fill_super().  Now declare
 
-struct super_block foo_get_sb(struct file_system_type *fs_type,
-	int flags, const char *dev_name, void *data)
+int foo_get_sb(struct file_system_type *fs_type,
+	int flags, const char *dev_name, void *data, struct vfsmount *mnt)
 {
-	return get_sb_bdev(fs_type, flags, dev_name, data, ext2_fill_super);
+	return get_sb_bdev(fs_type, flags, dev_name, data, foo_fill_super,
+			   mnt);
 }
 
 (or similar with s/bdev/nodev/ or s/bdev/single/, depending on the kind of
diff --git a/Documentation/filesystems/ramfs-rootfs-initramfs.txt b/Documentation/filesystems/ramfs-rootfs-initramfs.txt
index 60ab61e..25981e2 100644
--- a/Documentation/filesystems/ramfs-rootfs-initramfs.txt
+++ b/Documentation/filesystems/ramfs-rootfs-initramfs.txt
@@ -70,11 +70,13 @@
 What is rootfs?
 ---------------
 
-Rootfs is a special instance of ramfs, which is always present in 2.6 systems.
-(It's used internally as the starting and stopping point for searches of the
-kernel's doubly-linked list of mount points.)
+Rootfs is a special instance of ramfs (or tmpfs, if that's enabled), which is
+always present in 2.6 systems.  You can't unmount rootfs for approximately the
+same reason you can't kill the init process; rather than having special code
+to check for and handle an empty list, it's smaller and simpler for the kernel
+to just make sure certain lists can't become empty.
 
-Most systems just mount another filesystem over it and ignore it.  The
+Most systems just mount another filesystem over rootfs and ignore it.  The
 amount of space an empty instance of ramfs takes up is tiny.
 
 What is initramfs?
@@ -92,14 +94,16 @@
 
 All this differs from the old initrd in several ways:
 
-  - The old initrd was a separate file, while the initramfs archive is linked
-    into the linux kernel image.  (The directory linux-*/usr is devoted to
-    generating this archive during the build.)
+  - The old initrd was always a separate file, while the initramfs archive is
+    linked into the linux kernel image.  (The directory linux-*/usr is devoted
+    to generating this archive during the build.)
 
   - The old initrd file was a gzipped filesystem image (in some file format,
-    such as ext2, that had to be built into the kernel), while the new
+    such as ext2, that needed a driver built into the kernel), while the new
     initramfs archive is a gzipped cpio archive (like tar only simpler,
-    see cpio(1) and Documentation/early-userspace/buffer-format.txt).
+    see cpio(1) and Documentation/early-userspace/buffer-format.txt).  The
+    kernel's cpio extraction code is not only extremely small, it's also
+    __init data that can be discarded during the boot process.
 
   - The program run by the old initrd (which was called /initrd, not /init) did
     some setup and then returned to the kernel, while the init program from
@@ -124,13 +128,14 @@
 
 The 2.6 kernel build process always creates a gzipped cpio format initramfs
 archive and links it into the resulting kernel binary.  By default, this
-archive is empty (consuming 134 bytes on x86).  The config option
-CONFIG_INITRAMFS_SOURCE (for some reason buried under devices->block devices
-in menuconfig, and living in usr/Kconfig) can be used to specify a source for
-the initramfs archive, which will automatically be incorporated into the
-resulting binary.  This option can point to an existing gzipped cpio archive, a
-directory containing files to be archived, or a text file specification such
-as the following example:
+archive is empty (consuming 134 bytes on x86).
+
+The config option CONFIG_INITRAMFS_SOURCE (for some reason buried under
+devices->block devices in menuconfig, and living in usr/Kconfig) can be used
+to specify a source for the initramfs archive, which will automatically be
+incorporated into the resulting binary.  This option can point to an existing
+gzipped cpio archive, a directory containing files to be archived, or a text
+file specification such as the following example:
 
   dir /dev 755 0 0
   nod /dev/console 644 0 0 c 5 1
@@ -146,23 +151,84 @@
 Run "usr/gen_init_cpio" (after the kernel build) to get a usage message
 documenting the above file format.
 
-One advantage of the text file is that root access is not required to
+One advantage of the configuration file is that root access is not required to
 set permissions or create device nodes in the new archive.  (Note that those
 two example "file" entries expect to find files named "init.sh" and "busybox" in
 a directory called "initramfs", under the linux-2.6.* directory.  See
 Documentation/early-userspace/README for more details.)
 
-The kernel does not depend on external cpio tools, gen_init_cpio is created
-from usr/gen_init_cpio.c which is entirely self-contained, and the kernel's
-boot-time extractor is also (obviously) self-contained.  However, if you _do_
-happen to have cpio installed, the following command line can extract the
-generated cpio image back into its component files:
+The kernel does not depend on external cpio tools.  If you specify a
+directory instead of a configuration file, the kernel's build infrastructure
+creates a configuration file from that directory (usr/Makefile calls
+scripts/gen_initramfs_list.sh), and proceeds to package up that directory
+using the config file (by feeding it to usr/gen_init_cpio, which is created
+from usr/gen_init_cpio.c).  The kernel's build-time cpio creation code is
+entirely self-contained, and the kernel's boot-time extractor is also
+(obviously) self-contained.
+
+The one thing you might need external cpio utilities installed for is creating
+or extracting your own preprepared cpio files to feed to the kernel build
+(instead of a config file or directory).
+
+The following command line can extract a cpio image (either by the above script
+or by the kernel build) back into its component files:
 
   cpio -i -d -H newc -F initramfs_data.cpio --no-absolute-filenames
 
+The following shell script can create a prebuilt cpio archive you can
+use in place of the above config file:
+
+  #!/bin/sh
+
+  # Copyright 2006 Rob Landley <rob@landley.net> and TimeSys Corporation.
+  # Licensed under GPL version 2
+
+  if [ $# -ne 2 ]
+  then
+    echo "usage: mkinitramfs directory imagename.cpio.gz"
+    exit 1
+  fi
+
+  if [ -d "$1" ]
+  then
+    echo "creating $2 from $1"
+    (cd "$1"; find . | cpio -o -H newc | gzip) > "$2"
+  else
+    echo "First argument must be a directory"
+    exit 1
+  fi
+
+Note: The cpio man page contains some bad advice that will break your initramfs
+archive if you follow it.  It says "A typical way to generate the list
+of filenames is with the find command; you should give find the -depth option
+to minimize problems with permissions on directories that are unwritable or not
+searchable."  Don't do this when creating initramfs.cpio.gz images, it won't
+work.  The Linux kernel cpio extractor won't create files in a directory that
+doesn't exist, so the directory entries must go before the files that go in
+those directories.  The above script gets them in the right order.
+
+External initramfs images:
+--------------------------
+
+If the kernel has initrd support enabled, an external cpio.gz archive can also
+be passed into a 2.6 kernel in place of an initrd.  In this case, the kernel
+will autodetect the type (initramfs, not initrd) and extract the external cpio
+archive into rootfs before trying to run /init.
+
+This has the memory efficiency advantages of initramfs (no ramdisk block
+device) but the separate packaging of initrd (which is nice if you have
+non-GPL code you'd like to run from initramfs, without conflating it with
+the GPL licensed Linux kernel binary).
+
+It can also be used to supplement the kernel's built-in initamfs image.  The
+files in the external archive will overwrite any conflicting files in
+the built-in initramfs archive.  Some distributors also prefer to customize
+a single kernel image with task-specific initramfs images, without recompiling.
+
 Contents of initramfs:
 ----------------------
 
+An initramfs archive is a complete self-contained root filesystem for Linux.
 If you don't already understand what shared libraries, devices, and paths
 you need to get a minimal root filesystem up and running, here are some
 references:
@@ -176,13 +242,36 @@
 
 I use uClibc (http://www.uclibc.org) and busybox (http://www.busybox.net)
 myself.  These are LGPL and GPL, respectively.  (A self-contained initramfs
-package is planned for the busybox 1.2 release.)
+package is planned for the busybox 1.3 release.)
 
 In theory you could use glibc, but that's not well suited for small embedded
 uses like this.  (A "hello world" program statically linked against glibc is
 over 400k.  With uClibc it's 7k.  Also note that glibc dlopens libnss to do
 name lookups, even when otherwise statically linked.)
 
+A good first step is to get initramfs to run a statically linked "hello world"
+program as init, and test it under an emulator like qemu (www.qemu.org) or
+User Mode Linux, like so:
+
+  cat > hello.c << EOF
+  #include <stdio.h>
+  #include <unistd.h>
+
+  int main(int argc, char *argv[])
+  {
+    printf("Hello world!\n");
+    sleep(999999999);
+  }
+  EOF
+  gcc -static hello2.c -o init
+  echo init | cpio -o -H newc | gzip > test.cpio.gz
+  # Testing external initramfs using the initrd loading mechanism.
+  qemu -kernel /boot/vmlinuz -initrd test.cpio.gz /dev/zero
+
+When debugging a normal root filesystem, it's nice to be able to boot with
+"init=/bin/sh".  The initramfs equivalent is "rdinit=/bin/sh", and it's
+just as useful.
+
 Why cpio rather than tar?
 -------------------------
 
@@ -241,7 +330,7 @@
 Future directions:
 ------------------
 
-Today (2.6.14), initramfs is always compiled in, but not always used.  The
+Today (2.6.16), initramfs is always compiled in, but not always used.  The
 kernel falls back to legacy boot code that is reached only if initramfs does
 not contain an /init program.  The fallback is legacy code, there to ensure a
 smooth transition and allowing early boot functionality to gradually move to
@@ -258,8 +347,9 @@
 
 This kind of complexity (which inevitably includes policy) is rightly handled
 in userspace.  Both klibc and busybox/uClibc are working on simple initramfs
-packages to drop into a kernel build, and when standard solutions are ready
-and widely deployed, the kernel's legacy early boot code will become obsolete
-and a candidate for the feature removal schedule.
+packages to drop into a kernel build.
 
-But that's a while off yet.
+The klibc package has now been accepted into Andrew Morton's 2.6.17-mm tree.
+The kernel's current early boot code (partition detection, etc) will probably
+be migrated into a default initramfs, automatically created and used by the
+kernel build.
diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index 3a2e552..1cb7e8b 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -113,8 +113,8 @@
 struct file_system_type {
 	const char *name;
 	int fs_flags;
-        struct super_block *(*get_sb) (struct file_system_type *, int,
-                                       const char *, void *);
+        int (*get_sb) (struct file_system_type *, int,
+                       const char *, void *, struct vfsmount *);
         void (*kill_sb) (struct super_block *);
         struct module *owner;
         struct file_system_type * next;
@@ -211,7 +211,7 @@
         int (*sync_fs)(struct super_block *sb, int wait);
         void (*write_super_lockfs) (struct super_block *);
         void (*unlockfs) (struct super_block *);
-        int (*statfs) (struct super_block *, struct kstatfs *);
+        int (*statfs) (struct dentry *, struct kstatfs *);
         int (*remount_fs) (struct super_block *, int *, char *);
         void (*clear_inode) (struct inode *);
         void (*umount_begin) (struct super_block *);
diff --git a/Documentation/hwmon/abituguru b/Documentation/hwmon/abituguru
new file mode 100644
index 0000000..b2c0d61
--- /dev/null
+++ b/Documentation/hwmon/abituguru
@@ -0,0 +1,87 @@
+Kernel driver abituguru
+=======================
+
+Supported chips:
+  * Abit uGuru revision 1-3 (Hardware Monitor part only)
+    Prefix: 'abituguru'
+    Addresses scanned: ISA 0x0E0
+    Datasheet: Not available, this driver is based on reverse engineering.
+	A "Datasheet" has been written based on the reverse engineering it
+	should be available in the same dir as this file under the name
+	abituguru-datasheet.
+    Note:
+	The uGuru is a microcontroller with onboard firmware which programs
+	it to behave as a hwmon IC. There are many different revisions of the
+	firmware and thus effectivly many different revisions of the uGuru.
+	Below is an incomplete list with which revisions are used for which
+	Motherboards:
+	uGuru 1.00    ~ 1.24    (AI7, KV8-MAX3, AN7) (1)
+	uGuru 2.0.0.0 ~ 2.0.4.2 (KV8-PRO)
+	uGuru 2.1.0.0 ~ 2.1.2.8 (AS8, AV8, AA8, AG8, AA8XE, AX8)
+	uGuru 2.2.0.0 ~ 2.2.0.6 (AA8 Fatal1ty)
+	uGuru 2.3.0.0 ~ 2.3.0.9 (AN8)
+	uGuru 3.0.0.0 ~ 3.0.1.2 (AW8, AL8, NI8)
+	uGuru 4.xxxxx?          (AT8 32X) (2)
+	1) For revisions 2 and 3 uGuru's the driver can autodetect the
+	   sensortype (Volt or Temp) for bank1 sensors, for revision 1 uGuru's
+	   this doesnot always work. For these uGuru's the autodection can
+	   be overriden with the bank1_types module param. For all 3 known
+	   revison 1 motherboards the correct use of this param is:
+	   bank1_types=1,1,0,0,0,0,0,2,0,0,0,0,2,0,0,1
+	   You may also need to specify the fan_sensors option for these boards
+	   fan_sensors=5
+	2) The current version of the abituguru driver is known to NOT work
+	   on these Motherboards
+
+Authors:
+	Hans de Goede <j.w.r.degoede@hhs.nl>,
+	(Initial reverse engineering done by Olle Sandberg
+	 <ollebull@gmail.com>)
+
+
+Module Parameters
+-----------------
+
+* force: bool		Force detection. Note this parameter only causes the
+			detection to be skipped, if the uGuru can't be read
+			the module initialization (insmod) will still fail.
+* bank1_types: int[]	Bank1 sensortype autodetection override:
+			  -1 autodetect (default)
+			   0 volt sensor
+			   1 temp sensor
+			   2 not connected
+* fan_sensors: int	Tell the driver how many fan speed sensors there are
+			on your motherboard. Default: 0 (autodetect).
+* pwms: int		Tell the driver how many fan speed controls (fan
+			pwms) your motherboard has. Default: 0 (autodetect).
+* verbose: int		How verbose should the driver be? (0-3):
+			   0 normal output
+			   1 + verbose error reporting
+			   2 + sensors type probing info (default)
+			   3 + retryable error reporting
+			Default: 2 (the driver is still in the testing phase)
+
+Notice if you need any of the first three options above please insmod the
+driver with verbose set to 3 and mail me <j.w.r.degoede@hhs.nl> the output of:
+dmesg | grep abituguru
+
+
+Description
+-----------
+
+This driver supports the hardware monitoring features of the Abit uGuru chip
+found on Abit uGuru featuring motherboards (most modern Abit motherboards).
+
+The uGuru chip in reality is a Winbond W83L950D in disguise (despite Abit
+claiming it is "a new microprocessor designed by the ABIT Engineers").
+Unfortunatly this doesn't help since the W83L950D is a generic
+microcontroller with a custom Abit application running on it.
+
+Despite Abit not releasing any information regarding the uGuru, Olle
+Sandberg <ollebull@gmail.com> has managed to reverse engineer the sensor part
+of the uGuru. Without his work this driver would not have been possible.
+
+Known Issues
+------------
+
+The voltage and frequency control parts of the Abit uGuru are not supported.
diff --git a/Documentation/hwmon/abituguru-datasheet b/Documentation/hwmon/abituguru-datasheet
new file mode 100644
index 0000000..aef5a9b
--- /dev/null
+++ b/Documentation/hwmon/abituguru-datasheet
@@ -0,0 +1,312 @@
+uGuru datasheet
+===============
+
+First of all, what I know about uGuru is no fact based on any help, hints or
+datasheet from Abit. The data I have got on uGuru have I assembled through
+my weak knowledge in "backwards engineering".
+And just for the record, you may have noticed uGuru isn't a chip developed by
+Abit, as they claim it to be. It's realy just an microprocessor (uC) created by
+Winbond (W83L950D). And no, reading the manual for this specific uC or
+mailing  Windbond for help won't give any usefull data about uGuru, as it is
+the program inside the uC that is responding to calls.
+
+Olle Sandberg <ollebull@gmail.com>, 2005-05-25
+
+
+Original version by Olle Sandberg who did the heavy lifting of the initial
+reverse engineering. This version has been almost fully rewritten for clarity
+and extended with write support and info on more databanks, the write support
+is once again reverse engineered by Olle the additional databanks have been
+reverse engineered by me. I would like to express my thanks to Olle, this
+document and the Linux driver could not have been written without his efforts.
+
+Note: because of the lack of specs only the sensors part of the uGuru is
+described here and not the CPU / RAM / etc voltage & frequency control.
+
+Hans de Goede <j.w.r.degoede@hhs.nl>, 28-01-2006
+
+
+Detection
+=========
+
+As far as known the uGuru is always placed at and using the (ISA) I/O-ports
+0xE0 and 0xE4, so we don't have to scan any port-range, just check what the two
+ports are holding for detection. We will refer to 0xE0 as CMD (command-port)
+and 0xE4 as DATA because Abit refers to them with these names.
+
+If DATA holds 0x00 or 0x08 and CMD holds 0x00 or 0xAC an uGuru could be
+present. We have to check for two different values at data-port, because
+after a reboot uGuru will hold 0x00 here, but if the driver is removed and
+later on attached again data-port will hold 0x08, more about this later.
+
+After wider testing of the Linux kernel driver some variants of the uGuru have
+turned up which will hold 0x00 instead of 0xAC at the CMD port, thus we also
+have to test CMD for two different values. On these uGuru's DATA will initally
+hold 0x09 and will only hold 0x08 after reading CMD first, so CMD must be read
+first!
+
+To be really sure an uGuru is present a test read of one or more register
+sets should be done.
+
+
+Reading / Writing
+=================
+
+Addressing
+----------
+
+The uGuru has a number of different addressing levels. The first addressing
+level we will call banks. A bank holds data for one or more sensors. The data
+in a bank for a sensor is one or more bytes large.
+
+The number of bytes is fixed for a given bank, you should always read or write
+that many bytes, reading / writing more will fail, the results when writing
+less then the number of bytes for a given bank are undetermined.
+
+See below for all known bank addresses, numbers of sensors in that bank,
+number of bytes data per sensor and contents/meaning of those bytes.
+
+Although both this document and the kernel driver have kept the sensor
+terminoligy for the addressing within a bank this is not 100% correct, in
+bank 0x24 for example the addressing within the bank selects a PWM output not
+a sensor.
+
+Notice that some banks have both a read and a write address this is how the
+uGuru determines if a read from or a write to the bank is taking place, thus
+when reading you should always use the read address and when writing the
+write address. The write address is always one (1) more then the read address.
+
+
+uGuru ready
+-----------
+
+Before you can read from or write to the uGuru you must first put the uGuru
+in "ready" mode.
+
+To put the uGuru in ready mode first write 0x00 to DATA and then wait for DATA
+to hold 0x09, DATA should read 0x09 within 250 read cycles.
+
+Next CMD _must_ be read and should hold 0xAC, usually CMD will hold 0xAC the
+first read but sometimes it takes a while before CMD holds 0xAC and thus it
+has to be read a number of times (max 50).
+
+After reading CMD, DATA should hold 0x08 which means that the uGuru is ready
+for input. As above DATA will usually hold 0x08 the first read but not always.
+This step can be skipped, but it is undetermined what happens if the uGuru has
+not yet reported 0x08 at DATA and you proceed with writing a bank address.
+
+
+Sending bank and sensor addresses to the uGuru
+----------------------------------------------
+
+First the uGuru must be in "ready" mode as described above, DATA should hold
+0x08 indicating that the uGuru wants input, in this case the bank address.
+
+Next write the bank address to DATA. After the bank address has been written
+wait for to DATA to hold 0x08 again indicating that it wants / is ready for
+more input (max 250 reads).
+
+Once DATA holds 0x08 again write the sensor address to CMD.
+
+
+Reading
+-------
+
+First send the bank and sensor addresses as described above.
+Then for each byte of data you want to read wait for DATA to hold 0x01
+which indicates that the uGuru is ready to be read (max 250 reads) and once
+DATA holds 0x01 read the byte from CMD.
+
+Once all bytes have been read data will hold 0x09, but there is no reason to
+test for this. Notice that the number of bytes is bank address dependent see
+above and below.
+
+After completing a successfull read it is advised to put the uGuru back in
+ready mode, so that it is ready for the next read / write cycle. This way
+if your program / driver is unloaded and later loaded again the detection
+algorithm described above will still work.
+
+
+
+Writing
+-------
+
+First send the bank and sensor addresses as described above.
+Then for each byte of data you want to write wait for DATA to hold 0x00
+which indicates that the uGuru is ready to be written (max 250 reads) and
+once DATA holds 0x00 write the byte to CMD.
+
+Once all bytes have been written wait for DATA to hold 0x01 (max 250 reads)
+don't ask why this is the way it is.
+
+Once DATA holds 0x01 read CMD it should hold 0xAC now.
+
+After completing a successfull write it is advised to put the uGuru back in
+ready mode, so that it is ready for the next read / write cycle. This way
+if your program / driver is unloaded and later loaded again the detection
+algorithm described above will still work.
+
+
+Gotchas
+-------
+
+After wider testing of the Linux kernel driver some variants of the uGuru have
+turned up which do not hold 0x08 at DATA within 250 reads after writing the
+bank address. With these versions this happens quite frequent, using larger
+timeouts doesn't help, they just go offline for a second or 2, doing some
+internal callibration or whatever. Your code should be prepared to handle
+this and in case of no response in this specific case just goto sleep for a
+while and then retry.
+
+
+Address Map
+===========
+
+Bank 0x20 Alarms (R)
+--------------------
+This bank contains 0 sensors, iow the sensor address is ignored (but must be
+written) just use 0. Bank 0x20 contains 3 bytes:
+
+Byte 0:
+This byte holds the alarm flags for sensor 0-7 of Sensor Bank1, with bit 0
+corresponding to sensor 0, 1 to 1, etc.
+
+Byte 1:
+This byte holds the alarm flags for sensor 8-15 of Sensor Bank1, with bit 0
+corresponding to sensor 8, 1 to 9, etc.
+
+Byte 2:
+This byte holds the alarm flags for sensor 0-5 of Sensor Bank2, with bit 0
+corresponding to sensor 0, 1 to 1, etc.
+
+
+Bank 0x21 Sensor Bank1 Values / Readings (R)
+--------------------------------------------
+This bank contains 16 sensors, for each sensor it contains 1 byte.
+So far the following sensors are known to be available on all motherboards:
+Sensor  0 CPU temp
+Sensor  1 SYS temp
+Sensor  3 CPU core volt
+Sensor  4 DDR volt
+Sensor 10 DDR Vtt volt
+Sensor 15 PWM temp
+
+Byte 0:
+This byte holds the reading from the sensor. Sensors in Bank1 can be both
+volt and temp sensors, this is motherboard specific. The uGuru however does
+seem to know (be programmed with) what kindoff sensor is attached see Sensor
+Bank1 Settings description.
+
+Volt sensors use a linear scale, a reading 0 corresponds with 0 volt and a
+reading of 255 with 3494 mV. The sensors for higher voltages however are
+connected through a division circuit. The currently known division circuits
+in use result in ranges of: 0-4361mV, 0-6248mV or 0-14510mV. 3.3 volt sources
+use the 0-4361mV range, 5 volt the 0-6248mV and 12 volt the 0-14510mV .
+
+Temp sensors also use a linear scale, a reading of 0 corresponds with 0 degree
+Celsius and a reading of 255 with a reading of 255 degrees Celsius.
+
+
+Bank 0x22 Sensor Bank1 Settings (R)
+Bank 0x23 Sensor Bank1 Settings (W)
+-----------------------------------
+
+This bank contains 16 sensors, for each sensor it contains 3 bytes. Each
+set of 3 bytes contains the settings for the sensor with the same sensor
+address in Bank 0x21 .
+
+Byte 0:
+Alarm behaviour for the selected sensor. A 1 enables the described behaviour.
+Bit 0: Give an alarm if measured temp is over the warning threshold	(RW) *
+Bit 1: Give an alarm if measured volt is over the max threshold		(RW) **
+Bit 2: Give an alarm if measured volt is under the min threshold	(RW) **
+Bit 3: Beep if alarm							(RW)
+Bit 4: 1 if alarm cause measured temp is over the warning threshold	(R)
+Bit 5: 1 if alarm cause measured volt is over the max threshold		(R)
+Bit 6: 1 if alarm cause measured volt is under the min threshold	(R)
+Bit 7: Volt sensor: Shutdown if alarm persist for more then 4 seconds	(RW)
+       Temp sensor: Shutdown if temp is over the shutdown threshold	(RW)
+
+*  This bit is only honored/used by the uGuru if a temp sensor is connected
+** This bit is only honored/used by the uGuru if a volt sensor is connected
+Note with some trickery this can be used to find out what kinda sensor is
+detected see the Linux kernel driver for an example with many comments on
+how todo this.
+
+Byte 1:
+Temp sensor: warning threshold  (scale as bank 0x21)
+Volt sensor: min threshold      (scale as bank 0x21)
+
+Byte 2:
+Temp sensor: shutdown threshold (scale as bank 0x21)
+Volt sensor: max threshold      (scale as bank 0x21)
+
+
+Bank 0x24 PWM outputs for FAN's (R)
+Bank 0x25 PWM outputs for FAN's (W)
+-----------------------------------
+
+This bank contains 3 "sensors", for each sensor it contains 5 bytes.
+Sensor 0 usually controls the CPU fan
+Sensor 1 usually controls the NB (or chipset for single chip) fan
+Sensor 2 usually controls the System fan
+
+Byte 0:
+Flag 0x80 to enable control, Fan runs at 100% when disabled.
+low nibble (temp)sensor address at bank 0x21 used for control.
+
+Byte 1:
+0-255 = 0-12v (linear), specify voltage at which fan will rotate when under
+low threshold temp (specified in byte 3)
+
+Byte 2:
+0-255 = 0-12v (linear), specify voltage at which fan will rotate when above
+high threshold temp (specified in byte 4)
+
+Byte 3:
+Low threshold temp  (scale as bank 0x21)
+
+byte 4:
+High threshold temp (scale as bank 0x21)
+
+
+Bank 0x26 Sensors Bank2 Values / Readings (R)
+---------------------------------------------
+
+This bank contains 6 sensors (AFAIK), for each sensor it contains 1 byte.
+So far the following sensors are known to be available on all motherboards:
+Sensor 0: CPU fan speed
+Sensor 1: NB (or chipset for single chip) fan speed
+Sensor 2: SYS fan speed
+
+Byte 0:
+This byte holds the reading from the sensor. 0-255 = 0-15300 (linear)
+
+
+Bank 0x27 Sensors Bank2 Settings (R)
+Bank 0x28 Sensors Bank2 Settings (W)
+------------------------------------
+
+This bank contains 6 sensors (AFAIK), for each sensor it contains 2 bytes.
+
+Byte 0:
+Alarm behaviour for the selected sensor. A 1 enables the described behaviour.
+Bit 0: Give an alarm if measured rpm is under the min threshold	(RW)
+Bit 3: Beep if alarm						(RW)
+Bit 7: Shutdown if alarm persist for more then 4 seconds	(RW)
+
+Byte 1:
+min threshold (scale as bank 0x26)
+
+
+Warning for the adventerous
+===========================
+
+A word of caution to those who want to experiment and see if they can figure
+the voltage / clock programming out, I tried reading and only reading banks
+0-0x30 with the reading code used for the sensor banks (0x20-0x28) and this
+resulted in a _permanent_ reprogramming of the voltages, luckily I had the
+sensors part configured so that it would shutdown my system on any out of spec
+voltages which proprably safed my computer (after a reboot I managed to
+immediatly enter the bios and reload the defaults). This probably means that
+the read/write cycle for the non sensor part is different from the sensor part.
diff --git a/Documentation/hwmon/lm70 b/Documentation/hwmon/lm70
new file mode 100644
index 0000000..2bdd3fe
--- /dev/null
+++ b/Documentation/hwmon/lm70
@@ -0,0 +1,31 @@
+Kernel driver lm70
+==================
+
+Supported chip:
+  * National Semiconductor LM70
+    Datasheet: http://www.national.com/pf/LM/LM70.html
+
+Author:
+        Kaiwan N Billimoria <kaiwan@designergraphix.com>
+
+Description
+-----------
+
+This driver implements support for the National Semiconductor LM70
+temperature sensor.
+
+The LM70 temperature sensor chip supports a single temperature sensor.
+It communicates with a host processor (or microcontroller) via an
+SPI/Microwire Bus interface.
+
+Communication with the LM70 is simple: when the temperature is to be sensed,
+the driver accesses the LM70 using SPI communication: 16 SCLK cycles
+comprise the MOSI/MISO loop. At the end of the transfer, the 11-bit 2's
+complement digital temperature (sent via the SIO line), is available in the
+driver for interpretation. This driver makes use of the kernel's in-core
+SPI support.
+
+Thanks to
+---------
+Jean Delvare <khali@linux-fr.org> for mentoring the hwmon-side driver
+development.
diff --git a/Documentation/hwmon/lm83 b/Documentation/hwmon/lm83
index 061d9ed..f7aad14 100644
--- a/Documentation/hwmon/lm83
+++ b/Documentation/hwmon/lm83
@@ -7,6 +7,10 @@
     Addresses scanned: I2C 0x18 - 0x1a, 0x29 - 0x2b, 0x4c - 0x4e
     Datasheet: Publicly available at the National Semiconductor website
                http://www.national.com/pf/LM/LM83.html
+  * National Semiconductor LM82
+    Addresses scanned: I2C 0x18 - 0x1a, 0x29 - 0x2b, 0x4c - 0x4e
+    Datasheet: Publicly available at the National Semiconductor website
+               http://www.national.com/pf/LM/LM82.html
 
 
 Author: Jean Delvare <khali@linux-fr.org>
@@ -15,10 +19,11 @@
 -----------
 
 The LM83 is a digital temperature sensor. It senses its own temperature as
-well as the temperature of up to three external diodes. It is compatible
-with many other devices such as the LM84 and all other ADM1021 clones.
-The main difference between the LM83 and the LM84 in that the later can
-only sense the temperature of one external diode.
+well as the temperature of up to three external diodes. The LM82 is
+a stripped down version of the LM83 that only supports one external diode.
+Both are compatible with many other devices such as the LM84 and all
+other ADM1021 clones. The main difference between the LM83 and the LM84
+in that the later can only sense the temperature of one external diode.
 
 Using the adm1021 driver for a LM83 should work, but only two temperatures
 will be reported instead of four.
@@ -30,12 +35,16 @@
 
 Confirmed motherboards:
     SBS         P014
+    SBS         PSL09
 
 Unconfirmed motherboards:
     Gigabyte    GA-8IK1100
     Iwill       MPX2
     Soltek      SL-75DRV5
 
+The LM82 is confirmed to have been found on most AMD Geode reference
+designs and test platforms.
+
 The driver has been successfully tested by Magnus Forsström, who I'd
 like to thank here. More testers will be of course welcome.
 
diff --git a/Documentation/hwmon/smsc47m192 b/Documentation/hwmon/smsc47m192
new file mode 100644
index 0000000..45d6453
--- /dev/null
+++ b/Documentation/hwmon/smsc47m192
@@ -0,0 +1,102 @@
+Kernel driver smsc47m192
+========================
+
+Supported chips:
+  * SMSC LPC47M192 and LPC47M997
+    Prefix: 'smsc47m192'
+    Addresses scanned: I2C 0x2c - 0x2d
+    Datasheet: The datasheet for LPC47M192 is publicly available from
+               http://www.smsc.com/
+               The LPC47M997 is compatible for hardware monitoring.
+
+Author: Hartmut Rick <linux@rick.claranet.de>
+        Special thanks to Jean Delvare for careful checking
+        of the code and many helpful comments and suggestions.
+
+
+Description
+-----------
+
+This driver implements support for the hardware sensor capabilities
+of the SMSC LPC47M192 and LPC47M997 Super-I/O chips.
+
+These chips support 3 temperature channels and 8 voltage inputs
+as well as CPU voltage VID input.
+
+They do also have fan monitoring and control capabilities, but the
+these features are accessed via ISA bus and are not supported by this
+driver. Use the 'smsc47m1' driver for fan monitoring and control.
+
+Voltages and temperatures are measured by an 8-bit ADC, the resolution
+of the temperatures is 1 bit per degree C.
+Voltages are scaled such that the nominal voltage corresponds to
+192 counts, i.e. 3/4 of the full range. Thus the available range for
+each voltage channel is 0V ... 255/192*(nominal voltage), the resolution
+is 1 bit per (nominal voltage)/192.
+Both voltage and temperature values are scaled by 1000, the sys files
+show voltages in mV and temperatures in units of 0.001 degC.
+
+The +12V analog voltage input channel (in4_input) is multiplexed with
+bit 4 of the encoded CPU voltage. This means that you either get
+a +12V voltage measurement or a 5 bit CPU VID, but not both.
+The default setting is to use the pin as 12V input, and use only 4 bit VID.
+This driver assumes that the information in the configuration register
+is correct, i.e. that the BIOS has updated the configuration if
+the motherboard has this input wired to VID4.
+
+The temperature and voltage readings are updated once every 1.5 seconds.
+Reading them more often repeats the same values.
+
+
+sysfs interface
+---------------
+
+in0_input	- +2.5V voltage input
+in1_input	- CPU voltage input (nominal 2.25V)
+in2_input	- +3.3V voltage input
+in3_input	- +5V voltage input
+in4_input	- +12V voltage input (may be missing if used as VID4)
+in5_input	- Vcc voltage input (nominal 3.3V)
+		  This is the supply voltage of the sensor chip itself.
+in6_input	- +1.5V voltage input
+in7_input	- +1.8V voltage input
+
+in[0-7]_min,
+in[0-7]_max	- lower and upper alarm thresholds for in[0-7]_input reading
+
+		  All voltages are read and written in mV.
+
+in[0-7]_alarm	- alarm flags for voltage inputs
+		  These files read '1' in case of alarm, '0' otherwise.
+
+temp1_input	- chip temperature measured by on-chip diode
+temp[2-3]_input	- temperature measured by external diodes (one of these would
+		  typically be wired to the diode inside the CPU)
+
+temp[1-3]_min,
+temp[1-3]_max	- lower and upper alarm thresholds for temperatures
+
+temp[1-3]_offset - temperature offset registers
+		  The chip adds the offsets stored in these registers to
+		  the corresponding temperature readings.
+		  Note that temp1 and temp2 offsets share the same register,
+		  they cannot both be different from zero at the same time.
+		  Writing a non-zero number to one of them will reset the other
+		  offset to zero.
+
+		  All temperatures and offsets are read and written in
+		  units of 0.001 degC.
+
+temp[1-3]_alarm - alarm flags for temperature inputs, '1' in case of alarm,
+		  '0' otherwise.
+temp[2-3]_input_fault - diode fault flags for temperature inputs 2 and 3.
+		  A fault is detected if the two pins for the corresponding
+		  sensor are open or shorted, or any of the two is shorted
+		  to ground or Vcc. '1' indicates a diode fault.
+
+cpu0_vid	- CPU voltage as received from the CPU
+
+vrm		- CPU VID standard used for decoding CPU voltage
+
+		  The *_min, *_max, *_offset and vrm files can be read and
+		  written, all others are read-only.
diff --git a/Documentation/hwmon/sysfs-interface b/Documentation/hwmon/sysfs-interface
index a0d0ab2..d1d390a 100644
--- a/Documentation/hwmon/sysfs-interface
+++ b/Documentation/hwmon/sysfs-interface
@@ -3,15 +3,15 @@
 
 The libsensors library offers an interface to the raw sensors data
 through the sysfs interface. See libsensors documentation and source for
-more further information. As of writing this document, libsensors
-(from lm_sensors 2.8.3) is heavily chip-dependant. Adding or updating
+further information. As of writing this document, libsensors
+(from lm_sensors 2.8.3) is heavily chip-dependent. Adding or updating
 support for any given chip requires modifying the library's code.
 This is because libsensors was written for the procfs interface
 older kernel modules were using, which wasn't standardized enough.
 Recent versions of libsensors (from lm_sensors 2.8.2 and later) have
 support for the sysfs interface, though.
 
-The new sysfs interface was designed to be as chip-independant as
+The new sysfs interface was designed to be as chip-independent as
 possible.
 
 Note that motherboards vary widely in the connections to sensor chips.
@@ -24,7 +24,7 @@
 can change from motherboard to motherboard, the conversions cannot be
 hard coded into the driver and have to be done in user space.
 
-For this reason, even if we aim at a chip-independant libsensors, it will
+For this reason, even if we aim at a chip-independent libsensors, it will
 still require a configuration file (e.g. /etc/sensors.conf) for proper
 values conversion, labeling of inputs and hiding of unused inputs.
 
@@ -39,15 +39,16 @@
 this standard.
 
 Note that this standard isn't completely established yet, so it is subject
-to changes, even important ones. One more reason to use the library instead
-of accessing sysfs files directly.
+to changes. If you are writing a new hardware monitoring driver those
+features can't seem to fit in this interface, please contact us with your
+extension proposal. Keep in mind that backward compatibility must be
+preserved.
 
 Each chip gets its own directory in the sysfs /sys/devices tree.  To
-find all sensor chips, it is easier to follow the symlinks from
-/sys/i2c/devices/
+find all sensor chips, it is easier to follow the device symlinks from
+/sys/class/hwmon/hwmon*.
 
-All sysfs values are fixed point numbers.  To get the true value of some
-of the values, you should divide by the specified value.
+All sysfs values are fixed point numbers.
 
 There is only one value per file, unlike the older /proc specification.
 The common scheme for files naming is: <type><number>_<item>. Usual
@@ -69,28 +70,40 @@
 
 -------------------------------------------------------------------------
 
+[0-*]	denotes any positive number starting from 0
+[1-*]	denotes any positive number starting from 1
+RO	read only value
+RW	read/write value
+
+Read/write values may be read-only for some chips, depending on the
+hardware implementation.
+
+All entries are optional, and should only be created in a given driver
+if the chip has the feature.
+
 ************
 * Voltages *
 ************
 
-in[0-8]_min	Voltage min value.
+in[0-*]_min	Voltage min value.
 		Unit: millivolt
-		Read/Write
+		RW
 		
-in[0-8]_max	Voltage max value.
+in[0-*]_max	Voltage max value.
 		Unit: millivolt
-		Read/Write
+		RW
 		
-in[0-8]_input	Voltage input value.
+in[0-*]_input	Voltage input value.
 		Unit: millivolt
-		Read only
+		RO
+		Voltage measured on the chip pin.
 		Actual voltage depends on the scaling resistors on the
 		motherboard, as recommended in the chip datasheet.
 		This varies by chip and by motherboard.
 		Because of this variation, values are generally NOT scaled
 		by the chip driver, and must be done by the application.
 		However, some drivers (notably lm87 and via686a)
-		do scale, with various degrees of success.
+		do scale, because of internal resistors built into a chip.
 		These drivers will output the actual voltage.
 
 		Typical usage:
@@ -104,58 +117,72 @@
 			in7_*	varies
 			in8_*	varies
 
-cpu[0-1]_vid	CPU core reference voltage.
+cpu[0-*]_vid	CPU core reference voltage.
 		Unit: millivolt
-		Read only.
+		RO
 		Not always correct.
 
 vrm		Voltage Regulator Module version number. 
-		Read only.
-		Two digit number, first is major version, second is
-		minor version.
+		RW (but changing it should no more be necessary)
+		Originally the VRM standard version multiplied by 10, but now
+		an arbitrary number, as not all standards have a version
+		number.
 		Affects the way the driver calculates the CPU core reference
 		voltage from the vid pins.
 
+Also see the Alarms section for status flags associated with voltages.
+
 
 ********
 * Fans *
 ********
 
-fan[1-3]_min	Fan minimum value
+fan[1-*]_min	Fan minimum value
 		Unit: revolution/min (RPM)
-		Read/Write.
+		RW
 
-fan[1-3]_input	Fan input value.
+fan[1-*]_input	Fan input value.
 		Unit: revolution/min (RPM)
-		Read only.
+		RO
 
-fan[1-3]_div	Fan divisor.
+fan[1-*]_div	Fan divisor.
 		Integer value in powers of two (1, 2, 4, 8, 16, 32, 64, 128).
+		RW
 		Some chips only support values 1, 2, 4 and 8.
 		Note that this is actually an internal clock divisor, which
 		affects the measurable speed range, not the read value.
 
+Also see the Alarms section for status flags associated with fans.
+
+
 *******
 * PWM *
 *******
 
-pwm[1-3]	Pulse width modulation fan control.
+pwm[1-*]	Pulse width modulation fan control.
 		Integer value in the range 0 to 255
-		Read/Write
+		RW
 		255 is max or 100%.
 
-pwm[1-3]_enable
+pwm[1-*]_enable
 		Switch PWM on and off.
 		Not always present even if fan*_pwm is.
-		0 to turn off
-		1 to turn on in manual mode
-		2 to turn on in automatic mode
-		Read/Write
+		0: turn off
+		1: turn on in manual mode
+		2+: turn on in automatic mode
+		Check individual chip documentation files for automatic mode details.
+		RW
+
+pwm[1-*]_mode
+		0: DC mode
+		1: PWM mode
+		RW
 
 pwm[1-*]_auto_channels_temp
 		Select which temperature channels affect this PWM output in
 		auto mode. Bitfield, 1 is temp1, 2 is temp2, 4 is temp3 etc...
 		Which values are possible depend on the chip used.
+		RW
 
 pwm[1-*]_auto_point[1-*]_pwm
 pwm[1-*]_auto_point[1-*]_temp
@@ -163,6 +190,7 @@
 		Define the PWM vs temperature curve. Number of trip points is
 		chip-dependent. Use this for chips which associate trip points
 		to PWM output channels.
+		RW
 
 OR
 
@@ -172,50 +200,57 @@
 		Define the PWM vs temperature curve. Number of trip points is
 		chip-dependent. Use this for chips which associate trip points
 		to temperature channels.
+		RW
 
 
 ****************
 * Temperatures *
 ****************
 
-temp[1-3]_type	Sensor type selection.
+temp[1-*]_type	Sensor type selection.
 		Integers 1 to 4 or thermistor Beta value (typically 3435)
-		Read/Write.
+		RW
 		1: PII/Celeron Diode
 		2: 3904 transistor
 		3: thermal diode
 		4: thermistor (default/unknown Beta)
 		Not all types are supported by all chips
 
-temp[1-4]_max	Temperature max value.
-		Unit: millidegree Celcius
-		Read/Write value.
+temp[1-*]_max	Temperature max value.
+		Unit: millidegree Celsius (or millivolt, see below)
+		RW
 
-temp[1-3]_min	Temperature min value.
-		Unit: millidegree Celcius
-		Read/Write value.
+temp[1-*]_min	Temperature min value.
+		Unit: millidegree Celsius
+		RW
 
-temp[1-3]_max_hyst
+temp[1-*]_max_hyst
 		Temperature hysteresis value for max limit.
-		Unit: millidegree Celcius
+		Unit: millidegree Celsius
 		Must be reported as an absolute temperature, NOT a delta
 		from the max value.
-		Read/Write value.
+		RW
 
-temp[1-4]_input Temperature input value.
-		Unit: millidegree Celcius
-		Read only value.
+temp[1-*]_input Temperature input value.
+		Unit: millidegree Celsius
+		RO
 
-temp[1-4]_crit	Temperature critical value, typically greater than
+temp[1-*]_crit	Temperature critical value, typically greater than
 		corresponding temp_max values.
-		Unit: millidegree Celcius
-		Read/Write value.
+		Unit: millidegree Celsius
+		RW
 
-temp[1-2]_crit_hyst
+temp[1-*]_crit_hyst
 		Temperature hysteresis value for critical limit.
-		Unit: millidegree Celcius
+		Unit: millidegree Celsius
 		Must be reported as an absolute temperature, NOT a delta
 		from the critical value.
+		RW
+
+temp[1-4]_offset
+		Temperature offset which is added to the temperature reading
+		by the chip.
+		Unit: millidegree Celsius
 		Read/Write value.
 
 		If there are multiple temperature sensors, temp1_* is
@@ -225,6 +260,17 @@
 		itself, for example the thermal diode inside the CPU or
 		a thermistor nearby.
 
+Some chips measure temperature using external thermistors and an ADC, and
+report the temperature measurement as a voltage. Converting this voltage
+back to a temperature (or the other way around for limits) requires
+mathematical functions not available in the kernel, so the conversion
+must occur in user space. For these chips, all temp* files described
+above should contain values expressed in millivolt instead of millidegree
+Celsius. In other words, such temperature channels are handled as voltage
+channels by the driver.
+
+Also see the Alarms section for status flags associated with temperatures.
+
 
 ************
 * Currents *
@@ -233,25 +279,88 @@
 Note that no known chip provides current measurements as of writing,
 so this part is theoretical, so to say.
 
-curr[1-n]_max	Current max value
+curr[1-*]_max	Current max value
 		Unit: milliampere
-		Read/Write.
+		RW
 
-curr[1-n]_min	Current min value.
+curr[1-*]_min	Current min value.
 		Unit: milliampere
-		Read/Write.
+		RW
 
-curr[1-n]_input	Current input value
+curr[1-*]_input	Current input value
 		Unit: milliampere
-		Read only.
+		RO
 
 
-*********
-* Other *
-*********
+**********
+* Alarms *
+**********
+
+Each channel or limit may have an associated alarm file, containing a
+boolean value. 1 means than an alarm condition exists, 0 means no alarm.
+
+Usually a given chip will either use channel-related alarms, or
+limit-related alarms, not both. The driver should just reflect the hardware
+implementation.
+
+in[0-*]_alarm
+fan[1-*]_alarm
+temp[1-*]_alarm
+		Channel alarm
+		0: no alarm
+		1: alarm
+		RO
+
+OR
+
+in[0-*]_min_alarm
+in[0-*]_max_alarm
+fan[1-*]_min_alarm
+temp[1-*]_min_alarm
+temp[1-*]_max_alarm
+temp[1-*]_crit_alarm
+		Limit alarm
+		0: no alarm
+		1: alarm
+		RO
+
+Each input channel may have an associated fault file. This can be used
+to notify open diodes, unconnected fans etc. where the hardware
+supports it. When this boolean has value 1, the measurement for that
+channel should not be trusted.
+
+in[0-*]_input_fault
+fan[1-*]_input_fault
+temp[1-*]_input_fault
+		Input fault condition
+		0: no fault occured
+		1: fault condition
+		RO
+
+Some chips also offer the possibility to get beeped when an alarm occurs:
+
+beep_enable	Master beep enable
+		0: no beeps
+		1: beeps
+		RW
+
+in[0-*]_beep
+fan[1-*]_beep
+temp[1-*]_beep
+		Channel beep
+		0: disable
+		1: enable
+		RW
+
+In theory, a chip could provide per-limit beep masking, but no such chip
+was seen so far.
+
+Old drivers provided a different, non-standard interface to alarms and
+beeps. These interface files are deprecated, but will be kept around
+for compatibility reasons:
 
 alarms		Alarm bitmask.
-		Read only.
+		RO
 		Integer representation of one to four bytes.
 		A '1' bit means an alarm.
 		Chips should be programmed for 'comparator' mode so that
@@ -259,35 +368,26 @@
 		if it is still valid.
 		Generally a direct representation of a chip's internal
 		alarm registers; there is no standard for the position
-		of individual bits.
+		of individual bits. For this reason, the use of this
+		interface file for new drivers is discouraged. Use
+		individual *_alarm and *_fault files instead.
 		Bits are defined in kernel/include/sensors.h.
 
-alarms_in	Alarm bitmask relative to in (voltage) channels
-		Read only
-		A '1' bit means an alarm, LSB corresponds to in0 and so on
-		Prefered to 'alarms' for newer chips
-
-alarms_fan	Alarm bitmask relative to fan channels
-		Read only
-		A '1' bit means an alarm, LSB corresponds to fan1 and so on
-		Prefered to 'alarms' for newer chips
-
-alarms_temp	Alarm bitmask relative to temp (temperature) channels
-		Read only
-		A '1' bit means an alarm, LSB corresponds to temp1 and so on
-		Prefered to 'alarms' for newer chips
-
-beep_enable	Beep/interrupt enable
-		0 to disable.
-		1 to enable.
-		Read/Write
-
 beep_mask	Bitmask for beep.
-		Same format as 'alarms' with the same bit locations.
-		Read/Write
+		Same format as 'alarms' with the same bit locations,
+		use discouraged for the same reason. Use individual
+		*_beep files instead.
+		RW
+
+
+*********
+* Other *
+*********
 
 eeprom		Raw EEPROM data in binary form.
-		Read only.
+		RO
 
 pec		Enable or disable PEC (SMBus only)
-		Read/Write
+		0: disable
+		1: enable
+		RW
diff --git a/Documentation/hwmon/userspace-tools b/Documentation/hwmon/userspace-tools
index 2622aac..19900a8 100644
--- a/Documentation/hwmon/userspace-tools
+++ b/Documentation/hwmon/userspace-tools
@@ -6,31 +6,32 @@
 are also connected directly through the ISA bus.
 
 The kernel drivers make the data from the sensor chips available in the /sys
-virtual filesystem. Userspace tools are then used to display or set or the
-data in a more friendly manner.
+virtual filesystem. Userspace tools are then used to display the measured
+values or configure the chips in a more friendly manner.
 
 Lm-sensors
 ----------
 
-Core set of utilites that will allow you to obtain health information,
+Core set of utilities that will allow you to obtain health information,
 setup monitoring limits etc. You can get them on their homepage
 http://www.lm-sensors.nu/ or as a package from your Linux distribution.
 
 If from website:
-Get lmsensors from project web site. Please note, you need only userspace
-part, so compile with "make user_install" target.
+Get lm-sensors from project web site. Please note, you need only userspace
+part, so compile with "make user" and install with "make user_install".
 
 General hints to get things working:
 
 0) get lm-sensors userspace utils
-1) compile all drivers in I2C section as modules in your kernel
+1) compile all drivers in I2C and Hardware Monitoring sections as modules
+   in your kernel
 2) run sensors-detect script, it will tell you what modules you need to load.
 3) load them and run "sensors" command, you should see some results.
 4) fix sensors.conf, labels, limits, fan divisors
 5) if any more problems consult FAQ, or documentation
 
-Other utilites
---------------
+Other utilities
+---------------
 
 If you want some graphical indicators of system health look for applications
 like: gkrellm, ksensors, xsensors, wmtemp, wmsensors, wmgtemp, ksysguardd,
diff --git a/Documentation/hwmon/w83791d b/Documentation/hwmon/w83791d
new file mode 100644
index 0000000..83a3836
--- /dev/null
+++ b/Documentation/hwmon/w83791d
@@ -0,0 +1,113 @@
+Kernel driver w83791d
+=====================
+
+Supported chips:
+  * Winbond W83791D
+    Prefix: 'w83791d'
+    Addresses scanned: I2C 0x2c - 0x2f
+    Datasheet: http://www.winbond-usa.com/products/winbond_products/pdfs/PCIC/W83791Da.pdf
+
+Author: Charles Spirakis <bezaur@gmail.com>
+
+This driver was derived from the w83781d.c and w83792d.c source files.
+
+Credits:
+  w83781d.c:
+    Frodo Looijaard <frodol@dds.nl>,
+    Philip Edelbrock <phil@netroedge.com>,
+    and Mark Studebaker <mdsxyz123@yahoo.com>
+  w83792d.c:
+    Chunhao Huang <DZShen@Winbond.com.tw>,
+    Rudolf Marek <r.marek@sh.cvut.cz>
+
+Module Parameters
+-----------------
+
+* init boolean
+  (default 0)
+  Use 'init=1' to have the driver do extra software initializations.
+  The default behavior is to do the minimum initialization possible
+  and depend on the BIOS to properly setup the chip. If you know you
+  have a w83791d and you're having problems, try init=1 before trying
+  reset=1.
+
+* reset boolean
+  (default 0)
+  Use 'reset=1' to reset the chip (via index 0x40, bit 7). The default
+  behavior is no chip reset to preserve BIOS settings.
+
+* force_subclients=bus,caddr,saddr,saddr
+  This is used to force the i2c addresses for subclients of
+  a certain chip. Example usage is `force_subclients=0,0x2f,0x4a,0x4b'
+  to force the subclients of chip 0x2f on bus 0 to i2c addresses
+  0x4a and 0x4b.
+
+
+Description
+-----------
+
+This driver implements support for the Winbond W83791D chip.
+
+Detection of the chip can sometimes be foiled because it can be in an
+internal state that allows no clean access (Bank with ID register is not
+currently selected). If you know the address of the chip, use a 'force'
+parameter; this will put it into a more well-behaved state first.
+
+The driver implements three temperature sensors, five fan rotation speed
+sensors, and ten voltage sensors.
+
+Temperatures are measured in degrees Celsius and measurement resolution is 1
+degC for temp1 and 0.5 degC for temp2 and temp3. An alarm is triggered when
+the temperature gets higher than the Overtemperature Shutdown value; it stays
+on until the temperature falls below the Hysteresis value.
+
+Fan rotation speeds are reported in RPM (rotations per minute). An alarm is
+triggered if the rotation speed has dropped below a programmable limit. Fan
+readings can be divided by a programmable divider (1, 2, 4, 8 for fan 1/2/3
+and 1, 2, 4, 8, 16, 32, 64 or 128 for fan 4/5) to give the readings more
+range or accuracy.
+
+Voltage sensors (also known as IN sensors) report their values in millivolts.
+An alarm is triggered if the voltage has crossed a programmable minimum
+or maximum limit.
+
+Alarms are provided as output from a "realtime status register". The
+following bits are defined:
+
+bit - alarm on:
+0  - Vcore
+1  - VINR0
+2  - +3.3VIN
+3  - 5VDD
+4  - temp1
+5  - temp2
+6  - fan1
+7  - fan2
+8  - +12VIN
+9  - -12VIN
+10 - -5VIN
+11 - fan3
+12 - chassis
+13 - temp3
+14 - VINR1
+15 - reserved
+16 - tart1
+17 - tart2
+18 - tart3
+19 - VSB
+20 - VBAT
+21 - fan4
+22 - fan5
+23 - reserved
+
+When an alarm goes off, you can be warned by a beeping signal through your
+computer speaker. It is possible to enable all beeping globally, or only
+the beeping for some alarms.
+
+The driver only reads the chip values each 3 seconds; reading them more
+often will do no harm, but will return 'old' values.
+
+W83791D TODO:
+---------------
+Provide a patch for per-file alarms as discussed on the mailing list
+Provide a patch for smart-fan control (still need appropriate motherboard/fans)
diff --git a/Documentation/i2c/busses/i2c-i801 b/Documentation/i2c/busses/i2c-i801
index fd4b271..e46c234 100644
--- a/Documentation/i2c/busses/i2c-i801
+++ b/Documentation/i2c/busses/i2c-i801
@@ -21,8 +21,7 @@
 Module Parameters
 -----------------
 
-* force_addr: int
-  Forcibly enable the ICH at the given address. EXTREMELY DANGEROUS!
+None.
 
 
 Description
diff --git a/Documentation/i2c/busses/i2c-nforce2 b/Documentation/i2c/busses/i2c-nforce2
index d751282..cd49c42 100644
--- a/Documentation/i2c/busses/i2c-nforce2
+++ b/Documentation/i2c/busses/i2c-nforce2
@@ -7,6 +7,8 @@
   * nForce3 250Gb MCP          10de:00E4 
   * nForce4 MCP                10de:0052
   * nForce4 MCP-04             10de:0034
+  * nForce4 MCP51              10de:0264
+  * nForce4 MCP55              10de:0368
 
 Datasheet: not publically available, but seems to be similar to the
            AMD-8111 SMBus 2.0 adapter.
diff --git a/Documentation/i2c/busses/i2c-ocores b/Documentation/i2c/busses/i2c-ocores
new file mode 100644
index 0000000..cfcebb1
--- /dev/null
+++ b/Documentation/i2c/busses/i2c-ocores
@@ -0,0 +1,51 @@
+Kernel driver i2c-ocores
+
+Supported adapters:
+  * OpenCores.org I2C controller by Richard Herveille (see datasheet link)
+    Datasheet: http://www.opencores.org/projects.cgi/web/i2c/overview
+
+Author: Peter Korsgaard <jacmet@sunsite.dk>
+
+Description
+-----------
+
+i2c-ocores is an i2c bus driver for the OpenCores.org I2C controller
+IP core by Richard Herveille.
+
+Usage
+-----
+
+i2c-ocores uses the platform bus, so you need to provide a struct
+platform_device with the base address and interrupt number. The
+dev.platform_data of the device should also point to a struct
+ocores_i2c_platform_data (see linux/i2c-ocores.h) describing the
+distance between registers and the input clock speed.
+
+E.G. something like:
+
+static struct resource ocores_resources[] = {
+	[0] = {
+		.start	= MYI2C_BASEADDR,
+		.end	= MYI2C_BASEADDR + 8,
+		.flags	= IORESOURCE_MEM,
+	},
+	[1] = {
+		.start	= MYI2C_IRQ,
+		.end	= MYI2C_IRQ,
+		.flags	= IORESOURCE_IRQ,
+	},
+};
+
+static struct ocores_i2c_platform_data myi2c_data = {
+	.regstep	= 2,		/* two bytes between registers */
+	.clock_khz	= 50000,	/* input clock of 50MHz */
+};
+
+static struct platform_device myi2c = {
+	.name			= "ocores-i2c",
+	.dev = {
+		.platform_data	= &myi2c_data,
+	},
+	.num_resources		= ARRAY_SIZE(ocores_resources),
+	.resource		= ocores_resources,
+};
diff --git a/Documentation/i2c/busses/i2c-piix4 b/Documentation/i2c/busses/i2c-piix4
index a1c8f58..9214763 100644
--- a/Documentation/i2c/busses/i2c-piix4
+++ b/Documentation/i2c/busses/i2c-piix4
@@ -6,6 +6,8 @@
     Datasheet: Publicly available at the Intel website
   * ServerWorks OSB4, CSB5, CSB6 and HT-1000 southbridges
     Datasheet: Only available via NDA from ServerWorks
+  * ATI IXP southbridges IXP200, IXP300, IXP400
+    Datasheet: Not publicly available
   * Standard Microsystems (SMSC) SLC90E66 (Victory66) southbridge
     Datasheet: Publicly available at the SMSC website http://www.smsc.com
 
@@ -21,8 +23,6 @@
   Forcibly enable the PIIX4. DANGEROUS!
 * force_addr: int
   Forcibly enable the PIIX4 at the given address. EXTREMELY DANGEROUS!
-* fix_hstcfg: int
-  Fix config register. Needed on some boards (Force CPCI735).
 
 
 Description
@@ -63,10 +63,36 @@
 The PIIX/PIIX3 does not implement an SMBus or I2C bus, so you can't use
 this driver on those mainboards.
 
-The ServerWorks Southbridges, the Intel 440MX, and the Victory766 are
+The ServerWorks Southbridges, the Intel 440MX, and the Victory66 are
 identical to the PIIX4 in I2C/SMBus support.
 
-A few OSB4 southbridges are known to be misconfigured by the BIOS. In this
-case, you have you use the fix_hstcfg module parameter. Do not use it
-unless you know you have to, because in some cases it also breaks
-configuration on southbridges that don't need it.
+If you own Force CPCI735 motherboard or other OSB4 based systems you may need
+to change the SMBus Interrupt Select register so the SMBus controller uses
+the SMI mode.
+
+1) Use lspci command and locate the PCI device with the SMBus controller:
+   00:0f.0 ISA bridge: ServerWorks OSB4 South Bridge (rev 4f)
+   The line may vary for different chipsets. Please consult the driver source
+   for all possible PCI ids (and lspci -n to match them). Lets assume the
+   device is located at 00:0f.0.
+2) Now you just need to change the value in 0xD2 register. Get it first with
+   command: lspci -xxx -s 00:0f.0
+   If the value is 0x3 then you need to change it to 0x1
+   setpci  -s 00:0f.0 d2.b=1
+
+Please note that you don't need to do that in all cases, just when the SMBus is
+not working properly.
+
+
+Hardware-specific issues
+------------------------
+
+This driver will refuse to load on IBM systems with an Intel PIIX4 SMBus.
+Some of these machines have an RFID EEPROM (24RF08) connected to the SMBus,
+which can easily get corrupted due to a state machine bug. These are mostly
+Thinkpad laptops, but desktop systems may also be affected. We have no list
+of all affected systems, so the only safe solution was to prevent access to
+the SMBus on all IBM systems (detected using DMI data.)
+
+For additional information, read:
+http://www2.lm-sensors.nu/~lm78/cvs/lm_sensors2/README.thinkpad
diff --git a/Documentation/i2c/busses/i2c-sis96x b/Documentation/i2c/busses/i2c-sis96x
index 00a009b..08d7b2d 100644
--- a/Documentation/i2c/busses/i2c-sis96x
+++ b/Documentation/i2c/busses/i2c-sis96x
@@ -42,8 +42,8 @@
 chipsets as well: 635, and 635T. If anyone owns a board with those chips
 AND is willing to risk crashing & burning an otherwise well-behaved kernel
 in the name of progress... please contact me at <mhoffman@lightlink.com> or
-via the project's mailing list: <lm-sensors@lm-sensors.org>.  Please
-send bug reports and/or success stories as well.
+via the project's mailing list: <i2c@lm-sensors.org>.  Please send bug
+reports and/or success stories as well.
 
 
 TO DOs
diff --git a/Documentation/i2c/busses/scx200_acb b/Documentation/i2c/busses/scx200_acb
index f50e699..7c07883d 100644
--- a/Documentation/i2c/busses/scx200_acb
+++ b/Documentation/i2c/busses/scx200_acb
@@ -2,14 +2,31 @@
 
 Author: Christer Weinigel <wingel@nano-system.com>
 
+The driver supersedes the older, never merged driver named i2c-nscacb.
+
 Module Parameters
 -----------------
 
-* base: int
+* base: up to 4 ints
   Base addresses for the ACCESS.bus controllers on SCx200 and SC1100 devices
 
+  By default the driver uses two base addresses 0x820 and 0x840.
+  If you want only one base address, specify the second as 0 so as to
+  override this default.
+
 Description
 -----------
 
 Enable the use of the ACCESS.bus controller on the Geode SCx200 and
 SC1100 processors and the CS5535 and CS5536 Geode companion devices.
+
+Device-specific notes
+---------------------
+
+The SC1100 WRAP boards are known to use base addresses 0x810 and 0x820.
+If the scx200_acb driver is built into the kernel, add the following
+parameter to your boot command line:
+  scx200_acb.base=0x810,0x820
+If the scx200_acb driver is built as a module, add the following line to
+the file /etc/modprobe.conf instead:
+  options scx200_acb base=0x810,0x820
diff --git a/Documentation/ia64/aliasing.txt b/Documentation/ia64/aliasing.txt
new file mode 100644
index 0000000..38f9a52
--- /dev/null
+++ b/Documentation/ia64/aliasing.txt
@@ -0,0 +1,208 @@
+	         MEMORY ATTRIBUTE ALIASING ON IA-64
+
+			   Bjorn Helgaas
+		       <bjorn.helgaas@hp.com>
+			    May 4, 2006
+
+
+MEMORY ATTRIBUTES
+
+    Itanium supports several attributes for virtual memory references.
+    The attribute is part of the virtual translation, i.e., it is
+    contained in the TLB entry.  The ones of most interest to the Linux
+    kernel are:
+
+	WB		Write-back (cacheable)
+	UC		Uncacheable
+	WC		Write-coalescing
+
+    System memory typically uses the WB attribute.  The UC attribute is
+    used for memory-mapped I/O devices.  The WC attribute is uncacheable
+    like UC is, but writes may be delayed and combined to increase
+    performance for things like frame buffers.
+
+    The Itanium architecture requires that we avoid accessing the same
+    page with both a cacheable mapping and an uncacheable mapping[1].
+
+    The design of the chipset determines which attributes are supported
+    on which regions of the address space.  For example, some chipsets
+    support either WB or UC access to main memory, while others support
+    only WB access.
+
+MEMORY MAP
+
+    Platform firmware describes the physical memory map and the
+    supported attributes for each region.  At boot-time, the kernel uses
+    the EFI GetMemoryMap() interface.  ACPI can also describe memory
+    devices and the attributes they support, but Linux/ia64 currently
+    doesn't use this information.
+
+    The kernel uses the efi_memmap table returned from GetMemoryMap() to
+    learn the attributes supported by each region of physical address
+    space.  Unfortunately, this table does not completely describe the
+    address space because some machines omit some or all of the MMIO
+    regions from the map.
+
+    The kernel maintains another table, kern_memmap, which describes the
+    memory Linux is actually using and the attribute for each region.
+    This contains only system memory; it does not contain MMIO space.
+
+    The kern_memmap table typically contains only a subset of the system
+    memory described by the efi_memmap.  Linux/ia64 can't use all memory
+    in the system because of constraints imposed by the identity mapping
+    scheme.
+
+    The efi_memmap table is preserved unmodified because the original
+    boot-time information is required for kexec.
+
+KERNEL IDENTITY MAPPINGS
+
+    Linux/ia64 identity mappings are done with large pages, currently
+    either 16MB or 64MB, referred to as "granules."  Cacheable mappings
+    are speculative[2], so the processor can read any location in the
+    page at any time, independent of the programmer's intentions.  This
+    means that to avoid attribute aliasing, Linux can create a cacheable
+    identity mapping only when the entire granule supports cacheable
+    access.
+
+    Therefore, kern_memmap contains only full granule-sized regions that
+    can referenced safely by an identity mapping.
+
+    Uncacheable mappings are not speculative, so the processor will
+    generate UC accesses only to locations explicitly referenced by
+    software.  This allows UC identity mappings to cover granules that
+    are only partially populated, or populated with a combination of UC
+    and WB regions.
+
+USER MAPPINGS
+
+    User mappings are typically done with 16K or 64K pages.  The smaller
+    page size allows more flexibility because only 16K or 64K has to be
+    homogeneous with respect to memory attributes.
+
+POTENTIAL ATTRIBUTE ALIASING CASES
+
+    There are several ways the kernel creates new mappings:
+
+    mmap of /dev/mem
+
+	This uses remap_pfn_range(), which creates user mappings.  These
+	mappings may be either WB or UC.  If the region being mapped
+	happens to be in kern_memmap, meaning that it may also be mapped
+	by a kernel identity mapping, the user mapping must use the same
+	attribute as the kernel mapping.
+
+	If the region is not in kern_memmap, the user mapping should use
+	an attribute reported as being supported in the EFI memory map.
+
+	Since the EFI memory map does not describe MMIO on some
+	machines, this should use an uncacheable mapping as a fallback.
+
+    mmap of /sys/class/pci_bus/.../legacy_mem
+
+	This is very similar to mmap of /dev/mem, except that legacy_mem
+	only allows mmap of the one megabyte "legacy MMIO" area for a
+	specific PCI bus.  Typically this is the first megabyte of
+	physical address space, but it may be different on machines with
+	several VGA devices.
+
+	"X" uses this to access VGA frame buffers.  Using legacy_mem
+	rather than /dev/mem allows multiple instances of X to talk to
+	different VGA cards.
+
+	The /dev/mem mmap constraints apply.
+
+	However, since this is for mapping legacy MMIO space, WB access
+	does not make sense.  This matters on machines without legacy
+	VGA support: these machines may have WB memory for the entire
+	first megabyte (or even the entire first granule).
+
+	On these machines, we could mmap legacy_mem as WB, which would
+	be safe in terms of attribute aliasing, but X has no way of
+	knowing that it is accessing regular memory, not a frame buffer,
+	so the kernel should fail the mmap rather than doing it with WB.
+
+    read/write of /dev/mem
+
+	This uses copy_from_user(), which implicitly uses a kernel
+	identity mapping.  This is obviously safe for things in
+	kern_memmap.
+
+	There may be corner cases of things that are not in kern_memmap,
+	but could be accessed this way.  For example, registers in MMIO
+	space are not in kern_memmap, but could be accessed with a UC
+	mapping.  This would not cause attribute aliasing.  But
+	registers typically can be accessed only with four-byte or
+	eight-byte accesses, and the copy_from_user() path doesn't allow
+	any control over the access size, so this would be dangerous.
+
+    ioremap()
+
+	This returns a kernel identity mapping for use inside the
+	kernel.
+
+	If the region is in kern_memmap, we should use the attribute
+	specified there.  Otherwise, if the EFI memory map reports that
+	the entire granule supports WB, we should use that (granules
+	that are partially reserved or occupied by firmware do not appear
+	in kern_memmap).  Otherwise, we should use a UC mapping.
+
+PAST PROBLEM CASES
+
+    mmap of various MMIO regions from /dev/mem by "X" on Intel platforms
+
+      The EFI memory map may not report these MMIO regions.
+
+      These must be allowed so that X will work.  This means that
+      when the EFI memory map is incomplete, every /dev/mem mmap must
+      succeed.  It may create either WB or UC user mappings, depending
+      on whether the region is in kern_memmap or the EFI memory map.
+
+    mmap of 0x0-0xA0000 /dev/mem by "hwinfo" on HP sx1000 with VGA enabled
+
+      See https://bugzilla.novell.com/show_bug.cgi?id=140858.
+
+      The EFI memory map reports the following attributes:
+        0x00000-0x9FFFF WB only
+        0xA0000-0xBFFFF UC only (VGA frame buffer)
+        0xC0000-0xFFFFF WB only
+
+      This mmap is done with user pages, not kernel identity mappings,
+      so it is safe to use WB mappings.
+
+      The kernel VGA driver may ioremap the VGA frame buffer at 0xA0000,
+      which will use a granule-sized UC mapping covering 0-0xFFFFF.  This
+      granule covers some WB-only memory, but since UC is non-speculative,
+      the processor will never generate an uncacheable reference to the
+      WB-only areas unless the driver explicitly touches them.
+
+    mmap of 0x0-0xFFFFF legacy_mem by "X"
+
+      If the EFI memory map reports this entire range as WB, there
+      is no VGA MMIO hole, and the mmap should fail or be done with
+      a WB mapping.
+
+      There's no easy way for X to determine whether the 0xA0000-0xBFFFF
+      region is a frame buffer or just memory, so I think it's best to
+      just fail this mmap request rather than using a WB mapping.  As
+      far as I know, there's no need to map legacy_mem with WB
+      mappings.
+
+      Otherwise, a UC mapping of the entire region is probably safe.
+      The VGA hole means the region will not be in kern_memmap.  The
+      HP sx1000 chipset doesn't support UC access to the memory surrounding
+      the VGA hole, but X doesn't need that area anyway and should not
+      reference it.
+
+    mmap of 0xA0000-0xBFFFF legacy_mem by "X" on HP sx1000 with VGA disabled
+
+      The EFI memory map reports the following attributes:
+        0x00000-0xFFFFF WB only (no VGA MMIO hole)
+
+      This is a special case of the previous case, and the mmap should
+      fail for the same reason as above.
+
+NOTES
+
+    [1] SDM rev 2.2, vol 2, sec 4.4.1.
+    [2] SDM rev 2.2, vol 2, sec 4.4.6.
diff --git a/Documentation/infiniband/ipoib.txt b/Documentation/infiniband/ipoib.txt
index 5c5a4cc..1870355 100644
--- a/Documentation/infiniband/ipoib.txt
+++ b/Documentation/infiniband/ipoib.txt
@@ -1,10 +1,10 @@
 IP OVER INFINIBAND
 
   The ib_ipoib driver is an implementation of the IP over InfiniBand
-  protocol as specified by the latest Internet-Drafts issued by the
-  IETF ipoib working group.  It is a "native" implementation in the
-  sense of setting the interface type to ARPHRD_INFINIBAND and the
-  hardware address length to 20 (earlier proprietary implementations
+  protocol as specified by RFC 4391 and 4392, issued by the IETF ipoib
+  working group.  It is a "native" implementation in the sense of
+  setting the interface type to ARPHRD_INFINIBAND and the hardware
+  address length to 20 (earlier proprietary implementations
   masqueraded to the kernel as ethernet interfaces).
 
 Partitions and P_Keys
@@ -53,3 +53,7 @@
 
   IETF IP over InfiniBand (ipoib) Working Group
     http://ietf.org/html.charters/ipoib-charter.html
+  Transmission of IP over InfiniBand (IPoIB) (RFC 4391)
+    http://ietf.org/rfc/rfc4391.txt 
+  IP over InfiniBand (IPoIB) Architecture (RFC 4392)
+    http://ietf.org/rfc/rfc4392.txt 
diff --git a/Documentation/initrd.txt b/Documentation/initrd.txt
index 7de1c80..b1b6440 100644
--- a/Documentation/initrd.txt
+++ b/Documentation/initrd.txt
@@ -67,8 +67,7 @@
     as the last process has closed it, all data is freed and /dev/initrd
     can't be opened anymore.
 
-  root=/dev/ram0   (without devfs)
-  root=/dev/rd/0   (with devfs)
+  root=/dev/ram0
 
     initrd is mounted as root, and the normal boot procedure is followed,
     with the RAM disk still mounted as root.
@@ -90,8 +89,7 @@
 procedure should create the /initrd directory.
 
 If initrd will not be mounted in some cases, its content is still
-accessible if the following device has been created (note that this
-does not work if using devfs):
+accessible if the following device has been created:
 
 # mknod /dev/initrd b 1 250 
 # chmod 400 /dev/initrd
@@ -119,8 +117,7 @@
     (if space is critical, you may want to use the Minix FS instead of Ext2)
  3) mount the file system, e.g.
     # mount -t ext2 -o loop initrd /mnt
- 4) create the console device (not necessary if using devfs, but it can't
-    hurt to do it anyway):
+ 4) create the console device:
     # mkdir /mnt/dev
     # mknod /mnt/dev/console c 5 1
  5) copy all the files that are needed to properly use the initrd
@@ -152,12 +149,7 @@
 
   root=/dev/ram0 init=/linuxrc rw
 
-if not using devfs, or
-
-  root=/dev/rd/0 init=/linuxrc rw
-
-if using devfs. (rw is only necessary if writing to the initrd file
-system.)
+(rw is only necessary if writing to the initrd file system.)
 
 With LOADLIN, you simply execute
 
@@ -217,9 +209,9 @@
 # exec chroot . what-follows <dev/console >dev/console 2>&1
 
 Where what-follows is a program under the new root, e.g. /sbin/init
-If the new root file system will be used with devfs and has no valid
-/dev directory, devfs must be mounted before invoking chroot in order to
-provide /dev/console.
+If the new root file system will be used with udev and has no valid
+/dev directory, udev must be initialized before invoking chroot in order
+to provide /dev/console.
 
 Note: implementation details of pivot_root may change with time. In order
 to ensure compatibility, the following points should be observed:
@@ -236,7 +228,7 @@
 disk can be freed:
 
 # umount /initrd
-# blockdev --flushbufs /dev/ram0    # /dev/rd/0 if using devfs
+# blockdev --flushbufs /dev/ram0
 
 It is also possible to use initrd with an NFS-mounted root, see the
 pivot_root(8) man page for details.
diff --git a/Documentation/ioctl-number.txt b/Documentation/ioctl-number.txt
index 171a44e..edc04d7 100644
--- a/Documentation/ioctl-number.txt
+++ b/Documentation/ioctl-number.txt
@@ -85,7 +85,9 @@
 					<mailto:maassen@uni-freiburg.de>
 'C'	all	linux/soundcard.h
 'D'	all	asm-s390/dasd.h
+'E'	all	linux/input.h
 'F'	all	linux/fb.h
+'H'	all	linux/hiddev.h
 'I'	all	linux/isdn.h
 'J'	00-1F	drivers/scsi/gdth_ioctl.h
 'K'	all	linux/kd.h
@@ -117,7 +119,6 @@
 'c'	00-7F	linux/comstats.h	conflict!
 'c'	00-7F	linux/coda.h		conflict!
 'd'	00-FF	linux/char/drm/drm/h	conflict!
-'d'	00-1F	linux/devfs_fs.h	conflict!
 'd'	00-DF	linux/video_decoder.h	conflict!
 'd'	F0-FF	linux/digi1.h
 'e'	all	linux/digi1.h		conflict!
diff --git a/Documentation/irqflags-tracing.txt b/Documentation/irqflags-tracing.txt
new file mode 100644
index 0000000..6a44487
--- /dev/null
+++ b/Documentation/irqflags-tracing.txt
@@ -0,0 +1,57 @@
+IRQ-flags state tracing
+
+started by Ingo Molnar <mingo@redhat.com>
+
+the "irq-flags tracing" feature "traces" hardirq and softirq state, in
+that it gives interested subsystems an opportunity to be notified of
+every hardirqs-off/hardirqs-on, softirqs-off/softirqs-on event that
+happens in the kernel.
+
+CONFIG_TRACE_IRQFLAGS_SUPPORT is needed for CONFIG_PROVE_SPIN_LOCKING
+and CONFIG_PROVE_RW_LOCKING to be offered by the generic lock debugging
+code. Otherwise only CONFIG_PROVE_MUTEX_LOCKING and
+CONFIG_PROVE_RWSEM_LOCKING will be offered on an architecture - these
+are locking APIs that are not used in IRQ context. (the one exception
+for rwsems is worked around)
+
+architecture support for this is certainly not in the "trivial"
+category, because lots of lowlevel assembly code deal with irq-flags
+state changes. But an architecture can be irq-flags-tracing enabled in a
+rather straightforward and risk-free manner.
+
+Architectures that want to support this need to do a couple of
+code-organizational changes first:
+
+- move their irq-flags manipulation code from their asm/system.h header
+  to asm/irqflags.h
+
+- rename local_irq_disable()/etc to raw_local_irq_disable()/etc. so that
+  the linux/irqflags.h code can inject callbacks and can construct the
+  real local_irq_disable()/etc APIs.
+
+- add and enable TRACE_IRQFLAGS_SUPPORT in their arch level Kconfig file
+
+and then a couple of functional changes are needed as well to implement
+irq-flags-tracing support:
+
+- in lowlevel entry code add (build-conditional) calls to the
+  trace_hardirqs_off()/trace_hardirqs_on() functions. The lock validator
+  closely guards whether the 'real' irq-flags matches the 'virtual'
+  irq-flags state, and complains loudly (and turns itself off) if the
+  two do not match. Usually most of the time for arch support for
+  irq-flags-tracing is spent in this state: look at the lockdep
+  complaint, try to figure out the assembly code we did not cover yet,
+  fix and repeat. Once the system has booted up and works without a
+  lockdep complaint in the irq-flags-tracing functions arch support is
+  complete.
+- if the architecture has non-maskable interrupts then those need to be
+  excluded from the irq-tracing [and lock validation] mechanism via
+  lockdep_off()/lockdep_on().
+
+in general there is no risk from having an incomplete irq-flags-tracing
+implementation in an architecture: lockdep will detect that and will
+turn itself off. I.e. the lock validator will still be reliable. There
+should be no crashes due to irq-tracing bugs. (except if the assembly
+changes break other code by modifying conditions or registers that
+shouldnt be)
+
diff --git a/Documentation/isdn/README.gigaset b/Documentation/isdn/README.gigaset
index 85a64de..fa0d4cc 100644
--- a/Documentation/isdn/README.gigaset
+++ b/Documentation/isdn/README.gigaset
@@ -124,7 +124,8 @@
 
      You can use some configuration tool of your distribution to configure this
      "modem" or configure pppd/wvdial manually. There are some example ppp
-     configuration files and chat scripts in the gigaset-VERSION/ppp directory.
+     configuration files and chat scripts in the gigaset-VERSION/ppp directory
+     in the driver packages from http://sourceforge.net/projects/gigaset307x/.
      Please note that the USB drivers are not able to change the state of the
      control lines (the M105 driver can be configured to use some undocumented
      control requests, if you really need the control lines, though). This means
@@ -164,8 +165,8 @@
 
      If you want both of these at once, you are out of luck.
 
-     You can also use /sys/module/<name>/parameters/cidmode for changing
-     the CID mode setting (<name> is usb_gigaset or bas_gigaset).
+     You can also use /sys/class/tty/ttyGxy/cidmode for changing the CID mode
+     setting (ttyGxy is ttyGU0 or ttyGB0).
 
 
 3.   Troubleshooting
diff --git a/Documentation/kbuild/makefiles.txt b/Documentation/kbuild/makefiles.txt
index a9c00fa..14ef3868 100644
--- a/Documentation/kbuild/makefiles.txt
+++ b/Documentation/kbuild/makefiles.txt
@@ -1123,6 +1123,14 @@
 	$(INSTALL_MOD_PATH)/lib/modules/$(KERNELRELEASE).  The user may
 	override this value on the command line if desired.
 
+    INSTALL_MOD_STRIP
+
+	If this variable is specified, will cause modules to be stripped
+	after they are installed.  If INSTALL_MOD_STRIP is '1', then the
+	default option --strip-debug will be used.  Otherwise,
+	INSTALL_MOD_STRIP will used as the option(s) to the strip command.
+
+
 === 8 Makefile language
 
 The kernel Makefiles are designed to run with GNU Make.  The Makefiles
diff --git a/Documentation/kdump/gdbmacros.txt b/Documentation/kdump/gdbmacros.txt
index dcf5580..9b9b454 100644
--- a/Documentation/kdump/gdbmacros.txt
+++ b/Documentation/kdump/gdbmacros.txt
@@ -175,7 +175,7 @@
 document trapinfo
 	Run info threads and lookup pid of thread #1
 	'trapinfo <pid>' will tell you by which trap & possibly
-	addresthe kernel paniced.
+	address the kernel panicked.
 end
 
 
diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
index 212cf3c..08bafa8 100644
--- a/Documentation/kdump/kdump.txt
+++ b/Documentation/kdump/kdump.txt
@@ -1,155 +1,325 @@
-Documentation for kdump - the kexec-based crash dumping solution
+================================================================
+Documentation for Kdump - The kexec-based Crash Dumping Solution
 ================================================================
 
-DESIGN
-======
+This document includes overview, setup and installation, and analysis
+information.
 
-Kdump uses kexec to reboot to a second kernel whenever a dump needs to be
-taken. This second kernel is booted with very little memory. The first kernel
-reserves the section of memory that the second kernel uses. This ensures that
-on-going DMA from the first kernel does not corrupt the second kernel.
+Overview
+========
 
-All the necessary information about Core image is encoded in ELF format and
-stored in reserved area of memory before crash. Physical address of start of
-ELF header is passed to new kernel through command line parameter elfcorehdr=.
+Kdump uses kexec to quickly boot to a dump-capture kernel whenever a
+dump of the system kernel's memory needs to be taken (for example, when
+the system panics). The system kernel's memory image is preserved across
+the reboot and is accessible to the dump-capture kernel.
 
-On i386, the first 640 KB of physical memory is needed to boot, irrespective
-of where the kernel loads. Hence, this region is backed up by kexec just before
-rebooting into the new kernel.
+You can use common Linux commands, such as cp and scp, to copy the
+memory image to a dump file on the local disk, or across the network to
+a remote system.
 
-In the second kernel, "old memory" can be accessed in two ways.
+Kdump and kexec are currently supported on the x86, x86_64, and ppc64
+architectures.
 
-- The first one is through a /dev/oldmem device interface. A capture utility
-  can read the device file and write out the memory in raw format. This is raw
-  dump of memory and analysis/capture tool should be intelligent enough to
-  determine where to look for the right information. ELF headers (elfcorehdr=)
-  can become handy here.
+When the system kernel boots, it reserves a small section of memory for
+the dump-capture kernel. This ensures that ongoing Direct Memory Access
+(DMA) from the system kernel does not corrupt the dump-capture kernel.
+The kexec -p command loads the dump-capture kernel into this reserved
+memory.
 
-- The second interface is through /proc/vmcore. This exports the dump as an ELF
-  format file which can be written out using any file copy command
-  (cp, scp, etc). Further, gdb can be used to perform limited debugging on
-  the dump file. This method ensures methods ensure that there is correct
-  ordering of the dump pages (corresponding to the first 640 KB that has been
-  relocated).
+On x86 machines, the first 640 KB of physical memory is needed to boot,
+regardless of where the kernel loads. Therefore, kexec backs up this
+region just before rebooting into the dump-capture kernel.
 
-SETUP
-=====
+All of the necessary information about the system kernel's core image is
+encoded in the ELF format, and stored in a reserved area of memory
+before a crash. The physical address of the start of the ELF header is
+passed to the dump-capture kernel through the elfcorehdr= boot
+parameter.
 
-1) Download the upstream kexec-tools userspace package from
-   http://www.xmission.com/~ebiederm/files/kexec/kexec-tools-1.101.tar.gz.
+With the dump-capture kernel, you can access the memory image, or "old
+memory," in two ways:
 
-   Apply the latest consolidated kdump patch on top of kexec-tools-1.101
-   from http://lse.sourceforge.net/kdump/. This arrangment has been made
-   till all the userspace patches supporting kdump are integrated with
-   upstream kexec-tools userspace.
+- Through a /dev/oldmem device interface. A capture utility can read the
+  device file and write out the memory in raw format. This is a raw dump
+  of memory. Analysis and capture tools must be intelligent enough to
+  determine where to look for the right information.
 
-2) Download and build the appropriate (2.6.13-rc1 onwards) vanilla kernels.
-   Two kernels need to be built in order to get this feature working.
-   Following are the steps to properly configure the two kernels specific
-   to kexec and kdump features:
-
-  A) First kernel or regular kernel:
-  ----------------------------------
-   a) Enable "kexec system call" feature (in Processor type and features).
-      CONFIG_KEXEC=y
-   b) Enable "sysfs file system support" (in Pseudo filesystems).
-      CONFIG_SYSFS=y
-   c) make
-   d) Boot into first kernel with the command line parameter "crashkernel=Y@X".
-      Use appropriate values for X and Y. Y denotes how much memory to reserve
-      for the second kernel, and X denotes at what physical address the
-      reserved memory section starts. For example: "crashkernel=64M@16M".
+- Through /proc/vmcore. This exports the dump as an ELF-format file that
+  you can write out using file copy commands such as cp or scp. Further,
+  you can use analysis tools such as the GNU Debugger (GDB) and the Crash
+  tool to debug the dump file. This method ensures that the dump pages are
+  correctly ordered.
 
 
-  B) Second kernel or dump capture kernel:
-  ---------------------------------------
-   a) For i386 architecture enable Highmem support
-      CONFIG_HIGHMEM=y
-   b) Enable "kernel crash dumps" feature (under "Processor type and features")
-      CONFIG_CRASH_DUMP=y
-   c) Make sure a suitable value for "Physical address where the kernel is
-      loaded" (under "Processor type and features"). By default this value
-      is 0x1000000 (16MB) and it should be same as X (See option d above),
-      e.g., 16 MB or 0x1000000.
-      CONFIG_PHYSICAL_START=0x1000000
-   d) Enable "/proc/vmcore support" (Optional, under "Pseudo filesystems").
-      CONFIG_PROC_VMCORE=y
+Setup and Installation
+======================
 
-3) After booting to regular kernel or first kernel, load the second kernel
-   using the following command:
+Install kexec-tools and the Kdump patch
+---------------------------------------
 
-   kexec -p <second-kernel> --args-linux --elf32-core-headers
-   --append="root=<root-dev> init 1 irqpoll maxcpus=1"
+1) Login as the root user.
 
-   Notes:
-   ======
-     i) <second-kernel> has to be a vmlinux image ie uncompressed elf image.
-        bzImage will not work, as of now.
-    ii) --args-linux has to be speicfied as if kexec it loading an elf image,
-        it needs to know that the arguments supplied are of linux type.
-   iii) By default ELF headers are stored in ELF64 format to support systems
-        with more than 4GB memory. Option --elf32-core-headers forces generation
-        of ELF32 headers. The reason for this option being, as of now gdb can
-        not open vmcore file with ELF64 headers on a 32 bit systems. So ELF32
-        headers can be used if one has non-PAE systems and hence memory less
-        than 4GB.
-    iv) Specify "irqpoll" as command line parameter. This reduces driver
-         initialization failures in second kernel due to shared interrupts.
-     v) <root-dev> needs to be specified in a format corresponding to the root
-        device name in the output of mount command.
-    vi) If you have built the drivers required to mount root file system as
-        modules in <second-kernel>, then, specify
-        --initrd=<initrd-for-second-kernel>.
-   vii) Specify maxcpus=1 as, if during first kernel run, if panic happens on
-        non-boot cpus, second kernel doesn't seem to be boot up all the cpus.
-        The other option is to always built the second kernel without SMP
-        support ie CONFIG_SMP=n
+2) Download the kexec-tools user-space package from the following URL:
 
-4) After successfully loading the second kernel as above, if a panic occurs
-   system reboots into the second kernel. A module can be written to force
-   the panic or "ALT-SysRq-c" can be used initiate a crash dump for testing
-   purposes.
+   http://www.xmission.com/~ebiederm/files/kexec/kexec-tools-1.101.tar.gz
 
-5) Once the second kernel has booted, write out the dump file using
+3) Unpack the tarball with the tar command, as follows:
+
+   tar xvpzf kexec-tools-1.101.tar.gz
+
+4) Download the latest consolidated Kdump patch from the following URL:
+
+   http://lse.sourceforge.net/kdump/
+
+   (This location is being used until all the user-space Kdump patches
+   are integrated with the kexec-tools package.)
+
+5) Change to the kexec-tools-1.101 directory, as follows:
+
+   cd kexec-tools-1.101
+
+6) Apply the consolidated patch to the kexec-tools-1.101 source tree
+   with the patch command, as follows. (Modify the path to the downloaded
+   patch as necessary.)
+
+   patch -p1 < /path-to-kdump-patch/kexec-tools-1.101-kdump.patch
+
+7) Configure the package, as follows:
+
+   ./configure
+
+8) Compile the package, as follows:
+
+   make
+
+9) Install the package, as follows:
+
+   make install
+
+
+Download and build the system and dump-capture kernels
+------------------------------------------------------
+
+Download the mainline (vanilla) kernel source code (2.6.13-rc1 or newer)
+from http://www.kernel.org. Two kernels must be built: a system kernel
+and a dump-capture kernel. Use the following steps to configure these
+kernels with the necessary kexec and Kdump features:
+
+System kernel
+-------------
+
+1) Enable "kexec system call" in "Processor type and features."
+
+   CONFIG_KEXEC=y
+
+2) Enable "sysfs file system support" in "Filesystem" -> "Pseudo
+   filesystems." This is usually enabled by default.
+
+   CONFIG_SYSFS=y
+
+   Note that "sysfs file system support" might not appear in the "Pseudo
+   filesystems" menu if "Configure standard kernel features (for small
+   systems)" is not enabled in "General Setup." In this case, check the
+   .config file itself to ensure that sysfs is turned on, as follows:
+
+   grep 'CONFIG_SYSFS' .config
+
+3) Enable "Compile the kernel with debug info" in "Kernel hacking."
+
+   CONFIG_DEBUG_INFO=Y
+
+   This causes the kernel to be built with debug symbols. The dump
+   analysis tools require a vmlinux with debug symbols in order to read
+   and analyze a dump file.
+
+4) Make and install the kernel and its modules. Update the boot loader
+   (such as grub, yaboot, or lilo) configuration files as necessary.
+
+5) Boot the system kernel with the boot parameter "crashkernel=Y@X",
+   where Y specifies how much memory to reserve for the dump-capture kernel
+   and X specifies the beginning of this reserved memory. For example,
+   "crashkernel=64M@16M" tells the system kernel to reserve 64 MB of memory
+   starting at physical address 0x01000000 for the dump-capture kernel.
+
+   On x86 and x86_64, use "crashkernel=64M@16M".
+
+   On ppc64, use "crashkernel=128M@32M".
+
+
+The dump-capture kernel
+-----------------------
+
+1) Under "General setup," append "-kdump" to the current string in
+   "Local version."
+
+2) On x86, enable high memory support under "Processor type and
+   features":
+
+   CONFIG_HIGHMEM64G=y
+   or
+   CONFIG_HIGHMEM4G
+
+3) On x86 and x86_64, disable symmetric multi-processing support
+   under "Processor type and features":
+
+   CONFIG_SMP=n
+   (If CONFIG_SMP=y, then specify maxcpus=1 on the kernel command line
+   when loading the dump-capture kernel, see section "Load the Dump-capture
+   Kernel".)
+
+4) On ppc64, disable NUMA support and enable EMBEDDED support:
+
+   CONFIG_NUMA=n
+   CONFIG_EMBEDDED=y
+   CONFIG_EEH=N for the dump-capture kernel
+
+5) Enable "kernel crash dumps" support under "Processor type and
+   features":
+
+   CONFIG_CRASH_DUMP=y
+
+6) Use a suitable value for "Physical address where the kernel is
+   loaded" (under "Processor type and features"). This only appears when
+   "kernel crash dumps" is enabled. By default this value is 0x1000000
+   (16MB). It should be the same as X in the "crashkernel=Y@X" boot
+   parameter discussed above.
+
+   On x86 and x86_64, use "CONFIG_PHYSICAL_START=0x1000000".
+
+   On ppc64 the value is automatically set at 32MB when
+   CONFIG_CRASH_DUMP is set.
+
+6) Optionally enable "/proc/vmcore support" under "Filesystems" ->
+   "Pseudo filesystems".
+
+   CONFIG_PROC_VMCORE=y
+   (CONFIG_PROC_VMCORE is set by default when CONFIG_CRASH_DUMP is selected.)
+
+7) Make and install the kernel and its modules. DO NOT add this kernel
+   to the boot loader configuration files.
+
+
+Load the Dump-capture Kernel
+============================
+
+After booting to the system kernel, load the dump-capture kernel using
+the following command:
+
+   kexec -p <dump-capture-kernel> \
+   --initrd=<initrd-for-dump-capture-kernel> --args-linux \
+   --append="root=<root-dev> init 1 irqpoll"
+
+
+Notes on loading the dump-capture kernel:
+
+* <dump-capture-kernel> must be a vmlinux image (that is, an
+  uncompressed ELF image). bzImage does not work at this time.
+
+* By default, the ELF headers are stored in ELF64 format to support
+  systems with more than 4GB memory. The --elf32-core-headers option can
+  be used to force the generation of ELF32 headers. This is necessary
+  because GDB currently cannot open vmcore files with ELF64 headers on
+  32-bit systems. ELF32 headers can be used on non-PAE systems (that is,
+  less than 4GB of memory).
+
+* The "irqpoll" boot parameter reduces driver initialization failures
+  due to shared interrupts in the dump-capture kernel.
+
+* You must specify <root-dev> in the format corresponding to the root
+  device name in the output of mount command.
+
+* "init 1" boots the dump-capture kernel into single-user mode without
+  networking. If you want networking, use "init 3."
+
+
+Kernel Panic
+============
+
+After successfully loading the dump-capture kernel as previously
+described, the system will reboot into the dump-capture kernel if a
+system crash is triggered.  Trigger points are located in panic(),
+die(), die_nmi() and in the sysrq handler (ALT-SysRq-c).
+
+The following conditions will execute a crash trigger point:
+
+If a hard lockup is detected and "NMI watchdog" is configured, the system
+will boot into the dump-capture kernel ( die_nmi() ).
+
+If die() is called, and it happens to be a thread with pid 0 or 1, or die()
+is called inside interrupt context or die() is called and panic_on_oops is set,
+the system will boot into the dump-capture kernel.
+
+On powererpc systems when a soft-reset is generated, die() is called by all cpus and the system system will boot into the dump-capture kernel.
+
+For testing purposes, you can trigger a crash by using "ALT-SysRq-c",
+"echo c > /proc/sysrq-trigger or write a module to force the panic.
+
+Write Out the Dump File
+=======================
+
+After the dump-capture kernel is booted, write out the dump file with
+the following command:
 
    cp /proc/vmcore <dump-file>
 
-   Dump memory can also be accessed as a /dev/oldmem device for a linear/raw
-   view.  To create the device, type:
+You can also access dumped memory as a /dev/oldmem device for a linear
+and raw view. To create the device, use the following command:
 
-   mknod /dev/oldmem c 1 12
+    mknod /dev/oldmem c 1 12
 
-   Use "dd" with suitable options for count, bs and skip to access specific
-   portions of the dump.
+Use the dd command with suitable options for count, bs, and skip to
+access specific portions of the dump.
 
-   Entire memory:  dd if=/dev/oldmem of=oldmem.001
+To see the entire memory, use the following command:
+
+   dd if=/dev/oldmem of=oldmem.001
 
 
-ANALYSIS
+Analysis
 ========
-Limited analysis can be done using gdb on the dump file copied out of
-/proc/vmcore. Use vmlinux built with -g and run
 
-  gdb vmlinux <dump-file>
+Before analyzing the dump image, you should reboot into a stable kernel.
 
-Stack trace for the task on processor 0, register display, memory display
-work fine.
+You can do limited analysis using GDB on the dump file copied out of
+/proc/vmcore. Use the debug vmlinux built with -g and run the following
+command:
 
-Note: gdb cannot analyse core files generated in ELF64 format for i386.
+   gdb vmlinux <dump-file>
 
-Latest "crash" (crash-4.0-2.18) as available on Dave Anderson's site
-http://people.redhat.com/~anderson/ works well with kdump format.
+Stack trace for the task on processor 0, register display, and memory
+display work fine.
+
+Note: GDB cannot analyze core files generated in ELF64 format for x86.
+On systems with a maximum of 4GB of memory, you can generate
+ELF32-format headers using the --elf32-core-headers kernel option on the
+dump kernel.
+
+You can also use the Crash utility to analyze dump files in Kdump
+format. Crash is available on Dave Anderson's site at the following URL:
+
+   http://people.redhat.com/~anderson/
 
 
-TODO
-====
-1) Provide a kernel pages filtering mechanism so that core file size is not
-   insane on systems having huge memory banks.
-2) Relocatable kernel can help in maintaining multiple kernels for crashdump
-   and same kernel as the first kernel can be used to capture the dump.
+To Do
+=====
+
+1) Provide a kernel pages filtering mechanism, so core file size is not
+   extreme on systems with huge memory banks.
+
+2) Relocatable kernel can help in maintaining multiple kernels for
+   crash_dump, and the same kernel as the system kernel can be used to
+   capture the dump.
 
 
-CONTACT
+Contact
 =======
+
 Vivek Goyal (vgoyal@in.ibm.com)
 Maneesh Soni (maneesh@in.ibm.com)
+
+
+Trademark
+=========
+
+Linux is a trademark of Linus Torvalds in the United States, other
+countries, or both.
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index b3a6187..e11f772 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -35,7 +35,6 @@
 	APM	Advanced Power Management support is enabled.
 	AX25	Appropriate AX.25 support is enabled.
 	CD	Appropriate CD support is enabled.
-	DEVFS	devfs support is enabled.
 	DRM	Direct Rendering Management support is enabled.
 	EDD	BIOS Enhanced Disk Drive Services (EDD) is enabled
 	EFI	EFI Partitioning (GPT) is enabled
@@ -61,6 +60,7 @@
 	MTD	MTD support is enabled.
 	NET	Appropriate network support is enabled.
 	NUMA	NUMA support is enabled.
+	GENERIC_TIME The generic timeofday code is enabled.
 	NFS	Appropriate NFS support is enabled.
 	OSS	OSS sound support is enabled.
 	PARIDE	The ParIDE subsystem is enabled.
@@ -147,6 +147,9 @@
 	acpi_irq_isa=	[HW,ACPI] If irq_balance, mark listed IRQs used by ISA
 			Format: <irq>,<irq>...
 
+	acpi_os_name=	[HW,ACPI] Tell ACPI BIOS the name of the OS
+			Format: To spoof as Windows 98: ="Microsoft Windows"
+
 	acpi_osi=	[HW,ACPI] empty param disables _OSI
 
 	acpi_serialize	[HW,ACPI] force serialization of AML methods
@@ -176,6 +179,11 @@
 			override platform specific driver.
 			See also Documentation/acpi-hotkey.txt.
 
+	acpi_pm_good	[IA-32,X86-64]
+			Override the pmtimer bug detection: force the kernel
+			to assume that this machine's pmtimer latches its value
+			and always returns good values.
+
 	enable_timer_pin_1 [i386,x86-64]
 			Enable PIN 1 of APIC timer
 			Can be useful to work around chipset bugs
@@ -338,10 +346,11 @@
 			Value can be changed at runtime via
 				/selinux/checkreqprot.
 
- 	clock=		[BUGS=IA-32,HW] gettimeofday timesource override.
-			Forces specified timesource (if avaliable) to be used
-			when calculating gettimeofday(). If specicified
-			timesource is not avalible, it defaults to PIT.
+	clock=		[BUGS=IA-32, HW] gettimeofday clocksource override.
+			[Deprecated]
+			Forces specified clocksource (if avaliable) to be used
+			when calculating gettimeofday(). If specified
+			clocksource is not avalible, it defaults to PIT.
 			Format: { pit | tsc | cyclone | pmtmr }
 
 	disable_8254_timer
@@ -426,12 +435,20 @@
 
 	debug		[KNL] Enable kernel debugging (events log level).
 
+	debug_locks_verbose=
+			[KNL] verbose self-tests
+			Format=<0|1>
+			Print debugging info while doing the locking API
+			self-tests.
+			We default to 0 (no extra messages), setting it to
+			1 will print _a lot_ more information - normally
+			only useful to kernel developers.
+
 	decnet=		[HW,NET]
 			Format: <area>[,<node>]
 			See also Documentation/networking/decnet.txt.
 
-	devfs=		[DEVFS]
-			See Documentation/filesystems/devfs/boot-options.
+	delayacct	[KNL] Enable per-task delay accounting
 
 	dhash_entries=	[KNL]
 			Set number of hash buckets for dentry cache.
@@ -1402,6 +1419,15 @@
 			If enabled at boot time, /selinux/disable can be used
 			later to disable prior to initial policy load.
 
+	selinux_compat_net =
+			[SELINUX] Set initial selinux_compat_net flag value.
+                        Format: { "0" | "1" }
+                        0 -- use new secmark-based packet controls
+                        1 -- use legacy packet controls
+                        Default value is 0 (preferred).
+                        Value can be changed at runtime via
+                        /selinux/compat_net.
+
 	serialnumber	[BUGS=IA-32]
 
 	sg_def_reserved_size=	[SCSI]
@@ -1605,6 +1631,10 @@
 
 	time		Show timing data prefixed to each printk message line
 
+	clocksource=	[GENERIC_TIME] Override the default clocksource
+			Override the default clocksource and use the clocksource
+			with the name specified.
+
 	tipar.timeout=	[HW,PPT]
 			Set communications timeout in tenths of a second
 			(default 15).
@@ -1646,6 +1676,10 @@
 	usbhid.mousepoll=
 			[USBHID] The interval which mice are to be polled at.
 
+	vdso=		[IA-32]
+			vdso=1: enable VDSO (default)
+			vdso=0: disable VDSO mapping
+
 	video=		[FB] Frame buffer configuration
 			See Documentation/fb/modedb.txt.
 
@@ -1662,9 +1696,14 @@
 			decrease the size and leave more room for directly
 			mapped kernel RAM.
 
-	vmhalt=		[KNL,S390]
+	vmhalt=		[KNL,S390] Perform z/VM CP command after system halt.
+			Format: <command>
 
-	vmpoff=		[KNL,S390]
+	vmpanic=	[KNL,S390] Perform z/VM CP command after kernel panic.
+			Format: <command>
+
+	vmpoff=		[KNL,S390] Perform z/VM CP command after power off.
+			Format: <command>
 
 	waveartist=	[HW,OSS]
 			Format: <io>,<irq>,<dma>,<dma2>
diff --git a/Documentation/keys-request-key.txt b/Documentation/keys-request-key.txt
index 22488d7..c1f64fd 100644
--- a/Documentation/keys-request-key.txt
+++ b/Documentation/keys-request-key.txt
@@ -3,16 +3,23 @@
 			      ===================
 
 The key request service is part of the key retention service (refer to
-Documentation/keys.txt). This document explains more fully how that the
-requesting algorithm works.
+Documentation/keys.txt).  This document explains more fully how the requesting
+algorithm works.
 
 The process starts by either the kernel requesting a service by calling
-request_key():
+request_key*():
 
 	struct key *request_key(const struct key_type *type,
 				const char *description,
 				const char *callout_string);
 
+or:
+
+	struct key *request_key_with_auxdata(const struct key_type *type,
+					     const char *description,
+					     const char *callout_string,
+					     void *aux);
+
 Or by userspace invoking the request_key system call:
 
 	key_serial_t request_key(const char *type,
@@ -20,16 +27,26 @@
 				 const char *callout_info,
 				 key_serial_t dest_keyring);
 
-The main difference between the two access points is that the in-kernel
-interface does not need to link the key to a keyring to prevent it from being
-immediately destroyed. The kernel interface returns a pointer directly to the
-key, and it's up to the caller to destroy the key.
+The main difference between the access points is that the in-kernel interface
+does not need to link the key to a keyring to prevent it from being immediately
+destroyed.  The kernel interface returns a pointer directly to the key, and
+it's up to the caller to destroy the key.
+
+The request_key_with_auxdata() call is like the in-kernel request_key() call,
+except that it permits auxiliary data to be passed to the upcaller (the default
+is NULL).  This is only useful for those key types that define their own upcall
+mechanism rather than using /sbin/request-key.
 
 The userspace interface links the key to a keyring associated with the process
 to prevent the key from going away, and returns the serial number of the key to
 the caller.
 
 
+The following example assumes that the key types involved don't define their
+own upcall mechanisms.  If they do, then those should be substituted for the
+forking and execution of /sbin/request-key.
+
+
 ===========
 THE PROCESS
 ===========
@@ -40,8 +57,8 @@
      interface].
 
  (2) request_key() searches the process's subscribed keyrings to see if there's
-     a suitable key there. If there is, it returns the key. If there isn't, and
-     callout_info is not set, an error is returned. Otherwise the process
+     a suitable key there.  If there is, it returns the key.  If there isn't,
+     and callout_info is not set, an error is returned.  Otherwise the process
      proceeds to the next step.
 
  (3) request_key() sees that A doesn't have the desired key yet, so it creates
@@ -62,7 +79,7 @@
      instantiation.
 
  (7) The program may want to access another key from A's context (say a
-     Kerberos TGT key). It just requests the appropriate key, and the keyring
+     Kerberos TGT key).  It just requests the appropriate key, and the keyring
      search notes that the session keyring has auth key V in its bottom level.
 
      This will permit it to then search the keyrings of process A with the
@@ -79,10 +96,11 @@
 (10) The program then exits 0 and request_key() deletes key V and returns key
      U to the caller.
 
-This also extends further. If key W (step 7 above) didn't exist, key W would be
-created uninstantiated, another auth key (X) would be created (as per step 3)
-and another copy of /sbin/request-key spawned (as per step 4); but the context
-specified by auth key X will still be process A, as it was in auth key V.
+This also extends further.  If key W (step 7 above) didn't exist, key W would
+be created uninstantiated, another auth key (X) would be created (as per step
+3) and another copy of /sbin/request-key spawned (as per step 4); but the
+context specified by auth key X will still be process A, as it was in auth key
+V.
 
 This is because process A's keyrings can't simply be attached to
 /sbin/request-key at the appropriate places because (a) execve will discard two
@@ -118,17 +136,17 @@
 
  (2) It considers all the non-keyring keys within that keyring and, if any key
      matches the criteria specified, calls key_permission(SEARCH) on it to see
-     if the key is allowed to be found. If it is, that key is returned; if
+     if the key is allowed to be found.  If it is, that key is returned; if
      not, the search continues, and the error code is retained if of higher
      priority than the one currently set.
 
  (3) It then considers all the keyring-type keys in the keyring it's currently
-     searching. It calls key_permission(SEARCH) on each keyring, and if this
+     searching.  It calls key_permission(SEARCH) on each keyring, and if this
      grants permission, it recurses, executing steps (2) and (3) on that
      keyring.
 
 The process stops immediately a valid key is found with permission granted to
-use it. Any error from a previous match attempt is discarded and the key is
+use it.  Any error from a previous match attempt is discarded and the key is
 returned.
 
 When search_process_keyrings() is invoked, it performs the following searches
@@ -153,7 +171,7 @@
 returned.
 
 Only if all these fail does the whole thing fail with the highest priority
-error. Note that several errors may have come from LSM.
+error.  Note that several errors may have come from LSM.
 
 The error priority is:
 
diff --git a/Documentation/keys.txt b/Documentation/keys.txt
index aaa01b0..e373f02 100644
--- a/Documentation/keys.txt
+++ b/Documentation/keys.txt
@@ -19,6 +19,7 @@
 	- Key overview
 	- Key service overview
 	- Key access permissions
+	- SELinux support
 	- New procfs files
 	- Userspace system call interface
 	- Kernel services
@@ -232,6 +233,39 @@
 the key or having the sysadmin capability is sufficient.
 
 
+===============
+SELINUX SUPPORT
+===============
+
+The security class "key" has been added to SELinux so that mandatory access
+controls can be applied to keys created within various contexts.  This support
+is preliminary, and is likely to change quite significantly in the near future.
+Currently, all of the basic permissions explained above are provided in SELinux
+as well; SELinux is simply invoked after all basic permission checks have been
+performed.
+
+The value of the file /proc/self/attr/keycreate influences the labeling of
+newly-created keys.  If the contents of that file correspond to an SELinux
+security context, then the key will be assigned that context.  Otherwise, the
+key will be assigned the current context of the task that invoked the key
+creation request.  Tasks must be granted explicit permission to assign a
+particular context to newly-created keys, using the "create" permission in the
+key security class.
+
+The default keyrings associated with users will be labeled with the default
+context of the user if and only if the login programs have been instrumented to
+properly initialize keycreate during the login process.  Otherwise, they will
+be labeled with the context of the login program itself.
+
+Note, however, that the default keyrings associated with the root user are
+labeled with the default kernel context, since they are created early in the
+boot process, before root has a chance to log in.
+
+The keyrings associated with new threads are each labeled with the context of
+their associated thread, and both session and process keyrings are handled
+similarly.
+
+
 ================
 NEW PROCFS FILES
 ================
@@ -241,9 +275,17 @@
 
  (*) /proc/keys
 
-     This lists all the keys on the system, giving information about their
-     type, description and permissions. The payload of the key is not available
-     this way:
+     This lists the keys that are currently viewable by the task reading the
+     file, giving information about their type, description and permissions.
+     It is not possible to view the payload of the key this way, though some
+     information about it may be given.
+
+     The only keys included in the list are those that grant View permission to
+     the reading process whether or not it possesses them.  Note that LSM
+     security checks are still performed, and may further filter out keys that
+     the current process is not authorised to view.
+
+     The contents of the file look like this:
 
 	SERIAL   FLAGS  USAGE EXPY PERM     UID   GID   TYPE      DESCRIPTION: SUMMARY
 	00000001 I-----    39 perm 1f3f0000     0     0 keyring   _uid_ses.0: 1/4
@@ -271,7 +313,7 @@
  (*) /proc/key-users
 
      This file lists the tracking data for each user that has at least one key
-     on the system. Such data includes quota information and statistics:
+     on the system.  Such data includes quota information and statistics:
 
 	[root@andromeda root]# cat /proc/key-users
 	0:     46 45/45 1/100 13/10000
@@ -738,6 +780,17 @@
     See also Documentation/keys-request-key.txt.
 
 
+(*) To search for a key, passing auxiliary data to the upcaller, call:
+
+	struct key *request_key_with_auxdata(const struct key_type *type,
+					     const char *description,
+					     const char *callout_string,
+					     void *aux);
+
+    This is identical to request_key(), except that the auxiliary data is
+    passed to the key_type->request_key() op if it exists.
+
+
 (*) When it is no longer required, the key should be released using:
 
 	void key_put(struct key *key);
@@ -935,6 +988,16 @@
      It is not safe to sleep in this method; the caller may hold spinlocks.
 
 
+ (*) void (*revoke)(struct key *key);
+
+     This method is optional.  It is called to discard part of the payload
+     data upon a key being revoked.  The caller will have the key semaphore
+     write-locked.
+
+     It is safe to sleep in this method, though care should be taken to avoid
+     a deadlock against the key semaphore.
+
+
  (*) void (*destroy)(struct key *key);
 
      This method is optional. It is called to discard the payload data on a key
@@ -979,6 +1042,24 @@
      as might happen when the userspace buffer is accessed.
 
 
+ (*) int (*request_key)(struct key *key, struct key *authkey, const char *op,
+			void *aux);
+
+     This method is optional.  If provided, request_key() and
+     request_key_with_auxdata() will invoke this function rather than
+     upcalling to /sbin/request-key to operate upon a key of this type.
+
+     The aux parameter is as passed to request_key_with_auxdata() or is NULL
+     otherwise.  Also passed are the key to be operated upon, the
+     authorisation key for this operation and the operation type (currently
+     only "create").
+
+     This function should return only when the upcall is complete.  Upon return
+     the authorisation key will be revoked, and the target key will be
+     negatively instantiated if it is still uninstantiated.  The error will be
+     returned to the caller of request_key*().
+
+
 ============================
 REQUEST-KEY CALLBACK SERVICE
 ============================
diff --git a/Documentation/lockdep-design.txt b/Documentation/lockdep-design.txt
new file mode 100644
index 0000000..00d9360
--- /dev/null
+++ b/Documentation/lockdep-design.txt
@@ -0,0 +1,197 @@
+Runtime locking correctness validator
+=====================================
+
+started by Ingo Molnar <mingo@redhat.com>
+additions by Arjan van de Ven <arjan@linux.intel.com>
+
+Lock-class
+----------
+
+The basic object the validator operates upon is a 'class' of locks.
+
+A class of locks is a group of locks that are logically the same with
+respect to locking rules, even if the locks may have multiple (possibly
+tens of thousands of) instantiations. For example a lock in the inode
+struct is one class, while each inode has its own instantiation of that
+lock class.
+
+The validator tracks the 'state' of lock-classes, and it tracks
+dependencies between different lock-classes. The validator maintains a
+rolling proof that the state and the dependencies are correct.
+
+Unlike an lock instantiation, the lock-class itself never goes away: when
+a lock-class is used for the first time after bootup it gets registered,
+and all subsequent uses of that lock-class will be attached to this
+lock-class.
+
+State
+-----
+
+The validator tracks lock-class usage history into 5 separate state bits:
+
+- 'ever held in hardirq context'                    [ == hardirq-safe   ]
+- 'ever held in softirq context'                    [ == softirq-safe   ]
+- 'ever held with hardirqs enabled'                 [ == hardirq-unsafe ]
+- 'ever held with softirqs and hardirqs enabled'    [ == softirq-unsafe ]
+
+- 'ever used'                                       [ == !unused        ]
+
+Single-lock state rules:
+------------------------
+
+A softirq-unsafe lock-class is automatically hardirq-unsafe as well. The
+following states are exclusive, and only one of them is allowed to be
+set for any lock-class:
+
+ <hardirq-safe> and <hardirq-unsafe>
+ <softirq-safe> and <softirq-unsafe>
+
+The validator detects and reports lock usage that violate these
+single-lock state rules.
+
+Multi-lock dependency rules:
+----------------------------
+
+The same lock-class must not be acquired twice, because this could lead
+to lock recursion deadlocks.
+
+Furthermore, two locks may not be taken in different order:
+
+ <L1> -> <L2>
+ <L2> -> <L1>
+
+because this could lead to lock inversion deadlocks. (The validator
+finds such dependencies in arbitrary complexity, i.e. there can be any
+other locking sequence between the acquire-lock operations, the
+validator will still track all dependencies between locks.)
+
+Furthermore, the following usage based lock dependencies are not allowed
+between any two lock-classes:
+
+   <hardirq-safe>   ->  <hardirq-unsafe>
+   <softirq-safe>   ->  <softirq-unsafe>
+
+The first rule comes from the fact the a hardirq-safe lock could be
+taken by a hardirq context, interrupting a hardirq-unsafe lock - and
+thus could result in a lock inversion deadlock. Likewise, a softirq-safe
+lock could be taken by an softirq context, interrupting a softirq-unsafe
+lock.
+
+The above rules are enforced for any locking sequence that occurs in the
+kernel: when acquiring a new lock, the validator checks whether there is
+any rule violation between the new lock and any of the held locks.
+
+When a lock-class changes its state, the following aspects of the above
+dependency rules are enforced:
+
+- if a new hardirq-safe lock is discovered, we check whether it
+  took any hardirq-unsafe lock in the past.
+
+- if a new softirq-safe lock is discovered, we check whether it took
+  any softirq-unsafe lock in the past.
+
+- if a new hardirq-unsafe lock is discovered, we check whether any
+  hardirq-safe lock took it in the past.
+
+- if a new softirq-unsafe lock is discovered, we check whether any
+  softirq-safe lock took it in the past.
+
+(Again, we do these checks too on the basis that an interrupt context
+could interrupt _any_ of the irq-unsafe or hardirq-unsafe locks, which
+could lead to a lock inversion deadlock - even if that lock scenario did
+not trigger in practice yet.)
+
+Exception: Nested data dependencies leading to nested locking
+-------------------------------------------------------------
+
+There are a few cases where the Linux kernel acquires more than one
+instance of the same lock-class. Such cases typically happen when there
+is some sort of hierarchy within objects of the same type. In these
+cases there is an inherent "natural" ordering between the two objects
+(defined by the properties of the hierarchy), and the kernel grabs the
+locks in this fixed order on each of the objects.
+
+An example of such an object hieararchy that results in "nested locking"
+is that of a "whole disk" block-dev object and a "partition" block-dev
+object; the partition is "part of" the whole device and as long as one
+always takes the whole disk lock as a higher lock than the partition
+lock, the lock ordering is fully correct. The validator does not
+automatically detect this natural ordering, as the locking rule behind
+the ordering is not static.
+
+In order to teach the validator about this correct usage model, new
+versions of the various locking primitives were added that allow you to
+specify a "nesting level". An example call, for the block device mutex,
+looks like this:
+
+enum bdev_bd_mutex_lock_class
+{
+       BD_MUTEX_NORMAL,
+       BD_MUTEX_WHOLE,
+       BD_MUTEX_PARTITION
+};
+
+ mutex_lock_nested(&bdev->bd_contains->bd_mutex, BD_MUTEX_PARTITION);
+
+In this case the locking is done on a bdev object that is known to be a
+partition.
+
+The validator treats a lock that is taken in such a nested fasion as a
+separate (sub)class for the purposes of validation.
+
+Note: When changing code to use the _nested() primitives, be careful and
+check really thoroughly that the hiearchy is correctly mapped; otherwise
+you can get false positives or false negatives.
+
+Proof of 100% correctness:
+--------------------------
+
+The validator achieves perfect, mathematical 'closure' (proof of locking
+correctness) in the sense that for every simple, standalone single-task
+locking sequence that occured at least once during the lifetime of the
+kernel, the validator proves it with a 100% certainty that no
+combination and timing of these locking sequences can cause any class of
+lock related deadlock. [*]
+
+I.e. complex multi-CPU and multi-task locking scenarios do not have to
+occur in practice to prove a deadlock: only the simple 'component'
+locking chains have to occur at least once (anytime, in any
+task/context) for the validator to be able to prove correctness. (For
+example, complex deadlocks that would normally need more than 3 CPUs and
+a very unlikely constellation of tasks, irq-contexts and timings to
+occur, can be detected on a plain, lightly loaded single-CPU system as
+well!)
+
+This radically decreases the complexity of locking related QA of the
+kernel: what has to be done during QA is to trigger as many "simple"
+single-task locking dependencies in the kernel as possible, at least
+once, to prove locking correctness - instead of having to trigger every
+possible combination of locking interaction between CPUs, combined with
+every possible hardirq and softirq nesting scenario (which is impossible
+to do in practice).
+
+[*] assuming that the validator itself is 100% correct, and no other
+    part of the system corrupts the state of the validator in any way.
+    We also assume that all NMI/SMM paths [which could interrupt
+    even hardirq-disabled codepaths] are correct and do not interfere
+    with the validator. We also assume that the 64-bit 'chain hash'
+    value is unique for every lock-chain in the system. Also, lock
+    recursion must not be higher than 20.
+
+Performance:
+------------
+
+The above rules require _massive_ amounts of runtime checking. If we did
+that for every lock taken and for every irqs-enable event, it would
+render the system practically unusably slow. The complexity of checking
+is O(N^2), so even with just a few hundred lock-classes we'd have to do
+tens of thousands of checks for every event.
+
+This problem is solved by checking any given 'locking scenario' (unique
+sequence of locks taken after each other) only once. A simple stack of
+held locks is maintained, and a lightweight 64-bit hash value is
+calculated, which hash is unique for every lock chain. The hash value,
+when the chain is validated for the first time, is then put into a hash
+table, which hash-table can be checked in a lockfree manner. If the
+locking chain occurs again later on, the hash table tells us that we
+dont have to validate the chain again.
diff --git a/Documentation/md.txt b/Documentation/md.txt
index 03a13c4..0668f9d 100644
--- a/Documentation/md.txt
+++ b/Documentation/md.txt
@@ -200,6 +200,17 @@
      This can be written only while the array is being assembled, not
      after it is started.
 
+  layout
+     The "layout" for the array for the particular level.  This is
+     simply a number that is interpretted differently by different
+     levels.  It can be written while assembling an array.
+
+  resync_start
+     The point at which resync should start.  If no resync is needed,
+     this will be a very large number.  At array creation it will
+     default to 0, though starting the array as 'clean' will
+     set it much larger.
+
    new_dev
      This file can be written but not read.  The value written should
      be a block device number as major:minor.  e.g. 8:0
@@ -207,6 +218,54 @@
      available.  It will then appear at md/dev-XXX (depending on the
      name of the device) and further configuration is then possible.
 
+   safe_mode_delay
+     When an md array has seen no write requests for a certain period
+     of time, it will be marked as 'clean'.  When another write
+     request arrive, the array is marked as 'dirty' before the write
+     commenses.  This is known as 'safe_mode'.
+     The 'certain period' is controlled by this file which stores the
+     period as a number of seconds.  The default is 200msec (0.200).
+     Writing a value of 0 disables safemode.
+
+   array_state
+     This file contains a single word which describes the current
+     state of the array.  In many cases, the state can be set by
+     writing the word for the desired state, however some states
+     cannot be explicitly set, and some transitions are not allowed.
+
+     clear
+         No devices, no size, no level
+         Writing is equivalent to STOP_ARRAY ioctl
+     inactive
+         May have some settings, but array is not active
+            all IO results in error
+         When written, doesn't tear down array, but just stops it
+     suspended (not supported yet)
+         All IO requests will block. The array can be reconfigured.
+         Writing this, if accepted, will block until array is quiessent
+     readonly
+         no resync can happen.  no superblocks get written.
+         write requests fail
+     read-auto
+         like readonly, but behaves like 'clean' on a write request.
+
+     clean - no pending writes, but otherwise active.
+         When written to inactive array, starts without resync
+         If a write request arrives then
+           if metadata is known, mark 'dirty' and switch to 'active'.
+           if not known, block and switch to write-pending
+         If written to an active array that has pending writes, then fails.
+     active
+         fully active: IO and resync can be happening.
+         When written to inactive array, starts with resync
+
+     write-pending
+         clean, but writes are blocked waiting for 'active' to be written.
+
+     active-idle
+         like active, but no writes have been seen for a while (safe_mode_delay).
+
+
    sync_speed_min
    sync_speed_max
      This are similar to /proc/sys/dev/raid/speed_limit_{min,max}
@@ -250,10 +309,18 @@
 	      faulty   - device has been kicked from active use due to
                          a detected fault
 	      in_sync  - device is a fully in-sync member of the array
+	      writemostly - device will only be subject to read
+		         requests if there are no other options.
+			 This applies only to raid1 arrays.
 	      spare    - device is working, but not a full member.
 			 This includes spares that are in the process
 			 of being recoverred to
 	This list make grow in future.
+	This can be written to.
+	Writing "faulty"  simulates a failure on the device.
+	Writing "remove" removes the device from the array.
+	Writing "writemostly" sets the writemostly flag.
+	Writing "-writemostly" clears the writemostly flag.
 
       errors
 	An approximate count of read errors that have been detected on
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index c61d8b8..46b9b38 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -19,6 +19,7 @@
      - Control dependencies.
      - SMP barrier pairing.
      - Examples of memory barrier sequences.
+     - Read memory barriers vs load speculation.
 
  (*) Explicit kernel barriers.
 
@@ -248,7 +249,7 @@
      we may get either of:
 
 	STORE *A = X; Y = LOAD *A;
-	STORE *A = Y;
+	STORE *A = Y = X;
 
 
 =========================
@@ -261,9 +262,14 @@
 CPU to restrict the order.
 
 Memory barriers are such interventions.  They impose a perceived partial
-ordering between the memory operations specified on either side of the barrier.
-They request that the sequence of memory events generated appears to other
-parts of the system as if the barrier is effective on that CPU.
+ordering over the memory operations on either side of the barrier.
+
+Such enforcement is important because the CPUs and other devices in a system
+can use a variety of tricks to improve performance - including reordering,
+deferral and combination of memory operations; speculative loads; speculative
+branch prediction and various types of caching.  Memory barriers are used to
+override or suppress these tricks, allowing the code to sanely control the
+interaction of multiple CPUs and/or devices.
 
 
 VARIETIES OF MEMORY BARRIER
@@ -281,7 +287,7 @@
      A write barrier is a partial ordering on stores only; it is not required
      to have any effect on loads.
 
-     A CPU can be viewed as as commiting a sequence of store operations to the
+     A CPU can be viewed as committing a sequence of store operations to the
      memory system as time progresses.  All stores before a write barrier will
      occur in the sequence _before_ all the stores after the write barrier.
 
@@ -344,9 +350,12 @@
 
  (4) General memory barriers.
 
-     A general memory barrier is a combination of both a read memory barrier
-     and a write memory barrier.  It is a partial ordering over both loads and
-     stores.
+     A general memory barrier gives a guarantee that all the LOAD and STORE
+     operations specified before the barrier will appear to happen before all
+     the LOAD and STORE operations specified after the barrier with respect to
+     the other components of the system.
+
+     A general memory barrier is a partial ordering over both loads and stores.
 
      General memory barriers imply both read and write memory barriers, and so
      can substitute for either.
@@ -409,7 +418,7 @@
      indirect effect will be the order in which the second CPU sees the effects
      of the first CPU's accesses occur, but see the next point:
 
- (*) There is no guarantee that the a CPU will see the correct order of effects
+ (*) There is no guarantee that a CPU will see the correct order of effects
      from a second CPU's accesses, even _if_ the second CPU uses a memory
      barrier, unless the first CPU _also_ uses a matching memory barrier (see
      the subsection on "SMP Barrier Pairing").
@@ -457,8 +466,8 @@
 isn't, and this behaviour can be observed on certain real CPUs (such as the DEC
 Alpha).
 
-To deal with this, a data dependency barrier must be inserted between the
-address load and the data load:
+To deal with this, a data dependency barrier or better must be inserted
+between the address load and the data load:
 
 	CPU 1		CPU 2
 	===============	===============
@@ -480,7 +489,7 @@
 variable B might be stored in an even-numbered cache line.  Then, if the
 even-numbered bank of the reading CPU's cache is extremely busy while the
 odd-numbered bank is idle, one can see the new value of the pointer P (&B),
-but the old value of the variable B (1).
+but the old value of the variable B (2).
 
 
 Another example of where data dependency barriers might by required is where a
@@ -546,9 +555,9 @@
 	===============	===============
 	a = 1;
 	<write barrier>
-	b = 2;		x = a;
+	b = 2;		x = b;
 			<read barrier>
-			y = b;
+			y = a;
 
 Or:
 
@@ -563,6 +572,18 @@
 Basically, the read barrier always has to be there, even though it can be of
 the "weaker" type.
 
+[!] Note that the stores before the write barrier would normally be expected to
+match the loads after the read barrier or data dependency barrier, and vice
+versa:
+
+	CPU 1                           CPU 2
+	===============                 ===============
+	a = 1;           }----   --->{  v = c
+	b = 2;           }    \ /    {  w = d
+	<write barrier>        \        <read barrier>
+	c = 3;           }    / \    {  x = a;
+	d = 4;           }----   --->{  y = b;
+
 
 EXAMPLES OF MEMORY BARRIER SEQUENCES
 ------------------------------------
@@ -581,7 +602,7 @@
 
 This sequence of events is committed to the memory coherence system in an order
 that the rest of the system might perceive as the unordered set of { STORE A,
-STORE B, STORE C } all occuring before the unordered set of { STORE D, STORE E
+STORE B, STORE C } all occurring before the unordered set of { STORE D, STORE E
 }:
 
 	+-------+       :      :
@@ -600,8 +621,8 @@
 	|       |       +------+
 	+-------+       :      :
 	                   |
-	                   | Sequence in which stores committed to memory system
-	                   | by CPU 1
+	                   | Sequence in which stores are committed to the
+	                   | memory system by CPU 1
 	                   V
 
 
@@ -683,14 +704,12 @@
 	                               |        :       :       |       |
 	                               |        :       :       | CPU 2 |
 	                               |        +-------+       |       |
-	                                \       | X->9  |------>|       |
-	                                 \      +-------+       |       |
-	                                  ----->| B->2  |       |       |
-	                                        +-------+       |       |
-	     Makes sure all effects --->    ddddddddddddddddd   |       |
-	     prior to the store of C            +-------+       |       |
-	     are perceptible to                 | B->2  |------>|       |
-	     successive loads                   +-------+       |       |
+	                               |        | X->9  |------>|       |
+	                               |        +-------+       |       |
+	  Makes sure all effects --->   \   ddddddddddddddddd   |       |
+	  prior to the store of C        \      +-------+       |       |
+	  are perceptible to              ----->| B->2  |------>|       |
+	  subsequent loads                      +-------+       |       |
 	                                        :       :       +-------+
 
 
@@ -699,73 +718,239 @@
 
 	CPU 1			CPU 2
 	=======================	=======================
+		{ A = 0, B = 9 }
 	STORE A=1
-	STORE B=2
-	STORE C=3
 	<write barrier>
-	STORE D=4
-	STORE E=5
-				LOAD A
+	STORE B=2
 				LOAD B
-				LOAD C
-				LOAD D
-				LOAD E
+				LOAD A
 
 Without intervention, CPU 2 may then choose to perceive the events on CPU 1 in
 some effectively random order, despite the write barrier issued by CPU 1:
 
-	+-------+       :      :
-	|       |       +------+
-	|       |------>| C=3  | }
-	|       |  :    +------+ }
-	|       |  :    | A=1  | }
-	|       |  :    +------+ }
-	| CPU 1 |  :    | B=2  | }---
-	|       |       +------+ }   \
-	|       |   wwwwwwwwwwwww}    \
-	|       |       +------+ }     \          :       :       +-------+
-	|       |  :    | E=5  | }      \         +-------+       |       |
-	|       |  :    +------+ }       \      { | C->3  |------>|       |
-	|       |------>| D=4  | }        \     { +-------+    :  |       |
-	|       |       +------+           \    { | E->5  |    :  |       |
-	+-------+       :      :            \   { +-------+    :  |       |
-	                           Transfer  -->{ | A->1  |    :  | CPU 2 |
-	                          from CPU 1    { +-------+    :  |       |
-	                           to CPU 2     { | D->4  |    :  |       |
-	                                        { +-------+    :  |       |
-	                                        { | B->2  |------>|       |
-	                                          +-------+       |       |
-	                                          :       :       +-------+
+	+-------+       :      :                :       :
+	|       |       +------+                +-------+
+	|       |------>| A=1  |------      --->| A->0  |
+	|       |       +------+      \         +-------+
+	| CPU 1 |   wwwwwwwwwwwwwwww   \    --->| B->9  |
+	|       |       +------+        |       +-------+
+	|       |------>| B=2  |---     |       :       :
+	|       |       +------+   \    |       :       :       +-------+
+	+-------+       :      :    \   |       +-------+       |       |
+	                             ---------->| B->2  |------>|       |
+	                                |       +-------+       | CPU 2 |
+	                                |       | A->0  |------>|       |
+	                                |       +-------+       |       |
+	                                |       :       :       +-------+
+	                                 \      :       :
+	                                  \     +-------+
+	                                   ---->| A->1  |
+	                                        +-------+
+	                                        :       :
 
 
-If, however, a read barrier were to be placed between the load of C and the
-load of D on CPU 2, then the partial ordering imposed by CPU 1 will be
-perceived correctly by CPU 2.
+If, however, a read barrier were to be placed between the load of B and the
+load of A on CPU 2:
 
-	+-------+       :      :
-	|       |       +------+
-	|       |------>| C=3  | }
-	|       |  :    +------+ }
-	|       |  :    | A=1  | }---
-	|       |  :    +------+ }   \
-	| CPU 1 |  :    | B=2  | }    \
-	|       |       +------+       \
-	|       |   wwwwwwwwwwwwwwww    \
-	|       |       +------+         \        :       :       +-------+
-	|       |  :    | E=5  | }        \       +-------+       |       |
-	|       |  :    +------+ }---      \    { | C->3  |------>|       |
-	|       |------>| D=4  | }   \      \   { +-------+    :  |       |
-	|       |       +------+      \      -->{ | B->2  |    :  |       |
-	+-------+       :      :       \        { +-------+    :  |       |
-	                                \       { | A->1  |    :  | CPU 2 |
-	                                 \        +-------+       |       |
-	   At this point the read ---->   \   rrrrrrrrrrrrrrrrr   |       |
-	   barrier causes all effects      \      +-------+       |       |
-	   prior to the storage of C        \   { | E->5  |    :  |       |
-	   to be perceptible to CPU 2        -->{ +-------+    :  |       |
-	                                        { | D->4  |------>|       |
-	                                          +-------+       |       |
-	                                          :       :       +-------+
+	CPU 1			CPU 2
+	=======================	=======================
+		{ A = 0, B = 9 }
+	STORE A=1
+	<write barrier>
+	STORE B=2
+				LOAD B
+				<read barrier>
+				LOAD A
+
+then the partial ordering imposed by CPU 1 will be perceived correctly by CPU
+2:
+
+	+-------+       :      :                :       :
+	|       |       +------+                +-------+
+	|       |------>| A=1  |------      --->| A->0  |
+	|       |       +------+      \         +-------+
+	| CPU 1 |   wwwwwwwwwwwwwwww   \    --->| B->9  |
+	|       |       +------+        |       +-------+
+	|       |------>| B=2  |---     |       :       :
+	|       |       +------+   \    |       :       :       +-------+
+	+-------+       :      :    \   |       +-------+       |       |
+	                             ---------->| B->2  |------>|       |
+	                                |       +-------+       | CPU 2 |
+	                                |       :       :       |       |
+	                                |       :       :       |       |
+	  At this point the read ---->   \  rrrrrrrrrrrrrrrrr   |       |
+	  barrier causes all effects      \     +-------+       |       |
+	  prior to the storage of B        ---->| A->1  |------>|       |
+	  to be perceptible to CPU 2            +-------+       |       |
+	                                        :       :       +-------+
+
+
+To illustrate this more completely, consider what could happen if the code
+contained a load of A either side of the read barrier:
+
+	CPU 1			CPU 2
+	=======================	=======================
+		{ A = 0, B = 9 }
+	STORE A=1
+	<write barrier>
+	STORE B=2
+				LOAD B
+				LOAD A [first load of A]
+				<read barrier>
+				LOAD A [second load of A]
+
+Even though the two loads of A both occur after the load of B, they may both
+come up with different values:
+
+	+-------+       :      :                :       :
+	|       |       +------+                +-------+
+	|       |------>| A=1  |------      --->| A->0  |
+	|       |       +------+      \         +-------+
+	| CPU 1 |   wwwwwwwwwwwwwwww   \    --->| B->9  |
+	|       |       +------+        |       +-------+
+	|       |------>| B=2  |---     |       :       :
+	|       |       +------+   \    |       :       :       +-------+
+	+-------+       :      :    \   |       +-------+       |       |
+	                             ---------->| B->2  |------>|       |
+	                                |       +-------+       | CPU 2 |
+	                                |       :       :       |       |
+	                                |       :       :       |       |
+	                                |       +-------+       |       |
+	                                |       | A->0  |------>| 1st   |
+	                                |       +-------+       |       |
+	  At this point the read ---->   \  rrrrrrrrrrrrrrrrr   |       |
+	  barrier causes all effects      \     +-------+       |       |
+	  prior to the storage of B        ---->| A->1  |------>| 2nd   |
+	  to be perceptible to CPU 2            +-------+       |       |
+	                                        :       :       +-------+
+
+
+But it may be that the update to A from CPU 1 becomes perceptible to CPU 2
+before the read barrier completes anyway:
+
+	+-------+       :      :                :       :
+	|       |       +------+                +-------+
+	|       |------>| A=1  |------      --->| A->0  |
+	|       |       +------+      \         +-------+
+	| CPU 1 |   wwwwwwwwwwwwwwww   \    --->| B->9  |
+	|       |       +------+        |       +-------+
+	|       |------>| B=2  |---     |       :       :
+	|       |       +------+   \    |       :       :       +-------+
+	+-------+       :      :    \   |       +-------+       |       |
+	                             ---------->| B->2  |------>|       |
+	                                |       +-------+       | CPU 2 |
+	                                |       :       :       |       |
+	                                 \      :       :       |       |
+	                                  \     +-------+       |       |
+	                                   ---->| A->1  |------>| 1st   |
+	                                        +-------+       |       |
+	                                    rrrrrrrrrrrrrrrrr   |       |
+	                                        +-------+       |       |
+	                                        | A->1  |------>| 2nd   |
+	                                        +-------+       |       |
+	                                        :       :       +-------+
+
+
+The guarantee is that the second load will always come up with A == 1 if the
+load of B came up with B == 2.  No such guarantee exists for the first load of
+A; that may come up with either A == 0 or A == 1.
+
+
+READ MEMORY BARRIERS VS LOAD SPECULATION
+----------------------------------------
+
+Many CPUs speculate with loads: that is they see that they will need to load an
+item from memory, and they find a time where they're not using the bus for any
+other loads, and so do the load in advance - even though they haven't actually
+got to that point in the instruction execution flow yet.  This permits the
+actual load instruction to potentially complete immediately because the CPU
+already has the value to hand.
+
+It may turn out that the CPU didn't actually need the value - perhaps because a
+branch circumvented the load - in which case it can discard the value or just
+cache it for later use.
+
+Consider:
+
+	CPU 1	   		CPU 2
+	=======================	=======================
+	 	   		LOAD B
+	 	   		DIVIDE		} Divide instructions generally
+	 	   		DIVIDE		} take a long time to perform
+	 	   		LOAD A
+
+Which might appear as this:
+
+	                                        :       :       +-------+
+	                                        +-------+       |       |
+	                                    --->| B->2  |------>|       |
+	                                        +-------+       | CPU 2 |
+	                                        :       :DIVIDE |       |
+	                                        +-------+       |       |
+	The CPU being busy doing a --->     --->| A->0  |~~~~   |       |
+	division speculates on the              +-------+   ~   |       |
+	LOAD of A                               :       :   ~   |       |
+	                                        :       :DIVIDE |       |
+	                                        :       :   ~   |       |
+	Once the divisions are complete -->     :       :   ~-->|       |
+	the CPU can then perform the            :       :       |       |
+	LOAD with immediate effect              :       :       +-------+
+
+
+Placing a read barrier or a data dependency barrier just before the second
+load:
+
+	CPU 1	   		CPU 2
+	=======================	=======================
+	 	   		LOAD B
+	 	   		DIVIDE
+	 	   		DIVIDE
+				<read barrier>
+	 	   		LOAD A
+
+will force any value speculatively obtained to be reconsidered to an extent
+dependent on the type of barrier used.  If there was no change made to the
+speculated memory location, then the speculated value will just be used:
+
+	                                        :       :       +-------+
+	                                        +-------+       |       |
+	                                    --->| B->2  |------>|       |
+	                                        +-------+       | CPU 2 |
+	                                        :       :DIVIDE |       |
+	                                        +-------+       |       |
+	The CPU being busy doing a --->     --->| A->0  |~~~~   |       |
+	division speculates on the              +-------+   ~   |       |
+	LOAD of A                               :       :   ~   |       |
+	                                        :       :DIVIDE |       |
+	                                        :       :   ~   |       |
+	                                        :       :   ~   |       |
+	                                    rrrrrrrrrrrrrrrr~   |       |
+	                                        :       :   ~   |       |
+	                                        :       :   ~-->|       |
+	                                        :       :       |       |
+	                                        :       :       +-------+
+
+
+but if there was an update or an invalidation from another CPU pending, then
+the speculation will be cancelled and the value reloaded:
+
+	                                        :       :       +-------+
+	                                        +-------+       |       |
+	                                    --->| B->2  |------>|       |
+	                                        +-------+       | CPU 2 |
+	                                        :       :DIVIDE |       |
+	                                        +-------+       |       |
+	The CPU being busy doing a --->     --->| A->0  |~~~~   |       |
+	division speculates on the              +-------+   ~   |       |
+	LOAD of A                               :       :   ~   |       |
+	                                        :       :DIVIDE |       |
+	                                        :       :   ~   |       |
+	                                        :       :   ~   |       |
+	                                    rrrrrrrrrrrrrrrrr   |       |
+	                                        +-------+       |       |
+	The speculation is discarded --->   --->| A->1  |------>|       |
+	and an updated value is                 +-------+       |       |
+	retrieved                               :       :       +-------+
 
 
 ========================
@@ -830,10 +1015,9 @@
 There are some more advanced barrier functions:
 
  (*) set_mb(var, value)
- (*) set_wmb(var, value)
 
-     These assign the value to the variable and then insert at least a write
-     barrier after it, depending on the function.  They aren't guaranteed to
+     This assigns the value to the variable and then inserts at least a write
+     barrier after it, depending on the function.  It isn't guaranteed to
      insert anything more than a compiler barrier in a UP compilation.
 
 
@@ -901,7 +1085,7 @@
 ===============================
 
 Some of the other functions in the linux kernel imply memory barriers, amongst
-which are locking, scheduling and memory allocation functions.
+which are locking and scheduling functions.
 
 This specification is a _minimum_ guarantee; any particular architecture may
 provide more substantial guarantees, but these may not be relied upon outside
@@ -966,6 +1150,20 @@
     barriers is that the effects instructions outside of a critical section may
     seep into the inside of the critical section.
 
+A LOCK followed by an UNLOCK may not be assumed to be full memory barrier
+because it is possible for an access preceding the LOCK to happen after the
+LOCK, and an access following the UNLOCK to happen before the UNLOCK, and the
+two accesses can themselves then cross:
+
+	*A = a;
+	LOCK
+	UNLOCK
+	*B = b;
+
+may occur as:
+
+	LOCK, STORE *B, STORE *A, UNLOCK
+
 Locks and semaphores may not provide any guarantee of ordering on UP compiled
 systems, and so cannot be counted on in such a situation to actually achieve
 anything at all - especially with respect to I/O accesses - unless combined
@@ -1016,8 +1214,6 @@
 
  (*) schedule() and similar imply full memory barriers.
 
- (*) Memory allocation and release functions imply full memory barriers.
-
 
 =================================
 INTER-CPU LOCKING BARRIER EFFECTS
@@ -1269,9 +1465,8 @@
 
 On a UP system - where this wouldn't be a problem - the smp_mb() is just a
 compiler barrier, thus making sure the compiler emits the instructions in the
-right order without actually intervening in the CPU.  Since there there's only
-one CPU, that CPU's dependency ordering logic will take care of everything
-else.
+right order without actually intervening in the CPU.  Since there's only one
+CPU, that CPU's dependency ordering logic will take care of everything else.
 
 
 ATOMIC OPERATIONS
@@ -1448,9 +1643,9 @@
 
      The PCI bus, amongst others, defines an I/O space concept - which on such
      CPUs as i386 and x86_64 cpus readily maps to the CPU's concept of I/O
-     space.  However, it may also mapped as a virtual I/O space in the CPU's
-     memory map, particularly on those CPUs that don't support alternate
-     I/O spaces.
+     space.  However, it may also be mapped as a virtual I/O space in the CPU's
+     memory map, particularly on those CPUs that don't support alternate I/O
+     spaces.
 
      Accesses to this space may be fully synchronous (as on i386), but
      intermediary bridges (such as the PCI host bridge) may not fully honour
diff --git a/Documentation/mips/time.README b/Documentation/mips/time.README
index 70bc0dd..69ddc5c 100644
--- a/Documentation/mips/time.README
+++ b/Documentation/mips/time.README
@@ -65,7 +65,7 @@
 	1. (optional) set up RTC routines
 	2. (optional) calibrate and set the mips_counter_frequency
 
-  b) board_timer_setup - a function pointer.  Invoked at the end of time_init()
+  b) plat_timer_setup - a function pointer.  Invoked at the end of time_init()
 	1. (optional) over-ride any decisions made in time_init()
 	2. set up the irqaction for timer interrupt.
 	3. enable the timer interrupt
@@ -116,19 +116,17 @@
 
   If you supply board_time_init(), set the function poointer.
 
-  Set the function pointer board_timer_setup() (mandatory)
 
-
-Step 3: implement rtc routines, board_time_init() and board_timer_setup()
+Step 3: implement rtc routines, board_time_init() and plat_timer_setup()
   if needed.
 
-  board_time_init() - 
+  board_time_init() -
   	a) (optional) set up RTC routines, 
         b) (optional) calibrate and set the mips_counter_frequency
  	    (only needed if you intended to use fixed_rate_gettimeoffset
  	     or use cpu counter as timer interrupt source)
 
-  board_timer_setup() - 
+  plat_timer_setup() -
  	a) (optional) over-write any choices made above by time_init().
  	b) machine specific code should setup the timer irqaction.
  	c) enable the timer interrupt
diff --git a/Documentation/networking/README.ipw2200 b/Documentation/networking/README.ipw2200
index acb30c5..4f2a40f 100644
--- a/Documentation/networking/README.ipw2200
+++ b/Documentation/networking/README.ipw2200
@@ -14,8 +14,8 @@
 
 README.ipw2200
 
-Version: 1.0.8
-Date   : October 20, 2005
+Version: 1.1.2
+Date   : March 30, 2006
 
 
 Index
@@ -103,7 +103,7 @@
 
 1.1. Overview of Features
 -----------------------------------------------
-The current release (1.0.8) supports the following features:
+The current release (1.1.2) supports the following features:
 
 + BSS mode (Infrastructure, Managed)
 + IBSS mode (Ad-Hoc)
@@ -247,8 +247,8 @@
 % cat /sys/bus/pci/drivers/ipw2200/debug_level
 
 Will report the current debug level of the driver's logging subsystem 
-(only available if CONFIG_IPW_DEBUG was configured when the driver was 
-built).
+(only available if CONFIG_IPW2200_DEBUG was configured when the driver
+was built).
 
 You can set the debug level via:
 
diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
index 8d8b4e5..afac780 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -1,7 +1,7 @@
 
 		Linux Ethernet Bonding Driver HOWTO
 
-		Latest update: 21 June 2005
+		Latest update: 24 April 2006
 
 Initial release : Thomas Davis <tadavis at lbl.gov>
 Corrections, HA extensions : 2000/10/03-15 :
@@ -12,6 +12,8 @@
   - Jay Vosburgh <fubar at us dot ibm dot com>
 
 Reorganized and updated Feb 2005 by Jay Vosburgh
+Added Sysfs information: 2006/04/24
+  - Mitch Williams <mitch.a.williams at intel.com>
 
 Introduction
 ============
@@ -38,61 +40,62 @@
 2. Bonding Driver Options
 
 3. Configuring Bonding Devices
-3.1	Configuration with sysconfig support
-3.1.1		Using DHCP with sysconfig
-3.1.2		Configuring Multiple Bonds with sysconfig
-3.2	Configuration with initscripts support
-3.2.1		Using DHCP with initscripts
-3.2.2		Configuring Multiple Bonds with initscripts
-3.3	Configuring Bonding Manually
+3.1	Configuration with Sysconfig Support
+3.1.1		Using DHCP with Sysconfig
+3.1.2		Configuring Multiple Bonds with Sysconfig
+3.2	Configuration with Initscripts Support
+3.2.1		Using DHCP with Initscripts
+3.2.2		Configuring Multiple Bonds with Initscripts
+3.3	Configuring Bonding Manually with Ifenslave
 3.3.1		Configuring Multiple Bonds Manually
+3.4	Configuring Bonding Manually via Sysfs
 
-5. Querying Bonding Configuration
-5.1	Bonding Configuration
-5.2	Network Configuration
+4. Querying Bonding Configuration
+4.1	Bonding Configuration
+4.2	Network Configuration
 
-6. Switch Configuration
+5. Switch Configuration
 
-7. 802.1q VLAN Support
+6. 802.1q VLAN Support
 
-8. Link Monitoring
-8.1	ARP Monitor Operation
-8.2	Configuring Multiple ARP Targets
-8.3	MII Monitor Operation
+7. Link Monitoring
+7.1	ARP Monitor Operation
+7.2	Configuring Multiple ARP Targets
+7.3	MII Monitor Operation
 
-9. Potential Trouble Sources
-9.1	Adventures in Routing
-9.2	Ethernet Device Renaming
-9.3	Painfully Slow Or No Failed Link Detection By Miimon
+8. Potential Trouble Sources
+8.1	Adventures in Routing
+8.2	Ethernet Device Renaming
+8.3	Painfully Slow Or No Failed Link Detection By Miimon
 
-10. SNMP agents
+9. SNMP agents
 
-11. Promiscuous mode
+10. Promiscuous mode
 
-12. Configuring Bonding for High Availability
-12.1	High Availability in a Single Switch Topology
-12.2	High Availability in a Multiple Switch Topology
-12.2.1		HA Bonding Mode Selection for Multiple Switch Topology
-12.2.2		HA Link Monitoring for Multiple Switch Topology
+11. Configuring Bonding for High Availability
+11.1	High Availability in a Single Switch Topology
+11.2	High Availability in a Multiple Switch Topology
+11.2.1		HA Bonding Mode Selection for Multiple Switch Topology
+11.2.2		HA Link Monitoring for Multiple Switch Topology
 
-13. Configuring Bonding for Maximum Throughput
-13.1	Maximum Throughput in a Single Switch Topology
-13.1.1		MT Bonding Mode Selection for Single Switch Topology
-13.1.2		MT Link Monitoring for Single Switch Topology
-13.2	Maximum Throughput in a Multiple Switch Topology
-13.2.1		MT Bonding Mode Selection for Multiple Switch Topology
-13.2.2		MT Link Monitoring for Multiple Switch Topology
+12. Configuring Bonding for Maximum Throughput
+12.1	Maximum Throughput in a Single Switch Topology
+12.1.1		MT Bonding Mode Selection for Single Switch Topology
+12.1.2		MT Link Monitoring for Single Switch Topology
+12.2	Maximum Throughput in a Multiple Switch Topology
+12.2.1		MT Bonding Mode Selection for Multiple Switch Topology
+12.2.2		MT Link Monitoring for Multiple Switch Topology
 
-14. Switch Behavior Issues
-14.1	Link Establishment and Failover Delays
-14.2	Duplicated Incoming Packets
+13. Switch Behavior Issues
+13.1	Link Establishment and Failover Delays
+13.2	Duplicated Incoming Packets
 
-15. Hardware Specific Considerations
-15.1	IBM BladeCenter
+14. Hardware Specific Considerations
+14.1	IBM BladeCenter
 
-16. Frequently Asked Questions
+15. Frequently Asked Questions
 
-17. Resources and Links
+16. Resources and Links
 
 
 1. Bonding Driver Installation
@@ -156,6 +159,9 @@
 onwards) do not have /usr/include/linux symbolically linked to the
 default kernel source include directory.
 
+SECOND IMPORTANT NOTE:
+	If you plan to configure bonding using sysfs, you do not need
+to use ifenslave.
 
 2. Bonding Driver Options
 =========================
@@ -270,7 +276,7 @@
 		In bonding version 2.6.2 or later, when a failover
 		occurs in active-backup mode, bonding will issue one
 		or more gratuitous ARPs on the newly active slave.
-		One gratutious ARP is issued for the bonding master
+		One gratuitous ARP is issued for the bonding master
 		interface and each VLAN interfaces configured above
 		it, provided that the interface has at least one IP
 		address configured.  Gratuitous ARPs issued for VLAN
@@ -377,7 +383,7 @@
 		When a link is reconnected or a new slave joins the
 		bond the receive traffic is redistributed among all
 		active slaves in the bond by initiating ARP Replies
-		with the selected mac address to each of the
+		with the selected MAC address to each of the
 		clients. The updelay parameter (detailed below) must
 		be set to a value equal or greater than the switch's
 		forwarding delay so that the ARP Replies sent to the
@@ -498,11 +504,12 @@
 3. Configuring Bonding Devices
 ==============================
 
-	There are, essentially, two methods for configuring bonding:
-with support from the distro's network initialization scripts, and
-without.  Distros generally use one of two packages for the network
-initialization scripts: initscripts or sysconfig.  Recent versions of
-these packages have support for bonding, while older versions do not.
+	You can configure bonding using either your distro's network
+initialization scripts, or manually using either ifenslave or the
+sysfs interface.  Distros generally use one of two packages for the
+network initialization scripts: initscripts or sysconfig.  Recent
+versions of these packages have support for bonding, while older
+versions do not.
 
 	We will first describe the options for configuring bonding for
 distros using versions of initscripts and sysconfig with full or
@@ -530,7 +537,7 @@
 	If this returns any matches, then your initscripts or
 sysconfig has support for bonding.
 
-3.1 Configuration with sysconfig support
+3.1 Configuration with Sysconfig Support
 ----------------------------------------
 
 	This section applies to distros using a version of sysconfig
@@ -538,7 +545,7 @@
 
 	SuSE SLES 9's networking configuration system does support
 bonding, however, at this writing, the YaST system configuration
-frontend does not provide any means to work with bonding devices.
+front end does not provide any means to work with bonding devices.
 Bonding devices can be managed by hand, however, as follows.
 
 	First, if they have not already been configured, configure the
@@ -660,7 +667,7 @@
 	Note that the template does not document the various BONDING_
 settings described above, but does describe many of the other options.
 
-3.1.1 Using DHCP with sysconfig
+3.1.1 Using DHCP with Sysconfig
 -------------------------------
 
 	Under sysconfig, configuring a device with BOOTPROTO='dhcp'
@@ -670,7 +677,7 @@
 the slave devices.  Without active slaves, the DHCP requests are not
 sent to the network.
 
-3.1.2 Configuring Multiple Bonds with sysconfig
+3.1.2 Configuring Multiple Bonds with Sysconfig
 -----------------------------------------------
 
 	The sysconfig network initialization system is capable of
@@ -685,7 +692,7 @@
 options in the ifcfg-bondX file, it is not necessary to add them to
 the system /etc/modules.conf or /etc/modprobe.conf configuration file.
 
-3.2 Configuration with initscripts support
+3.2 Configuration with Initscripts Support
 ------------------------------------------
 
 	This section applies to distros using a version of initscripts
@@ -756,7 +763,7 @@
 will restart the networking subsystem and your bond link should be now
 up and running.
 
-3.2.1 Using DHCP with initscripts
+3.2.1 Using DHCP with Initscripts
 ---------------------------------
 
 	Recent versions of initscripts (the version supplied with
@@ -768,7 +775,7 @@
 and add a line consisting of "TYPE=Bonding".  Note that the TYPE value
 is case sensitive.
 
-3.2.2 Configuring Multiple Bonds with initscripts
+3.2.2 Configuring Multiple Bonds with Initscripts
 -------------------------------------------------
 
 	At this writing, the initscripts package does not directly
@@ -784,8 +791,8 @@
 exhibiting this problem, it will be impossible to configure multiple
 bonds with differing parameters.
 
-3.3 Configuring Bonding Manually
---------------------------------
+3.3 Configuring Bonding Manually with Ifenslave
+-----------------------------------------------
 
 	This section applies to distros whose network initialization
 scripts (the sysconfig or initscripts package) do not have specific
@@ -889,11 +896,139 @@
 	This may be repeated any number of times, specifying a new and
 unique name in place of bond1 for each subsequent instance.
 
+3.4 Configuring Bonding Manually via Sysfs
+------------------------------------------
 
-5. Querying Bonding Configuration 
+	Starting with version 3.0, Channel Bonding may be configured
+via the sysfs interface.  This interface allows dynamic configuration
+of all bonds in the system without unloading the module.  It also
+allows for adding and removing bonds at runtime.  Ifenslave is no
+longer required, though it is still supported.
+
+	Use of the sysfs interface allows you to use multiple bonds
+with different configurations without having to reload the module.
+It also allows you to use multiple, differently configured bonds when
+bonding is compiled into the kernel.
+
+	You must have the sysfs filesystem mounted to configure
+bonding this way.  The examples in this document assume that you
+are using the standard mount point for sysfs, e.g. /sys.  If your
+sysfs filesystem is mounted elsewhere, you will need to adjust the
+example paths accordingly.
+
+Creating and Destroying Bonds
+-----------------------------
+To add a new bond foo:
+# echo +foo > /sys/class/net/bonding_masters
+
+To remove an existing bond bar:
+# echo -bar > /sys/class/net/bonding_masters
+
+To show all existing bonds:
+# cat /sys/class/net/bonding_masters
+
+NOTE: due to 4K size limitation of sysfs files, this list may be
+truncated if you have more than a few hundred bonds.  This is unlikely
+to occur under normal operating conditions.
+
+Adding and Removing Slaves
+--------------------------
+	Interfaces may be enslaved to a bond using the file
+/sys/class/net/<bond>/bonding/slaves.  The semantics for this file
+are the same as for the bonding_masters file.
+
+To enslave interface eth0 to bond bond0:
+# ifconfig bond0 up
+# echo +eth0 > /sys/class/net/bond0/bonding/slaves
+
+To free slave eth0 from bond bond0:
+# echo -eth0 > /sys/class/net/bond0/bonding/slaves
+
+	NOTE: The bond must be up before slaves can be added.  All
+slaves are freed when the interface is brought down.
+
+	When an interface is enslaved to a bond, symlinks between the
+two are created in the sysfs filesystem.  In this case, you would get
+/sys/class/net/bond0/slave_eth0 pointing to /sys/class/net/eth0, and
+/sys/class/net/eth0/master pointing to /sys/class/net/bond0.
+
+	This means that you can tell quickly whether or not an
+interface is enslaved by looking for the master symlink.  Thus:
+# echo -eth0 > /sys/class/net/eth0/master/bonding/slaves
+will free eth0 from whatever bond it is enslaved to, regardless of
+the name of the bond interface.
+
+Changing a Bond's Configuration
+-------------------------------
+	Each bond may be configured individually by manipulating the
+files located in /sys/class/net/<bond name>/bonding
+
+	The names of these files correspond directly with the command-
+line parameters described elsewhere in in this file, and, with the
+exception of arp_ip_target, they accept the same values.  To see the
+current setting, simply cat the appropriate file.
+
+	A few examples will be given here; for specific usage
+guidelines for each parameter, see the appropriate section in this
+document.
+
+To configure bond0 for balance-alb mode:
+# ifconfig bond0 down
+# echo 6 > /sys/class/net/bond0/bonding/mode
+ - or -
+# echo balance-alb > /sys/class/net/bond0/bonding/mode
+	NOTE: The bond interface must be down before the mode can be
+changed.
+
+To enable MII monitoring on bond0 with a 1 second interval:
+# echo 1000 > /sys/class/net/bond0/bonding/miimon
+	NOTE: If ARP monitoring is enabled, it will disabled when MII
+monitoring is enabled, and vice-versa.
+
+To add ARP targets:
+# echo +192.168.0.100 > /sys/class/net/bond0/bonding/arp_ip_target
+# echo +192.168.0.101 > /sys/class/net/bond0/bonding/arp_ip_target
+	NOTE:  up to 10 target addresses may be specified.
+
+To remove an ARP target:
+# echo -192.168.0.100 > /sys/class/net/bond0/bonding/arp_ip_target
+
+Example Configuration
+---------------------
+	We begin with the same example that is shown in section 3.3,
+executed with sysfs, and without using ifenslave.
+
+	To make a simple bond of two e100 devices (presumed to be eth0
+and eth1), and have it persist across reboots, edit the appropriate
+file (/etc/init.d/boot.local or /etc/rc.d/rc.local), and add the
+following:
+
+modprobe bonding
+modprobe e100
+echo balance-alb > /sys/class/net/bond0/bonding/mode
+ifconfig bond0 192.168.1.1 netmask 255.255.255.0 up
+echo 100 > /sys/class/net/bond0/bonding/miimon
+echo +eth0 > /sys/class/net/bond0/bonding/slaves
+echo +eth1 > /sys/class/net/bond0/bonding/slaves
+
+	To add a second bond, with two e1000 interfaces in
+active-backup mode, using ARP monitoring, add the following lines to
+your init script:
+
+modprobe e1000
+echo +bond1 > /sys/class/net/bonding_masters
+echo active-backup > /sys/class/net/bond1/bonding/mode
+ifconfig bond1 192.168.2.1 netmask 255.255.255.0 up
+echo +192.168.2.100 /sys/class/net/bond1/bonding/arp_ip_target
+echo 2000 > /sys/class/net/bond1/bonding/arp_interval
+echo +eth2 > /sys/class/net/bond1/bonding/slaves
+echo +eth3 > /sys/class/net/bond1/bonding/slaves
+
+
+4. Querying Bonding Configuration 
 =================================
 
-5.1 Bonding Configuration
+4.1 Bonding Configuration
 -------------------------
 
 	Each bonding device has a read-only file residing in the
@@ -923,7 +1058,7 @@
 	The precise format and contents will change depending upon the
 bonding configuration, state, and version of the bonding driver.
 
-5.2 Network configuration
+4.2 Network configuration
 -------------------------
 
 	The network configuration can be inspected using the ifconfig
@@ -958,7 +1093,7 @@
           collisions:0 txqueuelen:100
           Interrupt:9 Base address:0x1400
 
-6. Switch Configuration
+5. Switch Configuration
 =======================
 
 	For this section, "switch" refers to whatever system the
@@ -991,7 +1126,7 @@
 with another EtherChannel group.
 
 
-7. 802.1q VLAN Support
+6. 802.1q VLAN Support
 ======================
 
 	It is possible to configure VLAN devices over a bond interface
@@ -1042,7 +1177,7 @@
 mode, which might not be what you want.
 
 
-8. Link Monitoring
+7. Link Monitoring
 ==================
 
 	The bonding driver at present supports two schemes for
@@ -1053,7 +1188,7 @@
 bonding driver itself, it is not possible to enable both ARP and MII
 monitoring simultaneously.
 
-8.1 ARP Monitor Operation
+7.1 ARP Monitor Operation
 -------------------------
 
 	The ARP monitor operates as its name suggests: it sends ARP
@@ -1071,7 +1206,7 @@
 shows the ARP requests and replies on the network, then it may be that
 your device driver is not updating last_rx and trans_start.
 
-8.2 Configuring Multiple ARP Targets
+7.2 Configuring Multiple ARP Targets
 ------------------------------------
 
 	While ARP monitoring can be done with just one target, it can
@@ -1094,7 +1229,7 @@
 options bond0 arp_interval=60 arp_ip_target=192.168.0.100
 
 
-8.3 MII Monitor Operation
+7.3 MII Monitor Operation
 -------------------------
 
 	The MII monitor monitors only the carrier state of the local
@@ -1120,14 +1255,14 @@
 and ethtool requests), then the MII monitor will assume the link is
 up.
 
-9. Potential Sources of Trouble
+8. Potential Sources of Trouble
 ===============================
 
-9.1 Adventures in Routing
+8.1 Adventures in Routing
 -------------------------
 
 	When bonding is configured, it is important that the slave
-devices not have routes that supercede routes of the master (or,
+devices not have routes that supersede routes of the master (or,
 generally, not have routes at all).  For example, suppose the bonding
 device bond0 has two slaves, eth0 and eth1, and the routing table is
 as follows:
@@ -1154,11 +1289,11 @@
 
 	The solution here is simply to insure that slaves do not have
 routes of their own, and if for some reason they must, those routes do
-not supercede routes of their master.  This should generally be the
+not supersede routes of their master.  This should generally be the
 case, but unusual configurations or errant manual or automatic static
 route additions may cause trouble.
 
-9.2 Ethernet Device Renaming
+8.2 Ethernet Device Renaming
 ----------------------------
 
 	On systems with network configuration scripts that do not
@@ -1207,7 +1342,7 @@
 place.  Full documentation on this can be found in the modprobe.conf
 and modprobe manual pages.
 
-9.3. Painfully Slow Or No Failed Link Detection By Miimon
+8.3. Painfully Slow Or No Failed Link Detection By Miimon
 ---------------------------------------------------------
 
 	By default, bonding enables the use_carrier option, which
@@ -1235,7 +1370,7 @@
 beyond other ports of a switch, or if a switch is refusing to pass
 traffic while still maintaining carrier on.
 
-10. SNMP agents
+9. SNMP agents
 ===============
 
 	If running SNMP agents, the bonding driver should be loaded
@@ -1281,7 +1416,7 @@
 and SNMP functions such as Interface_Scan_Next will report that
 association.
 
-11. Promiscuous mode
+10. Promiscuous mode
 ====================
 
 	When running network monitoring tools, e.g., tcpdump, it is
@@ -1308,7 +1443,7 @@
 the active slave changes (e.g., due to a link failure), the
 promiscuous setting will be propagated to the new active slave.
 
-12. Configuring Bonding for High Availability
+11. Configuring Bonding for High Availability
 =============================================
 
 	High Availability refers to configurations that provide
@@ -1318,7 +1453,7 @@
 (i.e., the network always works), even though other configurations
 could provide higher throughput.
 
-12.1 High Availability in a Single Switch Topology
+11.1 High Availability in a Single Switch Topology
 --------------------------------------------------
 
 	If two hosts (or a host and a single switch) are directly
@@ -1332,7 +1467,7 @@
 	See Section 13, "Configuring Bonding for Maximum Throughput"
 for information on configuring bonding with one peer device.
 
-12.2 High Availability in a Multiple Switch Topology
+11.2 High Availability in a Multiple Switch Topology
 ----------------------------------------------------
 
 	With multiple switches, the configuration of bonding and the
@@ -1359,7 +1494,7 @@
 the outside world ("port3" on each switch).  There is no technical
 reason that this could not be extended to a third switch.
 
-12.2.1 HA Bonding Mode Selection for Multiple Switch Topology
+11.2.1 HA Bonding Mode Selection for Multiple Switch Topology
 -------------------------------------------------------------
 
 	In a topology such as the example above, the active-backup and
@@ -1381,7 +1516,7 @@
 	necessary for some specific one-way traffic to reach both
 	independent networks, then the broadcast mode may be suitable.
 
-12.2.2 HA Link Monitoring Selection for Multiple Switch Topology
+11.2.2 HA Link Monitoring Selection for Multiple Switch Topology
 ----------------------------------------------------------------
 
 	The choice of link monitoring ultimately depends upon your
@@ -1402,10 +1537,10 @@
 target to query.
 
 
-13. Configuring Bonding for Maximum Throughput
+12. Configuring Bonding for Maximum Throughput
 ==============================================
 
-13.1 Maximizing Throughput in a Single Switch Topology
+12.1 Maximizing Throughput in a Single Switch Topology
 ------------------------------------------------------
 
 	In a single switch configuration, the best method to maximize
@@ -1476,7 +1611,7 @@
 mode is described below.
 
 
-13.1.1 MT Bonding Mode Selection for Single Switch Topology
+12.1.1 MT Bonding Mode Selection for Single Switch Topology
 -----------------------------------------------------------
 
 	This configuration is the easiest to set up and to understand,
@@ -1607,7 +1742,7 @@
 	device driver must support changing the hardware address while
 	the device is open.
 
-13.1.2 MT Link Monitoring for Single Switch Topology
+12.1.2 MT Link Monitoring for Single Switch Topology
 ----------------------------------------------------
 
 	The choice of link monitoring may largely depend upon which
@@ -1616,7 +1751,7 @@
 the MII monitor (which does not provide as high a level of end to end
 assurance as the ARP monitor).
 
-13.2 Maximum Throughput in a Multiple Switch Topology
+12.2 Maximum Throughput in a Multiple Switch Topology
 -----------------------------------------------------
 
 	Multiple switches may be utilized to optimize for throughput
@@ -1651,7 +1786,7 @@
 can be equipped with an additional network device connected to an
 external network; this host then additionally acts as a gateway.
 
-13.2.1 MT Bonding Mode Selection for Multiple Switch Topology
+12.2.1 MT Bonding Mode Selection for Multiple Switch Topology
 -------------------------------------------------------------
 
 	In actual practice, the bonding mode typically employed in
@@ -1664,7 +1799,7 @@
 mode allows individual connections between two hosts to effectively
 utilize greater than one interface's bandwidth.
 
-13.2.2 MT Link Monitoring for Multiple Switch Topology
+12.2.2 MT Link Monitoring for Multiple Switch Topology
 ------------------------------------------------------
 
 	Again, in actual practice, the MII monitor is most often used
@@ -1674,10 +1809,10 @@
 needed as the number of systems involved grows (remember that each
 host in the network is configured with bonding).
 
-14. Switch Behavior Issues
+13. Switch Behavior Issues
 ==========================
 
-14.1 Link Establishment and Failover Delays
+13.1 Link Establishment and Failover Delays
 -------------------------------------------
 
 	Some switches exhibit undesirable behavior with regard to the
@@ -1712,7 +1847,7 @@
 to not activate a backup interface immediately after a link goes down.
 Failover may be delayed via the downdelay bonding module option.
 
-14.2 Duplicated Incoming Packets
+13.2 Duplicated Incoming Packets
 --------------------------------
 
 	It is not uncommon to observe a short burst of duplicated
@@ -1751,14 +1886,14 @@
 most Cisco switches, the privileged command "clear mac address-table
 dynamic" will accomplish this).
 
-15. Hardware Specific Considerations
+14. Hardware Specific Considerations
 ====================================
 
 	This section contains additional information for configuring
 bonding on specific hardware platforms, or for interfacing bonding
 with particular switches or other devices.
 
-15.1 IBM BladeCenter
+14.1 IBM BladeCenter
 --------------------
 
 	This applies to the JS20 and similar systems.
@@ -1861,7 +1996,7 @@
 avoid fail-over delay issues when using bonding.
 
 	
-16. Frequently Asked Questions
+15. Frequently Asked Questions
 ==============================
 
 1.  Is it SMP safe?
@@ -1925,7 +2060,7 @@
 support specific features (described in the appropriate section under
 module parameters, above).
 
-	In 802.3ad mode, it works with with systems that support IEEE
+	In 802.3ad mode, it works with systems that support IEEE
 802.3ad Dynamic Link Aggregation.  Most managed and many unmanaged
 switches currently available support 802.3ad.
 
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index f12007b..d46338a 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -362,6 +362,13 @@
 	not receive a window scaling option from them.
 	Default: 0
 
+tcp_slow_start_after_idle - BOOLEAN
+	If set, provide RFC2861 behavior and time out the congestion
+	window after an idle period.  An idle period is defined at
+	the current RTO.  If unset, the congestion window will not
+	be timed out after an idle period.
+	Default: 1
+
 IP Variables:
 
 ip_local_port_range - 2 INTEGERS
diff --git a/Documentation/networking/ipvs-sysctl.txt b/Documentation/networking/ipvs-sysctl.txt
new file mode 100644
index 0000000..4ccdbca
--- /dev/null
+++ b/Documentation/networking/ipvs-sysctl.txt
@@ -0,0 +1,143 @@
+/proc/sys/net/ipv4/vs/* Variables:
+
+am_droprate - INTEGER
+        default 10
+
+        It sets the always mode drop rate, which is used in the mode 3
+        of the drop_rate defense.
+
+amemthresh - INTEGER
+        default 1024
+
+        It sets the available memory threshold (in pages), which is
+        used in the automatic modes of defense. When there is no
+        enough available memory, the respective strategy will be
+        enabled and the variable is automatically set to 2, otherwise
+        the strategy is disabled and the variable is  set  to 1.
+
+cache_bypass - BOOLEAN
+        0 - disabled (default)
+        not 0 - enabled
+
+        If it is enabled, forward packets to the original destination
+        directly when no cache server is available and destination
+        address is not local (iph->daddr is RTN_UNICAST). It is mostly
+        used in transparent web cache cluster.
+
+debug_level - INTEGER
+	0          - transmission error messages (default)
+	1          - non-fatal error messages
+	2          - configuration
+	3          - destination trash
+	4          - drop entry
+	5          - service lookup
+	6          - scheduling
+	7          - connection new/expire, lookup and synchronization
+	8          - state transition
+	9          - binding destination, template checks and applications
+	10         - IPVS packet transmission
+	11         - IPVS packet handling (ip_vs_in/ip_vs_out)
+	12 or more - packet traversal
+
+	Only available when IPVS is compiled with the CONFIG_IPVS_DEBUG
+
+	Higher debugging levels include the messages for lower debugging
+	levels, so setting debug level 2, includes level 0, 1 and 2
+	messages. Thus, logging becomes more and more verbose the higher
+	the level.
+
+drop_entry - INTEGER
+        0  - disabled (default)
+
+        The drop_entry defense is to randomly drop entries in the
+        connection hash table, just in order to collect back some
+        memory for new connections. In the current code, the
+        drop_entry procedure can be activated every second, then it
+        randomly scans 1/32 of the whole and drops entries that are in
+        the SYN-RECV/SYNACK state, which should be effective against
+        syn-flooding attack.
+
+        The valid values of drop_entry are from 0 to 3, where 0 means
+        that this strategy is always disabled, 1 and 2 mean automatic
+        modes (when there is no enough available memory, the strategy
+        is enabled and the variable is automatically set to 2,
+        otherwise the strategy is disabled and the variable is set to
+        1), and 3 means that that the strategy is always enabled.
+
+drop_packet - INTEGER
+        0  - disabled (default)
+
+        The drop_packet defense is designed to drop 1/rate packets
+        before forwarding them to real servers. If the rate is 1, then
+        drop all the incoming packets.
+
+        The value definition is the same as that of the drop_entry. In
+        the automatic mode, the rate is determined by the follow
+        formula: rate = amemthresh / (amemthresh - available_memory)
+        when available memory is less than the available memory
+        threshold. When the mode 3 is set, the always mode drop rate
+        is controlled by the /proc/sys/net/ipv4/vs/am_droprate.
+
+expire_nodest_conn - BOOLEAN
+        0 - disabled (default)
+        not 0 - enabled
+
+        The default value is 0, the load balancer will silently drop
+        packets when its destination server is not available. It may
+        be useful, when user-space monitoring program deletes the
+        destination server (because of server overload or wrong
+        detection) and add back the server later, and the connections
+        to the server can continue.
+
+        If this feature is enabled, the load balancer will expire the
+        connection immediately when a packet arrives and its
+        destination server is not available, then the client program
+        will be notified that the connection is closed. This is
+        equivalent to the feature some people requires to flush
+        connections when its destination is not available.
+
+expire_quiescent_template - BOOLEAN
+	0 - disabled (default)
+	not 0 - enabled
+
+	When set to a non-zero value, the load balancer will expire
+	persistent templates when the destination server is quiescent.
+	This may be useful, when a user makes a destination server
+	quiescent by setting its weight to 0 and it is desired that
+	subsequent otherwise persistent connections are sent to a
+	different destination server.  By default new persistent
+	connections are allowed to quiescent destination servers.
+
+	If this feature is enabled, the load balancer will expire the
+	persistence template if it is to be used to schedule a new
+	connection and the destination server is quiescent.
+
+nat_icmp_send - BOOLEAN
+        0 - disabled (default)
+        not 0 - enabled
+
+        It controls sending icmp error messages (ICMP_DEST_UNREACH)
+        for VS/NAT when the load balancer receives packets from real
+        servers but the connection entries don't exist.
+
+secure_tcp - INTEGER
+        0  - disabled (default)
+
+        The secure_tcp defense is to use a more complicated state
+        transition table and some possible short timeouts of each
+        state. In the VS/NAT, it delays the entering the ESTABLISHED
+        until the real server starts to send data and ACK packet
+        (after 3-way handshake).
+
+        The value definition is the same as that of drop_entry or
+        drop_packet.
+
+sync_threshold - INTEGER
+        default 3
+
+        It sets synchronization threshold, which is the minimum number
+        of incoming packets that a connection needs to receive before
+        the connection will be synchronized. A connection will be
+        synchronized, every time the number of its incoming packets
+        modulus 50 equals the threshold. The range of the threshold is
+        from 0 to 49.
diff --git a/Documentation/networking/netdevices.txt b/Documentation/networking/netdevices.txt
index 3c0a5ba..847cedb 100644
--- a/Documentation/networking/netdevices.txt
+++ b/Documentation/networking/netdevices.txt
@@ -42,9 +42,9 @@
 	Context: nominally process, but don't sleep inside an rwlock
 
 dev->hard_start_xmit:
-	Synchronization: dev->xmit_lock spinlock.
+	Synchronization: netif_tx_lock spinlock.
 	When the driver sets NETIF_F_LLTX in dev->features this will be
-	called without holding xmit_lock. In this case the driver 
+	called without holding netif_tx_lock. In this case the driver
 	has to lock by itself when needed. It is recommended to use a try lock
 	for this and return -1 when the spin lock fails. 
 	The locking there should also properly protect against 
@@ -62,12 +62,12 @@
 	  Only valid when NETIF_F_LLTX is set.
 
 dev->tx_timeout:
-	Synchronization: dev->xmit_lock spinlock.
+	Synchronization: netif_tx_lock spinlock.
 	Context: BHs disabled
 	Notes: netif_queue_stopped() is guaranteed true
 
 dev->set_multicast_list:
-	Synchronization: dev->xmit_lock spinlock.
+	Synchronization: netif_tx_lock spinlock.
 	Context: BHs disabled
 
 dev->poll:
diff --git a/Documentation/networking/pktgen.txt b/Documentation/networking/pktgen.txt
index 278771c..44f2f76 100644
--- a/Documentation/networking/pktgen.txt
+++ b/Documentation/networking/pktgen.txt
@@ -74,7 +74,7 @@
  pgset "pkt_size 9014"   sets packet size to 9014
  pgset "frags 5"         packet will consist of 5 fragments
  pgset "count 200000"    sets number of packets to send, set to zero
-                         for continious sends untill explicitl stopped.
+                         for continuous sends until explicitly stopped.
 
  pgset "delay 5000"      adds delay to hard_start_xmit(). nanoseconds
 
diff --git a/Documentation/networking/tuntap.txt b/Documentation/networking/tuntap.txt
index 76750fb..839cbb7 100644
--- a/Documentation/networking/tuntap.txt
+++ b/Documentation/networking/tuntap.txt
@@ -39,10 +39,13 @@
      mknod /dev/net/tun c 10 200
   
   Set permissions:
-     e.g. chmod 0700 /dev/net/tun
-     if you want the device only accessible by root. Giving regular users the
-     right to assign network devices is NOT a good idea. Users could assign
-     bogus network interfaces to trick firewalls or administrators.
+     e.g. chmod 0666 /dev/net/tun
+     There's no harm in allowing the device to be accessible by non-root users,
+     since CAP_NET_ADMIN is required for creating network devices or for 
+     connecting to network devices which aren't owned by the user in question.
+     If you want to create persistent devices and give ownership of them to 
+     unprivileged users, then you need the /dev/net/tun device to be usable by
+     those users.
 
   Driver module autoloading
 
diff --git a/Documentation/nfsroot.txt b/Documentation/nfsroot.txt
index d56dc71..3cc953c 100644
--- a/Documentation/nfsroot.txt
+++ b/Documentation/nfsroot.txt
@@ -4,15 +4,16 @@
 Written 1996 by Gero Kuhlmann <gero@gkminix.han.de>
 Updated 1997 by Martin Mares <mj@atrey.karlin.mff.cuni.cz>
 Updated 2006 by Nico Schottelius <nico-kernel-nfsroot@schottelius.org>
+Updated 2006 by Horms <horms@verge.net.au>
 
 
 
-If you want to use a diskless system, as an X-terminal or printer
-server for example, you have to put your root filesystem onto a
-non-disk device. This can either be a ramdisk (see initrd.txt in
-this directory for further information) or a filesystem mounted
-via NFS. The following text describes on how to use NFS for the
-root filesystem. For the rest of this text 'client' means the
+In order to use a diskless system, such as an X-terminal or printer server
+for example, it is necessary for the root filesystem to be present on a
+non-disk device. This may be an initramfs (see Documentation/filesystems/
+ramfs-rootfs-initramfs.txt), a ramdisk (see Documenation/initrd.txt) or a
+filesystem mounted via NFS. The following text describes on how to use NFS
+for the root filesystem. For the rest of this text 'client' means the
 diskless system, and 'server' means the NFS server.
 
 
@@ -21,11 +22,13 @@
 1.) Enabling nfsroot capabilities
     -----------------------------
 
-In order to use nfsroot you have to select support for NFS during
-kernel configuration. Note that NFS cannot be loaded as a module
-in this case. The configuration script will then ask you whether
-you want to use nfsroot, and if yes what kind of auto configuration
-system you want to use. Selecting both BOOTP and RARP is safe.
+In order to use nfsroot, NFS client support needs to be selected as
+built-in during configuration. Once this has been selected, the nfsroot
+option will become available, which should also be selected.
+
+In the networking options, kernel level autoconfiguration can be selected,
+along with the types of autoconfiguration to support. Selecting all of
+DHCP, BOOTP and RARP is safe.
 
 
 
@@ -33,11 +36,10 @@
 2.) Kernel command line
     -------------------
 
-When the kernel has been loaded by a boot loader (either by loadlin,
-LILO or a network boot program) it has to be told what root fs device
-to use, and where to find the server and the name of the directory
-on the server to mount as root. This can be established by a couple
-of kernel command line parameters:
+When the kernel has been loaded by a boot loader (see below) it needs to be
+told what root fs device to use. And in the case of nfsroot, where to find
+both the server and the name of the directory on the server to mount as root.
+This can be established using the following kernel command line parameters:
 
 
 root=/dev/nfs
@@ -49,23 +51,21 @@
 
 nfsroot=[<server-ip>:]<root-dir>[,<nfs-options>]
 
-  If the `nfsroot' parameter is NOT given on the command line, the default
-  "/tftpboot/%s" will be used.
+  If the `nfsroot' parameter is NOT given on the command line,
+  the default "/tftpboot/%s" will be used.
 
-  <server-ip>	Specifies the IP address of the NFS server. If this field
-		is not given, the default address as determined by the
-		`ip' variable (see below) is used. One use of this
-		parameter is for example to allow using different servers
-		for RARP and NFS. Usually you can leave this blank.
+  <server-ip>	Specifies the IP address of the NFS server.
+		The default address is determined by the `ip' parameter
+		(see below). This parameter allows the use of different
+		servers for IP autoconfiguration and NFS.
 
-  <root-dir>	Name of the directory on the server to mount as root. If
-		there is a "%s" token in the string, the token will be
-		replaced by the ASCII-representation of the client's IP
-		address.
+  <root-dir>	Name of the directory on the server to mount as root.
+		If there is a "%s" token in the string, it will be
+		replaced by the ASCII-representation of the client's
+		IP address.
 
   <nfs-options>	Standard NFS options. All options are separated by commas.
-		If the options field is not given, the following defaults
-		will be used:
+		The following defaults are used:
 			port		= as given by server portmap daemon
 			rsize		= 1024
 			wsize		= 1024
@@ -81,129 +81,174 @@
 ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf>
 
   This parameter tells the kernel how to configure IP addresses of devices
-  and also how to set up the IP routing table. It was originally called `nfsaddrs',
-  but now the boot-time IP configuration works independently of NFS, so it
-  was renamed to `ip' and the old name remained as an alias for compatibility
-  reasons.
+  and also how to set up the IP routing table. It was originally called
+  `nfsaddrs', but now the boot-time IP configuration works independently of
+  NFS, so it was renamed to `ip' and the old name remained as an alias for
+  compatibility reasons.
 
   If this parameter is missing from the kernel command line, all fields are
   assumed to be empty, and the defaults mentioned below apply. In general
-  this means that the kernel tries to configure everything using both
-  RARP and BOOTP (depending on what has been enabled during kernel confi-
-  guration, and if both what protocol answer got in first).
-
-  <client-ip>	IP address of the client. If empty, the address will either
-		be determined by RARP or BOOTP. What protocol is used de-
-		pends on what has been enabled during kernel configuration
-		and on the <autoconf> parameter. If this parameter is not
-		empty, neither RARP nor BOOTP will be used.
-
-  <server-ip>	IP address of the NFS server. If RARP is used to determine
-		the client address and this parameter is NOT empty only
-		replies from the specified server are accepted. To use
-		different RARP and NFS server, specify your RARP server
-		here (or leave it blank), and specify your NFS server in
-		the `nfsroot' parameter (see above). If this entry is blank
-		the address of the server is used which answered the RARP
-		or BOOTP request.
-
-  <gw-ip>	IP address of a gateway if the server is on a different
-		subnet. If this entry is empty no gateway is used and the
-		server is assumed to be on the local network, unless a
-		value has been received by BOOTP.
-
-  <netmask>	Netmask for local network interface. If this is empty,
-		the netmask is derived from the client IP address assuming
-		classful addressing, unless overridden in BOOTP reply.
-
-  <hostname>	Name of the client. If empty, the client IP address is
-		used in ASCII notation, or the value received by BOOTP.
-
-  <device>	Name of network device to use. If this is empty, all
-		devices are used for RARP and BOOTP requests, and the
-		first one we receive a reply on is configured. If you have
-		only one device, you can safely leave this blank.
-
-  <autoconf>	Method to use for autoconfiguration. If this is either
-		'rarp' or 'bootp', the specified protocol is used.
-		If the value is 'both' or empty, both protocols are used
-		so far as they have been enabled during kernel configura-
-		tion. 'off' means no autoconfiguration.
+  this means that the kernel tries to configure everything using
+  autoconfiguration.
 
   The <autoconf> parameter can appear alone as the value to the `ip'
   parameter (without all the ':' characters before) in which case auto-
   configuration is used.
 
+  <client-ip>	IP address of the client.
+
+  		Default:  Determined using autoconfiguration.
+
+  <server-ip>	IP address of the NFS server. If RARP is used to determine
+		the client address and this parameter is NOT empty only
+		replies from the specified server are accepted.
+
+		Only required for for NFS root. That is autoconfiguration
+		will not be triggered if it is missing and NFS root is not
+		in operation.
+
+		Default: Determined using autoconfiguration.
+		         The address of the autoconfiguration server is used.
+
+  <gw-ip>	IP address of a gateway if the server is on a different subnet.
+
+		Default: Determined using autoconfiguration.
+
+  <netmask>	Netmask for local network interface. If unspecified
+		the netmask is derived from the client IP address assuming
+		classful addressing.
+
+		Default:  Determined using autoconfiguration.
+
+  <hostname>	Name of the client. May be supplied by autoconfiguration,
+  		but its absence will not trigger autoconfiguration.
+
+  		Default: Client IP address is used in ASCII notation.
+
+  <device>	Name of network device to use.
+
+		Default: If the host only has one device, it is used.
+			 Otherwise the device is determined using
+			 autoconfiguration. This is done by sending
+			 autoconfiguration requests out of all devices,
+			 and using the device that received the first reply.
+
+  <autoconf>	Method to use for autoconfiguration. In the case of options
+                which specify multiple autoconfiguration protocols,
+		requests are sent using all protocols, and the first one
+		to reply is used.
+
+		Only autoconfiguration protocols that have been compiled
+		into the kernel will be used, regardless of the value of
+		this option.
+
+                  off or none: don't use autoconfiguration (default)
+		  on or any:   use any protocol available in the kernel
+		  dhcp:        use DHCP
+		  bootp:       use BOOTP
+		  rarp:        use RARP
+		  both:        use both BOOTP and RARP but not DHCP
+		               (old option kept for backwards compatibility)
+
+                Default: any
 
 
 
-3.) Kernel loader
-    -------------
 
-To get the kernel into memory different approaches can be used. They
-depend on what facilities are available:
+3.) Boot Loader
+    ----------
+
+To get the kernel into memory different approaches can be used.
+They depend on various facilities being available:
 
 
-3.1)  Writing the kernel onto a floppy using dd:
-	As always you can just write the kernel onto a floppy using dd,
-	but then it's not possible to use kernel command lines at all.
-	To substitute the 'root=' parameter, create a dummy device on any
-	linux system with major number 0 and minor number 255 using mknod:
+3.1)  Booting from a floppy using syslinux
 
-		mknod /dev/boot255 c 0 255
+	When building kernels, an easy way to create a boot floppy that uses
+	syslinux is to use the zdisk or bzdisk make targets which use
+      	and bzimage images respectively. Both targets accept the
+     	FDARGS parameter which can be used to set the kernel command line.
 
-	Then copy the kernel zImage file onto a floppy using dd:
+	e.g.
+	   make bzdisk FDARGS="root=/dev/nfs"
 
-		dd if=/usr/src/linux/arch/i386/boot/zImage of=/dev/fd0
+   	Note that the user running this command will need to have
+     	access to the floppy drive device, /dev/fd0
 
-	And finally use rdev to set the root device:
+     	For more information on syslinux, including how to create bootdisks
+     	for prebuilt kernels, see http://syslinux.zytor.com/
 
-		rdev /dev/fd0 /dev/boot255
+	N.B: Previously it was possible to write a kernel directly to
+	     a floppy using dd, configure the boot device using rdev, and
+	     boot using the resulting floppy. Linux no longer supports this
+	     method of booting.
 
-	You can then remove the dummy device /dev/boot255 again. There
-	is no real device available for it.
-	The other two kernel command line parameters cannot be substi-
-	tuted with rdev. Therefore, using this method the kernel will
-	by default use RARP and/or BOOTP, and if it gets an answer via
-	RARP will mount the directory /tftpboot/<client-ip>/ as its
-	root. If it got a BOOTP answer the directory name in that answer
-	is used.
+3.2) Booting from a cdrom using isolinux
+
+     	When building kernels, an easy way to create a bootable cdrom that
+     	uses isolinux is to use the isoimage target which uses a bzimage
+     	image. Like zdisk and bzdisk, this target accepts the FDARGS
+     	parameter which can be used to set the kernel command line.
+
+	e.g.
+	  make isoimage FDARGS="root=/dev/nfs"
+
+     	The resulting iso image will be arch/<ARCH>/boot/image.iso
+     	This can be written to a cdrom using a variety of tools including
+     	cdrecord.
+
+	e.g.
+	  cdrecord dev=ATAPI:1,0,0 arch/i386/boot/image.iso
+
+     	For more information on isolinux, including how to create bootdisks
+     	for prebuilt kernels, see http://syslinux.zytor.com/
 
 3.2) Using LILO
-	When using LILO you can specify all necessary command line
-	parameters with the 'append=' command in the LILO configuration
-	file. However, to use the 'root=' command you also need to
-	set up a dummy device as described in 3.1 above. For how to use
-	LILO and its 'append=' command please refer to the LILO
-	documentation.
+	When using LILO all the necessary command line parameters may be
+	specified using the 'append=' directive in the LILO configuration
+	file.
+
+	However, to use the 'root=' directive you also need to create
+	a dummy root device, which may be removed after LILO is run.
+
+	mknod /dev/boot255 c 0 255
+
+	For information on configuring LILO, please refer to its documentation.
 
 3.3) Using GRUB
-	When you use GRUB, you simply append the parameters after the kernel
-	specification: "kernel <kernel> <parameters>" (without the quotes).
+	When using GRUB, kernel parameter are simply appended after the kernel
+	specification: kernel <kernel> <parameters>
 
 3.4) Using loadlin
-	When you want to boot Linux from a DOS command prompt without
-	having a local hard disk to mount as root, you can use loadlin.
-	I was told that it works, but haven't used it myself yet. In
-	general you should be able to create a kernel command line simi-
-	lar to how LILO is doing it. Please refer to the loadlin docu-
-	mentation for further information.
+	loadlin may be used to boot Linux from a DOS command prompt without
+	requiring a local hard disk to mount as root. This has not been
+	thoroughly tested by the authors of this document, but in general
+	it should be possible configure the kernel command line similarly
+	to the configuration of LILO.
+
+	Please refer to the loadlin documentation for further information.
 
 3.5) Using a boot ROM
-	This is probably the most elegant way of booting a diskless
-	client. With a boot ROM the kernel gets loaded using the TFTP
-	protocol. As far as I know, no commercial boot ROMs yet
-	support booting Linux over the network, but there are two
-	free implementations of a boot ROM available on sunsite.unc.edu
-	and its mirrors. They are called 'netboot-nfs' and 'etherboot'.
-	Both contain everything you need to boot a diskless Linux client.
+	This is probably the most elegant way of booting a diskless client.
+	With a boot ROM the kernel is loaded using the TFTP protocol. The
+	authors of this document are not aware of any no commercial boot
+	ROMs that support booting Linux over the network. However, there
+	are two free implementations of a boot ROM, netboot-nfs and
+	etherboot, both of which are available on sunsite.unc.edu, and both
+	of which contain everything you need to boot a diskless Linux client.
 
 3.6) Using pxelinux
-	Using pxelinux you specify the kernel you built with
+	Pxelinux may be used to boot linux using the PXE boot loader
+	which is present on many modern network cards.
+
+	When using pxelinux, the kernel image is specified using
 	"kernel <relative-path-below /tftpboot>". The nfsroot parameters
 	are passed to the kernel by adding them to the "append" line.
-	You may perhaps also want to fine tune the console output,
-	see Documentation/serial-console.txt for serial console help.
+	It is common to use serial console in conjunction with pxeliunx,
+	see Documentation/serial-console.txt for more information.
+
+	For more information on isolinux, including how to create bootdisks
+	for prebuilt kernels, see http://syslinux.zytor.com/
 
 
 
diff --git a/Documentation/pci.txt b/Documentation/pci.txt
index 66bbbf1..2b395e4 100644
--- a/Documentation/pci.txt
+++ b/Documentation/pci.txt
@@ -213,11 +213,19 @@
 
    See Documentation/IO-mapping.txt for how to access device memory.
 
-   You still need to call request_region() for I/O regions and
-request_mem_region() for memory regions to make sure nobody else is using the
-same device.
+   The device driver needs to call pci_request_region() to make sure
+no other device is already using the same resource. The driver is expected
+to determine MMIO and IO Port resource availability _before_ calling
+pci_enable_device().  Conversely, drivers should call pci_release_region()
+_after_ calling pci_disable_device(). The idea is to prevent two devices
+colliding on the same address range.
 
-   All interrupt handlers should be registered with SA_SHIRQ and use the devid
+Generic flavors of pci_request_region() are request_mem_region()
+(for MMIO ranges) and request_region() (for IO Port ranges).
+Use these for address resources that are not described by "normal" PCI
+interfaces (e.g. BAR).
+
+   All interrupt handlers should be registered with IRQF_SHARED and use the devid
 to map IRQs to devices (remember that all PCI interrupts are shared).
 
 
diff --git a/Documentation/pcmcia/crc32hash.c b/Documentation/pcmcia/crc32hash.c
new file mode 100644
index 0000000..cbc36d2
--- /dev/null
+++ b/Documentation/pcmcia/crc32hash.c
@@ -0,0 +1,32 @@
+/* crc32hash.c - derived from linux/lib/crc32.c, GNU GPL v2 */
+/* Usage example:
+$ ./crc32hash "Dual Speed"
+*/
+
+#include <string.h>
+#include <stdio.h>
+#include <ctype.h>
+#include <stdlib.h>
+
+unsigned int crc32(unsigned char const *p, unsigned int len)
+{
+	int i;
+	unsigned int crc = 0;
+	while (len--) {
+		crc ^= *p++;
+		for (i = 0; i < 8; i++)
+			crc = (crc >> 1) ^ ((crc & 1) ? 0xedb88320 : 0);
+	}
+	return crc;
+}
+
+int main(int argc, char **argv) {
+	unsigned int result;
+	if (argc != 2) {
+		printf("no string passed as argument\n");
+		return -1;
+	}
+	result = crc32(argv[1], strlen(argv[1]));
+	printf("0x%x\n", result);
+	return 0;
+}
diff --git a/Documentation/pcmcia/devicetable.txt b/Documentation/pcmcia/devicetable.txt
index 3351c035..199afd1 100644
--- a/Documentation/pcmcia/devicetable.txt
+++ b/Documentation/pcmcia/devicetable.txt
@@ -27,37 +27,7 @@
 The hex value after "pa" is the hash of product ID string 1, after "pb" for
 string 2 and so on.
 
-Alternatively, you can use this small tool to determine the crc32 hash.
-simply pass the string you want to evaluate as argument to this program,
-e.g.
+Alternatively, you can use crc32hash (see Documentation/pcmcia/crc32hash.c)
+to determine the crc32 hash.  Simply pass the string you want to evaluate
+as argument to this program, e.g.:
 $ ./crc32hash "Dual Speed"
-
--------------------------------------------------------------------------
-/* crc32hash.c - derived from linux/lib/crc32.c, GNU GPL v2 */
-#include <string.h>
-#include <stdio.h>
-#include <ctype.h>
-#include <stdlib.h>
-
-unsigned int crc32(unsigned char const *p, unsigned int len)
-{
-	int i;
-	unsigned int crc = 0;
-	while (len--) {
-		crc ^= *p++;
-		for (i = 0; i < 8; i++)
-			crc = (crc >> 1) ^ ((crc & 1) ? 0xedb88320 : 0);
-	}
-	return crc;
-}
-
-int main(int argc, char **argv) {
-	unsigned int result;
-	if (argc != 2) {
-		printf("no string passed as argument\n");
-		return -1;
-	}
-	result = crc32(argv[1], strlen(argv[1]));
-	printf("0x%x\n", result);
-	return 0;
-}
diff --git a/Documentation/pi-futex.txt b/Documentation/pi-futex.txt
new file mode 100644
index 0000000..5d61dac
--- /dev/null
+++ b/Documentation/pi-futex.txt
@@ -0,0 +1,121 @@
+Lightweight PI-futexes
+----------------------
+
+We are calling them lightweight for 3 reasons:
+
+ - in the user-space fastpath a PI-enabled futex involves no kernel work
+   (or any other PI complexity) at all. No registration, no extra kernel
+   calls - just pure fast atomic ops in userspace.
+
+ - even in the slowpath, the system call and scheduling pattern is very
+   similar to normal futexes.
+
+ - the in-kernel PI implementation is streamlined around the mutex
+   abstraction, with strict rules that keep the implementation
+   relatively simple: only a single owner may own a lock (i.e. no
+   read-write lock support), only the owner may unlock a lock, no
+   recursive locking, etc.
+
+Priority Inheritance - why?
+---------------------------
+
+The short reply: user-space PI helps achieving/improving determinism for
+user-space applications. In the best-case, it can help achieve
+determinism and well-bound latencies. Even in the worst-case, PI will
+improve the statistical distribution of locking related application
+delays.
+
+The longer reply:
+-----------------
+
+Firstly, sharing locks between multiple tasks is a common programming
+technique that often cannot be replaced with lockless algorithms. As we
+can see it in the kernel [which is a quite complex program in itself],
+lockless structures are rather the exception than the norm - the current
+ratio of lockless vs. locky code for shared data structures is somewhere
+between 1:10 and 1:100. Lockless is hard, and the complexity of lockless
+algorithms often endangers to ability to do robust reviews of said code.
+I.e. critical RT apps often choose lock structures to protect critical
+data structures, instead of lockless algorithms. Furthermore, there are
+cases (like shared hardware, or other resource limits) where lockless
+access is mathematically impossible.
+
+Media players (such as Jack) are an example of reasonable application
+design with multiple tasks (with multiple priority levels) sharing
+short-held locks: for example, a highprio audio playback thread is
+combined with medium-prio construct-audio-data threads and low-prio
+display-colory-stuff threads. Add video and decoding to the mix and
+we've got even more priority levels.
+
+So once we accept that synchronization objects (locks) are an
+unavoidable fact of life, and once we accept that multi-task userspace
+apps have a very fair expectation of being able to use locks, we've got
+to think about how to offer the option of a deterministic locking
+implementation to user-space.
+
+Most of the technical counter-arguments against doing priority
+inheritance only apply to kernel-space locks. But user-space locks are
+different, there we cannot disable interrupts or make the task
+non-preemptible in a critical section, so the 'use spinlocks' argument
+does not apply (user-space spinlocks have the same priority inversion
+problems as other user-space locking constructs). Fact is, pretty much
+the only technique that currently enables good determinism for userspace
+locks (such as futex-based pthread mutexes) is priority inheritance:
+
+Currently (without PI), if a high-prio and a low-prio task shares a lock
+[this is a quite common scenario for most non-trivial RT applications],
+even if all critical sections are coded carefully to be deterministic
+(i.e. all critical sections are short in duration and only execute a
+limited number of instructions), the kernel cannot guarantee any
+deterministic execution of the high-prio task: any medium-priority task
+could preempt the low-prio task while it holds the shared lock and
+executes the critical section, and could delay it indefinitely.
+
+Implementation:
+---------------
+
+As mentioned before, the userspace fastpath of PI-enabled pthread
+mutexes involves no kernel work at all - they behave quite similarly to
+normal futex-based locks: a 0 value means unlocked, and a value==TID
+means locked. (This is the same method as used by list-based robust
+futexes.) Userspace uses atomic ops to lock/unlock these mutexes without
+entering the kernel.
+
+To handle the slowpath, we have added two new futex ops:
+
+  FUTEX_LOCK_PI
+  FUTEX_UNLOCK_PI
+
+If the lock-acquire fastpath fails, [i.e. an atomic transition from 0 to
+TID fails], then FUTEX_LOCK_PI is called. The kernel does all the
+remaining work: if there is no futex-queue attached to the futex address
+yet then the code looks up the task that owns the futex [it has put its
+own TID into the futex value], and attaches a 'PI state' structure to
+the futex-queue. The pi_state includes an rt-mutex, which is a PI-aware,
+kernel-based synchronization object. The 'other' task is made the owner
+of the rt-mutex, and the FUTEX_WAITERS bit is atomically set in the
+futex value. Then this task tries to lock the rt-mutex, on which it
+blocks. Once it returns, it has the mutex acquired, and it sets the
+futex value to its own TID and returns. Userspace has no other work to
+perform - it now owns the lock, and futex value contains
+FUTEX_WAITERS|TID.
+
+If the unlock side fastpath succeeds, [i.e. userspace manages to do a
+TID -> 0 atomic transition of the futex value], then no kernel work is
+triggered.
+
+If the unlock fastpath fails (because the FUTEX_WAITERS bit is set),
+then FUTEX_UNLOCK_PI is called, and the kernel unlocks the futex on the
+behalf of userspace - and it also unlocks the attached
+pi_state->rt_mutex and thus wakes up any potential waiters.
+
+Note that under this approach, contrary to previous PI-futex approaches,
+there is no prior 'registration' of a PI-futex. [which is not quite
+possible anyway, due to existing ABI properties of pthread mutexes.]
+
+Also, under this scheme, 'robustness' and 'PI' are two orthogonal
+properties of futexes, and all four combinations are possible: futex,
+robust-futex, PI-futex, robust+PI-futex.
+
+More details about priority inheritance can be found in
+Documentation/rtmutex.txt.
diff --git a/Documentation/power/devices.txt b/Documentation/power/devices.txt
index f987afe..fba1e05 100644
--- a/Documentation/power/devices.txt
+++ b/Documentation/power/devices.txt
@@ -135,96 +135,6 @@
 
 FREEZE -- stop DMA and interrupts, and be prepared to reinit HW from
 scratch. That probably means stop accepting upstream requests, the
-actual policy of what to do with them beeing specific to a given
-driver. It's acceptable for a network driver to just drop packets
-while a block driver is expected to block the queue so no request is
-lost. (Use IDE as an example on how to do that). FREEZE requires no
-power state change, and it's expected for drivers to be able to
-quickly transition back to operating state.
-
-SUSPEND -- like FREEZE, but also put hardware into low-power state. If
-there's need to distinguish several levels of sleep, additional flag
-is probably best way to do that.
-
-Transitions are only from a resumed state to a suspended state, never
-between 2 suspended states. (ON -> FREEZE or ON -> SUSPEND can happen,
-FREEZE -> SUSPEND or SUSPEND -> FREEZE can not).
-
-All events are:
-
-[NOTE NOTE NOTE: If you are driver author, you should not care; you
-should only look at event, and ignore flags.]
-
-#Prepare for suspend -- userland is still running but we are going to
-#enter suspend state. This gives drivers chance to load firmware from
-#disk and store it in memory, or do other activities taht require
-#operating userland, ability to kmalloc GFP_KERNEL, etc... All of these
-#are forbiden once the suspend dance is started.. event = ON, flags =
-#PREPARE_TO_SUSPEND
-
-Apm standby -- prepare for APM event. Quiesce devices to make life
-easier for APM BIOS. event = FREEZE, flags = APM_STANDBY
-
-Apm suspend -- same as APM_STANDBY, but it we should probably avoid
-spinning down disks. event = FREEZE, flags = APM_SUSPEND
-
-System halt, reboot -- quiesce devices to make life easier for BIOS. event
-= FREEZE, flags = SYSTEM_HALT or SYSTEM_REBOOT
-
-System shutdown -- at least disks need to be spun down, or data may be
-lost. Quiesce devices, just to make life easier for BIOS. event =
-FREEZE, flags = SYSTEM_SHUTDOWN
-
-Kexec    -- turn off DMAs and put hardware into some state where new
-kernel can take over. event = FREEZE, flags = KEXEC
-
-Powerdown at end of swsusp -- very similar to SYSTEM_SHUTDOWN, except wake
-may need to be enabled on some devices. This actually has at least 3
-subtypes, system can reboot, enter S4 and enter S5 at the end of
-swsusp. event = FREEZE, flags = SWSUSP and one of SYSTEM_REBOOT,
-SYSTEM_SHUTDOWN, SYSTEM_S4
-
-Suspend to ram  -- put devices into low power state. event = SUSPEND,
-flags = SUSPEND_TO_RAM
-
-Freeze for swsusp snapshot -- stop DMA and interrupts. No need to put
-devices into low power mode, but you must be able to reinitialize
-device from scratch in resume method. This has two flavors, its done
-once on suspending kernel, once on resuming kernel. event = FREEZE,
-flags = DURING_SUSPEND or DURING_RESUME
-
-Device detach requested from /sys -- deinitialize device; proably same as
-SYSTEM_SHUTDOWN, I do not understand this one too much. probably event
-= FREEZE, flags = DEV_DETACH.
-
-#These are not really events sent:
-#
-#System fully on -- device is working normally; this is probably never
-#passed to suspend() method... event = ON, flags = 0
-#
-#Ready after resume -- userland is now running, again. Time to free any
-#memory you ate during prepare to suspend... event = ON, flags =
-#READY_AFTER_RESUME
-#
-
-
-pm_message_t meaning
-
-pm_message_t has two fields. event ("major"), and flags.  If driver
-does not know event code, it aborts the request, returning error. Some
-drivers may need to deal with special cases based on the actual type
-of suspend operation being done at the system level. This is why
-there are flags.
-
-Event codes are:
-
-ON -- no need to do anything except special cases like broken
-HW.
-
-# NOTIFICATION -- pretty much same as ON?
-
-FREEZE -- stop DMA and interrupts, and be prepared to reinit HW from
-scratch. That probably means stop accepting upstream requests, the
 actual policy of what to do with them being specific to a given
 driver. It's acceptable for a network driver to just drop packets
 while a block driver is expected to block the queue so no request is
diff --git a/Documentation/power/swsusp.txt b/Documentation/power/swsusp.txt
index d7814a11..823b2cf 100644
--- a/Documentation/power/swsusp.txt
+++ b/Documentation/power/swsusp.txt
@@ -18,10 +18,11 @@
  *
  * (*) suspend/resume support is needed to make it safe.
  *
- * If you have any filesystems on USB devices mounted before suspend,
+ * If you have any filesystems on USB devices mounted before software suspend,
  * they won't be accessible after resume and you may lose data, as though
- * you have unplugged the USB devices with mounted filesystems on them
- * (see the FAQ below for details).
+ * you have unplugged the USB devices with mounted filesystems on them;
+ * see the FAQ below for details.  (This is not true for more traditional
+ * power states like "standby", which normally don't turn USB off.)
 
 You need to append resume=/dev/your_swap_partition to kernel command
 line. Then you suspend by
@@ -204,7 +205,7 @@
 distinctions between SUSPEND and FREEZE.
 
 A: Doing SUSPEND when you are asked to do FREEZE is always correct,
-but it may be unneccessarily slow. If you want USB to stay simple,
+but it may be unneccessarily slow. If you want your driver to stay simple,
 slowness may not matter to you. It can always be fixed later.
 
 For devices like disk it does matter, you do not want to spindown for
@@ -349,25 +350,72 @@
 
 A: If you want to see any non-error kernel messages on the virtual
 terminal the kernel switches to during suspend, you have to set the
-kernel console loglevel to at least 5, for example by doing
+kernel console loglevel to at least 4 (KERN_WARNING), for example by
+doing
 
-	echo 5 > /proc/sys/kernel/printk
+	# save the old loglevel
+	read LOGLEVEL DUMMY < /proc/sys/kernel/printk
+	# set the loglevel so we see the progress bar.
+	# if the level is higher than needed, we leave it alone.
+	if [ $LOGLEVEL -lt 5 ]; then
+	        echo 5 > /proc/sys/kernel/printk
+		fi
+
+        IMG_SZ=0
+        read IMG_SZ < /sys/power/image_size
+        echo -n disk > /sys/power/state
+        RET=$?
+        #
+        # the logic here is:
+        # if image_size > 0 (without kernel support, IMG_SZ will be zero),
+        # then try again with image_size set to zero.
+	if [ $RET -ne 0 -a $IMG_SZ -ne 0 ]; then # try again with minimal image size
+                echo 0 > /sys/power/image_size
+                echo -n disk > /sys/power/state
+                RET=$?
+        fi
+
+	# restore previous loglevel
+	echo $LOGLEVEL > /proc/sys/kernel/printk
+	exit $RET
 
 Q: Is this true that if I have a mounted filesystem on a USB device and
 I suspend to disk, I can lose data unless the filesystem has been mounted
 with "sync"?
 
-A: That's right.  It depends on your hardware, and it could be true even for
-suspend-to-RAM.  In fact, even with "-o sync" you can lose data if your
-programs have information in buffers they haven't written out to disk.
+A: That's right ... if you disconnect that device, you may lose data.
+In fact, even with "-o sync" you can lose data if your programs have
+information in buffers they haven't written out to a disk you disconnect,
+or if you disconnect before the device finished saving data you wrote.
 
-If you're lucky, your hardware will support low-power modes for USB
-controllers while the system is asleep.  Lots of hardware doesn't,
-however.  Shutting off the power to a USB controller is equivalent to
-unplugging all the attached devices.
+Software suspend normally powers down USB controllers, which is equivalent
+to disconnecting all USB devices attached to your system.
+
+Your system might well support low-power modes for its USB controllers
+while the system is asleep, maintaining the connection, using true sleep
+modes like "suspend-to-RAM" or "standby".  (Don't write "disk" to the
+/sys/power/state file; write "standby" or "mem".)  We've not seen any
+hardware that can use these modes through software suspend, although in
+theory some systems might support "platform" or "firmware" modes that
+won't break the USB connections.
 
 Remember that it's always a bad idea to unplug a disk drive containing a
-mounted filesystem.  With USB that's true even when your system is asleep!
-The safest thing is to unmount all USB-based filesystems before suspending
-and remount them after resuming.
+mounted filesystem.  That's true even when your system is asleep!  The
+safest thing is to unmount all filesystems on removable media (such USB,
+Firewire, CompactFlash, MMC, external SATA, or even IDE hotplug bays)
+before suspending; then remount them after resuming.
 
+Q: I upgraded the kernel from 2.6.15 to 2.6.16. Both kernels were
+compiled with the similar configuration files. Anyway I found that
+suspend to disk (and resume) is much slower on 2.6.16 compared to
+2.6.15. Any idea for why that might happen or how can I speed it up?
+
+A: This is because the size of the suspend image is now greater than
+for 2.6.15 (by saving more data we can get more responsive system
+after resume).
+
+There's the /sys/power/image_size knob that controls the size of the
+image.  If you set it to 0 (eg. by echo 0 > /sys/power/image_size as
+root), the 2.6.15 behavior should be restored.  If it is still too
+slow, take a look at suspend.sf.net -- userland suspend is faster and
+supports LZF compression to speed it up further.
diff --git a/Documentation/power/video.txt b/Documentation/power/video.txt
index 43a889f..d859faa 100644
--- a/Documentation/power/video.txt
+++ b/Documentation/power/video.txt
@@ -90,6 +90,7 @@
 Model                           hack (or "how to do it")
 ------------------------------------------------------------------------------
 Acer Aspire 1406LC		ole's late BIOS init (7), turn off DRI
+Acer TM 230			s3_bios (2)
 Acer TM 242FX			vbetool (6)
 Acer TM C110			video_post (8)
 Acer TM C300                    vga=normal (only suspend on console, not in X), vbetool (6) or video_post (8)
@@ -115,6 +116,7 @@
 Dell Inspiron 4000		??? (*)
 Dell Inspiron 500m		??? (*)
 Dell Inspiron 510m		???
+Dell Inspiron 5150		vbetool needed (6)
 Dell Inspiron 600m		??? (*)
 Dell Inspiron 8200		??? (*)
 Dell Inspiron 8500		??? (*)
@@ -125,6 +127,7 @@
 HP Pavilion ZD7000		vbetool post needed, need open-source nv driver for X
 HP Omnibook XE3	athlon version	none (1)
 HP Omnibook XE3GC		none (1), video is S3 Savage/IX-MV
+HP Omnibook XE3L-GF		vbetool (6)
 HP Omnibook 5150		none (1), (S1 also works OK)
 IBM TP T20, model 2647-44G	none (1), video is S3 Inc. 86C270-294 Savage/IX-MV, vesafb gets "interesting" but X work.
 IBM TP A31 / Type 2652-M5G      s3_mode (3) [works ok with BIOS 1.04 2002-08-23, but not at all with BIOS 1.11 2004-11-05 :-(]
@@ -157,6 +160,7 @@
 Sony Vaio vgn-S580BH		vga=normal, but suspend from X. Console will be blank unless you return to X.
 Sony Vaio vgn-FS115B		s3_bios (2),s3_mode (4)
 Toshiba Libretto L5		none (1)
+Toshiba Libretto 100CT/110CT    vbetool (6)
 Toshiba Portege 3020CT		s3_mode (3)
 Toshiba Satellite 4030CDT	s3_mode (3) (S1 also works OK)
 Toshiba Satellite 4080XCDT      s3_mode (3) (S1 also works OK)
diff --git a/Documentation/powerpc/booting-without-of.txt b/Documentation/powerpc/booting-without-of.txt
index 217e517..3c62e66 100644
--- a/Documentation/powerpc/booting-without-of.txt
+++ b/Documentation/powerpc/booting-without-of.txt
@@ -1436,9 +1436,9 @@
                interrupts = <1d 3>;
                interrupt-parent = <40000>;
                num-channels = <4>;
-               channel-fifo-len = <24>;
+               channel-fifo-len = <18>;
                exec-units-mask = <000000fe>;
-               descriptor-types-mask = <073f1127>;
+               descriptor-types-mask = <012b0ebf>;
        };
 
 
diff --git a/Documentation/ramdisk.txt b/Documentation/ramdisk.txt
index 7c25584..52f75b7 100644
--- a/Documentation/ramdisk.txt
+++ b/Documentation/ramdisk.txt
@@ -6,7 +6,7 @@
 	1) Overview
 	2) Kernel Command Line Parameters
 	3) Using "rdev -r"
-	4) An Example of Creating a Compressed RAM Disk 
+	4) An Example of Creating a Compressed RAM Disk
 
 
 1) Overview
@@ -34,7 +34,7 @@
 compatibility reasons, but it may be removed in the future.
 
 The new RAM disk also has the ability to load compressed RAM disk images,
-allowing one to squeeze more programs onto an average installation or 
+allowing one to squeeze more programs onto an average installation or
 rescue floppy disk.
 
 
@@ -51,7 +51,7 @@
 	===================
 
 This parameter tells the RAM disk driver how many bytes to use per block.  The
-default is 512.
+default is 1024 (BLOCK_SIZE).
 
 
 3) Using "rdev -r"
@@ -70,7 +70,7 @@
 ./arch/i386/kernel/setup.c:#define RAMDISK_PROMPT_FLAG          0x8000
 ./arch/i386/kernel/setup.c:#define RAMDISK_LOAD_FLAG            0x4000
 
-Consider a typical two floppy disk setup, where you will have the 
+Consider a typical two floppy disk setup, where you will have the
 kernel on disk one, and have already put a RAM disk image onto disk #2.
 
 Hence you want to set bits 0 to 13 as 0, meaning that your RAM disk
@@ -97,12 +97,12 @@
 	append = "load_ramdisk=1"
 
 
-4) An Example of Creating a Compressed RAM Disk 
+4) An Example of Creating a Compressed RAM Disk
 ----------------------------------------------
 
 To create a RAM disk image, you will need a spare block device to
 construct it on. This can be the RAM disk device itself, or an
-unused disk partition (such as an unmounted swap partition). For this 
+unused disk partition (such as an unmounted swap partition). For this
 example, we will use the RAM disk device, "/dev/ram0".
 
 Note: This technique should not be done on a machine with less than 8 MB
diff --git a/Documentation/robust-futexes.txt b/Documentation/robust-futexes.txt
index df82d75..76e8064 100644
--- a/Documentation/robust-futexes.txt
+++ b/Documentation/robust-futexes.txt
@@ -95,7 +95,7 @@
 is empty. If the thread/process crashed or terminated in some incorrect
 way then the list might be non-empty: in this case the kernel carefully
 walks the list [not trusting it], and marks all locks that are owned by
-this thread with the FUTEX_OWNER_DEAD bit, and wakes up one waiter (if
+this thread with the FUTEX_OWNER_DIED bit, and wakes up one waiter (if
 any).
 
 The list is guaranteed to be private and per-thread at do_exit() time,
diff --git a/Documentation/rt-mutex-design.txt b/Documentation/rt-mutex-design.txt
new file mode 100644
index 0000000..c472ffa
--- /dev/null
+++ b/Documentation/rt-mutex-design.txt
@@ -0,0 +1,781 @@
+#
+# Copyright (c) 2006 Steven Rostedt
+# Licensed under the GNU Free Documentation License, Version 1.2
+#
+
+RT-mutex implementation design
+------------------------------
+
+This document tries to describe the design of the rtmutex.c implementation.
+It doesn't describe the reasons why rtmutex.c exists. For that please see
+Documentation/rt-mutex.txt.  Although this document does explain problems
+that happen without this code, but that is in the concept to understand
+what the code actually is doing.
+
+The goal of this document is to help others understand the priority
+inheritance (PI) algorithm that is used, as well as reasons for the
+decisions that were made to implement PI in the manner that was done.
+
+
+Unbounded Priority Inversion
+----------------------------
+
+Priority inversion is when a lower priority process executes while a higher
+priority process wants to run.  This happens for several reasons, and
+most of the time it can't be helped.  Anytime a high priority process wants
+to use a resource that a lower priority process has (a mutex for example),
+the high priority process must wait until the lower priority process is done
+with the resource.  This is a priority inversion.  What we want to prevent
+is something called unbounded priority inversion.  That is when the high
+priority process is prevented from running by a lower priority process for
+an undetermined amount of time.
+
+The classic example of unbounded priority inversion is were you have three
+processes, let's call them processes A, B, and C, where A is the highest
+priority process, C is the lowest, and B is in between. A tries to grab a lock
+that C owns and must wait and lets C run to release the lock. But in the
+meantime, B executes, and since B is of a higher priority than C, it preempts C,
+but by doing so, it is in fact preempting A which is a higher priority process.
+Now there's no way of knowing how long A will be sleeping waiting for C
+to release the lock, because for all we know, B is a CPU hog and will
+never give C a chance to release the lock.  This is called unbounded priority
+inversion.
+
+Here's a little ASCII art to show the problem.
+
+   grab lock L1 (owned by C)
+     |
+A ---+
+        C preempted by B
+          |
+C    +----+
+
+B         +-------->
+                B now keeps A from running.
+
+
+Priority Inheritance (PI)
+-------------------------
+
+There are several ways to solve this issue, but other ways are out of scope
+for this document.  Here we only discuss PI.
+
+PI is where a process inherits the priority of another process if the other
+process blocks on a lock owned by the current process.  To make this easier
+to understand, let's use the previous example, with processes A, B, and C again.
+
+This time, when A blocks on the lock owned by C, C would inherit the priority
+of A.  So now if B becomes runnable, it would not preempt C, since C now has
+the high priority of A.  As soon as C releases the lock, it loses its
+inherited priority, and A then can continue with the resource that C had.
+
+Terminology
+-----------
+
+Here I explain some terminology that is used in this document to help describe
+the design that is used to implement PI.
+
+PI chain - The PI chain is an ordered series of locks and processes that cause
+           processes to inherit priorities from a previous process that is
+           blocked on one of its locks.  This is described in more detail
+           later in this document.
+
+mutex    - In this document, to differentiate from locks that implement
+           PI and spin locks that are used in the PI code, from now on
+           the PI locks will be called a mutex.
+
+lock     - In this document from now on, I will use the term lock when
+           referring to spin locks that are used to protect parts of the PI
+           algorithm.  These locks disable preemption for UP (when
+           CONFIG_PREEMPT is enabled) and on SMP prevents multiple CPUs from
+           entering critical sections simultaneously.
+
+spin lock - Same as lock above.
+
+waiter   - A waiter is a struct that is stored on the stack of a blocked
+           process.  Since the scope of the waiter is within the code for
+           a process being blocked on the mutex, it is fine to allocate
+           the waiter on the process's stack (local variable).  This
+           structure holds a pointer to the task, as well as the mutex that
+           the task is blocked on.  It also has the plist node structures to
+           place the task in the waiter_list of a mutex as well as the
+           pi_list of a mutex owner task (described below).
+
+           waiter is sometimes used in reference to the task that is waiting
+           on a mutex. This is the same as waiter->task.
+
+waiters  - A list of processes that are blocked on a mutex.
+
+top waiter - The highest priority process waiting on a specific mutex.
+
+top pi waiter - The highest priority process waiting on one of the mutexes
+                that a specific process owns.
+
+Note:  task and process are used interchangeably in this document, mostly to
+       differentiate between two processes that are being described together.
+
+
+PI chain
+--------
+
+The PI chain is a list of processes and mutexes that may cause priority
+inheritance to take place.  Multiple chains may converge, but a chain
+would never diverge, since a process can't be blocked on more than one
+mutex at a time.
+
+Example:
+
+   Process:  A, B, C, D, E
+   Mutexes:  L1, L2, L3, L4
+
+   A owns: L1
+           B blocked on L1
+           B owns L2
+                  C blocked on L2
+                  C owns L3
+                         D blocked on L3
+                         D owns L4
+                                E blocked on L4
+
+The chain would be:
+
+   E->L4->D->L3->C->L2->B->L1->A
+
+To show where two chains merge, we could add another process F and
+another mutex L5 where B owns L5 and F is blocked on mutex L5.
+
+The chain for F would be:
+
+   F->L5->B->L1->A
+
+Since a process may own more than one mutex, but never be blocked on more than
+one, the chains merge.
+
+Here we show both chains:
+
+   E->L4->D->L3->C->L2-+
+                       |
+                       +->B->L1->A
+                       |
+                 F->L5-+
+
+For PI to work, the processes at the right end of these chains (or we may
+also call it the Top of the chain) must be equal to or higher in priority
+than the processes to the left or below in the chain.
+
+Also since a mutex may have more than one process blocked on it, we can
+have multiple chains merge at mutexes.  If we add another process G that is
+blocked on mutex L2:
+
+  G->L2->B->L1->A
+
+And once again, to show how this can grow I will show the merging chains
+again.
+
+   E->L4->D->L3->C-+
+                   +->L2-+
+                   |     |
+                 G-+     +->B->L1->A
+                         |
+                   F->L5-+
+
+
+Plist
+-----
+
+Before I go further and talk about how the PI chain is stored through lists
+on both mutexes and processes, I'll explain the plist.  This is similar to
+the struct list_head functionality that is already in the kernel.
+The implementation of plist is out of scope for this document, but it is
+very important to understand what it does.
+
+There are a few differences between plist and list, the most important one
+being that plist is a priority sorted linked list.  This means that the
+priorities of the plist are sorted, such that it takes O(1) to retrieve the
+highest priority item in the list.  Obviously this is useful to store processes
+based on their priorities.
+
+Another difference, which is important for implementation, is that, unlike
+list, the head of the list is a different element than the nodes of a list.
+So the head of the list is declared as struct plist_head and nodes that will
+be added to the list are declared as struct plist_node.
+
+
+Mutex Waiter List
+-----------------
+
+Every mutex keeps track of all the waiters that are blocked on itself. The mutex
+has a plist to store these waiters by priority.  This list is protected by
+a spin lock that is located in the struct of the mutex. This lock is called
+wait_lock.  Since the modification of the waiter list is never done in
+interrupt context, the wait_lock can be taken without disabling interrupts.
+
+
+Task PI List
+------------
+
+To keep track of the PI chains, each process has its own PI list.  This is
+a list of all top waiters of the mutexes that are owned by the process.
+Note that this list only holds the top waiters and not all waiters that are
+blocked on mutexes owned by the process.
+
+The top of the task's PI list is always the highest priority task that
+is waiting on a mutex that is owned by the task.  So if the task has
+inherited a priority, it will always be the priority of the task that is
+at the top of this list.
+
+This list is stored in the task structure of a process as a plist called
+pi_list.  This list is protected by a spin lock also in the task structure,
+called pi_lock.  This lock may also be taken in interrupt context, so when
+locking the pi_lock, interrupts must be disabled.
+
+
+Depth of the PI Chain
+---------------------
+
+The maximum depth of the PI chain is not dynamic, and could actually be
+defined.  But is very complex to figure it out, since it depends on all
+the nesting of mutexes.  Let's look at the example where we have 3 mutexes,
+L1, L2, and L3, and four separate functions func1, func2, func3 and func4.
+The following shows a locking order of L1->L2->L3, but may not actually
+be directly nested that way.
+
+void func1(void)
+{
+	mutex_lock(L1);
+
+	/* do anything */
+
+	mutex_unlock(L1);
+}
+
+void func2(void)
+{
+	mutex_lock(L1);
+	mutex_lock(L2);
+
+	/* do something */
+
+	mutex_unlock(L2);
+	mutex_unlock(L1);
+}
+
+void func3(void)
+{
+	mutex_lock(L2);
+	mutex_lock(L3);
+
+	/* do something else */
+
+	mutex_unlock(L3);
+	mutex_unlock(L2);
+}
+
+void func4(void)
+{
+	mutex_lock(L3);
+
+	/* do something again */
+
+	mutex_unlock(L3);
+}
+
+Now we add 4 processes that run each of these functions separately.
+Processes A, B, C, and D which run functions func1, func2, func3 and func4
+respectively, and such that D runs first and A last.  With D being preempted
+in func4 in the "do something again" area, we have a locking that follows:
+
+D owns L3
+       C blocked on L3
+       C owns L2
+              B blocked on L2
+              B owns L1
+                     A blocked on L1
+
+And thus we have the chain A->L1->B->L2->C->L3->D.
+
+This gives us a PI depth of 4 (four processes), but looking at any of the
+functions individually, it seems as though they only have at most a locking
+depth of two.  So, although the locking depth is defined at compile time,
+it still is very difficult to find the possibilities of that depth.
+
+Now since mutexes can be defined by user-land applications, we don't want a DOS
+type of application that nests large amounts of mutexes to create a large
+PI chain, and have the code holding spin locks while looking at a large
+amount of data.  So to prevent this, the implementation not only implements
+a maximum lock depth, but also only holds at most two different locks at a
+time, as it walks the PI chain.  More about this below.
+
+
+Mutex owner and flags
+---------------------
+
+The mutex structure contains a pointer to the owner of the mutex.  If the
+mutex is not owned, this owner is set to NULL.  Since all architectures
+have the task structure on at least a four byte alignment (and if this is
+not true, the rtmutex.c code will be broken!), this allows for the two
+least significant bits to be used as flags.  This part is also described
+in Documentation/rt-mutex.txt, but will also be briefly described here.
+
+Bit 0 is used as the "Pending Owner" flag.  This is described later.
+Bit 1 is used as the "Has Waiters" flags.  This is also described later
+  in more detail, but is set whenever there are waiters on a mutex.
+
+
+cmpxchg Tricks
+--------------
+
+Some architectures implement an atomic cmpxchg (Compare and Exchange).  This
+is used (when applicable) to keep the fast path of grabbing and releasing
+mutexes short.
+
+cmpxchg is basically the following function performed atomically:
+
+unsigned long _cmpxchg(unsigned long *A, unsigned long *B, unsigned long *C)
+{
+        unsigned long T = *A;
+        if (*A == *B) {
+                *A = *C;
+        }
+        return T;
+}
+#define cmpxchg(a,b,c) _cmpxchg(&a,&b,&c)
+
+This is really nice to have, since it allows you to only update a variable
+if the variable is what you expect it to be.  You know if it succeeded if
+the return value (the old value of A) is equal to B.
+
+The macro rt_mutex_cmpxchg is used to try to lock and unlock mutexes. If
+the architecture does not support CMPXCHG, then this macro is simply set
+to fail every time.  But if CMPXCHG is supported, then this will
+help out extremely to keep the fast path short.
+
+The use of rt_mutex_cmpxchg with the flags in the owner field help optimize
+the system for architectures that support it.  This will also be explained
+later in this document.
+
+
+Priority adjustments
+--------------------
+
+The implementation of the PI code in rtmutex.c has several places that a
+process must adjust its priority.  With the help of the pi_list of a
+process this is rather easy to know what needs to be adjusted.
+
+The functions implementing the task adjustments are rt_mutex_adjust_prio,
+__rt_mutex_adjust_prio (same as the former, but expects the task pi_lock
+to already be taken), rt_mutex_get_prio, and rt_mutex_setprio.
+
+rt_mutex_getprio and rt_mutex_setprio are only used in __rt_mutex_adjust_prio.
+
+rt_mutex_getprio returns the priority that the task should have.  Either the
+task's own normal priority, or if a process of a higher priority is waiting on
+a mutex owned by the task, then that higher priority should be returned.
+Since the pi_list of a task holds an order by priority list of all the top
+waiters of all the mutexes that the task owns, rt_mutex_getprio simply needs
+to compare the top pi waiter to its own normal priority, and return the higher
+priority back.
+
+(Note:  if looking at the code, you will notice that the lower number of
+        prio is returned.  This is because the prio field in the task structure
+        is an inverse order of the actual priority.  So a "prio" of 5 is
+        of higher priority than a "prio" of 10.)
+
+__rt_mutex_adjust_prio examines the result of rt_mutex_getprio, and if the
+result does not equal the task's current priority, then rt_mutex_setprio
+is called to adjust the priority of the task to the new priority.
+Note that rt_mutex_setprio is defined in kernel/sched.c to implement the
+actual change in priority.
+
+It is interesting to note that __rt_mutex_adjust_prio can either increase
+or decrease the priority of the task.  In the case that a higher priority
+process has just blocked on a mutex owned by the task, __rt_mutex_adjust_prio
+would increase/boost the task's priority.  But if a higher priority task
+were for some reason to leave the mutex (timeout or signal), this same function
+would decrease/unboost the priority of the task.  That is because the pi_list
+always contains the highest priority task that is waiting on a mutex owned
+by the task, so we only need to compare the priority of that top pi waiter
+to the normal priority of the given task.
+
+
+High level overview of the PI chain walk
+----------------------------------------
+
+The PI chain walk is implemented by the function rt_mutex_adjust_prio_chain.
+
+The implementation has gone through several iterations, and has ended up
+with what we believe is the best.  It walks the PI chain by only grabbing
+at most two locks at a time, and is very efficient.
+
+The rt_mutex_adjust_prio_chain can be used either to boost or lower process
+priorities.
+
+rt_mutex_adjust_prio_chain is called with a task to be checked for PI
+(de)boosting (the owner of a mutex that a process is blocking on), a flag to
+check for deadlocking, the mutex that the task owns, and a pointer to a waiter
+that is the process's waiter struct that is blocked on the mutex (although this
+parameter may be NULL for deboosting).
+
+For this explanation, I will not mention deadlock detection. This explanation
+will try to stay at a high level.
+
+When this function is called, there are no locks held.  That also means
+that the state of the owner and lock can change when entered into this function.
+
+Before this function is called, the task has already had rt_mutex_adjust_prio
+performed on it.  This means that the task is set to the priority that it
+should be at, but the plist nodes of the task's waiter have not been updated
+with the new priorities, and that this task may not be in the proper locations
+in the pi_lists and wait_lists that the task is blocked on.  This function
+solves all that.
+
+A loop is entered, where task is the owner to be checked for PI changes that
+was passed by parameter (for the first iteration).  The pi_lock of this task is
+taken to prevent any more changes to the pi_list of the task.  This also
+prevents new tasks from completing the blocking on a mutex that is owned by this
+task.
+
+If the task is not blocked on a mutex then the loop is exited.  We are at
+the top of the PI chain.
+
+A check is now done to see if the original waiter (the process that is blocked
+on the current mutex) is the top pi waiter of the task.  That is, is this
+waiter on the top of the task's pi_list.  If it is not, it either means that
+there is another process higher in priority that is blocked on one of the
+mutexes that the task owns, or that the waiter has just woken up via a signal
+or timeout and has left the PI chain.  In either case, the loop is exited, since
+we don't need to do any more changes to the priority of the current task, or any
+task that owns a mutex that this current task is waiting on.  A priority chain
+walk is only needed when a new top pi waiter is made to a task.
+
+The next check sees if the task's waiter plist node has the priority equal to
+the priority the task is set at.  If they are equal, then we are done with
+the loop.  Remember that the function started with the priority of the
+task adjusted, but the plist nodes that hold the task in other processes<