summaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/ABI/obsolete/dv13949
-rw-r--r--Documentation/ABI/removed/dv139414
-rw-r--r--Documentation/ABI/removed/raw139415
-rw-r--r--Documentation/ABI/removed/raw1394_legacy_isochronous16
-rw-r--r--Documentation/ABI/removed/video139416
-rw-r--r--Documentation/ABI/testing/sysfs-devices-system-ibm-rtl22
-rw-r--r--Documentation/DocBook/device-drivers.tmpl5
-rw-r--r--Documentation/DocBook/kernel-api.tmpl6
-rw-r--r--Documentation/feature-removal-schedule.txt20
-rw-r--r--Documentation/filesystems/Locking31
-rw-r--r--Documentation/filesystems/nfs/00-INDEX4
-rw-r--r--Documentation/filesystems/nfs/idmapper.txt67
-rw-r--r--Documentation/filesystems/nfs/nfsroot.txt22
-rw-r--r--Documentation/filesystems/nfs/pnfs.txt48
-rw-r--r--Documentation/filesystems/proc.txt14
-rw-r--r--Documentation/filesystems/sharedsubtree.txt4
-rw-r--r--Documentation/hwmon/ltc426163
-rw-r--r--Documentation/kernel-parameters.txt7
-rw-r--r--Documentation/misc-devices/apds990x.txt111
-rw-r--r--Documentation/misc-devices/bh1770glc.txt116
-rw-r--r--Documentation/sound/alsa/ALSA-Configuration.txt82
-rw-r--r--Documentation/sound/alsa/HD-Audio.txt8
-rw-r--r--Documentation/sysrq.txt7
-rw-r--r--Documentation/timers/hpet_example.c27
-rw-r--r--Documentation/trace/postprocess/trace-vmscan-postprocess.pl39
-rw-r--r--Documentation/vm/highmem.txt162
26 files changed, 852 insertions, 83 deletions
diff --git a/Documentation/ABI/obsolete/dv1394 b/Documentation/ABI/obsolete/dv1394
deleted file mode 100644
index 2ee36864ca1..00000000000
--- a/Documentation/ABI/obsolete/dv1394
+++ /dev/null
@@ -1,9 +0,0 @@
-What: dv1394 (a.k.a. "OHCI-DV I/O support" for FireWire)
-Contact: linux1394-devel@lists.sourceforge.net
-Description:
- New application development should use raw1394 + userspace libraries
- instead, notably libiec61883 which is functionally equivalent.
-
-Users:
- ffmpeg/libavformat (used by a variety of media players)
- dvgrab v1.x (replaced by dvgrab2 on top of raw1394 and resp. libraries)
diff --git a/Documentation/ABI/removed/dv1394 b/Documentation/ABI/removed/dv1394
new file mode 100644
index 00000000000..c2310b6676f
--- /dev/null
+++ b/Documentation/ABI/removed/dv1394
@@ -0,0 +1,14 @@
+What: dv1394 (a.k.a. "OHCI-DV I/O support" for FireWire)
+Date: May 2010 (scheduled), finally removed in kernel v2.6.37
+Contact: linux1394-devel@lists.sourceforge.net
+Description:
+ /dev/dv1394/* were character device files, one for each FireWire
+ controller and for NTSC and PAL respectively, from which DV data
+ could be received by read() or transmitted by write(). A few
+ ioctl()s allowed limited control.
+ This special-purpose interface has been superseded by libraw1394 +
+ libiec61883 which are functionally equivalent, support HDV, and
+ transparently work on top of the newer firewire kernel drivers.
+
+Users:
+ ffmpeg/libavformat (if configured for DV1394)
diff --git a/Documentation/ABI/removed/raw1394 b/Documentation/ABI/removed/raw1394
new file mode 100644
index 00000000000..490aa1efc4a
--- /dev/null
+++ b/Documentation/ABI/removed/raw1394
@@ -0,0 +1,15 @@
+What: raw1394 (a.k.a. "Raw IEEE1394 I/O support" for FireWire)
+Date: May 2010 (scheduled), finally removed in kernel v2.6.37
+Contact: linux1394-devel@lists.sourceforge.net
+Description:
+ /dev/raw1394 was a character device file that allowed low-level
+ access to FireWire buses. Its major drawbacks were its inability
+ to implement sensible device security policies, and its low level
+ of abstraction that required userspace clients do duplicate much
+ of the kernel's ieee1394 core functionality.
+ Replaced by /dev/fw*, i.e. the <linux/firewire-cdev.h> ABI of
+ firewire-core.
+
+Users:
+ libraw1394 (works with firewire-cdev too, transparent to library ABI
+ users)
diff --git a/Documentation/ABI/removed/raw1394_legacy_isochronous b/Documentation/ABI/removed/raw1394_legacy_isochronous
deleted file mode 100644
index 1b629622d88..00000000000
--- a/Documentation/ABI/removed/raw1394_legacy_isochronous
+++ /dev/null
@@ -1,16 +0,0 @@
-What: legacy isochronous ABI of raw1394 (1st generation iso ABI)
-Date: June 2007 (scheduled), removed in kernel v2.6.23
-Contact: linux1394-devel@lists.sourceforge.net
-Description:
- The two request types RAW1394_REQ_ISO_SEND, RAW1394_REQ_ISO_LISTEN have
- been deprecated for quite some time. They are very inefficient as they
- come with high interrupt load and several layers of callbacks for each
- packet. Because of these deficiencies, the video1394 and dv1394 drivers
- and the 3rd-generation isochronous ABI in raw1394 (rawiso) were created.
-
-Users:
- libraw1394 users via the long deprecated API raw1394_iso_write,
- raw1394_start_iso_write, raw1394_start_iso_rcv, raw1394_stop_iso_rcv
-
- libdc1394, which optionally uses these old libraw1394 calls
- alternatively to the more efficient video1394 ABI
diff --git a/Documentation/ABI/removed/video1394 b/Documentation/ABI/removed/video1394
new file mode 100644
index 00000000000..c39c25aee77
--- /dev/null
+++ b/Documentation/ABI/removed/video1394
@@ -0,0 +1,16 @@
+What: video1394 (a.k.a. "OHCI-1394 Video support" for FireWire)
+Date: May 2010 (scheduled), finally removed in kernel v2.6.37
+Contact: linux1394-devel@lists.sourceforge.net
+Description:
+ /dev/video1394/* were character device files, one for each FireWire
+ controller, which were used for isochronous I/O. It was added as an
+ alternative to raw1394's isochronous I/O functionality which had
+ performance issues in its first generation. Any video1394 user had
+ to use raw1394 + libraw1394 too because video1394 did not provide
+ asynchronous I/O for device discovery and configuration.
+ Replaced by /dev/fw*, i.e. the <linux/firewire-cdev.h> ABI of
+ firewire-core.
+
+Users:
+ libdc1394 (works with firewire-cdev too, transparent to library ABI
+ users)
diff --git a/Documentation/ABI/testing/sysfs-devices-system-ibm-rtl b/Documentation/ABI/testing/sysfs-devices-system-ibm-rtl
new file mode 100644
index 00000000000..b82deeaec31
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-devices-system-ibm-rtl
@@ -0,0 +1,22 @@
+What: state
+Date: Sep 2010
+KernelVersion: 2.6.37
+Contact: Vernon Mauery <vernux@us.ibm.com>
+Description: The state file allows a means by which to change in and
+ out of Premium Real-Time Mode (PRTM), as well as the
+ ability to query the current state.
+ 0 => PRTM off
+ 1 => PRTM enabled
+Users: The ibm-prtm userspace daemon uses this interface.
+
+
+What: version
+Date: Sep 2010
+KernelVersion: 2.6.37
+Contact: Vernon Mauery <vernux@us.ibm.com>
+Description: The version file provides a means by which to query
+ the RTL table version that lives in the Extended
+ BIOS Data Area (EBDA).
+Users: The ibm-prtm userspace daemon uses this interface.
+
+
diff --git a/Documentation/DocBook/device-drivers.tmpl b/Documentation/DocBook/device-drivers.tmpl
index feca0758391..22edcbb9dda 100644
--- a/Documentation/DocBook/device-drivers.tmpl
+++ b/Documentation/DocBook/device-drivers.tmpl
@@ -51,8 +51,13 @@
<sect1><title>Delaying, scheduling, and timer routines</title>
!Iinclude/linux/sched.h
!Ekernel/sched.c
+!Iinclude/linux/completion.h
!Ekernel/timer.c
</sect1>
+ <sect1><title>Wait queues and Wake events</title>
+!Iinclude/linux/wait.h
+!Ekernel/wait.c
+ </sect1>
<sect1><title>High-resolution timers</title>
!Iinclude/linux/ktime.h
!Iinclude/linux/hrtimer.h
diff --git a/Documentation/DocBook/kernel-api.tmpl b/Documentation/DocBook/kernel-api.tmpl
index 6b4e07f28b6..7160652a873 100644
--- a/Documentation/DocBook/kernel-api.tmpl
+++ b/Documentation/DocBook/kernel-api.tmpl
@@ -93,6 +93,12 @@ X!Ilib/string.c
!Elib/crc32.c
!Elib/crc-ccitt.c
</sect1>
+
+ <sect1 id="idr"><title>idr/ida Functions</title>
+!Pinclude/linux/idr.h idr sync
+!Plib/idr.c IDA description
+!Elib/idr.c
+ </sect1>
</chapter>
<chapter id="mm">
diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index 9961f1564d2..d2af87ba96e 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -502,16 +502,6 @@ Who: Thomas Gleixner <tglx@linutronix.de>
----------------------------
-What: old ieee1394 subsystem (CONFIG_IEEE1394)
-When: 2.6.37
-Files: drivers/ieee1394/ except init_ohci1394_dma.c
-Why: superseded by drivers/firewire/ (CONFIG_FIREWIRE) which offers more
- features, better performance, and better security, all with smaller
- and more modern code base
-Who: Stefan Richter <stefanr@s5r6.in-berlin.de>
-
-----------------------------
-
What: The acpi_sleep=s4_nonvs command line option
When: 2.6.37
Files: arch/x86/kernel/acpi/sleep.c
@@ -545,3 +535,13 @@ Why: Hareware scan is the prefer method for iwlwifi devices for
Who: Wey-Yi Guy <wey-yi.w.guy@intel.com>
----------------------------
+
+What: access to nfsd auth cache through sys_nfsservctl or '.' files
+ in the 'nfsd' filesystem.
+When: 2.6.40
+Why: This is a legacy interface which have been replaced by a more
+ dynamic cache. Continuing to maintain this interface is an
+ unnecessary burden.
+Who: NeilBrown <neilb@suse.de>
+
+----------------------------
diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
index 2db4283efa8..8a817f656f0 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -349,21 +349,36 @@ call this method upon the IO completion.
--------------------------- block_device_operations -----------------------
prototypes:
- int (*open) (struct inode *, struct file *);
- int (*release) (struct inode *, struct file *);
- int (*ioctl) (struct inode *, struct file *, unsigned, unsigned long);
+ int (*open) (struct block_device *, fmode_t);
+ int (*release) (struct gendisk *, fmode_t);
+ int (*ioctl) (struct block_device *, fmode_t, unsigned, unsigned long);
+ int (*compat_ioctl) (struct block_device *, fmode_t, unsigned, unsigned long);
+ int (*direct_access) (struct block_device *, sector_t, void **, unsigned long *);
int (*media_changed) (struct gendisk *);
+ void (*unlock_native_capacity) (struct gendisk *);
int (*revalidate_disk) (struct gendisk *);
+ int (*getgeo)(struct block_device *, struct hd_geometry *);
+ void (*swap_slot_free_notify) (struct block_device *, unsigned long);
locking rules:
- BKL bd_sem
-open: yes yes
-release: yes yes
-ioctl: yes no
+ BKL bd_mutex
+open: no yes
+release: no yes
+ioctl: no no
+compat_ioctl: no no
+direct_access: no no
media_changed: no no
+unlock_native_capacity: no no
revalidate_disk: no no
+getgeo: no no
+swap_slot_free_notify: no no (see below)
+
+media_changed, unlock_native_capacity and revalidate_disk are called only from
+check_disk_change().
+
+swap_slot_free_notify is called with swap_lock and sometimes the page lock
+held.
-The last two are called only from check_disk_change().
--------------------------- file_operations -------------------------------
prototypes:
diff --git a/Documentation/filesystems/nfs/00-INDEX b/Documentation/filesystems/nfs/00-INDEX
index 2f68cd68876..a57e12411d2 100644
--- a/Documentation/filesystems/nfs/00-INDEX
+++ b/Documentation/filesystems/nfs/00-INDEX
@@ -12,5 +12,9 @@ nfs-rdma.txt
- how to install and setup the Linux NFS/RDMA client and server software
nfsroot.txt
- short guide on setting up a diskless box with NFS root filesystem.
+pnfs.txt
+ - short explanation of some of the internals of the pnfs client code
rpc-cache.txt
- introduction to the caching mechanisms in the sunrpc layer.
+idmapper.txt
+ - information for configuring request-keys to be used by idmapper
diff --git a/Documentation/filesystems/nfs/idmapper.txt b/Documentation/filesystems/nfs/idmapper.txt
new file mode 100644
index 00000000000..b9b4192ea8b
--- /dev/null
+++ b/Documentation/filesystems/nfs/idmapper.txt
@@ -0,0 +1,67 @@
+
+=========
+ID Mapper
+=========
+Id mapper is used by NFS to translate user and group ids into names, and to
+translate user and group names into ids. Part of this translation involves
+performing an upcall to userspace to request the information. Id mapper will
+user request-key to perform this upcall and cache the result. The program
+/usr/sbin/nfs.idmap should be called by request-key, and will perform the
+translation and initialize a key with the resulting information.
+
+ NFS_USE_NEW_IDMAPPER must be selected when configuring the kernel to use this
+ feature.
+
+===========
+Configuring
+===========
+The file /etc/request-key.conf will need to be modified so /sbin/request-key can
+direct the upcall. The following line should be added:
+
+#OP TYPE DESCRIPTION CALLOUT INFO PROGRAM ARG1 ARG2 ARG3 ...
+#====== ======= =============== =============== ===============================
+create id_resolver * * /usr/sbin/nfs.idmap %k %d 600
+
+This will direct all id_resolver requests to the program /usr/sbin/nfs.idmap.
+The last parameter, 600, defines how many seconds into the future the key will
+expire. This parameter is optional for /usr/sbin/nfs.idmap. When the timeout
+is not specified, nfs.idmap will default to 600 seconds.
+
+id mapper uses for key descriptions:
+ uid: Find the UID for the given user
+ gid: Find the GID for the given group
+ user: Find the user name for the given UID
+ group: Find the group name for the given GID
+
+You can handle any of these individually, rather than using the generic upcall
+program. If you would like to use your own program for a uid lookup then you
+would edit your request-key.conf so it look similar to this:
+
+#OP TYPE DESCRIPTION CALLOUT INFO PROGRAM ARG1 ARG2 ARG3 ...
+#====== ======= =============== =============== ===============================
+create id_resolver uid:* * /some/other/program %k %d 600
+create id_resolver * * /usr/sbin/nfs.idmap %k %d 600
+
+Notice that the new line was added above the line for the generic program.
+request-key will find the first matching line and corresponding program. In
+this case, /some/other/program will handle all uid lookups and
+/usr/sbin/nfs.idmap will handle gid, user, and group lookups.
+
+See <file:Documentation/keys-request-keys.txt> for more information about the
+request-key function.
+
+
+=========
+nfs.idmap
+=========
+nfs.idmap is designed to be called by request-key, and should not be run "by
+hand". This program takes two arguments, a serialized key and a key
+description. The serialized key is first converted into a key_serial_t, and
+then passed as an argument to keyctl_instantiate (both are part of keyutils.h).
+
+The actual lookups are performed by functions found in nfsidmap.h. nfs.idmap
+determines the correct function to call by looking at the first part of the
+description string. For example, a uid lookup description will appear as
+"uid:user@domain".
+
+nfs.idmap will return 0 if the key was instantiated, and non-zero otherwise.
diff --git a/Documentation/filesystems/nfs/nfsroot.txt b/Documentation/filesystems/nfs/nfsroot.txt
index f2430a7974e..90c71c6f0d0 100644
--- a/Documentation/filesystems/nfs/nfsroot.txt
+++ b/Documentation/filesystems/nfs/nfsroot.txt
@@ -159,6 +159,28 @@ ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf>
Default: any
+nfsrootdebug
+
+ This parameter enables debugging messages to appear in the kernel
+ log at boot time so that administrators can verify that the correct
+ NFS mount options, server address, and root path are passed to the
+ NFS client.
+
+
+rdinit=<executable file>
+
+ To specify which file contains the program that starts system
+ initialization, administrators can use this command line parameter.
+ The default value of this parameter is "/init". If the specified
+ file exists and the kernel can execute it, root filesystem related
+ kernel command line parameters, including `nfsroot=', are ignored.
+
+ A description of the process of mounting the root file system can be
+ found in:
+
+ Documentation/early-userspace/README
+
+
3.) Boot Loader
diff --git a/Documentation/filesystems/nfs/pnfs.txt b/Documentation/filesystems/nfs/pnfs.txt
new file mode 100644
index 00000000000..bc0b9cfe095
--- /dev/null
+++ b/Documentation/filesystems/nfs/pnfs.txt
@@ -0,0 +1,48 @@
+Reference counting in pnfs:
+==========================
+
+The are several inter-related caches. We have layouts which can
+reference multiple devices, each of which can reference multiple data servers.
+Each data server can be referenced by multiple devices. Each device
+can be referenced by multiple layouts. To keep all of this straight,
+we need to reference count.
+
+
+struct pnfs_layout_hdr
+----------------------
+The on-the-wire command LAYOUTGET corresponds to struct
+pnfs_layout_segment, usually referred to by the variable name lseg.
+Each nfs_inode may hold a pointer to a cache of of these layout
+segments in nfsi->layout, of type struct pnfs_layout_hdr.
+
+We reference the header for the inode pointing to it, across each
+outstanding RPC call that references it (LAYOUTGET, LAYOUTRETURN,
+LAYOUTCOMMIT), and for each lseg held within.
+
+Each header is also (when non-empty) put on a list associated with
+struct nfs_client (cl_layouts). Being put on this list does not bump
+the reference count, as the layout is kept around by the lseg that
+keeps it in the list.
+
+deviceid_cache
+--------------
+lsegs reference device ids, which are resolved per nfs_client and
+layout driver type. The device ids are held in a RCU cache (struct
+nfs4_deviceid_cache). The cache itself is referenced across each
+mount. The entries (struct nfs4_deviceid) themselves are held across
+the lifetime of each lseg referencing them.
+
+RCU is used because the deviceid is basically a write once, read many
+data structure. The hlist size of 32 buckets needs better
+justification, but seems reasonable given that we can have multiple
+deviceid's per filesystem, and multiple filesystems per nfs_client.
+
+The hash code is copied from the nfsd code base. A discussion of
+hashing and variations of this algorithm can be found at:
+http://groups.google.com/group/comp.lang.c/browse_thread/thread/9522965e2b8d3809
+
+data server cache
+-----------------
+file driver devices refer to data servers, which are kept in a module
+level cache. Its reference is held over the lifetime of the deviceid
+pointing to it.
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index a6aca874088..a563b74c7ae 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -374,13 +374,13 @@ Swap: 0 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
-The first of these lines shows the same information as is displayed for the
-mapping in /proc/PID/maps. The remaining lines show the size of the mapping,
-the amount of the mapping that is currently resident in RAM, the "proportional
-set size” (divide each shared page by the number of processes sharing it), the
-number of clean and dirty shared pages in the mapping, and the number of clean
-and dirty private pages in the mapping. The "Referenced" indicates the amount
-of memory currently marked as referenced or accessed.
+The first of these lines shows the same information as is displayed for the
+mapping in /proc/PID/maps. The remaining lines show the size of the mapping
+(size), the amount of the mapping that is currently resident in RAM (RSS), the
+process' proportional share of this mapping (PSS), the number of clean and
+dirty shared pages in the mapping, and the number of clean and dirty private
+pages in the mapping. The "Referenced" indicates the amount of memory
+currently marked as referenced or accessed.
This file is only present if the CONFIG_MMU kernel configuration option is
enabled.
diff --git a/Documentation/filesystems/sharedsubtree.txt b/Documentation/filesystems/sharedsubtree.txt
index fc0e39af43c..4ede421c968 100644
--- a/Documentation/filesystems/sharedsubtree.txt
+++ b/Documentation/filesystems/sharedsubtree.txt
@@ -62,10 +62,10 @@ replicas continue to be exactly same.
# mount /dev/sd0 /tmp/a
#ls /tmp/a
- t1 t2 t2
+ t1 t2 t3
#ls /mnt/a
- t1 t2 t2
+ t1 t2 t3
Note that the mount has propagated to the mount at /mnt as well.
diff --git a/Documentation/hwmon/ltc4261 b/Documentation/hwmon/ltc4261
new file mode 100644
index 00000000000..eba2e2c4b94
--- /dev/null
+++ b/Documentation/hwmon/ltc4261
@@ -0,0 +1,63 @@
+Kernel driver ltc4261
+=====================
+
+Supported chips:
+ * Linear Technology LTC4261
+ Prefix: 'ltc4261'
+ Addresses scanned: -
+ Datasheet:
+ http://cds.linear.com/docs/Datasheet/42612fb.pdf
+
+Author: Guenter Roeck <guenter.roeck@ericsson.com>
+
+
+Description
+-----------
+
+The LTC4261/LTC4261-2 negative voltage Hot Swap controllers allow a board
+to be safely inserted and removed from a live backplane.
+
+
+Usage Notes
+-----------
+
+This driver does not probe for LTC4261 devices, since there is no register
+which can be safely used to identify the chip. You will have to instantiate
+the devices explicitly.
+
+Example: the following will load the driver for an LTC4261 at address 0x10
+on I2C bus #1:
+$ modprobe ltc4261
+$ echo ltc4261 0x10 > /sys/bus/i2c/devices/i2c-1/new_device
+
+
+Sysfs entries
+-------------
+
+Voltage readings provided by this driver are reported as obtained from the ADC
+registers. If a set of voltage divider resistors is installed, calculate the
+real voltage by multiplying the reported value with (R1+R2)/R2, where R1 is the
+value of the divider resistor against the measured voltage and R2 is the value
+of the divider resistor against Ground.
+
+Current reading provided by this driver is reported as obtained from the ADC
+Current Sense register. The reported value assumes that a 1 mOhm sense resistor
+is installed. If a different sense resistor is installed, calculate the real
+current by dividing the reported value by the sense resistor value in mOhm.
+
+The chip has two voltage sensors, but only one set of voltage alarm status bits.
+In many many designs, those alarms are associated with the ADIN2 sensor, due to
+the proximity of the ADIN2 pin to the OV pin. ADIN2 is, however, not available
+on all chip variants. To ensure that the alarm condition is reported to the user,
+report it with both voltage sensors.
+
+in1_input ADIN2 voltage (mV)
+in1_min_alarm ADIN/ADIN2 Undervoltage alarm
+in1_max_alarm ADIN/ADIN2 Overvoltage alarm
+
+in2_input ADIN voltage (mV)
+in2_min_alarm ADIN/ADIN2 Undervoltage alarm
+in2_max_alarm ADIN/ADIN2 Overvoltage alarm
+
+curr1_input SENSE current (mA)
+curr1_alarm SENSE overcurrent alarm
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 0b6815504e6..4bc2f3c3da5 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1541,12 +1541,15 @@ and is between 256 and 4096 characters. It is defined in the file
1 to enable accounting
Default value is 0.
- nfsaddrs= [NFS]
+ nfsaddrs= [NFS] Deprecated. Use ip= instead.
See Documentation/filesystems/nfs/nfsroot.txt.
nfsroot= [NFS] nfs root filesystem for disk-less boxes.
See Documentation/filesystems/nfs/nfsroot.txt.
+ nfsrootdebug [NFS] enable nfsroot debugging messages.
+ See Documentation/filesystems/nfs/nfsroot.txt.
+
nfs.callback_tcpport=
[NFS] set the TCP port on which the NFSv4 callback
channel should listen.
@@ -2438,7 +2441,7 @@ and is between 256 and 4096 characters. It is defined in the file
topology informations if the hardware supports these.
The scheduler will make use of these informations and
e.g. base its process migration decisions on it.
- Default is off.
+ Default is on.
tp720= [HW,PS2]
diff --git a/Documentation/misc-devices/apds990x.txt b/Documentation/misc-devices/apds990x.txt
new file mode 100644
index 00000000000..d5408cade32
--- /dev/null
+++ b/Documentation/misc-devices/apds990x.txt
@@ -0,0 +1,111 @@
+Kernel driver apds990x
+======================
+
+Supported chips:
+Avago APDS990X
+
+Data sheet:
+Not freely available
+
+Author:
+Samu Onkalo <samu.p.onkalo@nokia.com>
+
+Description
+-----------
+
+APDS990x is a combined ambient light and proximity sensor. ALS and proximity
+functionality are highly connected. ALS measurement path must be running
+while the proximity functionality is enabled.
+
+ALS produces raw measurement values for two channels: Clear channel
+(infrared + visible light) and IR only. However, threshold comparisons happen
+using clear channel only. Lux value and the threshold level on the HW
+might vary quite much depending the spectrum of the light source.
+
+Driver makes necessary conversions to both directions so that user handles
+only lux values. Lux value is calculated using information from the both
+channels. HW threshold level is calculated from the given lux value to match
+with current type of the lightning. Sometimes inaccuracy of the estimations
+lead to false interrupt, but that doesn't harm.
+
+ALS contains 4 different gain steps. Driver automatically
+selects suitable gain step. After each measurement, reliability of the results
+is estimated and new measurement is trigged if necessary.
+
+Platform data can provide tuned values to the conversion formulas if
+values are known. Otherwise plain sensor default values are used.
+
+Proximity side is little bit simpler. There is no need for complex conversions.
+It produces directly usable values.
+
+Driver controls chip operational state using pm_runtime framework.
+Voltage regulators are controlled based on chip operational state.
+
+SYSFS
+-----
+
+
+chip_id
+ RO - shows detected chip type and version
+
+power_state
+ RW - enable / disable chip. Uses counting logic
+ 1 enables the chip
+ 0 disables the chip
+lux0_input
+ RO - measured lux value
+ sysfs_notify called when threshold interrupt occurs
+
+lux0_sensor_range
+ RO - lux0_input max value. Actually never reaches since sensor tends
+ to saturate much before that. Real max value varies depending
+ on the light spectrum etc.
+
+lux0_rate
+ RW - measurement rate in Hz
+
+lux0_rate_avail
+ RO - supported measurement rates
+
+lux0_calibscale
+ RW - calibration value. Set to neutral value by default.
+ Output results are multiplied with calibscale / calibscale_default
+ value.
+
+lux0_calibscale_default
+ RO - neutral calibration value
+
+lux0_thresh_above_value
+ RW - HI level threshold value. All results above the value
+ trigs an interrupt. 65535 (i.e. sensor_range) disables the above
+ interrupt.
+
+lux0_thresh_below_value
+ RW - LO level threshold value. All results below the value
+ trigs an interrupt. 0 disables the below interrupt.
+
+prox0_raw
+ RO - measured proximity value
+ sysfs_notify called when threshold interrupt occurs
+
+prox0_sensor_range
+ RO - prox0_raw max value (1023)
+
+prox0_raw_en
+ RW - enable / disable proximity - uses counting logic
+ 1 enables the proximity
+ 0 disables the proximity
+
+prox0_reporting_mode
+ RW - trigger / periodic. In "trigger" mode the driver tells two possible
+ values: 0 or prox0_sensor_range value. 0 means no proximity,
+ 1023 means proximity. This causes minimal number of interrupts.
+ In "periodic" mode the driver reports all values above
+ prox0_thresh_above. This causes more interrupts, but it can give
+ _rough_ estimate about the distance.
+
+prox0_reporting_mode_avail
+ RO - accepted values to prox0_reporting_mode (trigger, periodic)
+
+prox0_thresh_above_value
+ RW - threshold level which trigs proximity events.
diff --git a/Documentation/misc-devices/bh1770glc.txt b/Documentation/misc-devices/bh1770glc.txt
new file mode 100644
index 00000000000..7d64c014dc7
--- /dev/null
+++ b/Documentation/misc-devices/bh1770glc.txt
@@ -0,0 +1,116 @@
+Kernel driver bh1770glc
+=======================
+
+Supported chips:
+ROHM BH1770GLC
+OSRAM SFH7770
+
+Data sheet:
+Not freely available
+
+Author:
+Samu Onkalo <samu.p.onkalo@nokia.com>
+
+Description
+-----------
+BH1770GLC and SFH7770 are combined ambient light and proximity sensors.
+ALS and proximity parts operates on their own, but they shares common I2C
+interface and interrupt logic. In principle they can run on their own,
+but ALS side results are used to estimate reliability of the proximity sensor.
+
+ALS produces 16 bit lux values. The chip contains interrupt logic to produce
+low and high threshold interrupts.
+
+Proximity part contains IR-led driver up to 3 IR leds. The chip measures
+amount of reflected IR light and produces proximity result. Resolution is
+8 bit. Driver supports only one channel. Driver uses ALS results to estimate
+reliability of the proximity results. Thus ALS is always running while
+proximity detection is needed.
+
+Driver uses threshold interrupts to avoid need for polling the values.
+Proximity low interrupt doesn't exists in the chip. This is simulated
+by using a delayed work. As long as there is proximity threshold above
+interrupts the delayed work is pushed forward. So, when proximity level goes
+below the threshold value, there is no interrupt and the delayed work will
+finally run. This is handled as no proximity indication.
+
+Chip state is controlled via runtime pm framework when enabled in config.
+
+Calibscale factor is used to hide differences between the chips. By default
+value set to neutral state meaning factor of 1.00. To get proper values,
+calibrated source of light is needed as a reference. Calibscale factor is set
+so that measurement produces about the expected lux value.
+
+SYSFS
+-----
+
+chip_id
+ RO - shows detected chip type and version
+
+power_state
+ RW - enable / disable chip. Uses counting logic
+ 1 enables the chip
+ 0 disables the chip
+
+lux0_input
+ RO - measured lux value
+ sysfs_notify called when threshold interrupt occurs
+
+lux0_sensor_range
+ RO - lux0_input max value
+
+lux0_rate
+ RW - measurement rate in Hz
+
+lux0_rate_avail
+ RO - supported measurement rates
+
+lux0_thresh_above_value
+ RW - HI level threshold value. All results above the value
+ trigs an interrupt. 65535 (i.e. sensor_range) disables the above
+ interrupt.
+
+lux0_thresh_below_value
+ RW - LO level threshold value. All results below the value
+ trigs an interrupt. 0 disables the below interrupt.
+
+lux0_calibscale
+ RW - calibration value. Set to neutral value by default.
+ Output results are multiplied with calibscale / calibscale_default
+ value.
+
+lux0_calibscale_default
+ RO - neutral calibration value
+
+prox0_raw
+ RO - measured proximity value
+ sysfs_notify called when threshold interrupt occurs
+
+prox0_sensor_range
+ RO - prox0_raw max value
+
+prox0_raw_en
+ RW - enable / disable proximity - uses counting logic
+ 1 enables the proximity
+ 0 disables the proximity
+
+prox0_thresh_above_count
+ RW - number of proximity interrupts needed before triggering the event
+
+prox0_rate_above
+ RW - Measurement rate (in Hz) when the level is above threshold
+ i.e. when proximity on has been reported.
+
+prox0_rate_below
+ RW - Measurement rate (in Hz) when the level is below threshold
+ i.e. when proximity off has been reported.
+
+prox0_rate_avail
+ RO - Supported proximity measurement rates in Hz
+
+prox0_thresh_above0_value
+ RW - threshold level which trigs proximity events.
+ Filtered by persistence filter (prox0_thresh_above_count)
+
+prox0_thresh_above1_value
+ RW - threshold level which trigs event immediately
diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt
index 7f4dcebda9c..d0eb696d32e 100644
--- a/Documentation/sound/alsa/ALSA-Configuration.txt
+++ b/Documentation/sound/alsa/ALSA-Configuration.txt
@@ -300,6 +300,74 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed.
control correctly. If you have problems regarding this, try
another ALSA compliant mixer (alsamixer works).
+ Module snd-azt1605
+ ------------------
+
+ Module for Aztech Sound Galaxy soundcards based on the Aztech AZT1605
+ chipset.
+
+ port - port # for BASE (0x220,0x240,0x260,0x280)
+ wss_port - port # for WSS (0x530,0x604,0xe80,0xf40)
+ irq - IRQ # for WSS (7,9,10,11)
+ dma1 - DMA # for WSS playback (0,1,3)
+ dma2 - DMA # for WSS capture (0,1), -1 = disabled (default)
+ mpu_port - port # for MPU-401 UART (0x300,0x330), -1 = disabled (default)
+ mpu_irq - IRQ # for MPU-401 UART (3,5,7,9), -1 = disabled (default)
+ fm_port - port # for OPL3 (0x388), -1 = disabled (default)
+
+ This module supports multiple cards. It does not support autoprobe: port,
+ wss_port, irq and dma1 have to be specified. The other values are
+ optional.
+
+ "port" needs to match the BASE ADDRESS jumper on the card (0x220 or 0x240)
+ or the value stored in the card's EEPROM for cards that have an EEPROM and
+ their "CONFIG MODE" jumper set to "EEPROM SETTING". The other values can
+ be choosen freely from the options enumerated above.
+
+ If dma2 is specified and different from dma1, the card will operate in
+ full-duplex mode. When dma1=3, only dma2=0 is valid and the only way to
+ enable capture since only channels 0 and 1 are available for capture.
+
+ Generic settings are "port=0x220 wss_port=0x530 irq=10 dma1=1 dma2=0
+ mpu_port=0x330 mpu_irq=9 fm_port=0x388".
+
+ Whatever IRQ and DMA channels you pick, be sure to reserve them for
+ legacy ISA in your BIOS.
+
+ Module snd-azt2316
+ ------------------
+
+ Module for Aztech Sound Galaxy soundcards based on the Aztech AZT2316
+ chipset.
+
+ port - port # for BASE (0x220,0x240,0x260,0x280)
+ wss_port - port # for WSS (0x530,0x604,0xe80,0xf40)
+ irq - IRQ # for WSS (7,9,10,11)
+ dma1 - DMA # for WSS playback (0,1,3)
+ dma2 - DMA # for WSS capture (0,1), -1 = disabled (default)
+ mpu_port - port # for MPU-401 UART (0x300,0x330), -1 = disabled (default)
+ mpu_irq - IRQ # for MPU-401 UART (5,7,9,10), -1 = disabled (default)
+ fm_port - port # for OPL3 (0x388), -1 = disabled (default)
+
+ This module supports multiple cards. It does not support autoprobe: port,
+ wss_port, irq and dma1 have to be specified. The other values are
+ optional.
+
+ "port" needs to match the BASE ADDRESS jumper on the card (0x220 or 0x240)
+ or the value stored in the card's EEPROM for cards that have an EEPROM and
+ their "CONFIG MODE" jumper set to "EEPROM SETTING". The other values can
+ be choosen freely from the options enumerated above.
+
+ If dma2 is specified and different from dma1, the card will operate in
+ full-duplex mode. When dma1=3, only dma2=0 is valid and the only way to
+ enable capture since only channels 0 and 1 are available for capture.
+
+ Generic settings are "port=0x220 wss_port=0x530 irq=10 dma1=1 dma2=0
+ mpu_port=0x330 mpu_irq=9 fm_port=0x388".
+
+ Whatever IRQ and DMA channels you pick, be sure to reserve them for
+ legacy ISA in your BIOS.
+
Module snd-aw2
--------------
@@ -1641,20 +1709,6 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed.
This card is also known as Audio Excel DSP 16 or Zoltrix AV302.
- Module snd-sgalaxy
- ------------------
-
- Module for Aztech Sound Galaxy sound card.
-
- sbport - Port # for SB16 interface (0x220,0x240)
- wssport - Port # for WSS interface (0x530,0xe80,0xf40,0x604)
- irq - IRQ # (7,9,10,11)
- dma1 - DMA #
-
- This module supports multiple cards.
-
- The power-management is supported.
-
Module snd-sscape
-----------------
diff --git a/Documentation/sound/alsa/HD-Audio.txt b/Documentation/sound/alsa/HD-Audio.txt
index 278cc2122ea..c82beb00763 100644
--- a/Documentation/sound/alsa/HD-Audio.txt
+++ b/Documentation/sound/alsa/HD-Audio.txt
@@ -57,9 +57,11 @@ dead. However, this detection isn't perfect on some devices. In such
a case, you can change the default method via `position_fix` option.
`position_fix=1` means to use LPIB method explicitly.
-`position_fix=2` means to use the position-buffer. 0 is the default
-value, the automatic check and fallback to LPIB as described in the
-above. If you get a problem of repeated sounds, this option might
+`position_fix=2` means to use the position-buffer.
+`position_fix=3` means to use a combination of both methods, needed
+for some VIA and ATI controllers. 0 is the default value for all other
+controllers, the automatic check and fallback to LPIB as described in
+the above. If you get a problem of repeated sounds, this option might
help.
In addition to that, every controller is known to be broken regarding
diff --git a/Documentation/sysrq.txt b/Documentation/sysrq.txt
index 5c17196c8fe..312e3754e8c 100644
--- a/Documentation/sysrq.txt
+++ b/Documentation/sysrq.txt
@@ -75,7 +75,7 @@ On all - write a character to /proc/sysrq-trigger. e.g.:
'f' - Will call oom_kill to kill a memory hog process.
-'g' - Used by kgdb on ppc and sh platforms.
+'g' - Used by kgdb (kernel debugger)
'h' - Will display help (actually any other key than those listed
here will display help. but 'h' is easy to remember :-)
@@ -110,12 +110,15 @@ On all - write a character to /proc/sysrq-trigger. e.g.:
'u' - Will attempt to remount all mounted filesystems read-only.
-'v' - Dumps Voyager SMP processor info to your console.
+'v' - Forcefully restores framebuffer console
+'v' - Causes ETM buffer dump [ARM-specific]
'w' - Dumps tasks that are in uninterruptable (blocked) state.
'x' - Used by xmon interface on ppc/powerpc platforms.
+'y' - Show global CPU Registers [SPARC-64 specific]
+
'z' - Dump the ftrace buffer
'0'-'9' - Sets the console log level, controlling which kernel messages
diff --git a/Documentation/timers/hpet_example.c b/Documentation/timers/hpet_example.c
index 4bfafb7bc4c..9a3e7012c19 100644
--- a/Documentation/timers/hpet_example.c
+++ b/Documentation/timers/hpet_example.c
@@ -97,6 +97,33 @@ hpet_open_close(int argc, const char **argv)
void
hpet_info(int argc, const char **argv)
{
+ struct hpet_info info;
+ int fd;
+
+ if (argc != 1) {
+ fprintf(stderr, "hpet_info: device-name\n");
+ return;
+ }
+
+ fd = open(argv[0], O_RDONLY);
+ if (fd < 0) {
+ fprintf(stderr, "hpet_info: open of %s failed\n", argv[0]);
+ return;
+ }
+
+ if (ioctl(fd, HPET_INFO, &info) < 0) {
+ fprintf(stderr, "hpet_info: failed to get info\n");
+ goto out;
+ }
+
+ fprintf(stderr, "hpet_info: hi_irqfreq 0x%lx hi_flags 0x%lx ",
+ info.hi_ireqfreq, info.hi_flags);
+ fprintf(stderr, "hi_hpet %d hi_timer %d\n",
+ info.hi_hpet, info.hi_timer);
+
+out:
+ close(fd);
+ return;
}
void
diff --git a/Documentation/trace/postprocess/trace-vmscan-postprocess.pl b/Documentation/trace/postprocess/trace-vmscan-postprocess.pl
index 1b55146d1c8..b3e73ddb156 100644
--- a/Documentation/trace/postprocess/trace-vmscan-postprocess.pl
+++ b/Documentation/trace/postprocess/trace-vmscan-postprocess.pl
@@ -46,7 +46,7 @@ use constant HIGH_KSWAPD_LATENCY => 20;
use constant HIGH_KSWAPD_REWAKEUP => 21;
use constant HIGH_NR_SCANNED => 22;
use constant HIGH_NR_TAKEN => 23;
-use constant HIGH_NR_RECLAIM => 24;
+use constant HIGH_NR_RECLAIMED => 24;
use constant HIGH_NR_CONTIG_DIRTY => 25;
my %perprocesspid;
@@ -58,11 +58,13 @@ my $opt_read_procstat;
my $total_wakeup_kswapd;
my ($total_direct_reclaim, $total_direct_nr_scanned);
my ($total_direct_latency, $total_kswapd_latency);
+my ($total_direct_nr_reclaimed);
my ($total_direct_writepage_file_sync, $total_direct_writepage_file_async);
my ($total_direct_writepage_anon_sync, $total_direct_writepage_anon_async);
my ($total_kswapd_nr_scanned, $total_kswapd_wake);
my ($total_kswapd_writepage_file_sync, $total_kswapd_writepage_file_async);
my ($total_kswapd_writepage_anon_sync, $total_kswapd_writepage_anon_async);
+my ($total_kswapd_nr_reclaimed);
# Catch sigint and exit on request
my $sigint_report = 0;
@@ -104,7 +106,7 @@ my $regex_kswapd_wake_default = 'nid=([0-9]*) order=([0-9]*)';
my $regex_kswapd_sleep_default = 'nid=([0-9]*)';
my $regex_wakeup_kswapd_default = 'nid=([0-9]*) zid=([0-9]*) order=([0-9]*)';
my $regex_lru_isolate_default = 'isolate_mode=([0-9]*) order=([0-9]*) nr_requested=([0-9]*) nr_scanned=([0-9]*) nr_taken=([0-9]*) contig_taken=([0-9]*) contig_dirty=([0-9]*) contig_failed=([0-9]*)';
-my $regex_lru_shrink_inactive_default = 'lru=([A-Z_]*) nr_scanned=([0-9]*) nr_reclaimed=([0-9]*) priority=([0-9]*)';
+my $regex_lru_shrink_inactive_default = 'nid=([0-9]*) zid=([0-9]*) nr_scanned=([0-9]*) nr_reclaimed=([0-9]*) priority=([0-9]*) flags=([A-Z_|]*)';
my $regex_lru_shrink_active_default = 'lru=([A-Z_]*) nr_scanned=([0-9]*) nr_rotated=([0-9]*) priority=([0-9]*)';
my $regex_writepage_default = 'page=([0-9a-f]*) pfn=([0-9]*) flags=([A-Z_|]*)';
@@ -203,8 +205,8 @@ $regex_lru_shrink_inactive = generate_traceevent_regex(
"vmscan/mm_vmscan_lru_shrink_inactive",
$regex_lru_shrink_inactive_default,
"nid", "zid",
- "lru",
- "nr_scanned", "nr_reclaimed", "priority");
+ "nr_scanned", "nr_reclaimed", "priority",
+ "flags");
$regex_lru_shrink_active = generate_traceevent_regex(
"vmscan/mm_vmscan_lru_shrink_active",
$regex_lru_shrink_active_default,
@@ -375,6 +377,16 @@ EVENT_PROCESS:
my $nr_contig_dirty = $7;
$perprocesspid{$process_pid}->{HIGH_NR_SCANNED} += $nr_scanned;
$perprocesspid{$process_pid}->{HIGH_NR_CONTIG_DIRTY} += $nr_contig_dirty;
+ } elsif ($tracepoint eq "mm_vmscan_lru_shrink_inactive") {
+ $details = $5;
+ if ($details !~ /$regex_lru_shrink_inactive/o) {
+ print "WARNING: Failed to parse mm_vmscan_lru_shrink_inactive as expected\n";
+ print " $details\n";
+ print " $regex_lru_shrink_inactive/o\n";
+ next;
+ }
+ my $nr_reclaimed = $4;
+ $perprocesspid{$process_pid}->{HIGH_NR_RECLAIMED} += $nr_reclaimed;
} elsif ($tracepoint eq "mm_vmscan_writepage") {
$details = $5;
if ($details !~ /$regex_writepage/o) {
@@ -464,8 +476,8 @@ sub dump_stats {
# Print out process activity
printf("\n");
- printf("%-" . $max_strlen . "s %8s %10s %8s %8s %8s %8s %8s\n", "Process", "Direct", "Wokeup", "Pages", "Pages", "Pages", "Time");
- printf("%-" . $max_strlen . "s %8s %10s %8s %8s %8s %8s %8s\n", "details", "Rclms", "Kswapd", "Scanned", "Sync-IO", "ASync-IO", "Stalled");
+ printf("%-" . $max_strlen . "s %8s %10s %8s %8s %8s %8s %8s %8s\n", "Process", "Direct", "Wokeup", "Pages", "Pages", "Pages", "Pages", "Time");
+ printf("%-" . $max_strlen . "s %8s %10s %8s %8s %8s %8s %8s %8s\n", "details", "Rclms", "Kswapd", "Scanned", "Rclmed", "Sync-IO", "ASync-IO", "Stalled");
foreach $process_pid (keys %stats) {
if (!$stats{$process_pid}->{MM_VMSCAN_DIRECT_RECLAIM_BEGIN}) {
@@ -475,6 +487,7 @@ sub dump_stats {
$total_direct_reclaim += $stats{$process_pid}->{MM_VMSCAN_DIRECT_RECLAIM_BEGIN};
$total_wakeup_kswapd += $stats{$process_pid}->{MM_VMSCAN_WAKEUP_KSWAPD};
$total_direct_nr_scanned += $stats{$process_pid}->{HIGH_NR_SCANNED};
+ $total_direct_nr_reclaimed += $stats{$process_pid}->{HIGH_NR_RECLAIMED};
$total_direct_writepage_file_sync += $stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_FILE_SYNC};
$total_direct_writepage_anon_sync += $stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_ANON_SYNC};
$total_direct_writepage_file_async += $stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_FILE_ASYNC};
@@ -489,11 +502,12 @@ sub dump_stats {
$index++;
}
- printf("%-" . $max_strlen . "s %8d %10d %8u %8u %8u %8.3f",
+ printf("%-" . $max_strlen . "s %8d %10d %8u %8u %8u %8u %8.3f",
$process_pid,
$stats{$process_pid}->{MM_VMSCAN_DIRECT_RECLAIM_BEGIN},
$stats{$process_pid}->{MM_VMSCAN_WAKEUP_KSWAPD},
$stats{$process_pid}->{HIGH_NR_SCANNED},
+ $stats{$process_pid}->{HIGH_NR_RECLAIMED},
$stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_FILE_SYNC} + $stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_ANON_SYNC},
$stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_FILE_ASYNC} + $stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_ANON_ASYNC},
$this_reclaim_delay / 1000);
@@ -529,8 +543,8 @@ sub dump_stats {
# Print out kswapd activity
printf("\n");
- printf("%-" . $max_strlen . "s %8s %10s %8s %8s %8s %8s\n", "Kswapd", "Kswapd", "Order", "Pages", "Pages", "Pages");
- printf("%-" . $max_strlen . "s %8s %10s %8s %8s %8s %8s\n", "Instance", "Wakeups", "Re-wakeup", "Scanned", "Sync-IO", "ASync-IO");
+ printf("%-" . $max_strlen . "s %8s %10s %8s %8s %8s %8s\n", "Kswapd", "Kswapd", "Order", "Pages", "Pages", "Pages", "Pages");
+ printf("%-" . $max_strlen . "s %8s %10s %8s %8s %8s %8s\n", "Instance", "Wakeups", "Re-wakeup", "Scanned", "Rclmed", "Sync-IO", "ASync-IO");
foreach $process_pid (keys %stats) {
if (!$stats{$process_pid}->{MM_VMSCAN_KSWAPD_WAKE}) {
@@ -539,16 +553,18 @@ sub dump_stats {
$total_kswapd_wake += $stats{$process_pid}->{MM_VMSCAN_KSWAPD_WAKE};
$total_kswapd_nr_scanned += $stats{$process_pid}->{HIGH_NR_SCANNED};
+ $total_kswapd_nr_reclaimed += $stats{$process_pid}->{HIGH_NR_RECLAIMED};
$total_kswapd_writepage_file_sync += $stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_FILE_SYNC};
$total_kswapd_writepage_anon_sync += $stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_ANON_SYNC};
$total_kswapd_writepage_file_async += $stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_FILE_ASYNC};
$total_kswapd_writepage_anon_async += $stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_ANON_ASYNC};
- printf("%-" . $max_strlen . "s %8d %10d %8u %8i %8u",
+ printf("%-" . $max_strlen . "s %8d %10d %8u %8u %8i %8u",
$process_pid,
$stats{$process_pid}->{MM_VMSCAN_KSWAPD_WAKE},
$stats{$process_pid}->{HIGH_KSWAPD_REWAKEUP},
$stats{$process_pid}->{HIGH_NR_SCANNED},
+ $stats{$process_pid}->{HIGH_NR_RECLAIMED},
$stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_FILE_SYNC} + $stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_ANON_SYNC},
$stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_FILE_ASYNC} + $stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_ANON_ASYNC});
@@ -579,6 +595,7 @@ sub dump_stats {
print "\nSummary\n";
print "Direct reclaims: $total_direct_reclaim\n";
print "Direct reclaim pages scanned: $total_direct_nr_scanned\n";
+ print "Direct reclaim pages reclaimed: $total_direct_nr_reclaimed\n";
print "Direct reclaim write file sync I/O: $total_direct_writepage_file_sync\n";
print "Direct reclaim write anon sync I/O: $total_direct_writepage_anon_sync\n";
print "Direct reclaim write file async I/O: $total_direct_writepage_file_async\n";
@@ -588,6 +605,7 @@ sub dump_stats {
print "\n";
print "Kswapd wakeups: $total_kswapd_wake\n";
print "Kswapd pages scanned: $total_kswapd_nr_scanned\n";
+ print "Kswapd pages reclaimed: $total_kswapd_nr_reclaimed\n";
print "Kswapd reclaim write file sync I/O: $total_kswapd_writepage_file_sync\n";
print "Kswapd reclaim write anon sync I/O: $total_kswapd_writepage_anon_sync\n";
print "Kswapd reclaim write file async I/O: $total_kswapd_writepage_file_async\n";
@@ -612,6 +630,7 @@ sub aggregate_perprocesspid() {
$perprocess{$process}->{MM_VMSCAN_WAKEUP_KSWAPD} += $perprocesspid{$process_pid}->{MM_VMSCAN_WAKEUP_KSWAPD};
$perprocess{$process}->{HIGH_KSWAPD_REWAKEUP} += $perprocesspid{$process_pid}->{HIGH_KSWAPD_REWAKEUP};
$perprocess{$process}->{HIGH_NR_SCANNED} += $perprocesspid{$process_pid}->{HIGH_NR_SCANNED};
+ $perprocess{$process}->{HIGH_NR_RECLAIMED} += $perprocesspid{$process_pid}->{HIGH_NR_RECLAIMED};
$perprocess{$process}->{MM_VMSCAN_WRITEPAGE_FILE_SYNC} += $perprocesspid{$process_pid}->{MM_VMSCAN_WRITEPAGE_FILE_SYNC};
$perprocess{$process}->{MM_VMSCAN_WRITEPAGE_ANON_SYNC} += $perprocesspid{$process_pid}->{MM_VMSCAN_WRITEPAGE_ANON_SYNC};
$perprocess{$process}->{MM_VMSCAN_WRITEPAGE_FILE_ASYNC} += $perprocesspid{$process_pid}->{MM_VMSCAN_WRITEPAGE_FILE_ASYNC};
diff --git a/Documentation/vm/highmem.txt b/Documentation/vm/highmem.txt
new file mode 100644
index 00000000000..4324d24ffac
--- /dev/null
+++ b/Documentation/vm/highmem.txt
@@ -0,0 +1,162 @@
+
+ ====================
+ HIGH MEMORY HANDLING
+ ====================
+
+By: Peter Zijlstra <a.p.zijlstra@chello.nl>
+
+Contents:
+
+ (*) What is high memory?
+
+ (*) Temporary virtual mappings.
+
+ (*) Using kmap_atomic.
+
+ (*) Cost of temporary mappings.
+
+ (*) i386 PAE.
+
+
+====================
+WHAT IS HIGH MEMORY?
+====================
+
+High memory (highmem) is used when the size of physical memory approaches or
+exceeds the maximum size of virtual memory. At that point it becomes
+impossible for the kernel to keep all of the available physical memory mapped
+at all times. This means the kernel needs to start using temporary mappings of
+the pieces of physical memory that it wants to access.
+
+The part of (physical) memory not covered by a permanent mapping is what we
+refer to as 'highmem'. There are various architecture dependent constraints on
+where exactly that border lies.
+
+In the i386 arch, for example, we choose to map the kernel into every process's
+VM space so that we don't have to pay the full TLB invalidation costs for
+kernel entry/exit. This means the available virtual memory space (4GiB on
+i386) has to be divided between user and kernel space.
+
+The traditional split for architectures using this approach is 3:1, 3GiB for
+userspace and the top 1GiB for kernel space:
+
+ +--------+ 0xffffffff
+ | Kernel |
+ +--------+ 0xc0000000
+ | |
+ | User |
+ | |
+ +--------+ 0x00000000
+
+This means that the kernel can at most map 1GiB of physical memory at any one
+time, but because we need virtual address space for other things - including
+temporary maps to access the rest of the physical memory - the actual direct
+map will typically be less (usually around ~896MiB).
+
+Other architectures that have mm context tagged TLBs can have separate kernel
+and user maps. Some hardware (like some ARMs), however, have limited virtual
+space when they use mm context tags.
+
+
+==========================
+TEMPORARY VIRTUAL MAPPINGS
+==========================
+
+The kernel contains several ways of creating temporary mappings:
+
+ (*) vmap(). This can be used to make a long duration mapping of multiple
+ physical pages into a contiguous virtual space. It needs global
+ synchronization to unmap.
+
+ (*) kmap(). This permits a short duration mapping of a single page. It needs
+ global synchronization, but is amortized somewhat. It is also prone to
+ deadlocks when using in a nested fashion, and so it is not recommended for
+ new code.
+
+ (*) kmap_atomic(). This permits a very short duration mapping of a single
+ page. Since the mapping is restricted to the CPU that issued it, it
+ performs well, but the issuing task is therefore required to stay on that
+ CPU until it has finished, lest some other task displace its mappings.
+
+ kmap_atomic() may also be used by interrupt contexts, since it is does not
+ sleep and the caller may not sleep until after kunmap_atomic() is called.
+
+ It may be assumed that k[un]map_atomic() won't fail.
+
+
+=================
+USING KMAP_ATOMIC
+=================
+
+When and where to use kmap_atomic() is straightforward. It is used when code
+wants to access the contents of a page that might be allocated from high memory
+(see __GFP_HIGHMEM), for example a page in the pagecache. The API has two
+functions, and they can be used in a manner similar to the following:
+
+ /* Find the page of interest. */
+ struct page *page = find_get_page(mapping, offset);
+
+ /* Gain access to the contents of that page. */
+ void *vaddr = kmap_atomic(page);
+
+ /* Do something to the contents of that page. */
+ memset(vaddr, 0, PAGE_SIZE);
+
+ /* Unmap that page. */
+ kunmap_atomic(vaddr);
+
+Note that the kunmap_atomic() call takes the result of the kmap_atomic() call
+not the argument.
+
+If you need to map two pages because you want to copy from one page to
+another you need to keep the kmap_atomic calls strictly nested, like:
+
+ vaddr1 = kmap_atomic(page1);
+ vaddr2 = kmap_atomic(page2);
+
+ memcpy(vaddr1, vaddr2, PAGE_SIZE);
+
+ kunmap_atomic(vaddr2);
+ kunmap_atomic(vaddr1);
+
+
+==========================
+COST OF TEMPORARY MAPPINGS
+==========================
+
+The cost of creating temporary mappings can be quite high. The arch has to
+manipulate the kernel's page tables, the data TLB and/or the MMU's registers.
+
+If CONFIG_HIGHMEM is not set, then the kernel will try and create a mapping
+simply with a bit of arithmetic that will convert the page struct address into
+a pointer to the page contents rather than juggling mappings about. In such a
+case, the unmap operation may be a null operation.
+
+If CONFIG_MMU is not set, then there can be no temporary mappings and no
+highmem. In such a case, the arithmetic approach will also be used.
+
+
+========
+i386 PAE
+========
+
+The i386 arch, under some circumstances, will permit you to stick up to 64GiB
+of RAM into your 32-bit machine. This has a number of consequences:
+
+ (*) Linux needs a page-frame structure for each page in the system and the
+ pageframes need to live in the permanent mapping, which means:
+
+ (*) you can have 896M/sizeof(struct page) page-frames at most; with struct
+ page being 32-bytes that would end up being something in the order of 112G
+ worth of pages; the kernel, however, needs to store more than just
+ page-frames in that memory...
+
+ (*) PAE makes your page tables larger - which slows the system down as more
+ data has to be accessed to traverse in TLB fills and the like. One
+ advantage is that PAE has more PTE bits and can provide advanced features
+ like NX and PAT.
+
+The general recommendation is that you don't use more than 8GiB on a 32-bit
+machine - although more might work for you and your workload, you're pretty
+much on your own - don't expect kernel developers to really care much if things
+come apart.