summaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
authorStephen Rothwell <sfr@canb.auug.org.au>2017-02-17 15:53:02 +1100
committerStephen Rothwell <sfr@canb.auug.org.au>2017-02-17 15:53:04 +1100
commit6641ce8e95e9cb5c678cf882f9eb7c7632fa1a2c (patch)
treee36b518353cf1b36c20a8b315de17ea8ee577c39 /Documentation
parent200445685b48a1f4eb8e4abb4ae46f22236b1197 (diff)
parent1224d22e9f1c57618e6db3737c17f8a1e0927d18 (diff)
Merge branch 'akpm-current/current'
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/ABI/obsolete/sysfs-block-zram119
-rw-r--r--Documentation/ABI/testing/sysfs-block-zram101
-rw-r--r--Documentation/admin-guide/kernel-parameters.txt8
-rw-r--r--Documentation/blockdev/zram.txt74
-rw-r--r--Documentation/filesystems/autofs4-mount-control.txt1
-rw-r--r--Documentation/filesystems/autofs4.txt39
-rw-r--r--Documentation/sysctl/vm.txt4
-rw-r--r--Documentation/trace/postprocess/trace-vmscan-postprocess.pl26
-rw-r--r--Documentation/vm/ksm.txt18
-rw-r--r--Documentation/vm/transhuge.txt8
10 files changed, 113 insertions, 285 deletions
diff --git a/Documentation/ABI/obsolete/sysfs-block-zram b/Documentation/ABI/obsolete/sysfs-block-zram
deleted file mode 100644
index 720ea92cfb2e..000000000000
--- a/Documentation/ABI/obsolete/sysfs-block-zram
+++ /dev/null
@@ -1,119 +0,0 @@
-What: /sys/block/zram<id>/num_reads
-Date: August 2015
-Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
-Description:
- The num_reads file is read-only and specifies the number of
- reads (failed or successful) done on this device.
- Now accessible via zram<id>/stat node.
-
-What: /sys/block/zram<id>/num_writes
-Date: August 2015
-Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
-Description:
- The num_writes file is read-only and specifies the number of
- writes (failed or successful) done on this device.
- Now accessible via zram<id>/stat node.
-
-What: /sys/block/zram<id>/invalid_io
-Date: August 2015
-Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
-Description:
- The invalid_io file is read-only and specifies the number of
- non-page-size-aligned I/O requests issued to this device.
- Now accessible via zram<id>/io_stat node.
-
-What: /sys/block/zram<id>/failed_reads
-Date: August 2015
-Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
-Description:
- The failed_reads file is read-only and specifies the number of
- failed reads happened on this device.
- Now accessible via zram<id>/io_stat node.
-
-What: /sys/block/zram<id>/failed_writes
-Date: August 2015
-Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
-Description:
- The failed_writes file is read-only and specifies the number of
- failed writes happened on this device.
- Now accessible via zram<id>/io_stat node.
-
-What: /sys/block/zram<id>/notify_free
-Date: August 2015
-Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
-Description:
- The notify_free file is read-only. Depending on device usage
- scenario it may account a) the number of pages freed because
- of swap slot free notifications or b) the number of pages freed
- because of REQ_DISCARD requests sent by bio. The former ones
- are sent to a swap block device when a swap slot is freed, which
- implies that this disk is being used as a swap disk. The latter
- ones are sent by filesystem mounted with discard option,
- whenever some data blocks are getting discarded.
- Now accessible via zram<id>/io_stat node.
-
-What: /sys/block/zram<id>/zero_pages
-Date: August 2015
-Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
-Description:
- The zero_pages file is read-only and specifies number of zero
- filled pages written to this disk. No memory is allocated for
- such pages.
- Now accessible via zram<id>/mm_stat node.
-
-What: /sys/block/zram<id>/orig_data_size
-Date: August 2015
-Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
-Description:
- The orig_data_size file is read-only and specifies uncompressed
- size of data stored in this disk. This excludes zero-filled
- pages (zero_pages) since no memory is allocated for them.
- Unit: bytes
- Now accessible via zram<id>/mm_stat node.
-
-What: /sys/block/zram<id>/compr_data_size
-Date: August 2015
-Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
-Description:
- The compr_data_size file is read-only and specifies compressed
- size of data stored in this disk. So, compression ratio can be
- calculated using orig_data_size and this statistic.
- Unit: bytes
- Now accessible via zram<id>/mm_stat node.
-
-What: /sys/block/zram<id>/mem_used_total
-Date: August 2015
-Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
-Description:
- The mem_used_total file is read-only and specifies the amount
- of memory, including allocator fragmentation and metadata
- overhead, allocated for this disk. So, allocator space
- efficiency can be calculated using compr_data_size and this
- statistic.
- Unit: bytes
- Now accessible via zram<id>/mm_stat node.
-
-What: /sys/block/zram<id>/mem_used_max
-Date: August 2015
-Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
-Description:
- The mem_used_max file is read/write and specifies the amount
- of maximum memory zram have consumed to store compressed data.
- For resetting the value, you should write "0". Otherwise,
- you could see -EINVAL.
- Unit: bytes
- Downgraded to write-only node: so it's possible to set new
- value only; its current value is stored in zram<id>/mm_stat
- node.
-
-What: /sys/block/zram<id>/mem_limit
-Date: August 2015
-Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
-Description:
- The mem_limit file is read/write and specifies the maximum
- amount of memory ZRAM can use to store the compressed data.
- The limit could be changed in run time and "0" means disable
- the limit. No limit is the initial state. Unit: bytes
- Downgraded to write-only node: so it's possible to set new
- value only; its current value is stored in zram<id>/mm_stat
- node.
diff --git a/Documentation/ABI/testing/sysfs-block-zram b/Documentation/ABI/testing/sysfs-block-zram
index 4518d30b8c2e..451b6d882b2c 100644
--- a/Documentation/ABI/testing/sysfs-block-zram
+++ b/Documentation/ABI/testing/sysfs-block-zram
@@ -22,41 +22,6 @@ Description:
device. The reset operation frees all the memory associated
with this device.
-What: /sys/block/zram<id>/num_reads
-Date: August 2010
-Contact: Nitin Gupta <ngupta@vflare.org>
-Description:
- The num_reads file is read-only and specifies the number of
- reads (failed or successful) done on this device.
-
-What: /sys/block/zram<id>/num_writes
-Date: August 2010
-Contact: Nitin Gupta <ngupta@vflare.org>
-Description:
- The num_writes file is read-only and specifies the number of
- writes (failed or successful) done on this device.
-
-What: /sys/block/zram<id>/invalid_io
-Date: August 2010
-Contact: Nitin Gupta <ngupta@vflare.org>
-Description:
- The invalid_io file is read-only and specifies the number of
- non-page-size-aligned I/O requests issued to this device.
-
-What: /sys/block/zram<id>/failed_reads
-Date: February 2014
-Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
-Description:
- The failed_reads file is read-only and specifies the number of
- failed reads happened on this device.
-
-What: /sys/block/zram<id>/failed_writes
-Date: February 2014
-Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
-Description:
- The failed_writes file is read-only and specifies the number of
- failed writes happened on this device.
-
What: /sys/block/zram<id>/max_comp_streams
Date: February 2014
Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
@@ -73,74 +38,24 @@ Description:
available and selected compression algorithms, change
compression algorithm selection.
-What: /sys/block/zram<id>/notify_free
-Date: August 2010
-Contact: Nitin Gupta <ngupta@vflare.org>
-Description:
- The notify_free file is read-only. Depending on device usage
- scenario it may account a) the number of pages freed because
- of swap slot free notifications or b) the number of pages freed
- because of REQ_DISCARD requests sent by bio. The former ones
- are sent to a swap block device when a swap slot is freed, which
- implies that this disk is being used as a swap disk. The latter
- ones are sent by filesystem mounted with discard option,
- whenever some data blocks are getting discarded.
-
-What: /sys/block/zram<id>/zero_pages
-Date: August 2010
-Contact: Nitin Gupta <ngupta@vflare.org>
-Description:
- The zero_pages file is read-only and specifies number of zero
- filled pages written to this disk. No memory is allocated for
- such pages.
-
-What: /sys/block/zram<id>/orig_data_size
-Date: August 2010
-Contact: Nitin Gupta <ngupta@vflare.org>
-Description:
- The orig_data_size file is read-only and specifies uncompressed
- size of data stored in this disk. This excludes zero-filled
- pages (zero_pages) since no memory is allocated for them.
- Unit: bytes
-
-What: /sys/block/zram<id>/compr_data_size
-Date: August 2010
-Contact: Nitin Gupta <ngupta@vflare.org>
-Description:
- The compr_data_size file is read-only and specifies compressed
- size of data stored in this disk. So, compression ratio can be
- calculated using orig_data_size and this statistic.
- Unit: bytes
-
-What: /sys/block/zram<id>/mem_used_total
-Date: August 2010
-Contact: Nitin Gupta <ngupta@vflare.org>
-Description:
- The mem_used_total file is read-only and specifies the amount
- of memory, including allocator fragmentation and metadata
- overhead, allocated for this disk. So, allocator space
- efficiency can be calculated using compr_data_size and this
- statistic.
- Unit: bytes
-
What: /sys/block/zram<id>/mem_used_max
Date: August 2014
Contact: Minchan Kim <minchan@kernel.org>
Description:
- The mem_used_max file is read/write and specifies the amount
- of maximum memory zram have consumed to store compressed data.
- For resetting the value, you should write "0". Otherwise,
- you could see -EINVAL.
+ The mem_used_max file is write-only and is used to reset
+ the counter of maximum memory zram have consumed to store
+ compressed data. For resetting the value, you should write
+ "0". Otherwise, you could see -EINVAL.
Unit: bytes
What: /sys/block/zram<id>/mem_limit
Date: August 2014
Contact: Minchan Kim <minchan@kernel.org>
Description:
- The mem_limit file is read/write and specifies the maximum
- amount of memory ZRAM can use to store the compressed data. The
- limit could be changed in run time and "0" means disable the
- limit. No limit is the initial state. Unit: bytes
+ The mem_limit file is write-only and specifies the maximum
+ amount of memory ZRAM can use to store the compressed data.
+ The limit could be changed in run time and "0" means disable
+ the limit. No limit is the initial state. Unit: bytes
What: /sys/block/zram<id>/compact
Date: August 2015
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index d05533e6120b..986e44387dad 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3694,6 +3694,14 @@
last alloc / free. For more information see
Documentation/vm/slub.txt.
+ slub_memcg_sysfs= [MM, SLUB]
+ Determines whether to enable sysfs directories for
+ memory cgroup sub-caches. 1 to enable, 0 to disable.
+ The default is determined by CONFIG_SLUB_MEMCG_SYSFS_ON.
+ Enabling this can lead to a very high number of debug
+ directories and files being created under
+ /sys/kernel/slub.
+
slub_max_order= [MM, SLUB]
Determines the maximum allowed order for slabs.
A high setting may cause OOMs due to memory
diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
index 0535ae1f73e5..4fced8a21307 100644
--- a/Documentation/blockdev/zram.txt
+++ b/Documentation/blockdev/zram.txt
@@ -161,42 +161,14 @@ Name access description
disksize RW show and set the device's disk size
initstate RO shows the initialization state of the device
reset WO trigger device reset
-num_reads RO the number of reads
-failed_reads RO the number of failed reads
-num_write RO the number of writes
-failed_writes RO the number of failed writes
-invalid_io RO the number of non-page-size-aligned I/O requests
+mem_used_max WO reset the `mem_used_max' counter (see later)
+mem_limit WO specifies the maximum amount of memory ZRAM can use
+ to store the compressed data
max_comp_streams RW the number of possible concurrent compress operations
comp_algorithm RW show and change the compression algorithm
-notify_free RO the number of notifications to free pages (either
- slot free notifications or REQ_DISCARD requests)
-zero_pages RO the number of zero filled pages written to this disk
-orig_data_size RO uncompressed size of data stored in this disk
-compr_data_size RO compressed size of data stored in this disk
-mem_used_total RO the amount of memory allocated for this disk
-mem_used_max RW the maximum amount of memory zram have consumed to
- store the data (to reset this counter to the actual
- current value, write 1 to this attribute)
-mem_limit RW the maximum amount of memory ZRAM can use to store
- the compressed data
-pages_compacted RO the number of pages freed during compaction
- (available only via zram<id>/mm_stat node)
compact WO trigger memory compaction
debug_stat RO this file is used for zram debugging purposes
-WARNING
-=======
-per-stat sysfs attributes are considered to be deprecated.
-The basic strategy is:
--- the existing RW nodes will be downgraded to WO nodes (in linux 4.11)
--- deprecated RO sysfs nodes will eventually be removed (in linux 4.11)
-
-The list of deprecated attributes can be found here:
-Documentation/ABI/obsolete/sysfs-block-zram
-
-Basically, every attribute that has its own read accessible sysfs node
-(e.g. num_reads) *AND* is accessible via one of the stat files (zram<id>/stat
-or zram<id>/io_stat or zram<id>/mm_stat) is considered to be deprecated.
User space is advised to use the following files to read the device statistics.
@@ -211,22 +183,40 @@ The stat file represents device's I/O statistics not accounted by block
layer and, thus, not available in zram<id>/stat file. It consists of a
single line of text and contains the following stats separated by
whitespace:
- failed_reads
- failed_writes
- invalid_io
- notify_free
+ failed_reads the number of failed reads
+ failed_writes the number of failed writes
+ invalid_io the number of non-page-size-aligned I/O requests
+ notify_free Depending on device usage scenario it may account
+ a) the number of pages freed because of swap slot free
+ notifications or b) the number of pages freed because of
+ REQ_DISCARD requests sent by bio. The former ones are
+ sent to a swap block device when a swap slot is freed,
+ which implies that this disk is being used as a swap disk.
+ The latter ones are sent by filesystem mounted with
+ discard option, whenever some data blocks are getting
+ discarded.
File /sys/block/zram<id>/mm_stat
The stat file represents device's mm statistics. It consists of a single
line of text and contains the following stats separated by whitespace:
- orig_data_size
- compr_data_size
- mem_used_total
- mem_limit
- mem_used_max
- zero_pages
- num_migrated
+ orig_data_size uncompressed size of data stored in this disk.
+ This excludes same-element-filled pages (same_pages) since
+ no memory is allocated for them.
+ Unit: bytes
+ compr_data_size compressed size of data stored in this disk
+ mem_used_total the amount of memory allocated for this disk. This
+ includes allocator fragmentation and metadata overhead,
+ allocated for this disk. So, allocator space efficiency
+ can be calculated using compr_data_size and this statistic.
+ Unit: bytes
+ mem_limit the maximum amount of memory ZRAM can use to store
+ the compressed data
+ mem_used_max the maximum amount of memory zram have consumed to
+ store the data
+ same_pages the number of same element filled pages written to this disk.
+ No memory is allocated for such pages.
+ pages_compacted the number of pages freed during compaction
9) Deactivate:
swapoff /dev/zram0
diff --git a/Documentation/filesystems/autofs4-mount-control.txt b/Documentation/filesystems/autofs4-mount-control.txt
index 50a3e01a36f8..e5177cb31a04 100644
--- a/Documentation/filesystems/autofs4-mount-control.txt
+++ b/Documentation/filesystems/autofs4-mount-control.txt
@@ -179,6 +179,7 @@ struct autofs_dev_ioctl {
* including this struct */
__s32 ioctlfd; /* automount command fd */
+ /* Command parameters */
union {
struct args_protover protover;
struct args_protosubver protosubver;
diff --git a/Documentation/filesystems/autofs4.txt b/Documentation/filesystems/autofs4.txt
index 8fac3fe7b8c9..f10dd590f69f 100644
--- a/Documentation/filesystems/autofs4.txt
+++ b/Documentation/filesystems/autofs4.txt
@@ -65,7 +65,7 @@ directory is a mount trap only if the filesystem is mounted *direct*
and the root is empty.
Directories created in the root directory are mount traps only if the
-filesystem is mounted *indirect* and they are empty.
+filesystem is mounted *indirect* and they are empty.
Directories further down the tree depend on the *maxproto* mount
option and particularly whether it is less than five or not.
@@ -352,7 +352,7 @@ Communicating with autofs: root directory ioctls
------------------------------------------------
The root directory of an autofs filesystem will respond to a number of
-ioctls. The process issuing the ioctl must have the CAP_SYS_ADMIN
+ioctls. The process issuing the ioctl must have the CAP_SYS_ADMIN
capability, or must be the automount daemon.
The available ioctl commands are:
@@ -425,8 +425,20 @@ Each ioctl is passed a pointer to an `autofs_dev_ioctl` structure:
* including this struct */
__s32 ioctlfd; /* automount command fd */
- __u32 arg1; /* Command parameters */
- __u32 arg2;
+ /* Command parameters */
+ union {
+ struct args_protover protover;
+ struct args_protosubver protosubver;
+ struct args_openmount openmount;
+ struct args_ready ready;
+ struct args_fail fail;
+ struct args_setpipefd setpipefd;
+ struct args_timeout timeout;
+ struct args_requester requester;
+ struct args_expire expire;
+ struct args_askumount askumount;
+ struct args_ismountpoint ismountpoint;
+ };
char path[0];
};
@@ -446,25 +458,22 @@ Commands are:
set version numbers.
- **AUTOFS_DEV_IOCTL_OPENMOUNT_CMD**: return an open file descriptor
on the root of an autofs filesystem. The filesystem is identified
- by name and device number, which is stored in `arg1`. Device
- numbers for existing filesystems can be found in
+ by name and device number, which is stored in `openmount.devid`.
+ Device numbers for existing filesystems can be found in
`/proc/self/mountinfo`.
- **AUTOFS_DEV_IOCTL_CLOSEMOUNT_CMD**: same as `close(ioctlfd)`.
- **AUTOFS_DEV_IOCTL_SETPIPEFD_CMD**: if the filesystem is in
catatonic mode, this can provide the write end of a new pipe
- in `arg1` to re-establish communication with a daemon. The
- process group of the calling process is used to identify the
+ in `setpipefd.pipefd` to re-establish communication with a daemon.
+ The process group of the calling process is used to identify the
daemon.
- **AUTOFS_DEV_IOCTL_REQUESTER_CMD**: `path` should be a
name within the filesystem that has been auto-mounted on.
- arg1 is the dev number of the underlying autofs. On successful
- return, `arg1` and `arg2` will be the UID and GID of the process
- which triggered that mount.
-
+ On successful return, `requester.uid` and `requester.gid` will be
+ the UID and GID of the process which triggered that mount.
- **AUTOFS_DEV_IOCTL_ISMOUNTPOINT_CMD**: Check if path is a
mountpoint of a particular type - see separate documentation for
details.
-
- **AUTOFS_DEV_IOCTL_PROTOVER_CMD**:
- **AUTOFS_DEV_IOCTL_PROTOSUBVER_CMD**:
- **AUTOFS_DEV_IOCTL_READY_CMD**:
@@ -474,7 +483,7 @@ Commands are:
- **AUTOFS_DEV_IOCTL_EXPIRE_CMD**:
- **AUTOFS_DEV_IOCTL_ASKUMOUNT_CMD**: These all have the same
function as the similarly named **AUTOFS_IOC** ioctls, except
- that **FAIL** can be given an explicit error number in `arg1`
+ that **FAIL** can be given an explicit error number in `fail.status`
instead of assuming `ENOENT`, and this **EXPIRE** command
corresponds to **AUTOFS_IOC_EXPIRE_MULTI**.
@@ -512,7 +521,7 @@ always be mounted "shared". e.g.
> `mount --make-shared /autofs/mount/point`
-The automount daemon is only able to mange a single mount location for
+The automount daemon is only able to manage a single mount location for
an autofs filesystem and if mounts on that are not 'shared', other
locations will not behave as expected. In particular access to those
other locations will likely result in the `ELOOP` error
diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
index 95ccbe6d79ce..b4ad97f10b8e 100644
--- a/Documentation/sysctl/vm.txt
+++ b/Documentation/sysctl/vm.txt
@@ -376,8 +376,8 @@ max_map_count:
This file contains the maximum number of memory map areas a process
may have. Memory map areas are used as a side-effect of calling
-malloc, directly by mmap and mprotect, and also when loading shared
-libraries.
+malloc, directly by mmap, mprotect, and madvise, and also when loading
+shared libraries.
While most applications need less than a thousand maps, certain
programs, particularly malloc debuggers, may consume lots of them,
diff --git a/Documentation/trace/postprocess/trace-vmscan-postprocess.pl b/Documentation/trace/postprocess/trace-vmscan-postprocess.pl
index 8f961ef2b457..ba976805853a 100644
--- a/Documentation/trace/postprocess/trace-vmscan-postprocess.pl
+++ b/Documentation/trace/postprocess/trace-vmscan-postprocess.pl
@@ -112,8 +112,8 @@ my $regex_direct_end_default = 'nr_reclaimed=([0-9]*)';
my $regex_kswapd_wake_default = 'nid=([0-9]*) order=([0-9]*)';
my $regex_kswapd_sleep_default = 'nid=([0-9]*)';
my $regex_wakeup_kswapd_default = 'nid=([0-9]*) zid=([0-9]*) order=([0-9]*)';
-my $regex_lru_isolate_default = 'isolate_mode=([0-9]*) order=([0-9]*) nr_requested=([0-9]*) nr_scanned=([0-9]*) nr_taken=([0-9]*) file=([0-9]*)';
-my $regex_lru_shrink_inactive_default = 'nid=([0-9]*) zid=([0-9]*) nr_scanned=([0-9]*) nr_reclaimed=([0-9]*) priority=([0-9]*) flags=([A-Z_|]*)';
+my $regex_lru_isolate_default = 'isolate_mode=([0-9]*) classzone_idx=([0-9]*) order=([0-9]*) nr_requested=([0-9]*) nr_scanned=([0-9]*) nr_skipped=([0-9]*) nr_taken=([0-9]*) lru=([a-z_]*)';
+my $regex_lru_shrink_inactive_default = 'nid=([0-9]*) nr_scanned=([0-9]*) nr_reclaimed=([0-9]*) nr_dirty=([0-9]*) nr_writeback=([0-9]*) nr_congested=([0-9]*) nr_immediate=([0-9]*) nr_activate=([0-9]*) nr_ref_keep=([0-9]*) nr_unmap_fail=([0-9]*) priority=([0-9]*) flags=([A-Z_|]*)';
my $regex_lru_shrink_active_default = 'lru=([A-Z_]*) nr_scanned=([0-9]*) nr_rotated=([0-9]*) priority=([0-9]*)';
my $regex_writepage_default = 'page=([0-9a-f]*) pfn=([0-9]*) flags=([A-Z_|]*)';
@@ -205,15 +205,15 @@ $regex_wakeup_kswapd = generate_traceevent_regex(
$regex_lru_isolate = generate_traceevent_regex(
"vmscan/mm_vmscan_lru_isolate",
$regex_lru_isolate_default,
- "isolate_mode", "order",
- "nr_requested", "nr_scanned", "nr_taken",
- "file");
+ "isolate_mode", "classzone_idx", "order",
+ "nr_requested", "nr_scanned", "nr_skipped", "nr_taken",
+ "lru");
$regex_lru_shrink_inactive = generate_traceevent_regex(
"vmscan/mm_vmscan_lru_shrink_inactive",
$regex_lru_shrink_inactive_default,
- "nid", "zid",
- "nr_scanned", "nr_reclaimed", "priority",
- "flags");
+ "nid", "nr_scanned", "nr_reclaimed", "nr_dirty", "nr_writeback",
+ "nr_congested", "nr_immediate", "nr_activate", "nr_ref_keep",
+ "nr_unmap_fail", "priority", "flags");
$regex_lru_shrink_active = generate_traceevent_regex(
"vmscan/mm_vmscan_lru_shrink_active",
$regex_lru_shrink_active_default,
@@ -381,8 +381,8 @@ EVENT_PROCESS:
next;
}
my $isolate_mode = $1;
- my $nr_scanned = $4;
- my $file = $6;
+ my $nr_scanned = $5;
+ my $file = $8;
# To closer match vmstat scanning statistics, only count isolate_both
# and isolate_inactive as scanning. isolate_active is rotation
@@ -391,7 +391,7 @@ EVENT_PROCESS:
# isolate_both == 3
if ($isolate_mode != 2) {
$perprocesspid{$process_pid}->{HIGH_NR_SCANNED} += $nr_scanned;
- if ($file == 1) {
+ if ($file =~ /_file/) {
$perprocesspid{$process_pid}->{HIGH_NR_FILE_SCANNED} += $nr_scanned;
} else {
$perprocesspid{$process_pid}->{HIGH_NR_ANON_SCANNED} += $nr_scanned;
@@ -406,8 +406,8 @@ EVENT_PROCESS:
next;
}
- my $nr_reclaimed = $4;
- my $flags = $6;
+ my $nr_reclaimed = $3;
+ my $flags = $12;
my $file = 0;
if ($flags =~ /RECLAIM_WB_FILE/) {
$file = 1;
diff --git a/Documentation/vm/ksm.txt b/Documentation/vm/ksm.txt
index f34a8ee6f860..6b0ca7feb135 100644
--- a/Documentation/vm/ksm.txt
+++ b/Documentation/vm/ksm.txt
@@ -38,6 +38,10 @@ the range for whenever the KSM daemon is started; even if the range
cannot contain any pages which KSM could actually merge; even if
MADV_UNMERGEABLE is applied to a range which was never MADV_MERGEABLE.
+If a region of memory must be split into at least one new MADV_MERGEABLE
+or MADV_UNMERGEABLE region, the madvise may return ENOMEM if the process
+will exceed vm.max_map_count (see Documentation/sysctl/vm.txt).
+
Like other madvise calls, they are intended for use on mapped areas of
the user address space: they will report ENOMEM if the specified range
includes unmapped gaps (though working on the intervening mapped areas),
@@ -80,6 +84,20 @@ run - set 0 to stop ksmd from running but keep merged pages,
Default: 0 (must be changed to 1 to activate KSM,
except if CONFIG_SYSFS is disabled)
+use_zero_pages - specifies whether empty pages (i.e. allocated pages
+ that only contain zeroes) should be treated specially.
+ When set to 1, empty pages are merged with the kernel
+ zero page(s) instead of with each other as it would
+ happen normally. This can improve the performance on
+ architectures with coloured zero pages, depending on
+ the workload. Care should be taken when enabling this
+ setting, as it can potentially degrade the performance
+ of KSM for some workloads, for example if the checksums
+ of pages candidate for merging match the checksum of
+ an empty page. This setting can be changed at any time,
+ it is only effective for pages merged after the change.
+ Default: 0 (normal KSM behaviour as in earlier releases)
+
The effectiveness of KSM and MADV_MERGEABLE is shown in /sys/kernel/mm/ksm/:
pages_shared - how many shared pages are being used
diff --git a/Documentation/vm/transhuge.txt b/Documentation/vm/transhuge.txt
index f2e739545e74..cd28d5ee5273 100644
--- a/Documentation/vm/transhuge.txt
+++ b/Documentation/vm/transhuge.txt
@@ -110,6 +110,7 @@ MADV_HUGEPAGE region.
echo always >/sys/kernel/mm/transparent_hugepage/defrag
echo defer >/sys/kernel/mm/transparent_hugepage/defrag
+echo defer+madvise >/sys/kernel/mm/transparent_hugepage/defrag
echo madvise >/sys/kernel/mm/transparent_hugepage/defrag
echo never >/sys/kernel/mm/transparent_hugepage/defrag
@@ -120,10 +121,15 @@ that benefit heavily from THP use and are willing to delay the VM start
to utilise them.
"defer" means that an application will wake kswapd in the background
-to reclaim pages and wake kcompact to compact memory so that THP is
+to reclaim pages and wake kcompactd to compact memory so that THP is
available in the near future. It's the responsibility of khugepaged
to then install the THP pages later.
+"defer+madvise" will enter direct reclaim and compaction like "always", but
+only for regions that have used madvise(MADV_HUGEPAGE); all other regions
+will wake kswapd in the background to reclaim pages and wake kcompactd to
+compact memory so that THP is available in the near future.
+
"madvise" will enter direct reclaim like "always" but only for regions
that are have used madvise(MADV_HUGEPAGE). This is the default behaviour.