summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2006-06-30[PATCH] knfsd: nfsd: mark rqstp to prevent use of sendfile in privacy caseJ. Bruce Fields
Add a rq_sendfile_ok flag to svc_rqst which will be cleared in the privacy case so that the wrapping code will get copies of the read data instead of real page cache pages. This makes life simpler when we encrypt the response. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] knfsd: svcrpc: Simplify nfsd rpcsec_gss integrity codeJ. Bruce Fields
Pull out some of the integrity code into its own function, otherwise svcauth_gss_release() is going to become very ungainly after the addition of privacy code. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] knfsd: nfsd4: fix open flag passingJ. Bruce Fields
Since nfsv4 actually keeps around the file descriptors it gets from open (instead of just using them for a single read or write operation), we need to make sure that we can do RDWR opens and not just RDONLY/WRONLY. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] knfsd: nfsd4: fix some open argument testsJ. Bruce Fields
These tests always returned true; clearly that wasn't what was intended. In keeping with kernel style, make them functions instead of macros while we're at it. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] knfsd: svcrpc: gss: simplify rsc_parse()J. Bruce Fields
Adopt a simpler convention for gss_mech_put(), to simplify rsc_parse(). Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] knfsd: nfsd: fix misplaced fh_unlock() in nfsd_link()David M. Richter
In the event that lookup_one_len() fails in nfsd_link(), fh_unlock() is skipped and locks are held overlong. Patch was tested on 2.6.17-rc2 by causing lookup_one_len() to fail and verifying that fh_unlock() gets called appropriately. Signed-off-by: David M. Richter <richterd@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] knfsd: nfsd4: remove superfluous grace period checksJ. Bruce Fields
We're checking nfs_in_grace here a few times when there isn't really any reason to--bad_stateid is probably the more sensible return value anyway. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] knfsd: nfsd: call nfsd_setuser() on fh_compose(), fix nfsd4 ↵J. Bruce Fields
permissions problem In the typical v2/v3 case the only new filehandles used as arguments to operations are filehandles taken directly off the wire, which don't get dentries until fh_verify() is called. But in v4 the filehandles that are arguments to operations were often created by previous operations (putrootfh, lookup, etc.) using fh_compose, which sets the dentry in the filehandle without calling nfsd_setuser(). This also means that, for example, if filesystem B is mounted on filesystem A, and filesystem A is exported without root-squashing, then a client can bypass the rootsquashing on B using a compound that starts at a filehandle in A, crosses into B using lookups, and then does stuff in B. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] knfsd: nfsd4: fix open_confirm lockingJ. Bruce Fields
Fix an improper unlock in an error path. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] knfsd: ignore ref_fh when crossing a mountpointNeilBrown
nfsd tries to return to a client the same sort of filehandle as was used by the client. This removes some filehandle aliasing issues and means that a server upgrade followed by a downgrade will not confused clients not restarted during that time. However when crossing a mountpoint, the filehandle used for one filesystem doesn't provide any useful information on what sort of filehandle should be used on the other, and can provide misleading information. So if the reference filehandle is on a different filesystem to the one being generated, ignore it. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] knfsd: remove noise about filehandle being uptodateNeilBrown
There is a perfectly valid situation where fh_update gets called on an already uptodate filehandle - in nfsd_create_v3 where a CREATE_UNCHECKED finds an existing file and wants to just set the size. We could possible optimise out the call in that case, but the only harm involved is that fh_update prints a warning, so it is easier to remove the warning. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] knfsd: fixing missing 'expkey' support for fsid type 3Frank Filz
Type '3' is used for the fsid in filehandles when the device number of the device holding the filesystem has more than 8 bits in either major or minor. Unfortunately expkey_parse doesn't recognise type 3. Fix this. (Slighty modified from Frank's original) Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] knfsd: improve the test for cross-device-rename in nfsdNeilBrown
Just testing the i_sb isn't really enough, at least the vfsmnt must be the same. Thanks Al. Cc: Al Viro <viro@ftp.linux.org.uk> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] EDAC: maintainers updateDoug Thompson
Removed Dave Peterson as per his request as co-maintainer of EDAC Thanks Dave. Added Mark Gross as maintainer of edac-e752x driver Thanks Mark Signed-off-by: Doug Thompson <norsk5@xmission.com> Signed-off-by: Dave Peterson <dsp@llnl.gov> Signed-off-by: Mark Gross <mark.gross@intel.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] EDAC: probe1 cleanup 1-of-2Doug Thompson
- Add lower-level functions that handle various parts of the initialization done by the xxx_probe1() functions. Some of the xxx_probe1() functions are much too long and complicated (see "Chapter 5: Functions" in Documentation/CodingStyle). - Cleanup of probe1() functions in EDAC Signed-off-by: Doug Thompson <norsk5@xmission.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] EDAC: mc numbers refactor 1-of-2Doug Thompson
Remove add_mc_to_global_list(). In next patch, this function will be reimplemented with different semantics. 1 Reimplement add_mc_to_global_list() with semantics that allow the caller to determine the ID number for a mem_ctl_info structure. Then modify edac_mc_add_mc() so that the caller specifies the ID number for the new mem_ctl_info structure. Platform-specific code should be able to assign the ID numbers in a platform-specific manner. For instance, on Opteron it makes sense to have the ID of the mem_ctl_info structure match the ID of the node that the memory controller belongs to. 2 Modify callers of edac_mc_add_mc() so they use the new semantics. Signed-off-by: Doug Thompson <norsk5@xmission.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] EDAC: PCI device to DEVICE cleanupDoug Thompson
Change MC drivers from using CVS revision strings for their version number, Now each driver has its own local string. Remove some PCI dependencies from the core EDAC module. Made the code 'struct device' centric instead of 'struct pci_dev' Most of the code changes here are from a patch by Dave Jiang. It may be best to eventually move the PCI-specific code into a separate source file. Signed-off-by: Doug Thompson <norsk5@xmission.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] Documentation/IPMI typosMatt LaPlante
Two typos in Documentation/IPMI. Acked-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] rcu: Add lock annotations to RCU locking primitivesJosh Triplett
Add __acquire annotations to rcu_read_lock and rcu_read_lock_bh, and add __release annotations to rcu_read_unlock and rcu_read_unlock_bh. This allows sparse to detect improperly paired calls to these functions. Signed-off-by: Josh Triplett <josh@freedesktop.org> Acked-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] drivers/cdrom/cm206.c: cleanupsAdrian Bunk
- make __cm206_init() __init (required since it calls the __init cm206_init()) - make the needlessly global bcdbin() static - remove a comment with an obsolete compile command Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] show Acorn-specific block devices menu only when requiredAdrian Bunk
Don't show a menu that can't be entered due to lack of contents on arm (the options are only available on arm26). Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Ian Molton <spyro@f2s.com> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] Correct rtc_wkalrm commentsAndrew Victor
This corrects the comments describing the 'enabled' and 'pending' flags in struct rtc_wkalrm of include/linux/rtc.h. Signed-off-by: Andrew Victor <andrew@sanpeople.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] cond_resched() fixAndrew Morton
Fix a bug identified by Zou Nan hai <nanhai.zou@intel.com>: If the system is in state SYSTEM_BOOTING, and need_resched() is true, cond_resched() returns true even though it didn't reschedule. Consequently need_resched() remains true and JBD locks up. Fix that by teaching cond_resched() to only return true if it really did call schedule(). cond_resched_lock() and cond_resched_softirq() have a problem too. If we're in SYSTEM_BOOTING state and need_resched() is true, these functions will drop the lock and will then try to call schedule(), but the SYSTEM_BOOTING state will prevent schedule() from being called. So on return, need_resched() will still be true, but cond_resched_lock() has to return 1 to tell the caller that the lock was dropped. The caller will probably lock up. Bottom line: if these functions dropped the lock, they _must_ call schedule() to clear need_resched(). Make it so. Also, uninline __cond_resched(). It's largeish, and slowpath. Acked-by: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] uml: add __raw_writeq definitionJeff Dike
The x86_64 build requires a definition for __raw_writeq. Signed-off-by: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] uml: fix biarch gcc build on x86_64Jeff Dike
I run an x86_64 kernel with i386 userspace (Ubuntu Dapper) and decided to try out UML today. I found that UML wasn't quite aware of biarch compilers (which Ubuntu i386 ships). A fix similar to what was done for x86_64 should probably be committed (see http://marc.theaimsgroup.com/?l=linux-kernel&m=113425940204010&w=2). Without the FLAGS changes, the build will fail at a number of places and without the LINK change, the final link will fail. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] uml: remove stray fileJeff Dike
Forgot to remove arch/um/kernel/time.c when it was mostly moved to arch/um/os-Linux. Signed-off-by: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] uml: remove unneeded time definitionsJeff Dike
Remove um_time() and um_stime() syscalls since they are identical to system-wide ones. Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] uml: add locking to xtime accessesJeff Dike
do_timer must be called with xtime_lock held. I'm not sure boot_timer_handler needs this, however I don't think it hurts: it simply disables irq and takes a spinlock. Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] uml: unregister useless console when it's not neededJeff Dike
-mm in combination with an FC5 init started dying with 'stderr=1' because init didn't like the lack of /dev/console and exited. The problem was that the stderr console, which is intended to dump printk output to the terminal before the regular console is initialized, isn't a tty, and so can't make /dev/console operational. However, since it is registered first, the normal console, when it is registered, doesn't become the preferred console, and isn't attached to /dev/console. Thus, /dev/console is never operational. This patch makes the stderr console unregister itself in an initcall, which is late enough that the normal console is registered. When that happens, the normal console will become the preferred console and will be able to run /dev/console. Signed-off-by: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] uml: fix off-by-one bug in VM file creationJeff Dike
Fix an off-by-one bug in temp file creation. Seeking to the desired length and writing a byte resulted in the file being one byte longer than expected. Signed-off-by: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] uml: fix /proc/mounts parsing boundary conditionJeff Dike
When parsing /proc/mounts looking for a tmpfs mount on /dev/shm, if a string that we are looking for if split across reads, then it won't be recognized. Fix this by refilling the buffer whenever we advance the cursor. Signed-off-by: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] UML: fix the INIT_ENV_ARG_LIMIT dependenciesAdrian Bunk
Fix the INIT_ENV_ARG_LIMIT dependencies to what seems to have been intended. Spotted by Jean-Luc Leger. Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] add smp_setup_processor_id()Andrew Morton
Presently, smp_processor_id() isn't necessarily set up until setup_arch(). But it's used in boot_cpu_init() and printk() and perhaps in other places, prior to setup_arch() being called. So provide a new smp_setup_processor_id() which is called before anything else, wire it up for Voyager (which boots on a CPU other than #0, and broke). Cc: James Bottomley <James.Bottomley@steeleye.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] SELinux: Add security hook definition for getioprio and insert hooksDavid Quigley
Add a new security hook definition for the sys_ioprio_get operation. At present, the SELinux hook function implementation for this hook is identical to the getscheduler implementation but a separate hook is introduced to allow this check to be specialized in the future if necessary. This patch also creates a helper function get_task_ioprio which handles the access check in addition to retrieving the ioprio value for the task. Signed-off-by: David Quigley <dpquigl@tycho.nsa.gov> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org> Cc: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] SELinux: update USB code with new kill_proc_info_as_uidDavid Quigley
This patch updates the USB core to save and pass the sending task secid when sending signals upon AIO completion so that proper security checking can be applied by security modules. Signed-off-by: David Quigley <dpquigl@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] SELinux: add security hook call to kill_proc_info_as_uidDavid Quigley
This patch adds a call to the extended security_task_kill hook introduced by the prior patch to the kill_proc_info_as_uid function so that these signals can be properly mediated by security modules. It also updates the existing hook call in check_kill_permission. Signed-off-by: David Quigley <dpquigl@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] SELinux: extend task_kill hook to handle signals sent by AIO completionDavid Quigley
This patch extends the security_task_kill hook to handle signals sent by AIO completion. In this case, the secid of the task responsible for the signal needs to be obtained and saved earlier, so a security_task_getsecid() hook is added, and then this saved value is passed subsequently to the extended task_kill hook for use in checking. Signed-off-by: David Quigley <dpquigl@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] slab: consolidate code to free slabs from freelistChristoph Lameter
Post and discussion: http://marc.theaimsgroup.com/?t=115074342800003&r=1&w=2 Code in __shrink_node() duplicates code in cache_reap() Add a new function drain_freelist that removes slabs with objects that are already free and use that in various places. This eliminates the __node_shrink() function and provides the interrupt holdoff reduction from slab_free to code that used to call __node_shrink. [akpm@osdl.org: build fixes] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] Light weight event countersChristoph Lameter
The remaining counters in page_state after the zoned VM counter patches have been applied are all just for show in /proc/vmstat. They have no essential function for the VM. We use a simple increment of per cpu variables. In order to avoid the most severe races we disable preempt. Preempt does not prevent the race between an increment and an interrupt handler incrementing the same statistics counter. However, that race is exceedingly rare, we may only loose one increment or so and there is no requirement (at least not in kernel) that the vm event counters have to be accurate. In the non preempt case this results in a simple increment for each counter. For many architectures this will be reduced by the compiler to a single instruction. This single instruction is atomic for i386 and x86_64. And therefore even the rare race condition in an interrupt is avoided for both architectures in most cases. The patchset also adds an off switch for embedded systems that allows a building of linux kernels without these counters. The implementation of these counters is through inline code that hopefully results in only a single instruction increment instruction being emitted (i386, x86_64) or in the increment being hidden though instruction concurrency (EPIC architectures such as ia64 can get that done). Benefits: - VM event counter operations usually reduce to a single inline instruction on i386 and x86_64. - No interrupt disable, only preempt disable for the preempt case. Preempt disable can also be avoided by moving the counter into a spinlock. - Handling is similar to zoned VM counters. - Simple and easily extendable. - Can be omitted to reduce memory use for embedded use. References: RFC http://marc.theaimsgroup.com/?l=linux-kernel&m=113512330605497&w=2 RFC http://marc.theaimsgroup.com/?l=linux-kernel&m=114988082814934&w=2 local_t http://marc.theaimsgroup.com/?l=linux-kernel&m=114991748606690&w=2 V2 http://marc.theaimsgroup.com/?t=115014808400007&r=1&w=2 V3 http://marc.theaimsgroup.com/?l=linux-kernel&m=115024767022346&w=2 V4 http://marc.theaimsgroup.com/?l=linux-kernel&m=115047968808926&w=2 Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] Use Zoned VM Counters for NUMA statisticsChristoph Lameter
The numa statistics are really event counters. But they are per node and so we have had special treatment for these counters through additional fields on the pcp structure. We can now use the per zone nature of the zoned VM counters to realize these. This will shrink the size of the pcp structure on NUMA systems. We will have some room to add additional per zone counters that will all still fit in the same cacheline. Bits Prior pcp size Size after patch We can add ------------------------------------------------------------------ 64 128 bytes (16 words) 80 bytes (10 words) 48 32 76 bytes (19 words) 56 bytes (14 words) 8 (64 byte cacheline) 72 (128 byte) Remove the special statistics for numa and replace them with zoned vm counters. This has the side effect that global sums of these events now show up in /proc/vmstat. Also take the opportunity to move the zone_statistics() function from page_alloc.c into vmstat.c. Discussions: V2 http://marc.theaimsgroup.com/?t=115048227000002&r=1&w=2 Signed-off-by: Christoph Lameter <clameter@sgi.com> Acked-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] zoned-vm-counters: remove read_page_state()Andrew Morton
No callers. Cc: Christoph Lameter <clameter@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] zoned vm counters: remove useless struct wbsChristoph Lameter
Remove writeback state We can remove some functions now that were needed to calculate the page state for writeback control since these statistics are now directly available. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] zoned vm counters: conversion of nr_bounce to per zone counterChristoph Lameter
Conversion of nr_bounce to a per zone counter nr_bounce is only used for proc output. So it could be left as an event counter. However, the event counters may not be accurate and nr_bounce is categorizing types of pages in a zone. So we really need this to also be a per zone counter. [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] zoned vm counters: conversion of nr_unstable to per zone counterChristoph Lameter
Conversion of nr_unstable to a per zone counter We need to do some special modifications to the nfs code since there are multiple cases of disposition and we need to have a page ref for proper accounting. This converts the last critical page state of the VM and therefore we need to remove several functions that were depending on GET_PAGE_STATE_LAST in order to make the kernel compile again. We are only left with event type counters in page state. [akpm@osdl.org: bugfixes] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] zoned vm counters: conversion of nr_writeback to per zone counterChristoph Lameter
Conversion of nr_writeback to per zone counter. This removes the last page_state counter from arch/i386/mm/pgtable.c so we drop the page_state from there. [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] zoned vm counters: conversion of nr_dirty to per zone counterChristoph Lameter
This makes nr_dirty a per zone counter. Looping over all processors is avoided during writeback state determination. The counter aggregation for nr_dirty had to be undone in the NFS layer since we summed up the page counts from multiple zones. Someone more familiar with NFS should probably review what I have done. [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] zoned vm counters: conversion of nr_pagetables to per zone counterChristoph Lameter
Conversion of nr_page_table_pages to a per zone counter [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] zoned vm counters: conversion of nr_slab to per zone counterChristoph Lameter
- Allows reclaim to access counter without looping over processor counts. - Allows accurate statistics on how many pages are used in a zone by the slab. This may become useful to balance slab allocations over various zones. [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] zoned vm counters: zone_reclaim: remove ↵Christoph Lameter
/proc/sys/vm/zone_reclaim_interval The zone_reclaim_interval was necessary because we were not able to determine how many unmapped pages exist in a zone. Therefore we had to scan in intervals to figure out if any pages were unmapped. With the zoned counters and NR_ANON_PAGES we now know the number of pagecache pages and the number of mapped pages in a zone. So we can simply skip the reclaim if there is an insufficient number of unmapped pages. We use SWAP_CLUSTER_MAX as the boundary. Drop all support for /proc/sys/vm/zone_reclaim_interval. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] zoned vm counters: split NR_ANON_PAGES off from NR_FILE_MAPPEDChristoph Lameter
The current NR_FILE_MAPPED is used by zone reclaim and the dirty load calculation as the number of mapped pagecache pages. However, that is not true. NR_FILE_MAPPED includes the mapped anonymous pages. This patch separates those and therefore allows an accurate tracking of the anonymous pages per zone. It then becomes possible to determine the number of unmapped pages per zone and we can avoid scanning for unmapped pages if there are none. Also it may now be possible to determine the mapped/unmapped ratio in get_dirty_limit. Isnt the number of anonymous pages irrelevant in that calculation? Note that this will change the meaning of the number of mapped pages reported in /proc/vmstat /proc/meminfo and in the per node statistics. This may affect user space tools that monitor these counters! NR_FILE_MAPPED works like NR_FILE_DIRTY. It is only valid for pagecache pages. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>