summaryrefslogtreecommitdiff
path: root/fs/ntfs/attrib.c
AgeCommit message (Collapse)Author
2014-06-04mm: non-atomically mark page accessed during page cache allocation where ↵Mel Gorman
possible aops->write_begin may allocate a new page and make it visible only to have mark_page_accessed called almost immediately after. Once the page is visible the atomic operations are necessary which is noticable overhead when writing to an in-memory filesystem like tmpfs but should also be noticable with fast storage. The objective of the patch is to initialse the accessed information with non-atomic operations before the page is visible. The bulk of filesystems directly or indirectly use grab_cache_page_write_begin or find_or_create_page for the initial allocation of a page cache page. This patch adds an init_page_accessed() helper which behaves like the first call to mark_page_accessed() but may called before the page is visible and can be done non-atomically. The primary APIs of concern in this care are the following and are used by most filesystems. find_get_page find_lock_page find_or_create_page grab_cache_page_nowait grab_cache_page_write_begin All of them are very similar in detail to the patch creates a core helper pagecache_get_page() which takes a flags parameter that affects its behavior such as whether the page should be marked accessed or not. Then old API is preserved but is basically a thin wrapper around this core function. Each of the filesystems are then updated to avoid calling mark_page_accessed when it is known that the VM interfaces have already done the job. There is a slight snag in that the timing of the mark_page_accessed() has now changed so in rare cases it's possible a page gets to the end of the LRU as PageReferenced where as previously it might have been repromoted. This is expected to be rare but it's worth the filesystem people thinking about it in case they see a problem with the timing change. It is also the case that some filesystems may be marking pages accessed that previously did not but it makes sense that filesystems have consistent behaviour in this regard. The test case used to evaulate this is a simple dd of a large file done multiple times with the file deleted on each iterations. The size of the file is 1/10th physical memory to avoid dirty page balancing. In the async case it will be possible that the workload completes without even hitting the disk and will have variable results but highlight the impact of mark_page_accessed for async IO. The sync results are expected to be more stable. The exception is tmpfs where the normal case is for the "IO" to not hit the disk. The test machine was single socket and UMA to avoid any scheduling or NUMA artifacts. Throughput and wall times are presented for sync IO, only wall times are shown for async as the granularity reported by dd and the variability is unsuitable for comparison. As async results were variable do to writback timings, I'm only reporting the maximum figures. The sync results were stable enough to make the mean and stddev uninteresting. The performance results are reported based on a run with no profiling. Profile data is based on a separate run with oprofile running. async dd 3.15.0-rc3 3.15.0-rc3 vanilla accessed-v2 ext3 Max elapsed 13.9900 ( 0.00%) 11.5900 ( 17.16%) tmpfs Max elapsed 0.5100 ( 0.00%) 0.4900 ( 3.92%) btrfs Max elapsed 12.8100 ( 0.00%) 12.7800 ( 0.23%) ext4 Max elapsed 18.6000 ( 0.00%) 13.3400 ( 28.28%) xfs Max elapsed 12.5600 ( 0.00%) 2.0900 ( 83.36%) The XFS figure is a bit strange as it managed to avoid a worst case by sheer luck but the average figures looked reasonable. samples percentage ext3 86107 0.9783 vmlinux-3.15.0-rc4-vanilla mark_page_accessed ext3 23833 0.2710 vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed ext3 5036 0.0573 vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed ext4 64566 0.8961 vmlinux-3.15.0-rc4-vanilla mark_page_accessed ext4 5322 0.0713 vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed ext4 2869 0.0384 vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed xfs 62126 1.7675 vmlinux-3.15.0-rc4-vanilla mark_page_accessed xfs 1904 0.0554 vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed xfs 103 0.0030 vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed btrfs 10655 0.1338 vmlinux-3.15.0-rc4-vanilla mark_page_accessed btrfs 2020 0.0273 vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed btrfs 587 0.0079 vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed tmpfs 59562 3.2628 vmlinux-3.15.0-rc4-vanilla mark_page_accessed tmpfs 1210 0.0696 vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed tmpfs 94 0.0054 vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed [akpm@linux-foundation.org: don't run init_page_accessed() against an uninitialised pointer] Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Jan Kara <jack@suse.cz> Cc: Michal Hocko <mhocko@suse.cz> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Tested-by: Prabhakar Lad <prabhakar.csengg@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-20ntfs: remove the second argument of k[un]map_atomic()Cong Wang
Signed-off-by: Cong Wang <amwang@redhat.com>
2012-02-22NTFS: Do not dereference pointer before checking for NULL.Anton Altaparmakov
Found by Coverity software (http://scan.coverity.com). Signed-off-by: Anton Altaparmakov <anton@tuxera.com>
2011-03-31Fix common misspellingsLucas De Marchi
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2007-11-03NTFS: Fix read regression.Anton Altaparmakov
The regression was caused by: commit[a32ea1e1f925399e0d81ca3f7394a44a6dafa12c] Fix read/truncate race This causes ntfs_readpage() to be called for a zero i_size inode, which failed when the file was compressed and non-resident. Thanks a lot to Mike Galbraith for reporting the issue and tracking down the commit that caused the regression. Looking into it I found three bugs which the patch fixes. Signed-off-by: Anton Altaparmakov <aia21@cantab.net> Tested-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-12NTFS: Fix a mount time deadlock.Anton Altaparmakov
Big thanks go to Mathias Kolehmainen for reporting the bug, providing debug output and testing the patches I sent him to get it working. The fix was to stop calling ntfs_attr_set() at mount time as that causes balance_dirty_pages_ratelimited() to be called which on systems with little memory actually tries to go and balance the dirty pages which tries to take the s_umount semaphore but because we are still in fill_super() across which the VFS holds s_umount for writing this results in a deadlock. We now do the dirty work by hand by submitting individual buffers. This has the annoying "feature" that mounting can take a few seconds if the journal is large as we have clear it all. One day someone should improve on this by deferring the journal clearing to a helper kernel thread so it can be done in the background but I don't have time for this at the moment and the current solution works fine so I am leaving it like this for now. Signed-off-by: Anton Altaparmakov <aia21@cantab.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07mm: make read_cache_page synchronousNick Piggin
Ensure pages are uptodate after returning from read_cache_page, which allows us to cut out most of the filesystem-internal PageUptodate calls. I didn't have a great look down the call chains, but this appears to fixes 7 possible use-before uptodate in hfs, 2 in hfsplus, 1 in jfs, a few in ecryptfs, 1 in jffs2, and a possible cleared data overwritten with readpage in block2mtd. All depending on whether the filler is async and/or can return with a !uptodate page. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12[PATCH] NTFS: rename incorrect check of NTFS_DEBUG with just DEBUGRobert P. J. Day
Replace the incorrect debugging check of "#ifdef NTFS_DEBUG" with just "#ifdef DEBUG". Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Acked-by: Anton Altaparmakov <aia21@cantab.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2006-12-07[PATCH] slab: remove SLAB_NOFSChristoph Lameter
SLAB_NOFS is an alias of GFP_NOFS. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01[PATCH] fs/ntfs: Conversion to generic booleanRichard Knutsson
Conversion of booleans to: generic-boolean.patch (2006-08-23) Signed-off-by: Richard Knutsson <ricknu-0@student.ltu.se> Signed-off-by: Anton Altaparmakov <aia21@cantab.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] read_mapping_page for address spacePekka Enberg
Add read_mapping_page() which is used for callers that pass mapping->a_ops->readpage as the filler for read_cache_page. This removes some duplication from filesystem code. Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23NTFS: Remove all the make_bad_inode() calls. This should only be calledAnton Altaparmakov
from read inode and new inode code paths. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2006-03-23NTFS: Add support for sparse files which have a compression unit of 0.Anton Altaparmakov
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2006-03-23NTFS: Fix a buggette in an "should be impossible" case handling where weAnton Altaparmakov
continued the attribute lookup loop instead of aborting it. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2006-01-09[PATCH] mutex subsystem, semaphore to mutex: VFS, ->i_semJes Sorensen
This patch converts the inode semaphore to a mutex. I have tested it on XFS and compiled as much as one can consider on an ia64. Anyway your luck with it might be different. Modified-by: Ingo Molnar <mingo@elte.hu> (finished the conversion) Signed-off-by: Jes Sorensen <jes@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2005-10-24NTFS: Fix compilation warnings with gcc-4.0.2 on SUSE 10.0.Anton Altaparmakov
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-10-19NTFS: $EA attributes can be both resident non-resident.Anton Altaparmakov
Minor tidying. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-10-11NTFS: In attrib.c::ntfs_attr_set() call balance_dirty_pages_ratelimited()Anton Altaparmakov
and cond_resched() in the main loop as we could be dirtying a lot of pages and this ensures we play nice with the VM and the system as a whole. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-10-04NTFS: Add fs/ntfs/attrib.[hc]::ntfs_attr_extend_allocation(), a function toAnton Altaparmakov
extend the allocation of an attributes. Optionally, the data size, but not the initialized size can be extended, too. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-10-04NTFS: Fix ntfs_attr_make_non_resident() to update the vfs inode i_blocksAnton Altaparmakov
which is zero for a resident attribute but should no longer be zero once the attribute is non-resident as it then has real clusters allocated. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-10-04NTFS: Change ntfs_attr_make_non_resident to take the attribute value sizeAnton Altaparmakov
as an extra parameter. This is needed since we need to know the size before we can map the mft record and our callers always know it. The reason we cannot simply read the size from the vfs inode i_size is that this is not necessarily uptodate. This happens when ntfs_attr_make_non_resident() is called in the ->truncate call path. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-10-04NTFS: - Change ntfs_cluster_alloc() to take an extra boolean parameterAnton Altaparmakov
specifying whether the cluster are being allocated to extend an attribute or to fill a hole. - Change ntfs_attr_make_non_resident() to call ntfs_cluster_alloc() with @is_extension set to TRUE and remove the runlist terminator fixup code as this is now done by ntfs_cluster_alloc(). Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-10-04NTFS: Change ntfs_attr_find_vcn_nolock() to also take an optional attributeAnton Altaparmakov
search context as argument. This allows calling it with the mft record mapped. Update all callers. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-10-04NTFS: Change ntfs_map_runlist_nolock() to also take an optional attributeAnton Altaparmakov
search context. This allows calling it with the mft record mapped. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08NTFS: Fix handling of sparse attributes in ntfs_attr_make_non_resident().Anton Altaparmakov
Also, add BUG() checks to ntfs_attr_make_non_resident() and ntfs_attr_set() to ensure that these functions are never called for compressed or encrypted attributes. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08NTFS: Fix several bugs in fs/ntfs/attrib.c.Anton Altaparmakov
- Fix a bug in ntfs_map_runlist_nolock() where we forgot to protect access to the allocated size in the ntfs inode with the size lock. - Fix ntfs_attr_vcn_to_lcn_nolock() and ntfs_attr_find_vcn_nolock() to return LCN_ENOENT when there is no runlist and the allocated size is zero. - Fix load_attribute_list() to handle the case of a NULL runlist. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08NTFS: Add fs/ntfs/attrib.[hc]::ntfs_resident_attr_value_resize().Anton Altaparmakov
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-06-25NTFS: Prepare for 2.1.23 release: Update documentation and bump version.Anton Altaparmakov
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-06-25NTFS: Change ntfs_map_runlist_nolock() to only decompress the mapping pairsAnton Altaparmakov
if the requested vcn is inside it. Otherwise we get into problems when we try to map an out of bounds vcn because we then try to map the already mapped runlist fragment which causes ntfs_mapping_pairs_decompress() to fail and return error. Update ntfs_attr_find_vcn_nolock() accordingly. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-06-25NTFS: Add an extra parameter @last_vcn to ntfs_get_size_for_mapping_pairs()Anton Altaparmakov
and ntfs_mapping_pairs_build() to allow the runlist encoding to be partial which is desirable when filling holes in sparse attributes. Update all callers. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-06-25NTFS: Change the runlist terminator of the newly allocated cluster(s) toAnton Altaparmakov
LCN_ENOENT in ntfs_attr_make_non_resident(). Otherwise the runlist code gets confused. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-05-27NTFS: Use C99 style structure initialization after memory allocation whereAnton Altaparmakov
possible (fs/ntfs/{attrib.c,index.c,super.c}). Thanks to Al Viro and Pekka Enberg. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-05-05NTFS: Update attribute definition handling.Anton Altaparmakov
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-05-05NTFS: Fix compilation when configured read-only.Anton Altaparmakov
- Add ifdef NTFS_RW around write specific code if fs/ntfs/runlist.[hc] and fs/ntfs/attrib.[hc]. - Minor bugfix to fs/ntfs/attrib.c::ntfs_attr_make_non_resident() where the runlist was not freed in all error cases. - Add fs/ntfs/runlist.[hc]::ntfs_rl_find_vcn_nolock(). Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-05-05NTFS: Include linux/swap.h in fs/ntfs/attrib.c for mark_page_accessed().Anton Altaparmakov
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-05-05NTFS: - Modify ->readpage and ->writepage (fs/ntfs/aops.c) so they detectAnton Altaparmakov
and handle the case where an attribute is converted from resident to non-resident by a concurrent file write. - Reorder some operations when converting an attribute from resident to non-resident (fs/ntfs/attrib.c) so it is safe wrt concurrent ->readpage and ->writepage. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-05-05NTFS: Add fs/ntfs/attrib.[hc]::ntfs_attr_make_non_resident().Anton Altaparmakov
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-05-05NTFS: - Fix bug in fs/ntfs/attrib.c::ntfs_find_vcn_nolock() where afterAnton Altaparmakov
dropping the read lock and taking the write lock we were not checking whether someone else did not already do the work we wanted to do. - Rename ntfs_find_vcn_nolock() to ntfs_attr_find_vcn_nolock(). - Tidy up some comments in fs/ntfs/runlist.c. - Add LCN_ENOMEM and LCN_EIO definitions to fs/ntfs/runlist.h. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-05-05NTFS: Add fs/ntfs/attrib.[hc]::ntfs_attr_vcn_to_lcn_nolock() used by the newAnton Altaparmakov
write code. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-05-05NTFS: Add AT_EA in addition to AT_DATA to whitelist for being allowed to beAnton Altaparmakov
non-resident in fs/ntfs/attrib.c::ntfs_attr_can_be_non_resident(). Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-05-05NTFS: - Split ntfs_map_runlist() into ntfs_map_runlist() and a non-lockingAnton Altaparmakov
helper ntfs_map_runlist_nolock() which is used by ntfs_map_runlist(). This allows us to map runlist fragments with the runlist lock already held without having to drop and reacquire it around the call. Adapt all callers. - Change ntfs_find_vcn() to ntfs_find_vcn_nolock() which takes a locked runlist. This allows us to find runlist elements with the runlist lock already held without having to drop and reacquire it around the call. Adapt all callers. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-05-04NTFS: Use i_size_read() in fs/ntfs/attrib.c::ntfs_attr_set().Anton Altaparmakov
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-04-16Linux-2.6.12-rc2Linus Torvalds
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!