summaryrefslogtreecommitdiff
path: root/net/9p/trans_virtio.c
AgeCommit message (Collapse)Author
2018-05-10net/9p: correct some comment errors in 9p file system codeSun Lianwen
There are follow comment errors: 1 The function name is wrong in p9_release_pages() comment. 2 The function name and variable name is wrong in p9_poll_workfn() comment. 3 There is no variable dm_mr and lkey in struct p9_trans_rdma. 4 The function name is wrong in rdma_create_trans() comment. 5 There is no variable initialized in struct virtio_chan. 6 The variable name is wrong in p9_virtio_zc_request() comment. Signed-off-by: Sun Lianwen <sunlw.fnst@cn.fujitsu.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-099p/trans_virtio: discard zero-length replyGreg Kurz
When a 9p request is successfully flushed, the server is expected to just mark it as used without sending a 9p reply (ie, without writing data into the buffer). In this case, virtqueue_get_buf() will return len == 0 and we must not report a REQ_STATUS_RCVD status to the client, otherwise the client will erroneously assume the request has not been flushed. Cc: stable@vger.kernel.org Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-10-23net/9p: Switch to wait_event_killable()Tuomas Tynkkynen
Because userspace gets Very Unhappy when calls like stat() and execve() return -EINTR on 9p filesystem mounts. For instance, when bash is looking in PATH for things to execute and some SIGCHLD interrupts stat(), bash can throw a spurious 'command not found' since it doesn't retry the stat(). In practice, hitting the problem is rare and needs a really slow/bogged down 9p server. Cc: stable@vger.kernel.org Signed-off-by: Tuomas Tynkkynen <tuomas@tuxera.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-08-099p/trans_virtio: use kvfree() for iov_iter_get_pages_alloc()Vegard Nossum
The memory allocated by iov_iter_get_pages_alloc() can be allocated with vmalloc() if kmalloc() failed -- see get_pages_array(). In that case we need to free it with vfree(), so let's use kvfree(). The bug manifests like this: BUG: unable to handle kernel paging request at ffffeb0400072da0 IP: [<ffffffff8139c67b>] kfree+0x4b/0x140 PGD 0 Oops: 0000 [#1] PREEMPT SMP KASAN CPU: 2 PID: 675 Comm: trinity-c2 Not tainted 4.7.0-rc7+ #14 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 task: ffff8800badef2c0 ti: ffff880069208000 task.ti: ffff880069208000 RIP: 0010:[<ffffffff8139c67b>] [<ffffffff8139c67b>] kfree+0x4b/0x140 RSP: 0000:ffff88006920f3f0 EFLAGS: 00010282 RAX: ffffea0000000000 RBX: ffffc90001cb6000 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000246 RDI: ffffc90001cb6000 RBP: ffff88006920f410 R08: 0000000000000000 R09: dffffc0000000000 R10: ffff8800badefa30 R11: 0000056a3d3b0d9f R12: ffff88006920f620 R13: ffffeb0400072d80 R14: ffff8800baa94078 R15: 0000000000000000 FS: 00007fbd2b437700(0000) GS:ffff88011af00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffeb0400072da0 CR3: 000000006926d000 CR4: 00000000000006e0 Stack: 0000000000000001 ffff88006920f620 ffffed001755280f ffff8800baa94078 ffff88006920f6a8 ffffffff8310442b dffffc0000000000 ffff8800badefa30 ffff8800badefa28 ffff88011af1fba0 1ffff1000d241e98 ffff8800ba892150 Call Trace: [<ffffffff8310442b>] p9_virtio_zc_request+0x72b/0xdb0 [<ffffffff830f2116>] p9_client_zc_rpc.constprop.8+0x246/0xb10 [<ffffffff830f5d79>] p9_client_read+0x4c9/0x750 [<ffffffff8175ceac>] v9fs_fid_readpage+0x14c/0x320 [<ffffffff8175d0b6>] v9fs_vfs_readpage+0x36/0x50 [<ffffffff812c6f13>] filemap_fault+0x9a3/0xe60 [<ffffffff81331878>] __do_fault+0x158/0x300 [<ffffffff81339e01>] handle_mm_fault+0x1cf1/0x3c80 [<ffffffff810c0aaa>] __do_page_fault+0x30a/0x8e0 [<ffffffff810c10df>] do_page_fault+0x2f/0x80 [<ffffffff810b5b07>] do_async_page_fault+0x27/0xa0 [<ffffffff83296c48>] async_page_fault+0x28/0x30 Code: 00 80 41 54 53 49 01 fd 48 0f 42 05 b0 39 67 02 48 89 fb 49 01 c5 48 b8 00 00 00 00 00 ea ff ff 49 c1 ed 0c 49 c1 e5 06 49 01 c5 <49> 8b 45 20 48 8d 50 ff a8 01 4c 0f 45 ea 49 8b 55 20 48 8d 42 RIP [<ffffffff8139c67b>] kfree+0x4b/0x140 RSP <ffff88006920f3f0> CR2: ffffeb0400072da0 ---[ end trace f3d59a04bafec038 ]--- Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-01-24Merge tag 'for-linus-4.5-merge-window' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs Pull 9p updates from Eric Van Hensbergen: "Sorry for the last minute pull request, there's was a change that didn't get pulled into for-next until two weeks ago and I wanted to give it some bake time. Summary: Rework and error handling fixes, primarily in the fscatch and fd transports" * tag 'for-linus-4.5-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: fs/9p: use fscache mutex rather than spinlock 9p: trans_fd, bail out if recv fcall if missing 9p: trans_fd, read rework to use p9_parse_header net/9p: Add device name details on error
2016-01-04... and a couple in net/9pAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-069p/trans_virtio: don't bother with p9_tag_lookup()Al Viro
Just store the pointer to req instead of that to req->tc as opaque data. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-14net/9p: Add device name details on errorAneesh Kumar K.V
If we use wrong device name 9p mount fails with error "9pnet_virtio: no channels available" Improve the error output as below "9pnet_virtio: no channels available for device /dev/root" Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2015-07-139p/trans_virtio: reset virtio device on removePierre Morel
On device shutdown/removal, virtio drivers need to trigger a reset on the device; if this is neglected, the virtio core will complain about non-zero device status. This patch resets the status when the 9p virtio driver is removed from the system by calling vdev->config->reset on the virtio_device to send a reset to the host virtio device. Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-04-18Merge tag 'for-linus-4.1-merge-window' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs Pull 9pfs updates from Eric Van Hensbergen: "Some accumulated cleanup patches for kerneldoc and unused variables as well as some lock bug fixes and adding privateport option for RDMA" * tag 'for-linus-4.1-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: net/9p: add a privport option for RDMA transport. fs/9p: Initialize status in v9fs_file_do_lock. net/9p: Initialize opts->privport as it should be. net/9p: use memcpy() instead of snprintf() in p9_mount_tag_show() 9p: use unsigned integers for nwqid/count 9p: do not crash on unknown lock status code 9p: fix error handling in v9fs_file_do_lock 9p: remove unused variable in p9_fd_create() 9p: kerneldoc warning fixes
2015-04-11net/9p: switch the guts of p9_client_{read,write}() to iov_iterAl Viro
... and have get_user_pages_fast() mapping fewer pages than requested to generate a short read/write. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-03-20net/9p: use memcpy() instead of snprintf() in p9_mount_tag_show()Andrey Ryabinin
p9_mount_tag_show() uses '%s' format string to print non-NULL terminated chan->tag string. This leads to out of bounds memory read, because format '%s' implies that string is NULL-terminated. The length of string is know here, so its simpler and safer to use memcpy instead of snprintf(). Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com> Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2015-03-139p/trans_virtio: fix hot-unplugMichael S. Tsirkin
On device hot-unplug, 9p/virtio currently will kfree channel while it might still be in use. Of course, it might stay used forever, so it's an extremely ugly hack, but it seems better than use-after-free that we have now. [ Unused variable removed, whitespace cleanup, msg single-lined --RR ] Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-01-21virtio/9p: verify device has config spaceMichael S. Tsirkin
Some devices might not implement config space access (e.g. remoteproc used not to - before 3.9). virtio/9p needs config space access so make it fail gracefully if not there. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-159p/trans_virtio: enable VQs earlyMichael S. Tsirkin
virtio spec requires drivers to set DRIVER_OK before using VQs. This is set automatically after probe returns, but virtio 9p device adds self to channel list within probe, at which point VQ can be used in violation of the spec. To fix, call virtio_device_ready before using VQs. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-04-11Merge tag 'for-linus-3.15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs Pull 9p changes from Eric Van Hensbergen: "A bunch of updates and cleanup within the transport layer, particularly with a focus on RDMA" * tag 'for-linus-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: 9pnet_rdma: check token type before int conversion 9pnet: trans_fd : allocate struct p9_trans_fd and struct p9_conn together. 9pnet: p9_client->conn field is unused. Remove it. 9P: Get rid of REQ_STATUS_FLSH 9pnet_rdma: add cancelled() 9pnet_rdma: update request status during send 9P: Add cancelled() to the transport functions. net: Mark function as static in 9p/client.c 9P: Add memory barriers to protect request fields over cb/rpc threads handoff
2014-03-259P: Add memory barriers to protect request fields over cb/rpc threads handoffDominique Martinet
We need barriers to guarantee this pattern works as intended: [w] req->rc, 1 [r] req->status, 1 wmb rmb [w] req->status, 1 [r] req->rc Where the wmb ensures that rc gets written before status, and the rmb ensures that if you observe status == 1, rc is the new value. Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2014-02-109p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffersRichard Yao
The 9p-virtio transport does zero copy on things larger than 1024 bytes in size. It accomplishes this by returning the physical addresses of pages to the virtio-pci device. At present, the translation is usually a bit shift. That approach produces an invalid page address when we read/write to vmalloc buffers, such as those used for Linux kernel modules. Any attempt to load a Linux kernel module from 9p-virtio produces the following stack. [<ffffffff814878ce>] p9_virtio_zc_request+0x45e/0x510 [<ffffffff814814ed>] p9_client_zc_rpc.constprop.16+0xfd/0x4f0 [<ffffffff814839dd>] p9_client_read+0x15d/0x240 [<ffffffff811c8440>] v9fs_fid_readn+0x50/0xa0 [<ffffffff811c84a0>] v9fs_file_readn+0x10/0x20 [<ffffffff811c84e7>] v9fs_file_read+0x37/0x70 [<ffffffff8114e3fb>] vfs_read+0x9b/0x160 [<ffffffff81153571>] kernel_read+0x41/0x60 [<ffffffff810c83ab>] copy_module_from_fd.isra.34+0xfb/0x180 Subsequently, QEMU will die printing: qemu-system-x86_64: virtio: trying to map MMIO memory This patch enables 9p-virtio to correctly handle this case. This not only enables us to load Linux kernel modules off virtfs, but also enables ZFS file-based vdevs on virtfs to be used without killing QEMU. Special thanks to both Avi Kivity and Alexander Graf for their interpretation of QEMU backtraces. Without their guidence, tracking down this bug would have taken much longer. Also, special thanks to Linus Torvalds for his insightful explanation of why this should use is_vmalloc_addr() instead of is_vmalloc_or_module_addr(): https://lkml.org/lkml/2014/2/8/272 Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-23net/9p: remove virtio default hack and set appropriate bits insteadEric Van Hensbergen
A few releases back a patch made virtio the default transport, however it was done in a way which side-stepped the mechanism put in place to allow for this selection. This patch cleans that up while maintaining virtio as the default transport. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2013-10-17virtio: use size-based config accessors.Rusty Russell
This lets the transport do endian conversion if necessary, and insulates the drivers from the difference. Most drivers can use the simple helpers virtio_cread() and virtio_cwrite(). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-08-269p: send uevent after adding/removing mount_tag attributeMichael Marineau
This driver adds an attribute to the existing virtio device so a CHANGE event is required in order udev rules to make use of it. The ADD event happens before this driver is probed and unlike a more typical driver like a block device there isn't a higher level device to watch for. Signed-off-by: Michael Marineau <michael.marineau@coreos.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2013-05-02Merge tag 'virtio-next-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull virtio & lguest updates from Rusty Russell: "Lots of virtio work which wasn't quite ready for last merge window. Plus I dived into lguest again, reworking the pagetable code so we can move the switcher page: our fixmaps sometimes take more than 2MB now..." Ugh. Annoying conflicts with the tcm_vhost -> vhost_scsi rename. Hopefully correctly resolved. * tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (57 commits) caif_virtio: Remove bouncing email addresses lguest: improve code readability in lg_cpu_start. virtio-net: fill only rx queues which are being used lguest: map Switcher below fixmap. lguest: cache last cpu we ran on. lguest: map Switcher text whenever we allocate a new pagetable. lguest: don't share Switcher PTE pages between guests. lguest: expost switcher_pages array (as lg_switcher_pages). lguest: extract shadow PTE walking / allocating. lguest: make check_gpte et. al return bool. lguest: assume Switcher text is a single page. lguest: rename switcher_page to switcher_pages. lguest: remove RESERVE_MEM constant. lguest: check vaddr not pgd for Switcher protection. lguest: prepare to make SWITCHER_ADDR a variable. virtio: console: replace EMFILE with EBUSY for already-open port virtio-scsi: reset virtqueue affinity when doing cpu hotplug virtio-scsi: introduce multiqueue support virtio-scsi: push vq lock/unlock into virtscsi_vq_done virtio-scsi: pass struct virtio_scsi to virtqueue completion function ...
2013-03-209p/trans_virtio.c: use virtio_add_sgs[]Rusty Russell
virtio_add_buf() is going away, replaced with virtio_add_sgs() which takes multiple terminated scatterlists. Cc: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-03-08Revert parts of "hlist: drop the node parameter from iterators"Arnd Bergmann
Commit b67bfe0d42ca ("hlist: drop the node parameter from iterators") did a lot of nice changes but also contains two small hunks that seem to have slipped in accidentally and have no apparent connection to the intent of the patch. This reverts the two extraneous changes. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Peter Senna Tschudin <peter.senna@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27hlist: drop the node parameter from iteratorsSasha Levin
I'm not sure why, but the hlist for each entry iterators were conceived list_for_each_entry(pos, head, member) The hlist ones were greedy and wanted an extra parameter: hlist_for_each_entry(tpos, pos, head, member) Why did they need an extra pos parameter? I'm not quite sure. Not only they don't really need it, it also prevents the iterator from looking exactly like the list iterator, which is unfortunate. Besides the semantic patch, there was some manual work required: - Fix up the actual hlist iterators in linux/list.h - Fix up the declaration of other iterators based on the hlist ones. - A very small amount of places were using the 'node' parameter, this was modified to use 'obj->member' instead. - Coccinelle didn't handle the hlist_for_each_entry_safe iterator properly, so those had to be fixed up manually. The semantic patch which is mostly the work of Peter Senna Tschudin is here: @@ iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host; type T; expression a,c,d,e; identifier b; statement S; @@ -T b; <+... when != b ( hlist_for_each_entry(a, - b, c, d) S | hlist_for_each_entry_continue(a, - b, c) S | hlist_for_each_entry_from(a, - b, c) S | hlist_for_each_entry_rcu(a, - b, c, d) S | hlist_for_each_entry_rcu_bh(a, - b, c, d) S | hlist_for_each_entry_continue_rcu_bh(a, - b, c) S | for_each_busy_worker(a, c, - b, d) S | ax25_uid_for_each(a, - b, c) S | ax25_for_each(a, - b, c) S | inet_bind_bucket_for_each(a, - b, c) S | sctp_for_each_hentry(a, - b, c) S | sk_for_each(a, - b, c) S | sk_for_each_rcu(a, - b, c) S | sk_for_each_from -(a, b) +(a) S + sk_for_each_from(a) S | sk_for_each_safe(a, - b, c, d) S | sk_for_each_bound(a, - b, c) S | hlist_for_each_entry_safe(a, - b, c, d, e) S | hlist_for_each_entry_continue_rcu(a, - b, c) S | nr_neigh_for_each(a, - b, c) S | nr_neigh_for_each_safe(a, - b, c, d) S | nr_node_for_each(a, - b, c) S | nr_node_for_each_safe(a, - b, c, d) S | - for_each_gfn_sp(a, c, d, b) S + for_each_gfn_sp(a, c, d) S | - for_each_gfn_indirect_valid_sp(a, c, d, b) S + for_each_gfn_indirect_valid_sp(a, c, d) S | for_each_host(a, - b, c) S | for_each_host_safe(a, - b, c, d) S | for_each_mesh_entry(a, - b, c, d) S ) ...+> [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c] [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c] [akpm@linux-foundation.org: checkpatch fixes] [akpm@linux-foundation.org: fix warnings] [akpm@linux-foudnation.org: redo intrusive kvm changes] Tested-by: Peter Senna Tschudin <peter.senna@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-23net: change type of virtio_chan->p9_max_pagesZhang Yanfei
This member of struct virtio_chan is calculated from nr_free_buffer_pages so change its type to unsigned long in case of overflow. Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: David Miller <davem@davemloft.net> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ron Minnich <rminnich@sandia.gov> Cc: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-22virtio: 9p: correctly pass physical address to userspace for high pagesWill Deacon
When using a virtio transport, the 9p net device may pass the physical address of a kernel buffer to userspace via a scatterlist inside a virtqueue. If the kernel buffer is mapped outside of the linear mapping (e.g. highmem), then virt_to_page will return a bogus value and we will populate the scatterlist with junk. This patch uses kmap_to_page when populating the page array for a kernel buffer. Cc: stable@kernel.org Cc: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-07-10net: Fix (nearly-)kernel-doc comments for various functionsBen Hutchings
Fix incorrect start markers, wrapped summary lines, missing section breaks, incorrect separators, and some name mismatches. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-119p: BUG before corrupting memorySasha Levin
The BUG_ON() in pack_sg_list() would get triggered only one time after we've corrupted some memory by sg_set_buf() into an invalid sg buffer. I'm still working on figuring out why I manage to trigger that bug... Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2012-05-229p: disconnect channel when PCI device is removedSasha Levin
When a virtio_9p pci device is being removed, we should close down any active channels and free up resources, we're not supposed to BUG() if there's still an open channel since it's a valid case when removing the PCI device. Otherwise, removing the PCI device with an open channel would cause the following BUG(): [ 1184.671416] ------------[ cut here ]------------ [ 1184.672057] kernel BUG at net/9p/trans_virtio.c:618! [ 1184.672057] invalid opcode: 0000 [#1] PREEMPT SMP [ 1184.672057] CPU 3 [ 1184.672057] Pid: 5, comm: kworker/u:0 Tainted: G W 3.4.0-rc2-next-20120413-sasha-dirty #76 [ 1184.672057] RIP: 0010:[<ffffffff825c9116>] [<ffffffff825c9116>] p9_virtio_remove+0x16/0x90 [ 1184.672057] RSP: 0018:ffff88000d653ac0 EFLAGS: 00010202 [ 1184.672057] RAX: ffffffff836bfb40 RBX: ffff88000c9b2148 RCX: ffff88000d658978 [ 1184.672057] RDX: 0000000000000006 RSI: 0000000000000000 RDI: ffff880028868000 [ 1184.672057] RBP: ffff88000d653ad0 R08: 0000000000000000 R09: 0000000000000000 [ 1184.672057] R10: 0000000000000000 R11: 0000000000000001 R12: ffff880028868000 [ 1184.672057] R13: ffffffff835aa7c0 R14: ffff880041630000 R15: ffff88000d653da0 [ 1184.672057] FS: 0000000000000000(0000) GS:ffff880035a00000(0000) knlGS:0000000000000000 [ 1184.672057] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1184.672057] CR2: 0000000001181000 CR3: 000000000eba1000 CR4: 00000000000406e0 [ 1184.672057] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 x000000000117a190 *[ 1184.672057] DR3: 00000000000000** 00 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1184.672057] Process kworker/u:0 (pid: 5, threadinfo ffff88000d652000, task ffff88000d658000) [ 1184.672057] Stack: [ 1184.672057] ffff880028868000 ffffffff836bfb40 ffff88000d653af0 ffffffff8193661b [ 1184.672057] ffff880028868008 ffffffff836bfb40 ffff88000d653b10 ffffffff81af1c81 [ 1184.672057] ffff880028868068 ffff880028868008 ffff88000d653b30 ffffffff81af257a [ 1184.795301] Call Trace: [ 1184.795301] [<ffffffff8193661b>] virtio_dev_remove+0x1b/0x60 [ 1184.795301] [<ffffffff81af1c81>] __device_release_driver+0x81/0xd0 [ 1184.795301] [<ffffffff81af257a>] device_release_driver+0x2a/0x40 [ 1184.795301] [<ffffffff81af0d48>] bus_remove_device+0x138/0x150 [ 1184.795301] [<ffffffff81aef08d>] device_del+0x14d/0x1b0 [ 1184.795301] [<ffffffff81aef138>] device_unregister+0x48/0x60 [ 1184.795301] [<ffffffff8193694d>] unregister_virtio_device+0xd/0x10 [ 1184.795301] [<ffffffff8265fc74>] virtio_pci_remove+0x2a/0x6c [ 1184.795301] [<ffffffff818a95ad>] pci_device_remove+0x4d/0x110 [ 1184.795301] [<ffffffff81af1c81>] __device_release_driver+0x81/0xd0 [ 1184.795301] [<ffffffff81af257a>] device_release_driver+0x2a/0x40 [ 1184.795301] [<ffffffff81af0d48>] bus_remove_device+0x138/0x150 [ 1184.795301] [<ffffffff81aef08d>] device_del+0x14d/0x1b0 [ 1184.795301] [<ffffffff81aef138>] device_unregister+0x48/0x60 [ 1184.795301] [<ffffffff818a36fa>] pci_stop_bus_device+0x6a/0x90 [ 1184.795301] [<ffffffff818a3791>] pci_stop_and_remove_bus_device+0x11/0x20 [ 1184.795301] [<ffffffff818c21d9>] remove_callback+0x9/0x10 [ 1184.795301] [<ffffffff81252d91>] sysfs_schedule_callback_work+0x21/0x60 [ 1184.795301] [<ffffffff810cb1a1>] process_one_work+0x281/0x430 [ 1184.795301] [<ffffffff810cb140>] ? process_one_work+0x220/0x430 [ 1184.795301] [<ffffffff81252d70>] ? sysfs_read_file+0x1c0/0x1c0 [ 1184.795301] [<ffffffff810cc613>] worker_thread+0x1f3/0x320 [ 1184.795301] [<ffffffff810cc420>] ? manage_workers.clone.13+0x130/0x130 [ 1184.795301] [<ffffffff810d30b2>] kthread+0xb2/0xc0 [ 1184.795301] [<ffffffff826783f4>] kernel_thread_helper+0x4/0x10 [ 1184.795301] [<ffffffff810deb18>] ? finish_task_switch+0x78/0xf0 [ 1184.795301] [<ffffffff82676574>] ? retint_restore_args+0x13/0x13 [ 1184.795301] [<ffffffff810d3000>] ? kthread_flush_work_fn+0x10/0x10 [ 1184.795301] [<ffffffff826783f0>] ? gs_change+0x13/0x13 [ 1184.795301] Code: c1 9e 0a 00 48 83 c4 08 5b c9 c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 54 49 89 fc 53 48 8b 9f a8 04 00 00 80 3b 00 74 0a <0f> 0b 0f 1f 84 00 00 00 00 00 48 8b 87 88 04 00 00 ff 50 30 31 [ 1184.795301] RIP [<ffffffff825c9116>] p9_virtio_remove+0x16/0x90 [ 1184.795301] RSP <ffff88000d653ac0> [ 1184.952618] ---[ end trace a307b3ed40206b4c ]--- Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-01-12virtio: rename virtqueue_add_buf_gfp to virtqueue_add_bufRusty Russell
Remove wrapper functions. This makes the allocation type explicit in all callers; I used GPF_KERNEL where it seemed obvious, left it at GFP_ATOMIC otherwise. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Reviewed-by: Christoph Hellwig <hch@lst.de>
2012-01-059p: Reduce object size with CONFIG_NET_9P_DEBUGJoe Perches
Reduce object size by deduplicating formats. Use vsprintf extension %pV. Rename P9_DPRINTK uses to p9_debug, align arguments. Add function for _p9_debug and macro to add __func__. Add missing "\n"s to p9_debug uses. Remove embedded function names as p9_debug adds it. Remove P9_EPRINTK macro and convert use to pr_<level>. Add and use pr_fmt and pr_<level>. $ size fs/9p/built-in.o* text data bss dec hex filename 62133 984 16000 79117 1350d fs/9p/built-in.o.new 67342 984 16928 85254 14d06 fs/9p/built-in.o.old $ size net/9p/built-in.o* text data bss dec hex filename 88792 4148 22024 114964 1c114 net/9p/built-in.o.new 94072 4148 23232 121452 1da6c net/9p/built-in.o.old Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-10-24fs/9p: Update zero-copy implementation in 9pAneesh Kumar K.V
* remove lot of update to different data structure * add a seperate callback for zero copy request. * above makes non zero copy code path simpler * remove conditionalizing TREAD/TREADDIR/TWRITE in the zero copy path * Fix the dotu p9_check_errors with zero copy. Add sufficient doc around * Add support for both in and output buffers in zero copy callback * pin and unpin pages in the same context * use helpers instead of defining page offset and rest of page ourself * Fix mem leak in p9_check_errors * Remove 'E' and 'F' in p9pdu_vwritef Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-09-06net/9p: Fix kernel crash with msize 512KAneesh Kumar K.V
With msize equal to 512K (PAGE_SIZE * VIRTQUEUE_NUM), we hit multiple crashes. This patch fix those. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-07-23VirtIO can transfer VIRTQUEUE_NUM of pages.jvrao
Signed-off-by: Venkateswararao Jujjuri "<jvrao@linux.vnet.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-07-23Fix the size of receive buffer packing onto VirtIO ring.jvrao
Signed-off-by: Venkateswararao Jujjuri "<jvrao@linux.vnet.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-04-159p: Fix sparse errorAneesh Kumar K.V
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-22[net/9p]: Introduce basic flow-control for VirtIO transport.Venkateswararao Jujjuri (JV)
Recent zerocopy work in the 9P VirtIO transport maps and pins user buffers into kernel memory for the server to work on them. Since the user process can initiate this kind of pinning with a simple read/write call, thousands of IO threads initiated by the user process can hog the system resources and could result into denial of service. This patch introduces flow control to avoid that extreme scenario. The ceiling limit to avoid denial of service attacks is set to relatively high (nr_free_pagecache_pages()/4) so that it won't interfere with regular usage, but can step in extreme cases to limit the total system hang. Since we don't have a global structure to accommodate this variable, I choose the virtio_chan as the home for this. Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Reviewed-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-22[net/9p] Don't re-pin pages on retrying virtqueue_add_buf().Venkateswararao Jujjuri (JV)
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-22[net/9p] Set the condition just before waking up.Venkateswararao Jujjuri (JV)
Given that the sprious wake-ups are common, we need to move the condition setting right next to the wake_up(). After setting the condition to req->status = REQ_STATUS_RCVD, sprious wakeups may cause the virtqueue back on the free list for someone else to use. This may result in kernel panic while relasing the pinned pages in p9_release_req_pages(). Also rearranged the while loop in req_done() for better redability. Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-22[net/9p] unconditional wake_up to proc waiting for space on VirtIO ringVenkateswararao Jujjuri (JV)
Process may wait to get space on VirtIO ring to send a transaction to VirtFS server. Current code just does a conditional wake_up() which means only one process will be woken up even if multiple processes are waiting. This fix makes the wake_up unconditional. Hence we won't have any processes waiting for-ever. Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-15[net/9p] Add preferences to transport layer.Venkateswararao Jujjuri (JV)
This patch adds preferences field to the p9_trans_module. Through this, now transport layer can express its preference about the payload. i.e if payload neds to be part of the PDU or it prefers it to be sent sepearetly so that the transport layer can handle it in a better way. Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-15[net/9p] Add gup/zero_copy support to VirtIO transport layer.Venkateswararao Jujjuri (JV)
Modify p9_virtio_request() and req_done() functions to support additional payload sent down to the transport layer through tc->pubuf and tc->pkbuf. Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2010-10-28net/9p: Add waitq to VirtIO transport.Venkateswararao Jujjuri (JV)
If there is not enough space for the PDU on the VirtIO ring, current code returns -EIO propagating the error to user. This patch introduced a wqit_queue on the channel, and lets the process wait on this queue until VirtIO ring frees up. Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2010-10-28[net/9p]Serialize virtqueue operations to make VirtIO transport SMP safe.Venkateswararao Jujjuri (JV)
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2010-09-27net/9p: Mount only matching virtio channelsSven Eckelmann
p9_virtio_create will only compare the the channel's tag characters against the device name till the end of the channel's tag but not till the end of the device name. This means that if a user defines channels with the tags foo and foobar then he would mount foo when he requested foonot and may mount foo when he requested foobar. Thus it is necessary to check both string lengths against each other in case of a successful partial string match. Signed-off-by: Sven Eckelmann <sven.eckelmann@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-19trans_virtio: use virtqueue_xxx wrappersMichael S. Tsirkin
Switch trans_virtio to new virtqueue_xxx wrappers. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-13net/9p: Add sysfs mount_tag file for virtio 9P deviceAneesh Kumar K.V
This adds a new file for virtio 9P device. The file contain details of the mount device name that should be used to mount the 9P file system. Ex: /sys/devices/virtio-pci/virtio1/mount_tag file now contian the tag name to be used to mount the 9P file system. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2010-03-13net/9p: Use the tag name in the config space for identifying mount pointAneesh Kumar K.V
This patch use the tag name in the config space to identify the mount device. The the virtio device name depend on the enumeration order of the device and may not remain the same across multiple boots So we use the tag name which is set via qemu option to uniquely identify the mount device Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>