summaryrefslogtreecommitdiff
path: root/ipc
AgeCommit message (Collapse)Author
2017-01-31Revert "Revert "kdbus: add CMD_UPDATE_METADATA ioctl (reinitialize ↵Hyotaek Shim
connection metadata)"" This reverts commit 80b9e0d4a216cabc94399cc0f7dbb3064fd350ff. Change-Id: I72edcdc79d93b5b888deb5c210b537cef52fff11
2017-01-12Revert "kdbus: add CMD_UPDATE_METADATA ioctl (reinitialize connection metadata)"Inki Dae
This reverts commit d07e0df0ca20a1df62100485117df3d90a1a2a80. Change-Id: I895dad38c5f4c850b7d986e39d61a70ecdea20bf
2017-01-12kdbus: add CMD_UPDATE_METADATA ioctl (reinitialize connection metadata)Konrad Lipinski
Added to satisfy efl/launchpad developers' request. Tizen code routinely performs the following operation sequence: 1. create kdbus connection 2. update seclabel 3. rely on updated seclabel CONN_INFO has always returned seclabel collected at HELLO time (behavior consistent across all kdbus versions and documented in kdbus man). This would break step 3 of the above sequence. KDBUS_CMD_UPDATE_METADATA ioctl updates a connection's metadata to reflect the current state. Metadata is collected in the exact same way as during HELLO. Semantics Required by efl/launchpad can be obtained by altering the sequence like so: 1. create kdbus connection 2. update seclabel +2b. ioctl(connection_fd, KDBUS_CMD_UPDATE_METADATA); 3. rely on updated seclabel Change-Id: I4a4b2aea4256f8bfb3bd1c0d3df5e963d243cb52 Signed-off-by: Konrad Lipinski <konrad.l@samsung.com>
2016-12-14kdbus: link replies in fifo send orderKonrad Lipinski
As a consequence, timeouts are now issued in fifo order. Change-Id: Iacf4e922a37038fa42f95a9d058eeacd0ebce3ec Signed-off-by: Konrad Lipinski <konrad.l@samsung.com>
2016-12-14kdbus: Remove the unnecessary code that unlink the reply twice.INSUN PYO
In sync call situation, kdbus_conn_reply() calls kdbus_reply_unlink(reply). Then, kdbus_conn_entry_sync_attach() also calls kdbus_reply_unlink(reply_wake), so it is called twice. Even it has no problems because kdbus_reply_unlink() does't free twice, but it is unnecessary. Signed-off-by: INSUN PYO <insun.pyo@samsung.com> Change-Id: Ia2b3552ee2e6a49ff97bb9d8f6e62964fe6d2cbf
2016-12-14kdbus: cmdline emptiness check fixKonrad Lipinski
[ This commit re-applies change Ia925fa43bf4c50ab707be98f275b50808782f063 to new kdbus upstream version. ] Signed-off-by: Karol Lewandowski <k.lewandowsk@samsung.com> Change-Id: Ia03913a35dba34567c1bd99afbfe138ea7a5d133
2016-12-14kdbus: allowing sending replies even if NO_EXPECT_REPLY is setLukasz Skalski
[ This commit re-applies change I34427f381957dd8d366e9e4d837d5a2a34a39cc1 to new kdbus upstream version. ] Signed-off-by: Karol Lewandowski <k.lewandowsk@samsung.com> Change-Id: I6155f3599e59cd880267d9c03b0212b47ec9a64c
2016-12-14kdbus: allow unix domain socket fd passingKonrad Lipinski
[ This commit re-applies change Ifafec44da924ec8ed677629606c92a45e7171636 to new kdbus upstream version. ] Signed-off-by: Karol Lewandowski <k.lewandowsk@samsung.com> Change-Id: Ib293d0d864a3a91cf8422bcccc1f7593285868a1
2016-12-14kdbus: Upgrade driver to newer upstream versionKarol Lewandowski
This commit upgrades kdbus ipc driver from v4 patchset, as posted on lkml for review by Greg Kroah-Hartman on Mar 09 2015 to commit 0c05fbdc82f from new upstream kdbus repository (git://github.com/systemd/kdbus). Summary of major changes: * message importer rewritten - considerably reduces internal message processing overhead, * name registration reworked to follow DBus Specification precisely, * attached metadata now follow /proc access checks * reduced in-kernel stack buffer to 256 bytes for small messages Change-Id: I6d849173b4289e1b684ed1a9b48e6e0b361e5d53
2016-12-14kdbus: cmdline emptiness check fixKonrad Lipinski
Change-Id: Ia925fa43bf4c50ab707be98f275b50808782f063
2016-12-14kdbus: allow senders to receive own broadcastsINSUN PYO
The dbus1 spec does not place a restriction on who can receive broadcasts. As long as the sender has a MATCH-rule on itself, it can as well receive its own broadcasts. As it turns out, user-space currently relies on this feature. So make sure to allow this just like dbus1. If we find some client that does not work with this, we will have to turn it into a HELLO-flag. Until then, just try to adjust the default behavior. Signed-off-by: INSUN PYO <insun.pyo@samsung.com> Change-Id: I83702b59039062967ec2875e268a17d647902a87
2016-12-14kdbus: allowing sending replies even if NO_EXPECT_REPLY is setLukasz Skalski
Change-Id: I34427f381957dd8d366e9e4d837d5a2a34a39cc1
2016-12-14kdbus: allow unix domain socket fd passingKonrad Lipinski
Change-Id: Ifafec44da924ec8ed677629606c92a45e7171636
2016-12-14kdbus: increate KDBUS_CONN_MAX_REQUESTS_PENDING to 1024 from 128INSUN PYO
Signed-off-by: INSUN PYO <insun.pyo@samsung.com> Change-Id: I82e1d634c88f2e86a29e7f5d50ce5226943c2e54
2016-12-14kdbus: Remove kdbus Linux Security Module hooksKarol Lewandowski
This commit removes support for kdbus-LSM hooks as policy decisions are handled solely by userspace (libdbuspolicy library). This commit reverts following: - 802de9506 ("lsm: smack: smack callbacks for kdbus security hooks") - f13b7e7bd ("kdbus: use LSM hooks in kdbus code") - 067afa709 ("lsm: smack: Make ipc/kdbus includes visible so smack callbacks could see them") - 442f047fd ("lsm: make security_file_receive available for external modules") - 3b556db4b ("lsm: kdbus security hooks") Change-Id: Iae90cdb9577a9e706288b28d70bd57574398276e Signed-off-by: Karol Lewandowski <k.lewandowsk@samsung.com> Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com>
2016-12-14kdbus: disable internal kdbus policyLukasz Skalski
Possibilities of connections to own, see and talk to well-known names are already restricted by LSM hooks. Change-Id: I62d86a506a85e6c48bdd3e0f8b11f1aa5a918c75 Signed-off-by: Lukasz Skalski <l.skalski@samsung.com>
2016-12-14kdbus: disable all internal policy checksLukasz Skalski
Change-Id: I5ef09ea4e4389ca41a6ef7afda31fe3a8d9bc507 Signed-off-by: Lukasz Skalski <l.skalski@samsung.com>
2016-12-14kdbus: pool: use __vfs_read()Sergei Zviagintsev
After commit 5d5d56897530 ("make new_sync_{read,write}() static") ->read() cannot be called directly. kdbus_pool_slice_copy() leads to oops, which can be reproduced by launching tools/testing/selftests/kdbus/kdbus-test -t message-quota: [ 1167.146793] BUG: unable to handle kernel NULL pointer dereference at (null) [ 1167.147554] IP: [< (null)>] (null) [ 1167.148670] PGD 3a9dd067 PUD 3a841067 PMD 0 [ 1167.149611] Oops: 0010 [#1] SMP [ 1167.150088] Modules linked in: nfsv3 nfs kdbus lockd grace sunrpc [ 1167.150771] CPU: 0 PID: 518 Comm: kdbus-test Not tainted 4.0.0-next-20150420-kdbus #62 [ 1167.150771] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 1167.150771] task: ffff88003daed120 ti: ffff88003a800000 task.ti: ffff88003a800000 [ 1167.150771] RIP: 0010:[<0000000000000000>] [< (null)>] (null) [ 1167.150771] RSP: 0018:ffff88003a803bc0 EFLAGS: 00010286 [ 1167.150771] RAX: ffff8800377fb000 RBX: 00000000000201e8 RCX: ffff88003a803c00 [ 1167.150771] RDX: 0000000000000b40 RSI: ffff8800377fb4c0 RDI: ffff88003d815700 [ 1167.150771] RBP: ffff88003a803c48 R08: ffffffff8139e380 R09: ffff880039d80490 [ 1167.150771] R10: ffff88003a803a90 R11: 00000000000004c0 R12: 00000000002a24c0 [ 1167.150771] R13: 0000000000000b40 R14: ffff88003d815700 R15: ffffffff8139e460 [ 1167.150771] FS: 00007f41dccd4740(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000 [ 1167.150771] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1167.150771] CR2: 0000000000000000 CR3: 000000003ccdf000 CR4: 00000000000007b0 [ 1167.150771] Stack: [ 1167.150771] ffffffffa0065497 ffff88003a803c10 00007ffffffff000 ffff88003aaa67c0 [ 1167.150771] 00000000000004c0 ffff88003aaa6870 ffff88003ca83300 ffffffffa006537d [ 1167.150771] 00000000000201e8 ffffea0000ddfec0 ffff88003a803c20 0000000000000018 [ 1167.150771] Call Trace: [ 1167.150771] [<ffffffffa0065497>] ? kdbus_pool_slice_copy+0x127/0x200 [kdbus] [ 1167.150771] [<ffffffffa006537d>] ? kdbus_pool_slice_copy+0xd/0x200 [kdbus] [ 1167.150771] [<ffffffffa006670a>] kdbus_queue_entry_move+0xaa/0x180 [kdbus] [ 1167.150771] [<ffffffffa0059e64>] kdbus_conn_move_messages+0x1e4/0x2c0 [kdbus] [ 1167.150771] [<ffffffffa006234e>] kdbus_name_acquire+0x31e/0x390 [kdbus] [ 1167.150771] [<ffffffffa00625c5>] kdbus_cmd_name_acquire+0x125/0x130 [kdbus] [ 1167.150771] [<ffffffffa005db5d>] kdbus_handle_ioctl+0x4ed/0x610 [kdbus] [ 1167.150771] [<ffffffff811040e0>] do_vfs_ioctl+0x2e0/0x4e0 [ 1167.150771] [<ffffffff81389750>] ? preempt_schedule_common+0x1f/0x3f [ 1167.150771] [<ffffffff8110431c>] SyS_ioctl+0x3c/0x80 [ 1167.150771] [<ffffffff8138c36e>] system_call_fastpath+0x12/0x71 [ 1167.150771] Code: Bad RIP value. [ 1167.150771] RIP [< (null)>] (null) [ 1167.150771] RSP <ffff88003a803bc0> [ 1167.150771] CR2: 0000000000000000 [ 1167.168756] ---[ end trace a676bcfa75db5a96 ]--- Use __vfs_read() instead. Signed-off-by: Sergei Zviagintsev <sergei@s15v.net> Reviewed-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-12-14kdbus: disable internal kdbus policyLukasz Skalski
Possibilities of connections to own, see and talk to well-known names are already restricted by LSM hooks. Signed-off-by: Lukasz Skalski <l.skalski@samsung.com>
2016-12-14kdbus: Eliminate warning caused by lack of uapi/linux/kdbus.h inclusionPaul Osmialowski
metadata.h references struct kdbus_pids which is defined in uapi/linux/kdbus.h Normally, kdbus/metadata.h is included as one of many other headers that eventually include uapi/linux/kdbus.h at some point. When included alone, it causes warning. Also when kdbus/connection.h is included alone (e.g. in smack_lsm.c) , the same warning is shown as it includes kdbus/metadata.h. This patch adds missing inclusion. Signed-off-by: Paul Osmialowski <p.osmialowsk@samsung.com>
2016-12-14kdbus: use LSM hooks in kdbus codePaul Osmialowski
Originates from: https://github.com/lmctl/kdbus.git (branch: kdbus-lsm-v4.for-systemd-v212) commit: aa0885489d19be92fa41c6f0a71df28763228a40 Signed-off-by: Karol Lewandowski <k.lewandowsk@samsung.com> Signed-off-by: Paul Osmialowski <p.osmialowsk@samsung.com>
2016-12-14kdbus: do not append the same connection to the queue twicePaul Osmialowski
As it was discussed on systemd ML [1], the same connection should be queued up only once for a given well-known name. [1] http://lists.freedesktop.org/archives/systemd-devel/2015-April/030494.html This commit fixes following issue: [ 243.364270] ------------[ cut here ]------------ [ 243.364352] WARNING: CPU: 1 PID: 223 at ../ipc/kdbus/names.c:137 kdbus_name_entry_replace_owner+0x88/0x8c() [ 243.364408] Modules linked in: [ 243.364474] CPU: 1 PID: 223 Comm: kdbus-test Not tainted 4.0.0+ #1 [ 243.364526] Hardware name: Foundation-v8A (DT) [ 243.364569] Call trace: [ 243.364639] [<ffff800000089d38>] dump_backtrace+0x0/0x12c [ 243.364718] [<ffff800000089e74>] show_stack+0x10/0x1c [ 243.364798] [<ffff8000006642f4>] dump_stack+0x74/0x98 [ 243.364874] [<ffff8000000b282c>] warn_slowpath_common+0x98/0xd0 [ 243.364951] [<ffff8000000b2928>] warn_slowpath_null+0x14/0x20 [ 243.365026] [<ffff8000003cf7a4>] kdbus_name_entry_replace_owner+0x84/0x8c [ 243.365105] [<ffff8000003cf7e0>] kdbus_name_release_unlocked.isra.5+0x34/0x170 [ 243.365183] [<ffff8000003d0554>] kdbus_cmd_name_release+0x1b8/0x1c8 [ 243.365270] [<ffff8000003cbd28>] kdbus_handle_ioctl+0x5e0/0x690 [ 243.365347] [<ffff8000001b3520>] do_vfs_ioctl+0x31c/0x5c0 [ 243.365423] [<ffff8000001b3844>] SyS_ioctl+0x80/0x98 [ 243.365473] ---[ end trace 5bf3630c98408d38 ]--- Signed-off-by: Lukasz Skalski <l.skalski@samsung.com> Signed-off-by: Paul Osmialowski <p.osmialowsk@samsung.com>
2016-12-14kdbus: avoid the use of struct timespecArnd Bergmann
I did a routine check for new users of 'timespec', which we are trying to remove from the kernel in order to survive y2038. kdbus came up and looks particularly trivial to clean up. This changes the three ktime_get_ts() variants used in kdbus to ktime_get_ns(), which aside from removing timespec also simplifies the code and makes it slightly more efficient by avoiding a two-way conversion. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paul Osmialowski <p.osmialowsk@samsung.com>
2016-12-14kdbus: connection: fix handling of failed fget()Daniel Mack
The patch 5fc8dd5c84fc: "kdbus: add connection, queue handling and message validation code" from Sep 11, 2014, leads to the following static checker warning: ipc/kdbus/connection.c:2000 kdbus_cmd_send() warn: 'cancel_fd' isn't an ERR_PTR Fix this by checking for NULL pointers returned from fget(). Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Daniel Mack <daniel@zonque.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paul Osmialowski <p.osmialowsk@samsung.com>
2016-12-14kdbus: add Makefile, Kconfig and MAINTAINERS entryDaniel Mack
This patch hooks up the build system to actually compile the files added by previous patches. It also adds an entry to MAINTAINERS to direct people to Greg KH, David Herrmann, Djalal Harouni and me for questions and patches. Signed-off-by: Daniel Mack <daniel@zonque.org> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: Djalal Harouni <tixxdz@opendz.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paul Osmialowski <p.osmialowsk@samsung.com>
2016-12-14kdbus: add policy database implementationDaniel Mack
This patch adds the policy database implementation. A policy database restricts the possibilities of connections to own, see and talk to well-known names. It can be associated with a bus (through a policy holder connection) or a custom endpoint. By default, buses have an empty policy database that is augmented on demand when a policy holder connection is instantiated. Policies are set through KDBUS_CMD_HELLO (when creating a policy holder connection), KDBUS_CMD_CONN_UPDATE (when updating a policy holder connection), KDBUS_CMD_EP_MAKE (creating a custom endpoint) or KDBUS_CMD_EP_UPDATE (updating a custom endpoint). In all cases, the name and policy access information is stored in items of type KDBUS_ITEM_NAME and KDBUS_ITEM_POLICY_ACCESS. See kdbus.policy(7) for more details. Signed-off-by: Daniel Mack <daniel@zonque.org> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: Djalal Harouni <tixxdz@opendz.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paul Osmialowski <p.osmialowsk@samsung.com>
2016-12-14kdbus: add name registry implementationDaniel Mack
This patch adds the name registry implementation. Each bus instantiates a name registry to resolve well-known names into unique connection IDs for message delivery. The registry will be queried when a message is sent with kdbus_msg.dst_id set to KDBUS_DST_ID_NAME, or when a registry dump is requested. It's important to have this registry implemented in the kernel to implement lookups and take-overs in a race-free way. Signed-off-by: Daniel Mack <daniel@zonque.org> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: Djalal Harouni <tixxdz@opendz.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paul Osmialowski <p.osmialowsk@samsung.com>
2016-12-14kdbus: add code for buses, domains and endpointsDaniel Mack
Add the logic to handle the following entities: Domain: A domain is an unamed object containing a number of buses. A domain is automatically created when an instance of kdbusfs is mounted, and destroyed when it is unmounted. Every domain offers its own 'control' device node to create buses. Domains are isolated from each other. Bus: A bus is a named object inside a domain. Clients exchange messages over a bus. Multiple buses themselves have no connection to each other; messages can only be exchanged on the same bus. The default entry point to a bus, where clients establish the connection to, is the "bus" device node /sys/fs/kdbus/<bus name>/bus. Common operating system setups create one "system bus" per system, and one "user bus" for every logged-in user. Applications or services may create their own private named buses. Endpoint: An endpoint provides the device node to talk to a bus. Opening an endpoint creates a new connection to the bus to which the endpoint belongs. Every bus has a default endpoint called "bus". A bus can optionally offer additional endpoints with custom names to provide a restricted access to the same bus. Custom endpoints carry additional policy which can be used to give sandboxed processes only a locked-down, limited, filtered access to the same bus. See kdbus(7), kdbus.bus(7), kdbus.endpoint(7) and kdbus.fs(7) for more details. Signed-off-by: Daniel Mack <daniel@zonque.org> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: Djalal Harouni <tixxdz@opendz.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paul Osmialowski <p.osmialowsk@samsung.com>
2016-12-14kdbus: add code for notifications and matchesDaniel Mack
This patch adds code for matches and notifications. Notifications are broadcast messages generated by the kernel, which notify subscribes when connections are created or destroyed, when well-known-names have been claimed, released or changed ownership, or when reply messages have timed out. Matches are used to tell the kernel driver which broadcast messages a connection is interested in. Matches can either be specific on one of the kernel-generated notification types, or carry a bloom filter mask to match against a message from userspace. The latter is a way to pre-filter messages from other connections in order to mitigate unnecessary wakeups. Signed-off-by: Daniel Mack <daniel@zonque.org> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: Djalal Harouni <tixxdz@opendz.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paul Osmialowski <p.osmialowsk@samsung.com>
2016-12-14kdbus: add code to gather metadataDaniel Mack
A connection chooses which metadata it wants to have attached to each message it receives with kdbus_cmd_hello.attach_flags. The metadata will be attached as items to the messages. All metadata refers to information about the sending task at sending time, unless otherwise stated. Also, the metadata is copied, not referenced, so even if the sending task doesn't exist anymore at the time the message is received, the information is still preserved. In traditional D-Bus, userspace tasks like polkit or journald make a live lookup in procfs and sysfs to gain information about a sending task. This is racy, of course, as in a a connection-less system like D-Bus, the originating peer can go away immediately after sending the message. As we're moving D-Bus prmitives into the kernel, we have to provide the same semantics here, and inform the receiving peer on the live credentials of the sending peer. Metadata is collected at the following times. * When a bus is created (KDBUS_CMD_MAKE), information about the calling task is collected. This data is returned by the kernel via the KDBUS_CMD_BUS_CREATOR_INFO call. * When a connection is created (KDBUS_CMD_HELLO), information about the calling task is collected. Alternatively, a privileged connection may provide 'faked' information about credentials, PIDs and security labels which will be stored instead. This data is returned by the kernel as information on a connection (KDBUS_CMD_CONN_INFO). Only metadata that a connection allowed to be sent (by setting its bit in attach_flags_send) will be exported in this way. * When a message is sent (KDBUS_CMD_SEND), information about the sending task and the sending connection are collected. This metadata will be attached to the message when it arrives in the receiver's pool. If the connection sending the message installed faked credentials (see kdbus.connection(7)), the message will not be augmented by any information about the currently sending task. Which metadata items are actually delivered depends on the following sets and masks: (a) the system-wide kmod creds mask (module parameter 'attach_flags_mask') (b) the per-connection send creds mask, set by the connecting client (c) the per-connection receive creds mask, set by the connecting client (d) the per-bus minimal creds mask, set by the bus creator (e) the per-bus owner creds mask, set by the bus creator (f) the mask specified when querying creds of a bus peer (g) the mask specified when querying creds of a bus owner With the following rules: [1] The creds attached to messages are determined as a & b & c. [2] When connecting to a bus (KDBUS_CMD_HELLO), and ~b & d != 0, the call will fail with, -1, and errno is set to ECONNREFUSED. [3] When querying creds of a bus peer, the creds returned are a & b & f. [4] When querying creds of a bus owner, the creds returned are a & e & g. See kdbus.metadata(7) and kdbus.item(7) for more details on which metadata can currently be attached to messages. Signed-off-by: Daniel Mack <daniel@zonque.org> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: Djalal Harouni <tixxdz@opendz.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paul Osmialowski <p.osmialowsk@samsung.com>
2016-12-14kdbus: add node and filesystem implementationDaniel Mack
kdbusfs is a filesystem that will expose a fresh kdbus domain context each time it is mounted. Per mount point, there will be a 'control' node, which can be used to create buses. fs.c contains the implementation of that pseudo-fs. Exported inodes of 'file' type have their i_fop set to either kdbus_handle_control_ops or kdbus_handle_ep_ops, depending on their type. The actual dispatching of file operations is done from handle.c node.c is an implementation of a kdbus object that has an id and children, organized in an R/B tree. The tree is used by the filesystem code for lookup and iterator functions, and to deactivate children once the parent is deactivated. Every inode exported by kdbusfs is backed by a kdbus_node, hence it is embedded in struct kdbus_ep, struct kdbus_bus and struct kdbus_domain. Signed-off-by: Daniel Mack <daniel@zonque.org> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: Djalal Harouni <tixxdz@opendz.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paul Osmialowski <p.osmialowsk@samsung.com>
2016-12-14kdbus: add connection, queue handling and message validation codeDaniel Mack
This patch adds code to create and destroy connections, to validate incoming messages and to maintain the queue of messages that are associated with a connection. Note that connection and queue have a 1:1 relation, the code is only split in two parts for cleaner separation and better readability. Signed-off-by: Daniel Mack <daniel@zonque.org> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: Djalal Harouni <tixxdz@opendz.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paul Osmialowski <p.osmialowsk@samsung.com>
2016-12-14kdbus: add connection pool implementationDaniel Mack
A pool for data received from the kernel is installed for every connection of the bus, and it is used to copy data from the kernel to userspace clients, for messages and other information. It is accessed when one of the following ioctls is issued: * KDBUS_CMD_MSG_RECV, to receive a message * KDBUS_CMD_NAME_LIST, to dump the name registry * KDBUS_CMD_CONN_INFO, to retrieve information on a connection The offsets returned by either one of the aforementioned ioctls describe offsets inside the pool. Internally, the pool is organized in slices, that are dynamically allocated on demand. The overall size of the pool is chosen by the connection when it connects to the bus with KDBUS_CMD_HELLO. In order to make the slice available for subsequent calls, KDBUS_CMD_FREE has to be called on the offset. To access the memory, the caller is expected to mmap() it to its task. Signed-off-by: Daniel Mack <daniel@zonque.org> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: Djalal Harouni <tixxdz@opendz.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paul Osmialowski <p.osmialowsk@samsung.com>
2016-12-14kdbus: add driver skeleton, ioctl entry points and utility functionsDaniel Mack
Add the basic driver structure. handle.c is the main ioctl command dispatcher that calls into other parts of the driver. main.c contains the code that creates the initial domain at startup, and util.c has utility functions such as item iterators that are shared with other files. limits.h describes limits on things like maximum data structure sizes, number of messages per users and suchlike. Some of the numbers currently picked are rough ideas of what what might be sufficient and are probably rather conservative. Signed-off-by: Daniel Mack <daniel@zonque.org> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: Djalal Harouni <tixxdz@opendz.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paul Osmialowski <p.osmialowsk@samsung.com>
2016-08-19sysv, ipc: fix security-layer leakingFabian Frederick
[ Upstream commit 9b24fef9f0410fb5364245d6cc2bd044cc064007 ] Commit 53dad6d3a8e5 ("ipc: fix race with LSMs") updated ipc_rcu_putref() to receive rcu freeing function but used generic ipc_rcu_free() instead of msg_rcu_free() which does security cleaning. Running LTP msgsnd06 with kmemleak gives the following: cat /sys/kernel/debug/kmemleak unreferenced object 0xffff88003c0a11f8 (size 8): comm "msgsnd06", pid 1645, jiffies 4294672526 (age 6.549s) hex dump (first 8 bytes): 1b 00 00 00 01 00 00 00 ........ backtrace: kmemleak_alloc+0x23/0x40 kmem_cache_alloc_trace+0xe1/0x180 selinux_msg_queue_alloc_security+0x3f/0xd0 security_msg_queue_alloc+0x2e/0x40 newque+0x4e/0x150 ipcget+0x159/0x1b0 SyS_msgget+0x39/0x40 entry_SYSCALL_64_fastpath+0x13/0x8f Manfred Spraul suggested to fix sem.c as well and Davidlohr Bueso to only use ipc_rcu_free in case of security allocation failure in newary() Fixes: 53dad6d3a8e ("ipc: fix race with LSMs") Link: http://lkml.kernel.org/r/1470083552-22966-1-git-send-email-fabf@skynet.be Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: <stable@vger.kernel.org> [3.12+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
2016-03-04ipc/shm: handle removed segments gracefully in shm_mmap()Kirill A. Shutemov
[ Upstream commit 15db15e2f10ae12d021c9a2e9edd8a03b9238551 ] commit 1ac0b6dec656f3f78d1c3dd216fad84cb4d0a01e upstream. remap_file_pages(2) emulation can reach file which represents removed IPC ID as long as a memory segment is mapped. It breaks expectations of IPC subsystem. Test case (rewritten to be more human readable, originally autogenerated by syzkaller[1]): #define _GNU_SOURCE #include <stdlib.h> #include <sys/ipc.h> #include <sys/mman.h> #include <sys/shm.h> #define PAGE_SIZE 4096 int main() { int id; void *p; id = shmget(IPC_PRIVATE, 3 * PAGE_SIZE, 0); p = shmat(id, NULL, 0); shmctl(id, IPC_RMID, NULL); remap_file_pages(p, 3 * PAGE_SIZE, 0, 7, 0); return 0; } The patch changes shm_mmap() and code around shm_lock() to propagate locking error back to caller of shm_mmap(). [1] http://github.com/google/syzkaller Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2016-03-04ipc: convert invalid scenarios to use WARN_ONDavidlohr Bueso
[ Upstream commit d0edd8528362c07216498340e928159510595e7b ] Considering Linus' past rants about the (ab)use of BUG in the kernel, I took a look at how we deal with such calls in ipc. Given that any errors or corruption in ipc code are most likely contained within the set of processes participating in the broken mechanisms, there aren't really many strong fatal system failure scenarios that would require a BUG call. Also, if something is seriously wrong, ipc might not be the place for such a BUG either. 1. For example, recently, a customer hit one of these BUG_ONs in shm after failing shm_lock(). A busted ID imho does not merit a BUG_ON, and WARN would have been better. 2. MSG_COPY functionality of posix msgrcv(2) for checkpoint/restore. I don't see how we can hit this anyway -- at least it should be IS_ERR. The 'copy' arg from do_msgrcv is always set by calling prepare_copy() first and foremost. We could also probably drop this check altogether. Either way, it does not merit a BUG_ON. 3. No ->fault() callback for the fs getting the corresponding page -- seems selfish to make the system unusable. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2016-03-04ipc,shm: move BUG_ON check into shm_lockDavidlohr Bueso
[ Upstream commit c5c8975b2eb4eb7604e8ce4f762987f56d2a96a2 ] Upon every shm_lock call, we BUG_ON if an error was returned, indicating racing either in idr or in shm_destroy. Move this logic into the locking. [akpm@linux-foundation.org: simplify code] Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-10-22Initialize msg/shm IPC objects before doing ipc_addid()Linus Torvalds
commit b9a532277938798b53178d5a66af6e2915cb27cf upstream. As reported by Dmitry Vyukov, we really shouldn't do ipc_addid() before having initialized the IPC object state. Yes, we initialize the IPC object in a locked state, but with all the lockless RCU lookup work, that IPC object lock no longer means that the state cannot be seen. We already did this for the IPC semaphore code (see commit e8577d1f0329: "ipc/sem.c: fully initialize sem_array before making it visible") but we clearly forgot about msg and shm. Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-13ipc/sem.c: update/correct memory barriersManfred Spraul
commit 3ed1f8a99d70ea1cd1508910eb107d0edcae5009 upstream. sem_lock() did not properly pair memory barriers: !spin_is_locked() and spin_unlock_wait() are both only control barriers. The code needs an acquire barrier, otherwise the cpu might perform read operations before the lock test. As no primitive exists inside <include/spinlock.h> and since it seems noone wants another primitive, the code creates a local primitive within ipc/sem.c. With regards to -stable: The change of sem_wait_array() is a bugfix, the change to sem_lock() is a nop (just a preprocessor redefinition to improve the readability). The bugfix is necessary for all kernels that use sem_wait_array() (i.e.: starting from 3.10). Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Reported-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Kirill Tkhai <ktkhai@parallels.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-13ipc,sem: fix use after free on IPC_RMID after a task using same semaphore ↵Herton R. Krzesinski
set exits commit 602b8593d2b4138c10e922eeaafe306f6b51817b upstream. The current semaphore code allows a potential use after free: in exit_sem we may free the task's sem_undo_list while there is still another task looping through the same semaphore set and cleaning the sem_undo list at freeary function (the task called IPC_RMID for the same semaphore set). For example, with a test program [1] running which keeps forking a lot of processes (which then do a semop call with SEM_UNDO flag), and with the parent right after removing the semaphore set with IPC_RMID, and a kernel built with CONFIG_SLAB, CONFIG_SLAB_DEBUG and CONFIG_DEBUG_SPINLOCK, you can easily see something like the following in the kernel log: Slab corruption (Not tainted): kmalloc-64 start=ffff88003b45c1c0, len=64 000: 6b 6b 6b 6b 6b 6b 6b 6b 00 6b 6b 6b 6b 6b 6b 6b kkkkkkkk.kkkkkkk 010: ff ff ff ff 6b 6b 6b 6b ff ff ff ff ff ff ff ff ....kkkk........ Prev obj: start=ffff88003b45c180, len=64 000: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a .....N......ZZZZ 010: ff ff ff ff ff ff ff ff c0 fb 01 37 00 88 ff ff ...........7.... Next obj: start=ffff88003b45c200, len=64 000: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a .....N......ZZZZ 010: ff ff ff ff ff ff ff ff 68 29 a7 3c 00 88 ff ff ........h).<.... BUG: spinlock wrong CPU on CPU#2, test/18028 general protection fault: 0000 [#1] SMP Modules linked in: 8021q mrp garp stp llc nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc ppdev input_leds joydev parport_pc parport floppy serio_raw virtio_balloon virtio_rng virtio_console virtio_net iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr qxl ttm drm_kms_helper drm snd_hda_codec_generic i2c_piix4 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore crc32c_intel virtio_pci virtio_ring virtio pata_acpi ata_generic [last unloaded: speedstep_lib] CPU: 2 PID: 18028 Comm: test Not tainted 4.2.0-rc5+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014 RIP: spin_dump+0x53/0xc0 Call Trace: spin_bug+0x30/0x40 do_raw_spin_unlock+0x71/0xa0 _raw_spin_unlock+0xe/0x10 freeary+0x82/0x2a0 ? _raw_spin_lock+0xe/0x10 semctl_down.clone.0+0xce/0x160 ? __do_page_fault+0x19a/0x430 ? __audit_syscall_entry+0xa8/0x100 SyS_semctl+0x236/0x2c0 ? syscall_trace_leave+0xde/0x130 entry_SYSCALL_64_fastpath+0x12/0x71 Code: 8b 80 88 03 00 00 48 8d 88 60 05 00 00 48 c7 c7 a0 2c a4 81 31 c0 65 8b 15 eb 40 f3 7e e8 08 31 68 00 4d 85 e4 44 8b 4b 08 74 5e <45> 8b 84 24 88 03 00 00 49 8d 8c 24 60 05 00 00 8b 53 04 48 89 RIP [<ffffffff810d6053>] spin_dump+0x53/0xc0 RSP <ffff88003750fd68> ---[ end trace 783ebb76612867a0 ]--- NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [test:18053] Modules linked in: 8021q mrp garp stp llc nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc ppdev input_leds joydev parport_pc parport floppy serio_raw virtio_balloon virtio_rng virtio_console virtio_net iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr qxl ttm drm_kms_helper drm snd_hda_codec_generic i2c_piix4 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore crc32c_intel virtio_pci virtio_ring virtio pata_acpi ata_generic [last unloaded: speedstep_lib] CPU: 3 PID: 18053 Comm: test Tainted: G D 4.2.0-rc5+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014 RIP: native_read_tsc+0x0/0x20 Call Trace: ? delay_tsc+0x40/0x70 __delay+0xf/0x20 do_raw_spin_lock+0x96/0x140 _raw_spin_lock+0xe/0x10 sem_lock_and_putref+0x11/0x70 SYSC_semtimedop+0x7bf/0x960 ? handle_mm_fault+0xbf6/0x1880 ? dequeue_task_fair+0x79/0x4a0 ? __do_page_fault+0x19a/0x430 ? kfree_debugcheck+0x16/0x40 ? __do_page_fault+0x19a/0x430 ? __audit_syscall_entry+0xa8/0x100 ? do_audit_syscall_entry+0x66/0x70 ? syscall_trace_enter_phase1+0x139/0x160 SyS_semtimedop+0xe/0x10 SyS_semop+0x10/0x20 entry_SYSCALL_64_fastpath+0x12/0x71 Code: 47 10 83 e8 01 85 c0 89 47 10 75 08 65 48 89 3d 1f 74 ff 7e c9 c3 0f 1f 44 00 00 55 48 89 e5 e8 87 17 04 00 66 90 c9 c3 0f 1f 00 <55> 48 89 e5 0f 31 89 c1 48 89 d0 48 c1 e0 20 89 c9 48 09 c8 c9 Kernel panic - not syncing: softlockup: hung tasks I wasn't able to trigger any badness on a recent kernel without the proper config debugs enabled, however I have softlockup reports on some kernel versions, in the semaphore code, which are similar as above (the scenario is seen on some servers running IBM DB2 which uses semaphore syscalls). The patch here fixes the race against freeary, by acquiring or waiting on the sem_undo_list lock as necessary (exit_sem can race with freeary, while freeary sets un->semid to -1 and removes the same sem_undo from list_proc or when it removes the last sem_undo). After the patch I'm unable to reproduce the problem using the test case [1]. [1] Test case used below: #include <stdio.h> #include <sys/types.h> #include <sys/ipc.h> #include <sys/sem.h> #include <sys/wait.h> #include <stdlib.h> #include <time.h> #include <unistd.h> #include <errno.h> #define NSEM 1 #define NSET 5 int sid[NSET]; void thread() { struct sembuf op; int s; uid_t pid = getuid(); s = rand() % NSET; op.sem_num = pid % NSEM; op.sem_op = 1; op.sem_flg = SEM_UNDO; semop(sid[s], &op, 1); exit(EXIT_SUCCESS); } void create_set() { int i, j; pid_t p; union { int val; struct semid_ds *buf; unsigned short int *array; struct seminfo *__buf; } un; /* Create and initialize semaphore set */ for (i = 0; i < NSET; i++) { sid[i] = semget(IPC_PRIVATE , NSEM, 0644 | IPC_CREAT); if (sid[i] < 0) { perror("semget"); exit(EXIT_FAILURE); } } un.val = 0; for (i = 0; i < NSET; i++) { for (j = 0; j < NSEM; j++) { if (semctl(sid[i], j, SETVAL, un) < 0) perror("semctl"); } } /* Launch threads that operate on semaphore set */ for (i = 0; i < NSEM * NSET * NSET; i++) { p = fork(); if (p < 0) perror("fork"); if (p == 0) thread(); } /* Free semaphore set */ for (i = 0; i < NSET; i++) { if (semctl(sid[i], NSEM, IPC_RMID)) perror("IPC_RMID"); } /* Wait for forked processes to exit */ while (wait(NULL)) { if (errno == ECHILD) break; }; } int main(int argc, char **argv) { pid_t p; srand(time(NULL)); while (1) { p = fork(); if (p < 0) { perror("fork"); exit(EXIT_FAILURE); } if (p == 0) { create_set(); goto end; } /* Wait for forked processes to exit */ while (wait(NULL)) { if (errno == ECHILD) break; }; } end: return 0; } [akpm@linux-foundation.org: use normal comment layout] Signed-off-by: Herton R. Krzesinski <herton@redhat.com> Acked-by: Manfred Spraul <manfred@colorfullife.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Rafael Aquini <aquini@redhat.com> CC: Aristeu Rozanski <aris@redhat.com> Cc: David Jeffery <djeffery@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-08-16ipc: modify message queue accounting to not take kernel data structures into ↵Marcus Gelderie
account commit de54b9ac253787c366bbfb28d901a31954eb3511 upstream. A while back, the message queue implementation in the kernel was improved to use btrees to speed up retrieval of messages, in commit d6629859b36d ("ipc/mqueue: improve performance of send/recv"). That patch introducing the improved kernel handling of message queues (using btrees) has, as a by-product, changed the meaning of the QSIZE field in the pseudo-file created for the queue. Before, this field reflected the size of the user-data in the queue. Since, it also takes kernel data structures into account. For example, if 13 bytes of user data are in the queue, on my machine the file reports a size of 61 bytes. There was some discussion on this topic before (for example https://lkml.org/lkml/2014/10/1/115). Commenting on a th lkml, Michael Kerrisk gave the following background (https://lkml.org/lkml/2015/6/16/74): The pseudofiles in the mqueue filesystem (usually mounted at /dev/mqueue) expose fields with metadata describing a message queue. One of these fields, QSIZE, as originally implemented, showed the total number of bytes of user data in all messages in the message queue, and this feature was documented from the beginning in the mq_overview(7) page. In 3.5, some other (useful) work happened to break the user-space API in a couple of places, including the value exposed via QSIZE, which now includes a measure of kernel overhead bytes for the queue, a figure that renders QSIZE useless for its original purpose, since there's no way to deduce the number of overhead bytes consumed by the implementation. (The other user-space breakage was subsequently fixed.) This patch removes the accounting of kernel data structures in the queue. Reporting the size of these data-structures in the QSIZE field was a breaking change (see Michael's comment above). Without the QSIZE field reporting the total size of user-data in the queue, there is no way to deduce this number. It should be noted that the resource limit RLIMIT_MSGQUEUE is counted against the worst-case size of the queue (in both the old and the new implementation). Therefore, the kernel overhead accounting in QSIZE is not necessary to help the user understand the limitations RLIMIT imposes on the processes. Signed-off-by: Marcus Gelderie <redmnic@gmail.com> Acked-by: Doug Ledford <dledford@redhat.com> Acked-by: Michael Kerrisk <mtk.manpages@gmail.com> Acked-by: Davidlohr Bueso <dbueso@suse.de> Cc: David Howells <dhowells@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: John Duffy <jb_duffy@btinternet.com> Cc: Arto Bendiken <arto@bendiken.net> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-26Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull fourth vfs update from Al Viro: "d_inode() annotations from David Howells (sat in for-next since before the beginning of merge window) + four assorted fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: RCU pathwalk breakage when running into a symlink overmounting something fix I_DIO_WAKEUP definition direct-io: only inc/dec inode->i_dio_count for file systems fs/9p: fix readdir() VFS: assorted d_backing_inode() annotations VFS: fs/inode.c helpers: d_inode() annotations VFS: fs/cachefiles: d_backing_inode() annotations VFS: fs library helpers: d_inode() annotations VFS: assorted weird filesystems: d_inode() annotations VFS: normal filesystems (and lustre): d_inode() annotations VFS: security/: d_inode() annotations VFS: security/: d_backing_inode() annotations VFS: net/: d_inode() annotations VFS: net/unix: d_backing_inode() annotations VFS: kernel/: d_inode() annotations VFS: audit: d_backing_inode() annotations VFS: Fix up some ->d_inode accesses in the chelsio driver VFS: Cachefiles should perform fs modifications on the top layer only VFS: AF_UNIX sockets should call mknod on the top layer only
2015-04-15ipc: remove use of seq_printf return valueJoe Perches
The seq_printf return value, because it's frequently misused, will eventually be converted to void. See: commit 1f33c41c03da ("seq_file: Rename seq_overflow() to seq_has_overflowed() and make public") Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-15VFS: assorted weird filesystems: d_inode() annotationsDavid Howells
Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-17ipc,sem: use current->state helpersDavidlohr Bueso
Call __set_current_state() instead of assigning the new state directly. These interfaces also aid CONFIG_DEBUG_ATOMIC_SLEEP environments, keeping track of who changed the state. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-16Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile #2 from Al Viro: "Next pile (and there'll be one or two more). The large piece in this one is getting rid of /proc/*/ns/* weirdness; among other things, it allows to (finally) make nameidata completely opaque outside of fs/namei.c, making for easier further cleanups in there" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: coda_venus_readdir(): use file_inode() fs/namei.c: fold link_path_walk() call into path_init() path_init(): don't bother with LOOKUP_PARENT in argument fs/namei.c: new helper (path_cleanup()) path_init(): store the "base" pointer to file in nameidata itself make default ->i_fop have ->open() fail with ENXIO make nameidata completely opaque outside of fs/namei.c kill proc_ns completely take the targets of /proc/*/ns/* symlinks to separate fs bury struct proc_ns in fs/proc copy address of proc_ns_ops into ns_common new helpers: ns_alloc_inum/ns_free_inum make proc_ns_operations work with struct ns_common * instead of void * switch the rest of proc_ns_operations to working with &...->ns netns: switch ->get()/->put()/->install()/->inum() to working with &net->ns make mntns ->get()/->put()/->install()/->inum() work with &mnt_ns->ns common object embedded into various struct ....ns
2014-12-13shmdt: use i_size_read() instead of ->i_sizeDave Hansen
Andrew Morton noted http://lkml.kernel.org/r/20141104142027.a7a0d010772d84560b445f59@linux-foundation.org that the shmdt uses inode->i_size outside of i_mutex being held. There is one more case in shm.c in shm_destroy(). This converts both users over to use i_size_read(). Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-13ipc/shm.c: fix overly aggressive shmdt() when calls span multiple segmentsDave Hansen
This is a highly-contrived scenario. But, a single shmdt() call can be induced in to unmapping memory from mulitple shm segments. Example code is here: http://www.sr71.net/~dave/intel/shmfun.c The fix is pretty simple: Record the 'struct file' for the first VMA we encounter and then stick to it. Decline to unmap anything not from the same file and thus the same segment. I found this by inspection and the odds of anyone hitting this in practice are pretty darn small. Lightly tested, but it's a pretty small patch. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Manfred Spraul <manfred@colorfullife.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-13ipc/msg: increase MSGMNI, remove scalingManfred Spraul
SysV can be abused to allocate locked kernel memory. For most systems, a small limit doesn't make sense, see the discussion with regards to SHMMAX. Therefore: increase MSGMNI to the maximum supported. And: If we ignore the risk of locking too much memory, then an automatic scaling of MSGMNI doesn't make sense. Therefore the logic can be removed. The code preserves auto_msgmni to avoid breaking any user space applications that expect that the value exists. Notes: 1) If an administrator must limit the memory allocations, then he can set MSGMNI as necessary. Or he can disable sysv entirely (as e.g. done by Android). 2) MSGMAX and MSGMNB are intentionally not increased, as these values are used to control latency vs. throughput: If MSGMNB is large, then msgsnd() just returns and more messages can be queued before a task switch to a task that calls msgrcv() is forced. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Rafael Aquini <aquini@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>