summaryrefslogtreecommitdiff
path: root/include/linux/vfio.h
diff options
context:
space:
mode:
authorJason Gunthorpe <jgg@nvidia.com>2022-02-24 16:20:18 +0200
committerLeon Romanovsky <leonro@nvidia.com>2022-03-03 12:57:39 +0200
commit115dcec65f61d53e25e1bed5e380468b30f98b14 (patch)
tree521de2561eb5f663afa742e033713658abb1a32d /include/linux/vfio.h
parent445ad495f0ff71553693e6a2a9d7a3bc6917ca36 (diff)
vfio: Define device migration protocol v2
Replace the existing region based migration protocol with an ioctl based protocol. The two protocols have the same general semantic behaviors, but the way the data is transported is changed. This is the STOP_COPY portion of the new protocol, it defines the 5 states for basic stop and copy migration and the protocol to move the migration data in/out of the kernel. Compared to the clarification of the v1 protocol Alex proposed: https://lore.kernel.org/r/163909282574.728533.7460416142511440919.stgit@omen This has a few deliberate functional differences: - ERROR arcs allow the device function to remain unchanged. - The protocol is not required to return to the original state on transition failure. Instead userspace can execute an unwind back to the original state, reset, or do something else without needing kernel support. This simplifies the kernel design and should userspace choose a policy like always reset, avoids doing useless work in the kernel on error handling paths. - PRE_COPY is made optional, userspace must discover it before using it. This reflects the fact that the majority of drivers we are aware of right now will not implement PRE_COPY. - segmentation is not part of the data stream protocol, the receiver does not have to reproduce the framing boundaries. The hybrid FSM for the device_state is described as a Mealy machine by documenting each of the arcs the driver is required to implement. Defining the remaining set of old/new device_state transitions as 'combination transitions' which are naturally defined as taking multiple FSM arcs along the shortest path within the FSM's digraph allows a complete matrix of transitions. A new VFIO_DEVICE_FEATURE of VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE is defined to replace writing to the device_state field in the region. This allows returning a brand new FD whenever the requested transition opens a data transfer session. The VFIO core code implements the new feature and provides a helper function to the driver. Using the helper the driver only has to implement 6 of the FSM arcs and the other combination transitions are elaborated consistently from those arcs. A new VFIO_DEVICE_FEATURE of VFIO_DEVICE_FEATURE_MIGRATION is defined to report the capability for migration and indicate which set of states and arcs are supported by the device. The FSM provides a lot of flexibility to make backwards compatible extensions but the VFIO_DEVICE_FEATURE also allows for future breaking extensions for scenarios that cannot support even the basic STOP_COPY requirements. The VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE with the GET option (i.e. VFIO_DEVICE_FEATURE_GET) can be used to read the current migration state of the VFIO device. Data transfer sessions are now carried over a file descriptor, instead of the region. The FD functions for the lifetime of the data transfer session. read() and write() transfer the data with normal Linux stream FD semantics. This design allows future expansion to support poll(), io_uring, and other performance optimizations. The complicated mmap mode for data transfer is discarded as current qemu doesn't take meaningful advantage of it, and the new qemu implementation avoids substantially all the performance penalty of using a read() on the region. Link: https://lore.kernel.org/all/20220224142024.147653-10-yishaih@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Diffstat (limited to 'include/linux/vfio.h')
-rw-r--r--include/linux/vfio.h20
1 files changed, 20 insertions, 0 deletions
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 550c28f2ef60..c44e80bbbd3b 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -56,6 +56,16 @@ struct vfio_device {
* match, -errno for abort (ex. match with insufficient or incorrect
* additional args)
* @device_feature: Optional, fill in the VFIO_DEVICE_FEATURE ioctl
+ * @migration_set_state: Optional callback to change the migration state for
+ * devices that support migration. It's mandatory for
+ * VFIO_DEVICE_FEATURE_MIGRATION migration support.
+ * The returned FD is used for data transfer according to the FSM
+ * definition. The driver is responsible to ensure that FD reaches end
+ * of stream or error whenever the migration FSM leaves a data transfer
+ * state or before close_device() returns.
+ * @migration_get_state: Optional callback to get the migration state for
+ * devices that support migration. It's mandatory for
+ * VFIO_DEVICE_FEATURE_MIGRATION migration support.
*/
struct vfio_device_ops {
char *name;
@@ -72,6 +82,11 @@ struct vfio_device_ops {
int (*match)(struct vfio_device *vdev, char *buf);
int (*device_feature)(struct vfio_device *device, u32 flags,
void __user *arg, size_t argsz);
+ struct file *(*migration_set_state)(
+ struct vfio_device *device,
+ enum vfio_device_mig_state new_state);
+ int (*migration_get_state)(struct vfio_device *device,
+ enum vfio_device_mig_state *curr_state);
};
/**
@@ -114,6 +129,11 @@ extern void vfio_device_put(struct vfio_device *device);
int vfio_assign_device_set(struct vfio_device *device, void *set_id);
+int vfio_mig_get_next_state(struct vfio_device *device,
+ enum vfio_device_mig_state cur_fsm,
+ enum vfio_device_mig_state new_fsm,
+ enum vfio_device_mig_state *next_fsm);
+
/*
* External user API
*/