Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

qemu with hax to log dma reads & writes jcs.org/2018/11/12/vfio

virtio,vhost,pc: features, fixes, cleanups.

Virtio 1.0 support for virtio-mmio.
Misc fixes, cleanups.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Mon 16 Sep 2019 16:17:30 BST
# gpg: using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
virtio-mmio: implement modern (v2) personality (virtio-1)
virtio pmem: user document
intel_iommu: Remove the caching-mode check during flag change
pc/q35: Disallow vfio-pci hotplug without VT-d caching mode
qdev/machine: Introduce hotplug_allowed hook
intel_iommu: Sanity check vfio-pci config on machine init done
backends/vhost-user.c: prevent using uninitialized vqs
vhost-user-blk: prevent using uninitialized vqs
docs/nvdimm: add example on persistent backend setup
MAINTAINERS: update virtio-rng and virtio-serial maintainer

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Peter Maydell 6 years ago f3964112 f8c3db33

+521 -33

12 changed files

expand all

MAINTAINERS

backends

vhost-user.c

docs

nvdimm.txt

virtio-pmem.rst

block

vhost-user-blk.c

core

qdev.c

i386

intel_iommu.c

pc.c

virtio

virtio-mmio.c

include

boards.h

qdev-core.h

qdev-monitor.c

+4 -2

MAINTAINERS

··· 1554 1554 F: contrib/vhost-user-input/* 1555 1555 1556 1556 virtio-serial 1557 - M: Amit Shah <amit@kernel.org> 1557 + M: Laurent Vivier <lvivier@redhat.com> 1558 + R: Amit Shah <amit@kernel.org> 1558 1559 S: Supported 1559 1560 F: hw/char/virtio-serial-bus.c 1560 1561 F: hw/char/virtio-console.c ··· 1563 1564 F: tests/virtio-serial-test.c 1564 1565 1565 1566 virtio-rng 1566 - M: Amit Shah <amit@kernel.org> 1567 + M: Laurent Vivier <lvivier@redhat.com> 1568 + R: Amit Shah <amit@kernel.org> 1567 1569 S: Supported 1568 1570 F: hw/virtio/virtio-rng.c 1569 1571 F: include/hw/virtio/virtio-rng.h

+1 -1

backends/vhost-user.c

··· 46 46 47 47 b->vdev = vdev; 48 48 b->dev.nvqs = nvqs; 49 - b->dev.vqs = g_new(struct vhost_virtqueue, nvqs); 49 + b->dev.vqs = g_new0(struct vhost_virtqueue, nvqs); 50 50 51 51 ret = vhost_dev_init(&b->dev, &b->vhost_user, VHOST_BACKEND_TYPE_USER, 0); 52 52 if (ret < 0) {

+31

docs/nvdimm.txt

··· 171 171 accept persistent writes. In result, for example, the guest Linux 172 172 NVDIMM driver, marks such vNVDIMM device as read-only. 173 173 174 + Backend File Setup Example 175 + -------------------------- 176 + 177 + Here are two examples showing how to setup these persistent backends on 178 + linux using the tool ndctl [3]. 179 + 180 + A. DAX device 181 + 182 + Use the following command to set up /dev/dax0.0 so that the entirety of 183 + namespace0.0 can be exposed as an emulated NVDIMM to the guest: 184 + 185 + ndctl create-namespace -f -e namespace0.0 -m devdax 186 + 187 + The /dev/dax0.0 could be used directly in "mem-path" option. 188 + 189 + B. DAX file 190 + 191 + Individual files on a DAX host file system can be exposed as emulated 192 + NVDIMMS. First an fsdax block device is created, partitioned, and then 193 + mounted with the "dax" mount option: 194 + 195 + ndctl create-namespace -f -e namespace0.0 -m fsdax 196 + (partition /dev/pmem0 with name pmem0p1) 197 + mount -o dax /dev/pmem0p1 /mnt 198 + (create or copy a disk image file with qemu-img(1), cp(1), or dd(1) 199 + in /mnt) 200 + 201 + Then the new file in /mnt could be used in "mem-path" option. 202 + 174 203 NVDIMM Persistence 175 204 ------------------ 176 205 ··· 212 241 https://www.snia.org/sites/default/files/technical_work/final/NVMProgrammingModel_v1.2.pdf 213 242 [2] Persistent Memory Development Kit (PMDK), formerly known as NVML project, home page: 214 243 http://pmem.io/pmdk/ 244 + [3] ndctl-create-namespace - provision or reconfigure a namespace 245 + http://pmem.io/ndctl/ndctl-create-namespace.html

+75

docs/virtio-pmem.rst

··· 1 + 2 + ======================== 3 + QEMU virtio pmem 4 + ======================== 5 + 6 + This document explains the setup and usage of the virtio pmem device 7 + which is available since QEMU v4.1.0. 8 + 9 + The virtio pmem device is a paravirtualized persistent memory device 10 + on regular (i.e non-NVDIMM) storage. 11 + 12 + Usecase 13 + -------- 14 + 15 + Virtio pmem allows to bypass the guest page cache and directly use 16 + host page cache. This reduces guest memory footprint as the host can 17 + make efficient memory reclaim decisions under memory pressure. 18 + 19 + o How does virtio-pmem compare to the nvdimm emulation supported by QEMU? 20 + 21 + NVDIMM emulation on regular (i.e. non-NVDIMM) host storage does not 22 + persist the guest writes as there are no defined semantics in the device 23 + specification. The virtio pmem device provides guest write persistence 24 + on non-NVDIMM host storage. 25 + 26 + virtio pmem usage 27 + ----------------- 28 + 29 + A virtio pmem device backed by a memory-backend-file can be created on 30 + the QEMU command line as in the following example: 31 + 32 + -object memory-backend-file,id=mem1,share,mem-path=./virtio_pmem.img,size=4G 33 + -device virtio-pmem-pci,memdev=mem1,id=nv1 34 + 35 + where: 36 + - "object memory-backend-file,id=mem1,share,mem-path=<image>, size=<image size>" 37 + creates a backend file with the specified size. 38 + 39 + - "device virtio-pmem-pci,id=nvdimm1,memdev=mem1" creates a virtio pmem 40 + pci device whose storage is provided by above memory backend device. 41 + 42 + Multiple virtio pmem devices can be created if multiple pairs of "-object" 43 + and "-device" are provided. 44 + 45 + Hotplug 46 + ------- 47 + 48 + Virtio pmem devices can be hotplugged via the QEMU monitor. First, the 49 + memory backing has to be added via 'object_add'; afterwards, the virtio 50 + pmem device can be added via 'device_add'. 51 + 52 + For example, the following commands add another 4GB virtio pmem device to 53 + the guest: 54 + 55 + (qemu) object_add memory-backend-file,id=mem2,share=on,mem-path=virtio_pmem2.img,size=4G 56 + (qemu) device_add virtio-pmem-pci,id=virtio_pmem2,memdev=mem2 57 + 58 + Guest Data Persistence 59 + ---------------------- 60 + 61 + Guest data persistence on non-NVDIMM requires guest userspace applications 62 + to perform fsync/msync. This is different from a real nvdimm backend where 63 + no additional fsync/msync is required. This is to persist guest writes in 64 + host backing file which otherwise remains in host page cache and there is 65 + risk of losing the data in case of power failure. 66 + 67 + With virtio pmem device, MAP_SYNC mmap flag is not supported. This provides 68 + a hint to application to perform fsync for write persistence. 69 + 70 + Limitations 71 + ------------ 72 + - Real nvdimm device backend is not supported. 73 + - virtio pmem hotunplug is not supported. 74 + - ACPI NVDIMM features like regions/namespaces are not supported. 75 + - ndctl command is not supported.

+1 -1

hw/block/vhost-user-blk.c

··· 421 421 } 422 422 423 423 s->inflight = g_new0(struct vhost_inflight, 1); 424 - s->vqs = g_new(struct vhost_virtqueue, s->num_queues); 424 + s->vqs = g_new0(struct vhost_virtqueue, s->num_queues); 425 425 s->watch = 0; 426 426 s->connected = false; 427 427

+17

hw/core/qdev.c

··· 237 237 return NULL; 238 238 } 239 239 240 + bool qdev_hotplug_allowed(DeviceState *dev, Error **errp) 241 + { 242 + MachineState *machine; 243 + MachineClass *mc; 244 + Object *m_obj = qdev_get_machine(); 245 + 246 + if (object_dynamic_cast(m_obj, TYPE_MACHINE)) { 247 + machine = MACHINE(m_obj); 248 + mc = MACHINE_GET_CLASS(machine); 249 + if (mc->hotplug_allowed) { 250 + return mc->hotplug_allowed(machine, dev, errp); 251 + } 252 + } 253 + 254 + return true; 255 + } 256 + 240 257 HotplugHandler *qdev_get_bus_hotplug_handler(DeviceState *dev) 241 258 { 242 259 if (dev->parent_bus) {

+35 -6

hw/i386/intel_iommu.c

··· 35 35 #include "hw/i386/x86-iommu.h" 36 36 #include "hw/pci-host/q35.h" 37 37 #include "sysemu/kvm.h" 38 + #include "sysemu/sysemu.h" 38 39 #include "hw/i386/apic_internal.h" 39 40 #include "kvm_i386.h" 40 41 #include "migration/vmstate.h" ··· 63 64 64 65 static void vtd_address_space_refresh_all(IntelIOMMUState *s); 65 66 static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n); 67 + 68 + static void vtd_panic_require_caching_mode(void) 69 + { 70 + error_report("We need to set caching-mode=on for intel-iommu to enable " 71 + "device assignment with IOMMU protection."); 72 + exit(1); 73 + } 66 74 67 75 static void vtd_define_quad(IntelIOMMUState *s, hwaddr addr, uint64_t val, 68 76 uint64_t wmask, uint64_t w1cmask) ··· 2928 2936 VTDAddressSpace *vtd_as = container_of(iommu, VTDAddressSpace, iommu); 2929 2937 IntelIOMMUState *s = vtd_as->iommu_state; 2930 2938 2931 - if (!s->caching_mode && new & IOMMU_NOTIFIER_MAP) { 2932 - error_report("We need to set caching-mode=on for intel-iommu to enable " 2933 - "device assignment with IOMMU protection."); 2934 - exit(1); 2935 - } 2936 - 2937 2939 /* Update per-address-space notifier flags */ 2938 2940 vtd_as->notifier_flags = new; 2939 2941 ··· 3699 3701 return true; 3700 3702 } 3701 3703 3704 + static int vtd_machine_done_notify_one(Object *child, void *unused) 3705 + { 3706 + IntelIOMMUState *iommu = INTEL_IOMMU_DEVICE(x86_iommu_get_default()); 3707 + 3708 + /* 3709 + * We hard-coded here because vfio-pci is the only special case 3710 + * here. Let's be more elegant in the future when we can, but so 3711 + * far there seems to be no better way. 3712 + */ 3713 + if (object_dynamic_cast(child, "vfio-pci") && !iommu->caching_mode) { 3714 + vtd_panic_require_caching_mode(); 3715 + } 3716 + 3717 + return 0; 3718 + } 3719 + 3720 + static void vtd_machine_done_hook(Notifier *notifier, void *unused) 3721 + { 3722 + object_child_foreach_recursive(object_get_root(), 3723 + vtd_machine_done_notify_one, NULL); 3724 + } 3725 + 3726 + static Notifier vtd_machine_done_notify = { 3727 + .notify = vtd_machine_done_hook, 3728 + }; 3729 + 3702 3730 static void vtd_realize(DeviceState *dev, Error **errp) 3703 3731 { 3704 3732 MachineState *ms = MACHINE(qdev_get_machine()); ··· 3744 3772 pci_setup_iommu(bus, vtd_host_dma_iommu, dev); 3745 3773 /* Pseudo address space under root PCI bus. */ 3746 3774 pcms->ioapic_as = vtd_host_dma_iommu(bus, s, Q35_PSEUDO_DEVFN_IOAPIC); 3775 + qemu_add_machine_init_done_notifier(&vtd_machine_done_notify); 3747 3776 } 3748 3777 3749 3778 static void vtd_class_init(ObjectClass *klass, void *data)

+21

hw/i386/pc.c

··· 2756 2756 } 2757 2757 } 2758 2758 2759 + 2760 + static bool pc_hotplug_allowed(MachineState *ms, DeviceState *dev, Error **errp) 2761 + { 2762 + X86IOMMUState *iommu = x86_iommu_get_default(); 2763 + IntelIOMMUState *intel_iommu; 2764 + 2765 + if (iommu && 2766 + object_dynamic_cast((Object *)iommu, TYPE_INTEL_IOMMU_DEVICE) && 2767 + object_dynamic_cast((Object *)dev, "vfio-pci")) { 2768 + intel_iommu = INTEL_IOMMU_DEVICE(iommu); 2769 + if (!intel_iommu->caching_mode) { 2770 + error_setg(errp, "Device assignment is not allowed without " 2771 + "enabling caching-mode=on for Intel IOMMU."); 2772 + return false; 2773 + } 2774 + } 2775 + 2776 + return true; 2777 + } 2778 + 2759 2779 static void pc_machine_class_init(ObjectClass *oc, void *data) 2760 2780 { 2761 2781 MachineClass *mc = MACHINE_CLASS(oc); ··· 2780 2800 pcmc->pvh_enabled = true; 2781 2801 assert(!mc->get_hotplug_handler); 2782 2802 mc->get_hotplug_handler = pc_get_hotplug_handler; 2803 + mc->hotplug_allowed = pc_hotplug_allowed; 2783 2804 mc->cpu_index_to_instance_props = pc_cpu_index_to_props; 2784 2805 mc->get_default_cpu_node_id = pc_get_default_cpu_node_id; 2785 2806 mc->possible_cpu_arch_ids = pc_possible_cpu_arch_ids;

+319 -23

hw/virtio/virtio-mmio.c

··· 50 50 OBJECT_CHECK(VirtIOMMIOProxy, (obj), TYPE_VIRTIO_MMIO) 51 51 52 52 #define VIRT_MAGIC 0x74726976 /* 'virt' */ 53 - #define VIRT_VERSION 1 53 + #define VIRT_VERSION 2 54 + #define VIRT_VERSION_LEGACY 1 54 55 #define VIRT_VENDOR 0x554D4551 /* 'QEMU' */ 56 + 57 + typedef struct VirtIOMMIOQueue { 58 + uint16_t num; 59 + bool enabled; 60 + uint32_t desc[2]; 61 + uint32_t avail[2]; 62 + uint32_t used[2]; 63 + } VirtIOMMIOQueue; 55 64 56 65 typedef struct { 57 66 /* Generic */ 58 67 SysBusDevice parent_obj; 59 68 MemoryRegion iomem; 60 69 qemu_irq irq; 70 + bool legacy; 61 71 /* Guest accessible state needing migration and reset */ 62 72 uint32_t host_features_sel; 63 73 uint32_t guest_features_sel; ··· 65 75 /* virtio-bus */ 66 76 VirtioBusState bus; 67 77 bool format_transport_address; 78 + /* Fields only used for non-legacy (v2) devices */ 79 + uint32_t guest_features[2]; 80 + VirtIOMMIOQueue vqs[VIRTIO_QUEUE_MAX]; 68 81 } VirtIOMMIOProxy; 69 82 70 83 static bool virtio_mmio_ioeventfd_enabled(DeviceState *d) ··· 118 131 case VIRTIO_MMIO_MAGIC_VALUE: 119 132 return VIRT_MAGIC; 120 133 case VIRTIO_MMIO_VERSION: 121 - return VIRT_VERSION; 134 + if (proxy->legacy) { 135 + return VIRT_VERSION_LEGACY; 136 + } else { 137 + return VIRT_VERSION; 138 + } 122 139 case VIRTIO_MMIO_VENDOR_ID: 123 140 return VIRT_VENDOR; 124 141 default: ··· 149 166 case VIRTIO_MMIO_MAGIC_VALUE: 150 167 return VIRT_MAGIC; 151 168 case VIRTIO_MMIO_VERSION: 152 - return VIRT_VERSION; 169 + if (proxy->legacy) { 170 + return VIRT_VERSION_LEGACY; 171 + } else { 172 + return VIRT_VERSION; 173 + } 153 174 case VIRTIO_MMIO_DEVICE_ID: 154 175 return vdev->device_id; 155 176 case VIRTIO_MMIO_VENDOR_ID: 156 177 return VIRT_VENDOR; 157 178 case VIRTIO_MMIO_DEVICE_FEATURES: 158 - if (proxy->host_features_sel) { 159 - return 0; 179 + if (proxy->legacy) { 180 + if (proxy->host_features_sel) { 181 + return 0; 182 + } else { 183 + return vdev->host_features; 184 + } 185 + } else { 186 + VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev); 187 + return (vdev->host_features & ~vdc->legacy_features) 188 + >> (32 * proxy->host_features_sel); 160 189 } 161 - return vdev->host_features; 162 190 case VIRTIO_MMIO_QUEUE_NUM_MAX: 163 191 if (!virtio_queue_get_num(vdev, vdev->queue_sel)) { 164 192 return 0; 165 193 } 166 194 return VIRTQUEUE_MAX_SIZE; 167 195 case VIRTIO_MMIO_QUEUE_PFN: 196 + if (!proxy->legacy) { 197 + qemu_log_mask(LOG_GUEST_ERROR, 198 + "%s: read from legacy register (0x%" 199 + HWADDR_PRIx ") in non-legacy mode\n", 200 + __func__, offset); 201 + return 0; 202 + } 168 203 return virtio_queue_get_addr(vdev, vdev->queue_sel) 169 204 >> proxy->guest_page_shift; 205 + case VIRTIO_MMIO_QUEUE_READY: 206 + if (proxy->legacy) { 207 + qemu_log_mask(LOG_GUEST_ERROR, 208 + "%s: read from non-legacy register (0x%" 209 + HWADDR_PRIx ") in legacy mode\n", 210 + __func__, offset); 211 + return 0; 212 + } 213 + return proxy->vqs[vdev->queue_sel].enabled; 170 214 case VIRTIO_MMIO_INTERRUPT_STATUS: 171 215 return atomic_read(&vdev->isr); 172 216 case VIRTIO_MMIO_STATUS: 173 217 return vdev->status; 218 + case VIRTIO_MMIO_CONFIG_GENERATION: 219 + if (proxy->legacy) { 220 + qemu_log_mask(LOG_GUEST_ERROR, 221 + "%s: read from non-legacy register (0x%" 222 + HWADDR_PRIx ") in legacy mode\n", 223 + __func__, offset); 224 + return 0; 225 + } 226 + return vdev->generation; 174 227 case VIRTIO_MMIO_DEVICE_FEATURES_SEL: 175 228 case VIRTIO_MMIO_DRIVER_FEATURES: 176 229 case VIRTIO_MMIO_DRIVER_FEATURES_SEL: ··· 180 233 case VIRTIO_MMIO_QUEUE_ALIGN: 181 234 case VIRTIO_MMIO_QUEUE_NOTIFY: 182 235 case VIRTIO_MMIO_INTERRUPT_ACK: 236 + case VIRTIO_MMIO_QUEUE_DESC_LOW: 237 + case VIRTIO_MMIO_QUEUE_DESC_HIGH: 238 + case VIRTIO_MMIO_QUEUE_AVAIL_LOW: 239 + case VIRTIO_MMIO_QUEUE_AVAIL_HIGH: 240 + case VIRTIO_MMIO_QUEUE_USED_LOW: 241 + case VIRTIO_MMIO_QUEUE_USED_HIGH: 183 242 qemu_log_mask(LOG_GUEST_ERROR, 184 - "%s: read of write-only register\n", 185 - __func__); 243 + "%s: read of write-only register (0x%" HWADDR_PRIx ")\n", 244 + __func__, offset); 186 245 return 0; 187 246 default: 188 - qemu_log_mask(LOG_GUEST_ERROR, "%s: bad register offset\n", __func__); 247 + qemu_log_mask(LOG_GUEST_ERROR, 248 + "%s: bad register offset (0x%" HWADDR_PRIx ")\n", 249 + __func__, offset); 189 250 return 0; 190 251 } 191 252 return 0; ··· 232 293 } 233 294 switch (offset) { 234 295 case VIRTIO_MMIO_DEVICE_FEATURES_SEL: 235 - proxy->host_features_sel = value; 296 + if (value) { 297 + proxy->host_features_sel = 1; 298 + } else { 299 + proxy->host_features_sel = 0; 300 + } 236 301 break; 237 302 case VIRTIO_MMIO_DRIVER_FEATURES: 238 - if (!proxy->guest_features_sel) { 239 - virtio_set_features(vdev, value); 303 + if (proxy->legacy) { 304 + if (proxy->guest_features_sel) { 305 + qemu_log_mask(LOG_GUEST_ERROR, 306 + "%s: attempt to write guest features with " 307 + "guest_features_sel > 0 in legacy mode\n", 308 + __func__); 309 + } else { 310 + virtio_set_features(vdev, value); 311 + } 312 + } else { 313 + proxy->guest_features[proxy->guest_features_sel] = value; 240 314 } 241 315 break; 242 316 case VIRTIO_MMIO_DRIVER_FEATURES_SEL: 243 - proxy->guest_features_sel = value; 317 + if (value) { 318 + proxy->guest_features_sel = 1; 319 + } else { 320 + proxy->guest_features_sel = 0; 321 + } 244 322 break; 245 323 case VIRTIO_MMIO_GUEST_PAGE_SIZE: 324 + if (!proxy->legacy) { 325 + qemu_log_mask(LOG_GUEST_ERROR, 326 + "%s: write to legacy register (0x%" 327 + HWADDR_PRIx ") in non-legacy mode\n", 328 + __func__, offset); 329 + return; 330 + } 246 331 proxy->guest_page_shift = ctz32(value); 247 332 if (proxy->guest_page_shift > 31) { 248 333 proxy->guest_page_shift = 0; ··· 256 341 break; 257 342 case VIRTIO_MMIO_QUEUE_NUM: 258 343 trace_virtio_mmio_queue_write(value, VIRTQUEUE_MAX_SIZE); 259 - virtio_queue_set_num(vdev, vdev->queue_sel, value); 260 - /* Note: only call this function for legacy devices */ 261 - virtio_queue_update_rings(vdev, vdev->queue_sel); 344 + if (proxy->legacy) { 345 + virtio_queue_set_num(vdev, vdev->queue_sel, value); 346 + virtio_queue_update_rings(vdev, vdev->queue_sel); 347 + } else { 348 + proxy->vqs[vdev->queue_sel].num = value; 349 + } 262 350 break; 263 351 case VIRTIO_MMIO_QUEUE_ALIGN: 264 - /* Note: this is only valid for legacy devices */ 352 + if (!proxy->legacy) { 353 + qemu_log_mask(LOG_GUEST_ERROR, 354 + "%s: write to legacy register (0x%" 355 + HWADDR_PRIx ") in non-legacy mode\n", 356 + __func__, offset); 357 + return; 358 + } 265 359 virtio_queue_set_align(vdev, vdev->queue_sel, value); 266 360 break; 267 361 case VIRTIO_MMIO_QUEUE_PFN: 362 + if (!proxy->legacy) { 363 + qemu_log_mask(LOG_GUEST_ERROR, 364 + "%s: write to legacy register (0x%" 365 + HWADDR_PRIx ") in non-legacy mode\n", 366 + __func__, offset); 367 + return; 368 + } 268 369 if (value == 0) { 269 370 virtio_reset(vdev); 270 371 } else { ··· 272 373 value << proxy->guest_page_shift); 273 374 } 274 375 break; 376 + case VIRTIO_MMIO_QUEUE_READY: 377 + if (proxy->legacy) { 378 + qemu_log_mask(LOG_GUEST_ERROR, 379 + "%s: write to non-legacy register (0x%" 380 + HWADDR_PRIx ") in legacy mode\n", 381 + __func__, offset); 382 + return; 383 + } 384 + if (value) { 385 + virtio_queue_set_num(vdev, vdev->queue_sel, 386 + proxy->vqs[vdev->queue_sel].num); 387 + virtio_queue_set_rings(vdev, vdev->queue_sel, 388 + ((uint64_t)proxy->vqs[vdev->queue_sel].desc[1]) << 32 | 389 + proxy->vqs[vdev->queue_sel].desc[0], 390 + ((uint64_t)proxy->vqs[vdev->queue_sel].avail[1]) << 32 | 391 + proxy->vqs[vdev->queue_sel].avail[0], 392 + ((uint64_t)proxy->vqs[vdev->queue_sel].used[1]) << 32 | 393 + proxy->vqs[vdev->queue_sel].used[0]); 394 + proxy->vqs[vdev->queue_sel].enabled = 1; 395 + } else { 396 + proxy->vqs[vdev->queue_sel].enabled = 0; 397 + } 398 + break; 275 399 case VIRTIO_MMIO_QUEUE_NOTIFY: 276 400 if (value < VIRTIO_QUEUE_MAX) { 277 401 virtio_queue_notify(vdev, value); ··· 286 410 virtio_mmio_stop_ioeventfd(proxy); 287 411 } 288 412 413 + if (!proxy->legacy && (value & VIRTIO_CONFIG_S_FEATURES_OK)) { 414 + virtio_set_features(vdev, 415 + ((uint64_t)proxy->guest_features[1]) << 32 | 416 + proxy->guest_features[0]); 417 + } 418 + 289 419 virtio_set_status(vdev, value & 0xff); 290 420 291 421 if (value & VIRTIO_CONFIG_S_DRIVER_OK) { ··· 296 426 virtio_reset(vdev); 297 427 } 298 428 break; 429 + case VIRTIO_MMIO_QUEUE_DESC_LOW: 430 + if (proxy->legacy) { 431 + qemu_log_mask(LOG_GUEST_ERROR, 432 + "%s: write to non-legacy register (0x%" 433 + HWADDR_PRIx ") in legacy mode\n", 434 + __func__, offset); 435 + return; 436 + } 437 + proxy->vqs[vdev->queue_sel].desc[0] = value; 438 + break; 439 + case VIRTIO_MMIO_QUEUE_DESC_HIGH: 440 + if (proxy->legacy) { 441 + qemu_log_mask(LOG_GUEST_ERROR, 442 + "%s: write to non-legacy register (0x%" 443 + HWADDR_PRIx ") in legacy mode\n", 444 + __func__, offset); 445 + return; 446 + } 447 + proxy->vqs[vdev->queue_sel].desc[1] = value; 448 + break; 449 + case VIRTIO_MMIO_QUEUE_AVAIL_LOW: 450 + if (proxy->legacy) { 451 + qemu_log_mask(LOG_GUEST_ERROR, 452 + "%s: write to non-legacy register (0x%" 453 + HWADDR_PRIx ") in legacy mode\n", 454 + __func__, offset); 455 + return; 456 + } 457 + proxy->vqs[vdev->queue_sel].avail[0] = value; 458 + break; 459 + case VIRTIO_MMIO_QUEUE_AVAIL_HIGH: 460 + if (proxy->legacy) { 461 + qemu_log_mask(LOG_GUEST_ERROR, 462 + "%s: write to non-legacy register (0x%" 463 + HWADDR_PRIx ") in legacy mode\n", 464 + __func__, offset); 465 + return; 466 + } 467 + proxy->vqs[vdev->queue_sel].avail[1] = value; 468 + break; 469 + case VIRTIO_MMIO_QUEUE_USED_LOW: 470 + if (proxy->legacy) { 471 + qemu_log_mask(LOG_GUEST_ERROR, 472 + "%s: write to non-legacy register (0x%" 473 + HWADDR_PRIx ") in legacy mode\n", 474 + __func__, offset); 475 + return; 476 + } 477 + proxy->vqs[vdev->queue_sel].used[0] = value; 478 + break; 479 + case VIRTIO_MMIO_QUEUE_USED_HIGH: 480 + if (proxy->legacy) { 481 + qemu_log_mask(LOG_GUEST_ERROR, 482 + "%s: write to non-legacy register (0x%" 483 + HWADDR_PRIx ") in legacy mode\n", 484 + __func__, offset); 485 + return; 486 + } 487 + proxy->vqs[vdev->queue_sel].used[1] = value; 488 + break; 299 489 case VIRTIO_MMIO_MAGIC_VALUE: 300 490 case VIRTIO_MMIO_VERSION: 301 491 case VIRTIO_MMIO_DEVICE_ID: ··· 303 493 case VIRTIO_MMIO_DEVICE_FEATURES: 304 494 case VIRTIO_MMIO_QUEUE_NUM_MAX: 305 495 case VIRTIO_MMIO_INTERRUPT_STATUS: 496 + case VIRTIO_MMIO_CONFIG_GENERATION: 306 497 qemu_log_mask(LOG_GUEST_ERROR, 307 - "%s: write to readonly register\n", 308 - __func__); 498 + "%s: write to read-only register (0x%" HWADDR_PRIx ")\n", 499 + __func__, offset); 309 500 break; 310 501 311 502 default: 312 - qemu_log_mask(LOG_GUEST_ERROR, "%s: bad register offset\n", __func__); 503 + qemu_log_mask(LOG_GUEST_ERROR, 504 + "%s: bad register offset (0x%" HWADDR_PRIx ")\n", 505 + __func__, offset); 313 506 } 314 507 } 315 508 509 + static const MemoryRegionOps virtio_legacy_mem_ops = { 510 + .read = virtio_mmio_read, 511 + .write = virtio_mmio_write, 512 + .endianness = DEVICE_NATIVE_ENDIAN, 513 + }; 514 + 316 515 static const MemoryRegionOps virtio_mem_ops = { 317 516 .read = virtio_mmio_read, 318 517 .write = virtio_mmio_write, 319 - .endianness = DEVICE_NATIVE_ENDIAN, 518 + .endianness = DEVICE_LITTLE_ENDIAN, 320 519 }; 321 520 322 521 static void virtio_mmio_update_irq(DeviceState *opaque, uint16_t vector) ··· 352 551 qemu_put_be32(f, proxy->guest_page_shift); 353 552 } 354 553 554 + static const VMStateDescription vmstate_virtio_mmio_queue_state = { 555 + .name = "virtio_mmio/queue_state", 556 + .version_id = 1, 557 + .minimum_version_id = 1, 558 + .fields = (VMStateField[]) { 559 + VMSTATE_UINT16(num, VirtIOMMIOQueue), 560 + VMSTATE_BOOL(enabled, VirtIOMMIOQueue), 561 + VMSTATE_UINT32_ARRAY(desc, VirtIOMMIOQueue, 2), 562 + VMSTATE_UINT32_ARRAY(avail, VirtIOMMIOQueue, 2), 563 + VMSTATE_UINT32_ARRAY(used, VirtIOMMIOQueue, 2), 564 + VMSTATE_END_OF_LIST() 565 + } 566 + }; 567 + 568 + static const VMStateDescription vmstate_virtio_mmio_state_sub = { 569 + .name = "virtio_mmio/state", 570 + .version_id = 1, 571 + .minimum_version_id = 1, 572 + .fields = (VMStateField[]) { 573 + VMSTATE_UINT32_ARRAY(guest_features, VirtIOMMIOProxy, 2), 574 + VMSTATE_STRUCT_ARRAY(vqs, VirtIOMMIOProxy, VIRTIO_QUEUE_MAX, 0, 575 + vmstate_virtio_mmio_queue_state, 576 + VirtIOMMIOQueue), 577 + VMSTATE_END_OF_LIST() 578 + } 579 + }; 580 + 581 + static const VMStateDescription vmstate_virtio_mmio = { 582 + .name = "virtio_mmio", 583 + .version_id = 1, 584 + .minimum_version_id = 1, 585 + .minimum_version_id_old = 1, 586 + .fields = (VMStateField[]) { 587 + VMSTATE_END_OF_LIST() 588 + }, 589 + .subsections = (const VMStateDescription * []) { 590 + &vmstate_virtio_mmio_state_sub, 591 + NULL 592 + } 593 + }; 594 + 595 + static void virtio_mmio_save_extra_state(DeviceState *opaque, QEMUFile *f) 596 + { 597 + VirtIOMMIOProxy *proxy = VIRTIO_MMIO(opaque); 598 + 599 + vmstate_save_state(f, &vmstate_virtio_mmio, proxy, NULL); 600 + } 601 + 602 + static int virtio_mmio_load_extra_state(DeviceState *opaque, QEMUFile *f) 603 + { 604 + VirtIOMMIOProxy *proxy = VIRTIO_MMIO(opaque); 605 + 606 + return vmstate_load_state(f, &vmstate_virtio_mmio, proxy, 1); 607 + } 608 + 609 + static bool virtio_mmio_has_extra_state(DeviceState *opaque) 610 + { 611 + VirtIOMMIOProxy *proxy = VIRTIO_MMIO(opaque); 612 + 613 + return !proxy->legacy; 614 + } 615 + 355 616 static void virtio_mmio_reset(DeviceState *d) 356 617 { 357 618 VirtIOMMIOProxy *proxy = VIRTIO_MMIO(d); 619 + int i; 358 620 359 621 virtio_mmio_stop_ioeventfd(proxy); 360 622 virtio_bus_reset(&proxy->bus); 361 623 proxy->host_features_sel = 0; 362 624 proxy->guest_features_sel = 0; 363 625 proxy->guest_page_shift = 0; 626 + 627 + if (!proxy->legacy) { 628 + proxy->guest_features[0] = proxy->guest_features[1] = 0; 629 + 630 + for (i = 0; i < VIRTIO_QUEUE_MAX; i++) { 631 + proxy->vqs[i].enabled = 0; 632 + proxy->vqs[i].num = 0; 633 + proxy->vqs[i].desc[0] = proxy->vqs[i].desc[1] = 0; 634 + proxy->vqs[i].avail[0] = proxy->vqs[i].avail[1] = 0; 635 + proxy->vqs[i].used[0] = proxy->vqs[i].used[1] = 0; 636 + } 637 + } 364 638 } 365 639 366 640 static int virtio_mmio_set_guest_notifier(DeviceState *d, int n, bool assign, ··· 423 697 return r; 424 698 } 425 699 700 + static void virtio_mmio_pre_plugged(DeviceState *d, Error **errp) 701 + { 702 + VirtIOMMIOProxy *proxy = VIRTIO_MMIO(d); 703 + VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus); 704 + 705 + if (!proxy->legacy) { 706 + virtio_add_feature(&vdev->host_features, VIRTIO_F_VERSION_1); 707 + } 708 + } 709 + 426 710 /* virtio-mmio device */ 427 711 428 712 static Property virtio_mmio_properties[] = { 429 713 DEFINE_PROP_BOOL("format_transport_address", VirtIOMMIOProxy, 430 714 format_transport_address, true), 715 + DEFINE_PROP_BOOL("force-legacy", VirtIOMMIOProxy, legacy, true), 431 716 DEFINE_PROP_END_OF_LIST(), 432 717 }; 433 718 ··· 439 724 qbus_create_inplace(&proxy->bus, sizeof(proxy->bus), TYPE_VIRTIO_MMIO_BUS, 440 725 d, NULL); 441 726 sysbus_init_irq(sbd, &proxy->irq); 442 - memory_region_init_io(&proxy->iomem, OBJECT(d), &virtio_mem_ops, proxy, 443 - TYPE_VIRTIO_MMIO, 0x200); 727 + if (proxy->legacy) { 728 + memory_region_init_io(&proxy->iomem, OBJECT(d), 729 + &virtio_legacy_mem_ops, proxy, 730 + TYPE_VIRTIO_MMIO, 0x200); 731 + } else { 732 + memory_region_init_io(&proxy->iomem, OBJECT(d), 733 + &virtio_mem_ops, proxy, 734 + TYPE_VIRTIO_MMIO, 0x200); 735 + } 444 736 sysbus_init_mmio(sbd, &proxy->iomem); 445 737 } 446 738 ··· 511 803 k->notify = virtio_mmio_update_irq; 512 804 k->save_config = virtio_mmio_save_config; 513 805 k->load_config = virtio_mmio_load_config; 806 + k->save_extra_state = virtio_mmio_save_extra_state; 807 + k->load_extra_state = virtio_mmio_load_extra_state; 808 + k->has_extra_state = virtio_mmio_has_extra_state; 514 809 k->set_guest_notifiers = virtio_mmio_set_guest_notifiers; 515 810 k->ioeventfd_enabled = virtio_mmio_ioeventfd_enabled; 516 811 k->ioeventfd_assign = virtio_mmio_ioeventfd_assign; 812 + k->pre_plugged = virtio_mmio_pre_plugged; 517 813 k->has_variable_vring_alignment = true; 518 814 bus_class->max_dev = 1; 519 815 bus_class->get_dev_path = virtio_mmio_bus_get_dev_path;

include/hw/boards.h

··· 166 166 * The function pointer to hook different machine specific functions for 167 167 * parsing "smp-opts" from QemuOpts to MachineState::CpuTopology and more 168 168 * machine specific topology fields, such as smp_dies for PCMachine. 169 + * @hotplug_allowed: 170 + * If the hook is provided, then it'll be called for each device 171 + * hotplug to check whether the device hotplug is allowed. Return 172 + * true to grant allowance or false to reject the hotplug. When 173 + * false is returned, an error must be set to show the reason of 174 + * the rejection. If the hook is not provided, all hotplug will be 175 + * allowed. 169 176 */ 170 177 struct MachineClass { 171 178 /*< private >*/ ··· 224 231 225 232 HotplugHandler *(*get_hotplug_handler)(MachineState *machine, 226 233 DeviceState *dev); 234 + bool (*hotplug_allowed)(MachineState *state, DeviceState *dev, 235 + Error **errp); 227 236 CpuInstanceProperties (*cpu_index_to_instance_props)(MachineState *machine, 228 237 unsigned cpu_index); 229 238 const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState *machine);

include/hw/qdev-core.h

··· 280 280 int required_for_version); 281 281 HotplugHandler *qdev_get_bus_hotplug_handler(DeviceState *dev); 282 282 HotplugHandler *qdev_get_machine_hotplug_handler(DeviceState *dev); 283 + bool qdev_hotplug_allowed(DeviceState *dev, Error **errp); 283 284 /** 284 285 * qdev_get_hotplug_handler: Get handler responsible for device wiring 285 286 *

qdev-monitor.c

··· 615 615 /* create device */ 616 616 dev = DEVICE(object_new(driver)); 617 617 618 + /* Check whether the hotplug is allowed by the machine */ 619 + if (qdev_hotplug && !qdev_hotplug_allowed(dev, &err)) { 620 + /* Error must be set in the machine hook */ 621 + assert(err); 622 + goto err_del_dev; 623 + } 624 + 618 625 if (bus) { 619 626 qdev_set_parent_bus(dev, bus); 620 627 } else if (qdev_hotplug && !qdev_get_machine_hotplug_handler(dev)) {