qemu with hax to log dma reads & writes jcs.org/2018/11/12/vfio

fw-cfg: support writeable blobs

Useful to send guest data back to QEMU.

Changes from Laszlo Ersek <lersek@redhat.com>:
- rebase the patch from Michael Tsirkin's original postings at [1] and [2]
to the following patches:
- loader: Allow a custom AddressSpace when loading ROMs
- loader: Add AddressSpace loading support to uImages
- loader: fix handling of custom address spaces when adding ROM blobs
- reject such writes immediately that would exceed the end of the array,
rather than performing a partial write before setting the error bit: see
the (len != dma.length) condition
- document the write interface

[1] http://lists.nongnu.org/archive/html/qemu-devel/2016-02/msg04968.html
[2] http://lists.nongnu.org/archive/html/qemu-devel/2016-03/msg02735.html

Cc: "Gabriel L. Somlo" <somlo@cmu.edu>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Michael Walle <michael@walle.cc>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Shannon Zhao <zhaoshenglong@huawei.com>
Cc: qemu-arm@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Acked-by: Gabriel Somlo <somlo@cmu.edu>
Tested-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>

+75 -30
+25 -7
docs/specs/fw_cfg.txt
··· 33 33 NOTE: As of QEMU v2.4, writes to the fw_cfg data register are no 34 34 longer supported, and will be ignored (treated as no-ops)! 35 35 36 + NOTE: As of QEMU v2.9, writes are reinstated, but only through the DMA 37 + interface (see below). Furthermore, writeability of any specific item is 38 + governed independently of Bit14 in the selector key value. 39 + 36 40 Bit15 of the selector register indicates whether the configuration 37 41 setting is architecture specific. A value of 0 means the item is a 38 42 generic configuration item. A value of 1 means the item is specific ··· 43 47 44 48 == Data Register == 45 49 46 - * Read/Write (writes ignored as of QEMU v2.4) 50 + * Read/Write (writes ignored as of QEMU v2.4, but see the DMA interface) 47 51 * Location: platform dependent (IOport [*] or MMIO) 48 52 * Width: 8-bit (if IOport), 8/16/32/64-bit (if MMIO) 49 53 * Endianness: string-preserving ··· 134 138 135 139 === All Other Data Items === 136 140 137 - Please consult the QEMU source for the most up-to-date and authoritative 138 - list of selector keys and their respective items' purpose and format. 141 + Please consult the QEMU source for the most up-to-date and authoritative list 142 + of selector keys and their respective items' purpose, format and writeability. 139 143 140 144 === Ranges === 141 145 ··· 144 148 145 149 Selector Reg. Range Usage 146 150 --------------- ----------- 147 - 0x0000 - 0x3fff Generic (0x0000 - 0x3fff, RO) 151 + 0x0000 - 0x3fff Generic (0x0000 - 0x3fff, generally RO, possibly RW through 152 + the DMA interface in QEMU v2.9+) 148 153 0x4000 - 0x7fff Generic (0x0000 - 0x3fff, RW, ignored in QEMU v2.4+) 149 - 0x8000 - 0xbfff Arch. Specific (0x0000 - 0x3fff, RO) 154 + 0x8000 - 0xbfff Arch. Specific (0x0000 - 0x3fff, generally RO, possibly RW 155 + through the DMA interface in QEMU v2.9+) 150 156 0xc000 - 0xffff Arch. Specific (0x0000 - 0x3fff, RW, ignored in v2.4+) 151 157 152 158 In practice, the number of allowed firmware configuration items is given ··· 182 188 - Bit 1: Read 183 189 - Bit 2: Skip 184 190 - Bit 3: Select. The upper 16 bits are the selected index. 191 + - Bit 4: Write 185 192 186 193 When an operation is triggered, if the "control" field has bit 3 set, the 187 194 upper 16 bits are interpreted as an index of a firmware configuration item. ··· 191 198 "length" bytes for the current selector and offset will be copied into the 192 199 physical RAM address specified by the "address" field. 193 200 194 - If the "control" field has bit 2 set (and not bit 1), a skip operation will be 195 - performed. The offset for the current selector will be advanced "length" bytes. 201 + If the "control" field has bit 4 set (and not bit 1), a write operation will be 202 + performed. "length" bytes will be copied from the physical RAM address 203 + specified by the "address" field to the current selector and offset. QEMU 204 + prevents starting or finishing the write beyond the end of the item associated 205 + with the current selector (i.e., the item cannot be resized). Truncated writes 206 + are dropped entirely. Writes to read-only items are also rejected. All of these 207 + write errors set bit 0 (the error bit) in the "control" field. 208 + 209 + If the "control" field has bit 2 set (and neither bit 1 nor bit 4), a skip 210 + operation will be performed. The offset for the current selector will be 211 + advanced "length" bytes. 196 212 197 213 To check the result, read the "control" field: 198 214 error bit set -> something went wrong. ··· 234 250 235 251 Use of names not beginning with "opt/" is potentially dangerous and 236 252 entirely unsupported. QEMU will warn if you try. 253 + 254 + All externally provided fw_cfg items are read-only to the guest.
+1 -1
hw/arm/virt-acpi-build.c
··· 818 818 uint64_t max_size) 819 819 { 820 820 return rom_add_blob(name, blob->data, acpi_data_len(blob), max_size, -1, 821 - name, virt_acpi_build_update, build_state, NULL); 821 + name, virt_acpi_build_update, build_state, NULL, true); 822 822 } 823 823 824 824 static const VMStateDescription vmstate_virt_acpi_build = {
+11 -7
hw/core/loader.c
··· 853 853 } 854 854 } 855 855 856 - static void *rom_set_mr(Rom *rom, Object *owner, const char *name) 856 + static void *rom_set_mr(Rom *rom, Object *owner, const char *name, bool ro) 857 857 { 858 858 void *data; 859 859 ··· 862 862 rom->datasize, rom->romsize, 863 863 fw_cfg_resized, 864 864 &error_fatal); 865 - memory_region_set_readonly(rom->mr, true); 865 + memory_region_set_readonly(rom->mr, ro); 866 866 vmstate_register_ram_global(rom->mr); 867 867 868 868 data = memory_region_get_ram_ptr(rom->mr); ··· 942 942 snprintf(devpath, sizeof(devpath), "/rom@%s", fw_file_name); 943 943 944 944 if ((!option_rom || mc->option_rom_has_mr) && mc->rom_file_has_mr) { 945 - data = rom_set_mr(rom, OBJECT(fw_cfg), devpath); 945 + data = rom_set_mr(rom, OBJECT(fw_cfg), devpath, true); 946 946 } else { 947 947 data = rom->data; 948 948 } ··· 979 979 MemoryRegion *rom_add_blob(const char *name, const void *blob, size_t len, 980 980 size_t max_len, hwaddr addr, const char *fw_file_name, 981 981 FWCfgReadCallback fw_callback, void *callback_opaque, 982 - AddressSpace *as) 982 + AddressSpace *as, bool read_only) 983 983 { 984 984 MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine()); 985 985 Rom *rom; ··· 998 998 char devpath[100]; 999 999 void *data; 1000 1000 1001 - snprintf(devpath, sizeof(devpath), "/rom@%s", fw_file_name); 1001 + if (read_only) { 1002 + snprintf(devpath, sizeof(devpath), "/rom@%s", fw_file_name); 1003 + } else { 1004 + snprintf(devpath, sizeof(devpath), "/ram@%s", fw_file_name); 1005 + } 1002 1006 1003 1007 if (mc->rom_file_has_mr) { 1004 - data = rom_set_mr(rom, OBJECT(fw_cfg), devpath); 1008 + data = rom_set_mr(rom, OBJECT(fw_cfg), devpath, read_only); 1005 1009 mr = rom->mr; 1006 1010 } else { 1007 1011 data = rom->data; ··· 1009 1013 1010 1014 fw_cfg_add_file_callback(fw_cfg, fw_file_name, 1011 1015 fw_callback, callback_opaque, 1012 - data, rom->datasize); 1016 + data, rom->datasize, read_only); 1013 1017 } 1014 1018 return mr; 1015 1019 }
+2 -2
hw/i386/acpi-build.c
··· 2806 2806 uint64_t max_size) 2807 2807 { 2808 2808 return rom_add_blob(name, blob->data, acpi_data_len(blob), max_size, -1, 2809 - name, acpi_build_update, build_state, NULL); 2809 + name, acpi_build_update, build_state, NULL, true); 2810 2810 } 2811 2811 2812 2812 static const VMStateDescription vmstate_acpi_build = { ··· 2872 2872 build_state->rsdp = g_memdup(tables.rsdp->data, rsdp_size); 2873 2873 fw_cfg_add_file_callback(pcms->fw_cfg, ACPI_BUILD_RSDP_FILE, 2874 2874 acpi_build_update, build_state, 2875 - build_state->rsdp, rsdp_size); 2875 + build_state->rsdp, rsdp_size, true); 2876 2876 build_state->rsdp_mr = NULL; 2877 2877 } else { 2878 2878 build_state->rsdp = NULL;
+1 -1
hw/lm32/lm32_hwsetup.h
··· 75 75 hwaddr base) 76 76 { 77 77 rom_add_blob("hwsetup", hw->data, TARGET_PAGE_SIZE, 78 - TARGET_PAGE_SIZE, base, NULL, NULL, NULL, NULL); 78 + TARGET_PAGE_SIZE, base, NULL, NULL, NULL, NULL, true); 79 79 } 80 80 81 81 static inline void hwsetup_add_u8(HWSetup *hw, uint8_t u)
+29 -8
hw/nvram/fw_cfg.c
··· 54 54 #define FW_CFG_DMA_CTL_READ 0x02 55 55 #define FW_CFG_DMA_CTL_SKIP 0x04 56 56 #define FW_CFG_DMA_CTL_SELECT 0x08 57 + #define FW_CFG_DMA_CTL_WRITE 0x10 57 58 58 59 #define FW_CFG_DMA_SIGNATURE 0x51454d5520434647ULL /* "QEMU CFG" */ 59 60 60 61 typedef struct FWCfgEntry { 61 62 uint32_t len; 63 + bool allow_write; 62 64 uint8_t *data; 63 65 void *callback_opaque; 64 66 FWCfgReadCallback read_callback; ··· 326 328 FWCfgDmaAccess dma; 327 329 int arch; 328 330 FWCfgEntry *e; 329 - int read; 331 + int read = 0, write = 0; 330 332 dma_addr_t dma_addr; 331 333 332 334 /* Reset the address before the next access */ ··· 353 355 354 356 if (dma.control & FW_CFG_DMA_CTL_READ) { 355 357 read = 1; 358 + write = 0; 359 + } else if (dma.control & FW_CFG_DMA_CTL_WRITE) { 360 + read = 0; 361 + write = 1; 356 362 } else if (dma.control & FW_CFG_DMA_CTL_SKIP) { 357 363 read = 0; 364 + write = 0; 358 365 } else { 359 366 dma.length = 0; 360 367 } ··· 374 381 dma.control |= FW_CFG_DMA_CTL_ERROR; 375 382 } 376 383 } 377 - 384 + if (write) { 385 + dma.control |= FW_CFG_DMA_CTL_ERROR; 386 + } 378 387 } else { 379 388 if (dma.length <= (e->len - s->cur_offset)) { 380 389 len = dma.length; ··· 391 400 dma.control |= FW_CFG_DMA_CTL_ERROR; 392 401 } 393 402 } 403 + if (write) { 404 + if (!e->allow_write || 405 + len != dma.length || 406 + dma_memory_read(s->dma_as, dma.address, 407 + &e->data[s->cur_offset], len)) { 408 + dma.control |= FW_CFG_DMA_CTL_ERROR; 409 + } 410 + } 394 411 395 412 s->cur_offset += len; 396 413 } ··· 586 603 static void fw_cfg_add_bytes_read_callback(FWCfgState *s, uint16_t key, 587 604 FWCfgReadCallback callback, 588 605 void *callback_opaque, 589 - void *data, size_t len) 606 + void *data, size_t len, 607 + bool read_only) 590 608 { 591 609 int arch = !!(key & FW_CFG_ARCH_LOCAL); 592 610 ··· 599 617 s->entries[arch][key].len = (uint32_t)len; 600 618 s->entries[arch][key].read_callback = callback; 601 619 s->entries[arch][key].callback_opaque = callback_opaque; 620 + s->entries[arch][key].allow_write = !read_only; 602 621 } 603 622 604 623 static void *fw_cfg_modify_bytes_read(FWCfgState *s, uint16_t key, ··· 616 635 s->entries[arch][key].data = data; 617 636 s->entries[arch][key].len = len; 618 637 s->entries[arch][key].callback_opaque = NULL; 638 + s->entries[arch][key].allow_write = false; 619 639 620 640 return ptr; 621 641 } 622 642 623 643 void fw_cfg_add_bytes(FWCfgState *s, uint16_t key, void *data, size_t len) 624 644 { 625 - fw_cfg_add_bytes_read_callback(s, key, NULL, NULL, data, len); 645 + fw_cfg_add_bytes_read_callback(s, key, NULL, NULL, data, len, true); 626 646 } 627 647 628 648 void fw_cfg_add_string(FWCfgState *s, uint16_t key, const char *value) ··· 749 769 750 770 void fw_cfg_add_file_callback(FWCfgState *s, const char *filename, 751 771 FWCfgReadCallback callback, void *callback_opaque, 752 - void *data, size_t len) 772 + void *data, size_t len, bool read_only) 753 773 { 754 774 int i, index, count; 755 775 size_t dsize; ··· 811 831 } 812 832 813 833 fw_cfg_add_bytes_read_callback(s, FW_CFG_FILE_FIRST + index, 814 - callback, callback_opaque, data, len); 834 + callback, callback_opaque, data, len, 835 + read_only); 815 836 816 837 s->files->f[index].size = cpu_to_be32(len); 817 838 s->files->f[index].select = cpu_to_be16(FW_CFG_FILE_FIRST + index); ··· 824 845 void fw_cfg_add_file(FWCfgState *s, const char *filename, 825 846 void *data, size_t len) 826 847 { 827 - fw_cfg_add_file_callback(s, filename, NULL, NULL, data, len); 848 + fw_cfg_add_file_callback(s, filename, NULL, NULL, data, len, true); 828 849 } 829 850 830 851 void *fw_cfg_modify_file(FWCfgState *s, const char *filename, ··· 847 868 } 848 869 } 849 870 /* add new one */ 850 - fw_cfg_add_file_callback(s, filename, NULL, NULL, data, len); 871 + fw_cfg_add_file_callback(s, filename, NULL, NULL, data, len, true); 851 872 return NULL; 852 873 } 853 874
+4 -3
include/hw/loader.h
··· 180 180 size_t max_len, hwaddr addr, 181 181 const char *fw_file_name, 182 182 FWCfgReadCallback fw_callback, 183 - void *callback_opaque, AddressSpace *as); 183 + void *callback_opaque, AddressSpace *as, 184 + bool read_only); 184 185 int rom_add_elf_program(const char *name, void *data, size_t datasize, 185 186 size_t romsize, hwaddr addr, AddressSpace *as); 186 187 int rom_check_and_register_reset(void); ··· 194 195 #define rom_add_file_fixed(_f, _a, _i) \ 195 196 rom_add_file(_f, NULL, _a, _i, false, NULL, NULL) 196 197 #define rom_add_blob_fixed(_f, _b, _l, _a) \ 197 - rom_add_blob(_f, _b, _l, _l, _a, NULL, NULL, NULL, NULL) 198 + rom_add_blob(_f, _b, _l, _l, _a, NULL, NULL, NULL, NULL, true) 198 199 #define rom_add_file_mr(_f, _mr, _i) \ 199 200 rom_add_file(_f, NULL, 0, _i, false, _mr, NULL) 200 201 #define rom_add_file_as(_f, _as, _i) \ ··· 202 203 #define rom_add_file_fixed_as(_f, _a, _i, _as) \ 203 204 rom_add_file(_f, NULL, _a, _i, false, NULL, _as) 204 205 #define rom_add_blob_fixed_as(_f, _b, _l, _a, _as) \ 205 - rom_add_blob(_f, _b, _l, _l, _a, NULL, NULL, NULL, _as) 206 + rom_add_blob(_f, _b, _l, _l, _a, NULL, NULL, NULL, _as, true) 206 207 207 208 #define PC_ROM_MIN_VGA 0xc0000 208 209 #define PC_ROM_MIN_OPTION 0xc8000
+2 -1
include/hw/nvram/fw_cfg.h
··· 136 136 * @callback_opaque: argument to be passed into callback function 137 137 * @data: pointer to start of item data 138 138 * @len: size of item data 139 + * @read_only: is file read only 139 140 * 140 141 * Add a new NAMED fw_cfg item as a raw "blob" of the given size. The data 141 142 * referenced by the starting pointer is only linked, NOT copied, into the ··· 151 152 */ 152 153 void fw_cfg_add_file_callback(FWCfgState *s, const char *filename, 153 154 FWCfgReadCallback callback, void *callback_opaque, 154 - void *data, size_t len); 155 + void *data, size_t len, bool read_only); 155 156 156 157 /** 157 158 * fw_cfg_modify_file: