qemu with hax to log dma reads & writes jcs.org/2018/11/12/vfio

qcow2: Forbid discard in qcow2 v2 images with backing files

A discard request deallocates the selected clusters so they read back
as zeroes. This is done by clearing the cluster offset field and
setting QCOW_OFLAG_ZERO in the L2 entry.

This flag is however only supported when qcow_version >= 3. In older
images the cluster is simply deallocated, exposing any possible stale
data from the backing file.

Since discard is an advisory operation it's safer to simply forbid it
in this scenario.

Note that we are adding this check to qcow2_co_pdiscard() and not to
qcow2_cluster_discard() or discard_in_l2_slice() because the last
two are also used by qcow2_snapshot_create() to discard the clusters
used by the VM state. In this case there's no risk of exposing stale
data to the guest and we really want that the clusters are always
discarded.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-Id: <20200331114345.29993-1-berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>

authored by

Alberto Garcia and committed by
Max Reitz
80f5c011 53ef8a92

+187 -19
+6
block/qcow2.c
··· 3784 3784 int ret; 3785 3785 BDRVQcow2State *s = bs->opaque; 3786 3786 3787 + /* If the image does not support QCOW_OFLAG_ZERO then discarding 3788 + * clusters could expose stale data from the backing file. */ 3789 + if (s->qcow_version < 3 && bs->backing) { 3790 + return -ENOTSUP; 3791 + } 3792 + 3787 3793 if (!QEMU_IS_ALIGNED(offset | bytes, s->cluster_size)) { 3788 3794 assert(bytes < s->cluster_size); 3789 3795 /* Ignore partial clusters, except for the special case of the
+6 -4
tests/qemu-iotests/046
··· 193 193 verify_io() 194 194 { 195 195 if ($QEMU_IMG info -U -f "$IMGFMT" "$TEST_IMG" | grep "compat: 0.10" > /dev/null); then 196 - # For v2 images, discarded clusters are read from the backing file 197 - # Keep the variable empty so that the backing file value can be used as 196 + # In v2 images clusters are not discarded when there is a backing file. 197 + # Keep the variable empty so that the previous value can be used as 198 198 # the default below 199 199 discarded= 200 200 else ··· 230 230 echo read -P 70 0x78000 0x6000 231 231 echo read -P 7 0x7e000 0x2000 232 232 233 - echo read -P ${discarded:-8} 0x80000 0x6000 233 + echo read -P ${discarded:-89} 0x80000 0x1000 234 + echo read -P ${discarded:-8} 0x81000 0x5000 234 235 echo read -P 80 0x86000 0x2000 235 236 echo read -P ${discarded:-8} 0x88000 0x2000 236 237 echo read -P 81 0x8a000 0xe000 237 238 echo read -P 90 0x98000 0x6000 238 239 echo read -P 9 0x9e000 0x2000 239 240 240 - echo read -P ${discarded:-10} 0xa0000 0x6000 241 + echo read -P ${discarded:-109} 0xa0000 0x1000 242 + echo read -P ${discarded:-10} 0xa1000 0x5000 241 243 echo read -P 100 0xa6000 0x2000 242 244 echo read -P ${discarded:-10} 0xa8000 0x2000 243 245 echo read -P 101 0xaa000 0xe000
+8 -4
tests/qemu-iotests/046.out
··· 187 187 24 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 188 188 read 8192/8192 bytes at offset 516096 189 189 8 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 190 - read 24576/24576 bytes at offset 524288 191 - 24 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 190 + read 4096/4096 bytes at offset 524288 191 + 4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 192 + read 20480/20480 bytes at offset 528384 193 + 20 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 192 194 read 8192/8192 bytes at offset 548864 193 195 8 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 194 196 read 8192/8192 bytes at offset 557056 ··· 199 201 24 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 200 202 read 8192/8192 bytes at offset 647168 201 203 8 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 202 - read 24576/24576 bytes at offset 655360 203 - 24 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 204 + read 4096/4096 bytes at offset 655360 205 + 4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 206 + read 20480/20480 bytes at offset 659456 207 + 20 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 204 208 read 8192/8192 bytes at offset 679936 205 209 8 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 206 210 read 8192/8192 bytes at offset 688128
+5 -7
tests/qemu-iotests/060
··· 160 160 161 161 $QEMU_IO -c 'write 0k 64k' "$BACKING_IMG" | _filter_qemu_io 162 162 163 - # compat=0.10 is required in order to make the following discard actually 164 - # unallocate the sector rather than make it a zero sector - we want COW, after 165 - # all. 166 - _make_test_img -o 'compat=0.10' -b "$BACKING_IMG" 1G 163 + _make_test_img -b "$BACKING_IMG" 1G 167 164 # Write two clusters, the second one enforces creation of an L2 table after 168 165 # the first data cluster. 169 166 $QEMU_IO -c 'write 0k 64k' -c 'write 512M 64k' "$TEST_IMG" | _filter_qemu_io 170 - # Discard the first cluster. This cluster will soon enough be reallocated and 167 + # Free the first cluster. This cluster will soon enough be reallocated and 171 168 # used for COW. 172 - $QEMU_IO -c 'discard 0k 64k' "$TEST_IMG" | _filter_qemu_io 169 + poke_file "$TEST_IMG" "$l2_offset" "\x00\x00\x00\x00\x00\x00\x00\x00" 170 + poke_file "$TEST_IMG" "$(($rb_offset+10))" "\x00\x00" 173 171 # Now, corrupt the image by marking the second L2 table cluster as free. 174 - poke_file "$TEST_IMG" '131084' "\x00\x00" # 0x2000c 172 + poke_file "$TEST_IMG" "$(($rb_offset+12))" "\x00\x00" 175 173 # Start a write operation requiring COW on the image stopping it right before 176 174 # doing the read; then, trigger the corruption prevention by writing anything to 177 175 # any unallocated cluster, leading to an attempt to overwrite the second L2
-2
tests/qemu-iotests/060.out
··· 105 105 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 106 106 wrote 65536/65536 bytes at offset 536870912 107 107 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 108 - discard 65536/65536 bytes at offset 0 109 - 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 110 108 qcow2: Marking image as corrupt: Preventing invalid write on metadata (overlaps with active L2 table); further corruption events will be suppressed 111 109 blkdebug: Suspended request '0' 112 110 write failed: Input/output error
+3 -2
tests/qemu-iotests/177
··· 89 89 { 90 90 if ($QEMU_IMG info -f "$IMGFMT" "$TEST_IMG" | 91 91 grep "compat: 0.10" > /dev/null); then 92 - # For v2 images, discarded clusters are read from the backing file 93 - discarded=11 92 + # In v2 images clusters are not discarded when there is a backing file 93 + # so the previous value is read 94 + discarded=22 94 95 else 95 96 # Discarded clusters are zeroed for v3 or later 96 97 discarded=0
+97
tests/qemu-iotests/290
··· 1 + #!/usr/bin/env bash 2 + # 3 + # Test how 'qemu-io -c discard' behaves on v2 and v3 qcow2 images 4 + # 5 + # Copyright (C) 2020 Igalia, S.L. 6 + # Author: Alberto Garcia <berto@igalia.com> 7 + # 8 + # This program is free software; you can redistribute it and/or modify 9 + # it under the terms of the GNU General Public License as published by 10 + # the Free Software Foundation; either version 2 of the License, or 11 + # (at your option) any later version. 12 + # 13 + # This program is distributed in the hope that it will be useful, 14 + # but WITHOUT ANY WARRANTY; without even the implied warranty of 15 + # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 16 + # GNU General Public License for more details. 17 + # 18 + # You should have received a copy of the GNU General Public License 19 + # along with this program. If not, see <http://www.gnu.org/licenses/>. 20 + # 21 + 22 + # creator 23 + owner=berto@igalia.com 24 + 25 + seq=`basename $0` 26 + echo "QA output created by $seq" 27 + 28 + status=1 # failure is the default! 29 + 30 + _cleanup() 31 + { 32 + _cleanup_test_img 33 + } 34 + trap "_cleanup; exit \$status" 0 1 2 3 15 35 + 36 + # get standard environment, filters and checks 37 + . ./common.rc 38 + . ./common.filter 39 + 40 + _supported_fmt qcow2 41 + _supported_proto file 42 + _supported_os Linux 43 + _unsupported_imgopts 'compat=0.10' refcount_bits data_file 44 + 45 + echo 46 + echo "### Test 'qemu-io -c discard' on a QCOW2 image without a backing file" 47 + echo 48 + for qcow2_compat in 0.10 1.1; do 49 + echo "# Create an image with compat=$qcow2_compat without a backing file" 50 + _make_test_img -o "compat=$qcow2_compat" 128k 51 + 52 + echo "# Fill all clusters with data and then discard them" 53 + $QEMU_IO -c 'write -P 0x01 0 128k' "$TEST_IMG" | _filter_qemu_io 54 + $QEMU_IO -c 'discard 0 128k' "$TEST_IMG" | _filter_qemu_io 55 + 56 + echo "# Read the data from the discarded clusters" 57 + $QEMU_IO -c 'read -P 0x00 0 128k' "$TEST_IMG" | _filter_qemu_io 58 + 59 + echo "# Output of qemu-img map" 60 + $QEMU_IMG map "$TEST_IMG" | _filter_testdir 61 + done 62 + 63 + echo 64 + echo "### Test 'qemu-io -c discard' on a QCOW2 image with a backing file" 65 + echo 66 + 67 + echo "# Create a backing image and fill it with data" 68 + BACKING_IMG="$TEST_IMG.base" 69 + TEST_IMG="$BACKING_IMG" _make_test_img 128k 70 + $QEMU_IO -c 'write -P 0xff 0 128k' "$BACKING_IMG" | _filter_qemu_io 71 + 72 + for qcow2_compat in 0.10 1.1; do 73 + echo "# Create an image with compat=$qcow2_compat and a backing file" 74 + _make_test_img -o "compat=$qcow2_compat" -b "$BACKING_IMG" 75 + 76 + echo "# Fill all clusters with data and then discard them" 77 + $QEMU_IO -c 'write -P 0x01 0 128k' "$TEST_IMG" | _filter_qemu_io 78 + $QEMU_IO -c 'discard 0 128k' "$TEST_IMG" | _filter_qemu_io 79 + 80 + echo "# Read the data from the discarded clusters" 81 + if [ "$qcow2_compat" = "1.1" ]; then 82 + # In qcow2 v3 clusters are zeroed (with QCOW_OFLAG_ZERO) 83 + $QEMU_IO -c 'read -P 0x00 0 128k' "$TEST_IMG" | _filter_qemu_io 84 + else 85 + # In qcow2 v2 if there's a backing image we cannot zero the clusters 86 + # without exposing the backing file data so discard does nothing 87 + $QEMU_IO -c 'read -P 0x01 0 128k' "$TEST_IMG" | _filter_qemu_io 88 + fi 89 + 90 + echo "# Output of qemu-img map" 91 + $QEMU_IMG map "$TEST_IMG" | _filter_testdir 92 + done 93 + 94 + # success, all done 95 + echo "*** done" 96 + rm -f $seq.full 97 + status=0
+61
tests/qemu-iotests/290.out
··· 1 + QA output created by 290 2 + 3 + ### Test 'qemu-io -c discard' on a QCOW2 image without a backing file 4 + 5 + # Create an image with compat=0.10 without a backing file 6 + Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=131072 7 + # Fill all clusters with data and then discard them 8 + wrote 131072/131072 bytes at offset 0 9 + 128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 10 + discard 131072/131072 bytes at offset 0 11 + 128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 12 + # Read the data from the discarded clusters 13 + read 131072/131072 bytes at offset 0 14 + 128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 15 + # Output of qemu-img map 16 + Offset Length Mapped to File 17 + # Create an image with compat=1.1 without a backing file 18 + Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=131072 19 + # Fill all clusters with data and then discard them 20 + wrote 131072/131072 bytes at offset 0 21 + 128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 22 + discard 131072/131072 bytes at offset 0 23 + 128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 24 + # Read the data from the discarded clusters 25 + read 131072/131072 bytes at offset 0 26 + 128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 27 + # Output of qemu-img map 28 + Offset Length Mapped to File 29 + 30 + ### Test 'qemu-io -c discard' on a QCOW2 image with a backing file 31 + 32 + # Create a backing image and fill it with data 33 + Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=131072 34 + wrote 131072/131072 bytes at offset 0 35 + 128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 36 + # Create an image with compat=0.10 and a backing file 37 + Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=131072 backing_file=TEST_DIR/t.IMGFMT.base 38 + # Fill all clusters with data and then discard them 39 + wrote 131072/131072 bytes at offset 0 40 + 128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 41 + discard 131072/131072 bytes at offset 0 42 + 128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 43 + # Read the data from the discarded clusters 44 + read 131072/131072 bytes at offset 0 45 + 128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 46 + # Output of qemu-img map 47 + Offset Length Mapped to File 48 + 0 0x20000 0x50000 TEST_DIR/t.qcow2 49 + # Create an image with compat=1.1 and a backing file 50 + Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=131072 backing_file=TEST_DIR/t.IMGFMT.base 51 + # Fill all clusters with data and then discard them 52 + wrote 131072/131072 bytes at offset 0 53 + 128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 54 + discard 131072/131072 bytes at offset 0 55 + 128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 56 + # Read the data from the discarded clusters 57 + read 131072/131072 bytes at offset 0 58 + 128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 59 + # Output of qemu-img map 60 + Offset Length Mapped to File 61 + *** done
+1
tests/qemu-iotests/group
··· 296 296 286 rw quick 297 297 288 quick 298 298 289 rw quick 299 + 290 rw auto quick