qemu with hax to log dma reads & writes jcs.org/2018/11/12/vfio

qcow2: Repair OFLAG_COPIED when fixing leaks

Repairing OFLAG_COPIED is usually safe because it is done after the
refcounts have been repaired. Therefore, it we did not find anyone else
referencing a data or L2 cluster, it makes no sense to not set
OFLAG_COPIED -- and the other direction (clearing OFLAG_COPIED) is
always safe, anyway, it may just induce leaks.

Furthermore, if OFLAG_COPIED is actually consistent with a wrong (leaky)
refcount, we will decrement the refcount with -r leaks, but OFLAG_COPIED
will then be wrong. qemu-img check should not produce images that are
more corrupted afterwards then they were before.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1527085
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20180509200059.31125-2-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

+17 -8
+17 -8
block/qcow2-refcount.c
··· 1799 1799 int ret; 1800 1800 uint64_t refcount; 1801 1801 int i, j; 1802 + bool repair; 1803 + 1804 + if (fix & BDRV_FIX_ERRORS) { 1805 + /* Always repair */ 1806 + repair = true; 1807 + } else if (fix & BDRV_FIX_LEAKS) { 1808 + /* Repair only if that seems safe: This function is always 1809 + * called after the refcounts have been fixed, so the refcount 1810 + * is accurate if that repair was successful */ 1811 + repair = !res->check_errors && !res->corruptions && !res->leaks; 1812 + } else { 1813 + repair = false; 1814 + } 1802 1815 1803 1816 for (i = 0; i < s->l1_size; i++) { 1804 1817 uint64_t l1_entry = s->l1_table[i]; ··· 1818 1831 if ((refcount == 1) != ((l1_entry & QCOW_OFLAG_COPIED) != 0)) { 1819 1832 fprintf(stderr, "%s OFLAG_COPIED L2 cluster: l1_index=%d " 1820 1833 "l1_entry=%" PRIx64 " refcount=%" PRIu64 "\n", 1821 - fix & BDRV_FIX_ERRORS ? "Repairing" : 1822 - "ERROR", 1823 - i, l1_entry, refcount); 1824 - if (fix & BDRV_FIX_ERRORS) { 1834 + repair ? "Repairing" : "ERROR", i, l1_entry, refcount); 1835 + if (repair) { 1825 1836 s->l1_table[i] = refcount == 1 1826 1837 ? l1_entry | QCOW_OFLAG_COPIED 1827 1838 : l1_entry & ~QCOW_OFLAG_COPIED; ··· 1862 1873 if ((refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) { 1863 1874 fprintf(stderr, "%s OFLAG_COPIED data cluster: " 1864 1875 "l2_entry=%" PRIx64 " refcount=%" PRIu64 "\n", 1865 - fix & BDRV_FIX_ERRORS ? "Repairing" : 1866 - "ERROR", 1867 - l2_entry, refcount); 1868 - if (fix & BDRV_FIX_ERRORS) { 1876 + repair ? "Repairing" : "ERROR", l2_entry, refcount); 1877 + if (repair) { 1869 1878 l2_table[j] = cpu_to_be64(refcount == 1 1870 1879 ? l2_entry | QCOW_OFLAG_COPIED 1871 1880 : l2_entry & ~QCOW_OFLAG_COPIED);