block: Add driver-specific options for backing files
Options starting in "backing." are passed to the backing file now. Ifyou don't need to specify the filename for the backing file, you can addit on the command line instead of in the image file:
$ qemu-nbd -t /tmp/test.img...
qcow2: allow sub-cluster compressed write to last cluster
Compression in qcow2 requires image length to be a multiple of thecluster size. Lift this requirement by zero-padding the final clusterwhen necessary. The virtual disk size is still not cluster-aligned, so...
qcow: allow sub-cluster compressed write to last cluster
Compression in qcow requires image length to be a multiple of thecluster size. Lift this requirement by zero-padding the final clusterwhen necessary. The virtual disk size is still not cluster-aligned, so...
ssh: Remove unnecessary use of strlen function.
Reviewed-by: Eric Blake <eblake@redhat.com>Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
block/ssh: Add missing gcc format attributes
Now gcc will check whether format string and variable arguments match.
Signed-off-by: Stefan Weil <sw@weilnetz.de>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
block: Add support for Secure Shell (ssh) block device.
qemu-system-x86_64 -drive file=ssh://hostname/some/image
QEMU will ssh into 'hostname' and open '/some/image' which is madeavailable as a standard block device.
You can specify a username (ssh://user@host/...) and/or a port number...
block: ssh: Use libssh2_sftp_fsync (if supported by libssh2) to flush to disk.
libssh2_sftp_fsync is an extension to libssh2 to support fsync(2) oversftp, which is itself an extension of OpenSSH.
If both libssh2 and the ssh daemon support it, this will allow...
rbd: add an asynchronous flush
The existing bdrv_co_flush_to_disk implementation uses rbd_flush(),which is sychronous and causes the main qemu thread to block until itis complete. This results in unresponsiveness and extra latency forthe guest.
Fix this by using an asynchronous version of flush. This was added to...
block: Introduce bdrv_writev_vmstate
Signed-off-by: Kevin Wolf <kwolf@redhat.com>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
block: Introduce bdrv_pwritev() for qcow2_save_vmstate
Directly pass the QEMUIOVector on instead of linearising it.
aes: move aes.h from include/block to include/qemu
Move aes.h from include/block to include/qemu to show it can be reusedby other subsystems.
Cc: Kevin Wolf <kwolf@redhat.com>Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>...
hw: move headers to include/
Many of these should be cleaned up with proper qdev-/QOM-ification.Right now there are many catch-all headers in include/hw/ARCH dependingon cpu.h, and this makes it necessary to compile these files per-target.However, fixing this does not belong in these patches....
qcow2: Return real error in qcow2_update_snapshot_refcount
This fixes the error message triggered by the following script:
cat > /tmp/blkdebug.cfg <<EOF [inject-error] event = "cluster_free" errno = "28" immediately = "off" EOF
$qemu_img create -f qcow2 test.qcow2 10G...
qcow2: Fix L1 write error handling in qcow2_update_snapshot_refcount
It ignored the error code, and at least the 'goto fail' is obviousnonsense as it creates an endless loop (if the next attempt doesn'tmagically succeed) and leaves the in-memory L1 table in big-endian...
oslib-posix: rename socket_set_nonblock() to qemu_set_nonblock()
The fcntl(fd, F_SETFL, O_NONBLOCK) flag is not specific to sockets.Rename to qemu_set_nonblock() just like qemu_set_cloexec().
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>Reviewed-by: Eric Blake <eblake@redhat.com>...
qcow2: Use byte granularity in qcow2_alloc_cluster_offset()
This gets rid of the nb_clusters and keep_clusters and the associatedcomplicated calculations. Just advance the number of bytes that havebeen processed and everything is fine.
This patch advances the variables even after the last operation even...
qcow2: Allow requests with multiple l2metas
Instead of expecting a single l2meta, have a list of them. This allowsto still have a single I/O request for the guest data, even thoughmultiple l2meta may be needed in order to describe both a COW overwriteand a new cluster allocation (typical sequential write case)....
qcow2: Move cluster gathering to a non-looping loop
This patch is mainly to separate the indentation change from thesemantic changes. All that really changes here is that everything movesinto a while loop, all 'goto done' become 'break' and at the end of the...
qcow2: Gather clusters in a looping loop
Instead of just checking once in exactly this order if there aredependendies, non-COW clusters and new allocation, this starts loopingaround these. This way we can, for example, gather non-COW clusters afternew allocations as long as the host cluster offsets stay contiguous....
qcow2: Improve check for overlapping allocations
The old code detected an overlapping allocation even when theallocations didn't actually overlap, but were only adjacent.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>Reviewed-by: Eric Blake <eblake@redhat.com>...
qcow2: Change handle_dependency to byte granularity
This is a more precise description of what really constitutes adependency. The behaviour doesn't change at this point because the COWarea of the old request is still aligned to cluster boundaries andtherefore an overlap is detected wheneven the requests touch any part of...
qcow2: Decouple cluster allocation from cluster reuse code
This moves some code that prepares the allocation of new clusters towhere the actual allocation happens. This is the minimum required to beable to move it to a separate function in the next patch....
qcow2: Factor out handle_alloc()
qcow2: handle_alloc(): Get rid of nb_clusters parameter
We already communicate the same information in *bytes.
qcow2: handle_alloc(): Get rid of keep_clusters parameter
handle_alloc() is now called with the offset at which the actual newallocation starts instead of the offset at which the whole write requeststarts, part of which may already be processed.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>...
qcow2: Finalise interface of handle_alloc()
The interface works completely on a byte granularity now and duplicatedparameters are removed.
qcow2: Clean up handle_alloc()
Things can be simplified a bit now. No semantic changes.
qcow2: Factor out handle_copied()
qcow2: handle_copied(): Get rid of nb_clusters parameter
handle_copied() uses its bytes parameter now to determine how manyclusters it should try to find.
qcow2: handle_copied(): Get rid of keep_clusters parameter
Now *bytes is used to return the length of the area that can be writtento without performing an allocation or COW.
qcow2: handle_copied(): Implement non-zero host_offset
Look only for clusters that start at a given physical offset.
qcow2: Prepare handle_alloc/copied() for byte granularity
This makes handle_alloc() and handle_copied() return byte-granularityhost offsets instead of returning always the cluster start. This isrequired so that qcow2_alloc_cluster_offset() can stop aligning...
qcow2: Fix "total clusters" number in bdrv_check
This should be based on the virtual disk size, not on the size of theimage.
Interesting observation: With some VM state stored in the image file,percentages higher than 100% are possible, even though snapshots...
qcow2: Remove bogus unlock of s->lock
The unlock wakes up the next coroutine, but the currently runningcoroutine will lock it again before it yields, so this doesn't make alot of sense.
qcow2: Handle dependencies earlier
Handling overlapping allocations isn't just a detail of clusterallocation. It is rather one of three ways to get the host clusteroffset for a write request:
1. If a request overlaps an in-flight allocations, the cluster offset...
block: Add options QDict to bdrv_file_open() prototypes (fix MinGW build)
The new parameter is unused yet.
This part was missing in commit 787e4a8500020695eb391e2f1cc4767ee071d441.
Cc: Kevin Wolf <kwolf@redhat.com>Cc: Eric Blake <eblake@redhat.com>Signed-off-by: Stefan Weil <sw@weilnetz.de>...
rbd: fix compile error
Commit 787e4a85 [block: Add options QDict to bdrv_file_open() prototypes] didn'tupdate rbd.c accordingly.
Cc: Kevin Wolf <kwolf@redhat.com>Cc: Stefan Hajnoczi <stefanha@redhat.com>Signed-off-by: Liu Yuan <tailai.ly@taobao.com>Reviewed-by: Stefan Weil <sw@weilnetz.de>...
nbd: Accept -drive options for the network connection
The existing parsers for the file name now parse everything into thebdrv_open() options QDict. Instead of using these parsers, you can nowdirectly specify the options on the command line, like this:...
block: Introduce .bdrv_parse_filename callback
If a driver needs structured data and not just a string, it can providea .bdrv_parse_filename callback now that parses the command line stringinto separate options. Keeping this separate from .bdrv_open_filename...
block: Make find_image_format safe with NULL filename
In order to achieve this, the .bdrv_probe callbacks of all drivers mustcope with this. The DMG driver is the only one that bases its decisionon the filename and it needs to be changed.
nbd: Use default port if only host is specified
The URL method already takes care to apply the default port when none isspecfied. Directly specifying driver-specific options required the portnumber until now. Allow leaving it out and apply the default....
nbd: Check against invalid option combinations
A file name may only specified if no host or socket path is specified.The latter two may not appear at the same time either.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>Reviewed-by: Eric Blake <eblake@redhat.com>
block: Add options QDict to bdrv_file_open() prototypes
nbd: Keep hostname and port separate
The NBD block supports an URL syntax, for which a URL parser returnsseparate hostname and port fields. It also supports the traditional qemusyntax encoded in a filename. Until now, after parsing the URL to geteach piece of information, a new string is built to be fed to socket...
sheepdog: show error message for halt status
Sheepdog (neither quorum nor unsafe mode) will refuse to serve IO requests whennumber of alive nodes is less than that of copies specified by users. This willreturn 0x19 to QEMU client which currently doesn't recognize it....
qcow2: Fix segfault in qcow2_invalidate_cache
Need to pass an options QDict to qcow2_open() now. This fixes a segfaulton the migration target with qcow2.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
threadpool: drop global thread pool
Now that each AioContext has a ThreadPool and the main loop AioContextcan be fetched with bdrv_get_aio_context(), we can eliminate the conceptof a global thread pool from thread-pool.c.
The submit functions must take a ThreadPool* argument....
qcow2: flush refcount cache correctly in qcow2_write_snapshots()
Since qcow2 metadata is cached we need to flush the caches, not just theunderlying file. Use bdrv_flush(bs) instead of bdrv_flush(bs->file).
Also add the error return path when bdrv_flush() fails and move the...
qcow2: set L2 cache dependency in qcow2_alloc_bytes()
Compressed writes use qcow2_alloc_bytes() to allocate space with bytegranularity. The affected clusters' refcounts will be incremented butwe do not need to flush yet.
Set a L2 cache dependency on the refcount block cache, so that the...
qcow2: flush in qcow2_update_snapshot_refcount()
Users of qcow2_update_snapshot_refcount() do not flush consistently.qcow2_snapshot_create() flushes but qcow2_snapshot_goto() andqcow2_snapshot_delete() do not.
Solve this by moving the bdrv_flush() into...
qcow2: drop flush in update_cluster_refcount()
The update_cluster_refcount() function increments/decrements a cluster'srefcount and then returns the new refcount value.
There is no need to flush since both update_cluster_refcount() callersalready take care of this:...
qcow2: drop unnecessary flush in qcow2_update_snapshot_refcount()
We already flush when the function completes. There is no need to flushafter every compressed cluster.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qcow2: make is_allocated return true for zero clusters
Otherwise, live migration of the top layer will miss zero clusters andlet the backing file show through. This also matches what is done in qed.
QCOW2_CLUSTER_ZERO clusters are invalid in v2 image files. Check this...
sheepdog: use non-blocking fd in coroutine context
Using a blocking socket in the coroutine context reduces the chance ofswitching to other work. This patch makes the sheepdog driver use anon-blocking fd always.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>...
sheepdog: set io_flush handler in do_co_req
If an io_flush handler is not set, qemu_aio_wait doesn't invokecallbacks.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
block: Add options QDict to .bdrv_open()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>Reviewed-by: Eric Blake <eblake@redhat.com>Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
block: Add options QDict to bdrv_open() prototype
It doesn't do anything yet except storing the options QDict in theBlockDriverState.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>Reviewed-by: Eric Blake <eblake@redhat.com>Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>...
qcow2: Allow lazy refcounts to be enabled on the command line
qcow2 images now accept a boolean lazy_refcounts options. Use it likethis:
-drive file=test.qcow2,lazy_refcounts=on
If the option is specified on the command line, it overrides the default...
qcow2: flush refcount cache correctly in alloc_refcount_block()
update_refcount() affects the refcount cache, it does not write to disk.Therefore bdrv_flush(bs->file) does nothing. We need to flush therefcount cache in order to write out the refcount updates!...
iscsi: add iscsi_truncate support
this patch adds iscsi_truncate which effectively allows foronline resizing of iscsi volumes. for this to work you haveto resize the volume on your storage and then callblock_resize command in qemu which will issue areadcapacity16 to update the capacity....
iscsi: retry read, write, flush and unmap on unit attention check conditions
the storage might return a check condition status for various reasons.(e.g. bus reset, capacity change, thin-provisioning info etc.)
currently all these informative status responses lead to an I/O error...
move socket_set_nodelay to osdep.c
sheepdog: accept URIs
The URI syntax is consistent with the NBD and Gluster syntax. Thesyntax is
sheepdog[+tcp]://[host:port]/vdiname[#snapid|#tag]
sheepdog: use inet_connect to simplify connect code
This uses the form "<host>:<port>" for the representation of thesheepdog server to use inet_connect.
sheepdog: add support for connecting to unix domain socket
This patch adds support for a unix domain socket for a connectionbetween qemu and local sheepdog server. You can use the unix domainsocket with the following syntax:
$ qemu sheepdog+unix:///<vdiname>?socket=<socket path>[#snapid]...
qcow2: introduce check_refcounts_l1/l2() flags
The check_refcounts_l1/l2() functions have a check_copied argument tocheck that the QCOW_O_COPIED flag is consistent with refcount == 1.This should be a bool, not an int.
However, the next patch introduces qcow2 fragmentation statistics and...
qcow2: record fragmentation statistics during check
The qemu-img check command can display fragmentation statistics: * Total number of clusters in virtual disk * Number of allocated clusters * Number of fragmented clusters
This patch adds fragmentation statistics support to qcow2....
qcow2: support compressed clusters in BlockFragInfo
qemu-img: find the image end offset during check
This patch adds the support for reporting the image end offset (inbytes). This is particularly useful after a conversion (or a rebase)where the destination is a block device in order to find the firstunused byte at the end of the image....
block/curl: only restrict protocols with libcurl>=7.19.4
The curl_easy_setopt(state->curl, CURLOPT_PROTOCOLS, ...) interface wasintroduced in libcurl 7.19.4. Therefore we cannot protect againstCVE-2013-0249 when linking against an older libcurl.
This fixes the build failure introduced by...
Revert "block/vpc: Fix size calculation"
This reverts commit f880defbb06708d30a38ce9f2667067626acdd38.
Jeff Cody's testing revealed that the interpretation of size differseven between VirtualPC and HyperV. Revert this so there is time toconsider the impact of any backwards incompatible behavior this change...
block/raw-posix: detect readonly Linux block devices using BLKROGET
Linux block devices can be set read-only with "blockdev --setro<device>". The same thing can be done for LVM volumes using "lvchange--permission r <volume>". This read-only setting is independent of...
block/vpc: Fix size calculation
The size calculated from the CHS values is not the real image (disk) size,but usually a smaller value. This is caused by rounding effects.
Only older operating systems use CHS. Such guests won't be able to usethe whole disk. All modern operating systems use the real size....
error: Strip trailing '\n' from error string arguments (again)
Commit 6daf194d and be62a2eb got rid of a bunch, but they keep comingback. Tracked down with this Coccinelle semantic patch:
r expression err, eno, cls, fmt; position p; @@ (...
r
block/curl: disable extra protocols to prevent CVE-2013-0249
There is a buffer overflow in libcurl POP3/SMTP/IMAP. The workaround issimple: disable extra protocols so that they cannot be exploited. Fulldetails here:
http://curl.haxx.se/docs/adv_20130206.html...
block/raw-posix: Build fix for O_ASYNC
Commit eeb6b45d48800e96f67ef2a5c80332557fd45ddb (block: raw-posix imagefile reopen) broke the build on OpenIndiana.
illumos has no O_ASYNC. Exclude it from flags to be comparedand instead assert that it is not set where defined....
dmg: Fix bdrv_open() error handling
Return -errno instead of -1 on errors and add error checks in someplaces that didn't have one. Passing things by reference requires morecorrect typing, replaced a few off_ts therefore - with a 32-bit off_tthis is even a fix for truncation bugs....
dmg: Use g_free instead of free
The buffers are allocated with g_(re)alloc, so use g_free to free them.
parallels: Fix bdrv_open() error handling
Return -errno instead of -1 on errors. Hey, no memory leak to fix herewhile we're touching it!
vmdk: Allow space in file name
The previous scanf() format string stopped parsing the file name on thefirst white white space, which seems to be allowed at least by VMwareWorkstation.
Change the format string to collect everything between the first and...
vmdk: Allow selecting SCSI adapter in image creation
Introduce a new option "adapter_type" when converting to vmdk images.It can be one of the following: ide (default), buslogic, lsilogicor legacyESX (according to the vmdk spec from vmware).
In case of a non-ide adapter, heads is set to 255 instead of the 16....
sheepdog: pass vdi_id to sheep daemon for sd_close()
Sheep daemon needs vdi_id to identify which vdi is closed to release resourcessuch as object cache.
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>Cc: Kevin Wolf <kwolf@redhat.com>Cc: Stefan Hajnoczi <stefanha@redhat.com>...
bochs: Fix bdrv_open() error handling
Return -errno instead of -1 on errors. While touching thecode, fix a memory leak.
cloop: Fix bdrv_open() error handling
vpc: Fix bdrv_open() error handling
g_malloc(0) and g_malloc0(0) return NULL; simplify
Once upon a time, it was decided that qemu_malloc(0) should abort.Switching to glib retired that bright idea. Some code that was addedto cope with it (e.g. in commits 702ef63, b76b6e9) is still around....
mirror: support more than one in-flight AIO operation
With AIO support in place, we can start copying more than one chunkin parallel. This patch introduces the required infrastructure forthis: the buffer is split into multiple granularity-sized chunks,...
mirror: support arbitrarily-sized iterations
Yet another optimization is to extend the mirroring iteration to include moreadjacent dirty blocks. This limits the number of I/O operations and makesmirroring efficient even with a small granularity. Most of the infrastructure...
block: Use error code EMEDIUMTYPE for wrong format in some block drivers
This improves error reports for bochs, cow, qcow, qcow2, qed and vmdkwhen a file with the wrong format is selected.
Signed-off-by: Stefan Weil <sw@weilnetz.de>Reviewed-by: Eric Blake <eblake@redhat.com>...
block/vdi: Improve debug output for signature
The signature is a 32 bit value and needs up to 8 hex digits for printing.
Signed-off-by: Stefan Weil <sw@weilnetz.de>Reviewed-by: Eric Blake <eblake@redhat.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
block/vdi: Improved return values from vdi_open
vdi_open returned -1 in case of any error, but it should return anerror code (negative value of errno or -EMEDIUMTYPE).
block/vdi: Check for bad signature
vdi_open did not check for a bad signature.This check was only in vdi_probe.
mirror: do nothing on zero-sized disk
On a zero-sized disk we need to break out of the job successfullybefore bdrv_dirty_iter_init is called, otherwise you will get anassertion failure with the next patch.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>...
block: allow customizing the granularity of the dirty bitmap
Reviewed-by: Eric Blake <eblake@redhat.com>Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
mirror: allow customizing the granularity
The desired granularity may be very different depending on the kind ofoperation (e.g. continuous replication vs. collapse-to-raw) and whetherthe VM is expected to perform lots of I/O while mirroring is in progress....
mirror: switch mirror_iteration to AIO
There is really no change in the behavior of the job here, sincethere is still a maximum of one in-flight I/O operation betweenthe source and the target. However, this patch already introducesthe AIO callbacks (which are unmodified in the next patch)...
mirror: add buf-size argument to drive-mirror
This makes sense when the next commit starts using the extra buffer spaceto perform many I/O operations asynchronously.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
block: implement dirty bitmap using HBitmap
This actually uses the dirty bitmap in the block layer, and convertsmirroring to use an HBitmapIter.
Reviewed-by: Laszlo Ersek <lersek@redhat.com> (except block/mirror.c parts)Reviewed-by: Eric Blake <eblake@redhat.com>...
mirror: perform COW if the cluster size is bigger than the granularity
When mirroring runs, the backing files for the target may not yet beready. However, this means that a copy-on-write operation on the targetwould fill the missing sectors with zeros. Copy-on-write only happens...
block: return count of dirty sectors, not chunks
Reviewed-by: Laszlo Ersek <lersek@redhat.com>Reviewed-by: Eric Blake <eblake@redhat.com>Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>