threadpool: drop global thread pool
Now that each AioContext has a ThreadPool and the main loop AioContextcan be fetched with bdrv_get_aio_context(), we can eliminate the conceptof a global thread pool from thread-pool.c.
The submit functions must take a ThreadPool* argument....
qcow2: flush refcount cache correctly in qcow2_write_snapshots()
Since qcow2 metadata is cached we need to flush the caches, not just theunderlying file. Use bdrv_flush(bs) instead of bdrv_flush(bs->file).
Also add the error return path when bdrv_flush() fails and move the...
qcow2: set L2 cache dependency in qcow2_alloc_bytes()
Compressed writes use qcow2_alloc_bytes() to allocate space with bytegranularity. The affected clusters' refcounts will be incremented butwe do not need to flush yet.
Set a L2 cache dependency on the refcount block cache, so that the...
qcow2: flush in qcow2_update_snapshot_refcount()
Users of qcow2_update_snapshot_refcount() do not flush consistently.qcow2_snapshot_create() flushes but qcow2_snapshot_goto() andqcow2_snapshot_delete() do not.
Solve this by moving the bdrv_flush() into...
qcow2: drop flush in update_cluster_refcount()
The update_cluster_refcount() function increments/decrements a cluster'srefcount and then returns the new refcount value.
There is no need to flush since both update_cluster_refcount() callersalready take care of this:...
qcow2: drop unnecessary flush in qcow2_update_snapshot_refcount()
We already flush when the function completes. There is no need to flushafter every compressed cluster.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qcow2: make is_allocated return true for zero clusters
Otherwise, live migration of the top layer will miss zero clusters andlet the backing file show through. This also matches what is done in qed.
QCOW2_CLUSTER_ZERO clusters are invalid in v2 image files. Check this...
sheepdog: use non-blocking fd in coroutine context
Using a blocking socket in the coroutine context reduces the chance ofswitching to other work. This patch makes the sheepdog driver use anon-blocking fd always.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>...
sheepdog: set io_flush handler in do_co_req
If an io_flush handler is not set, qemu_aio_wait doesn't invokecallbacks.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
block: Add options QDict to .bdrv_open()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>Reviewed-by: Eric Blake <eblake@redhat.com>Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
block: Add options QDict to bdrv_open() prototype
It doesn't do anything yet except storing the options QDict in theBlockDriverState.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>Reviewed-by: Eric Blake <eblake@redhat.com>Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>...
qcow2: Allow lazy refcounts to be enabled on the command line
qcow2 images now accept a boolean lazy_refcounts options. Use it likethis:
-drive file=test.qcow2,lazy_refcounts=on
If the option is specified on the command line, it overrides the default...
qcow2: flush refcount cache correctly in alloc_refcount_block()
update_refcount() affects the refcount cache, it does not write to disk.Therefore bdrv_flush(bs->file) does nothing. We need to flush therefcount cache in order to write out the refcount updates!...
iscsi: add iscsi_truncate support
this patch adds iscsi_truncate which effectively allows foronline resizing of iscsi volumes. for this to work you haveto resize the volume on your storage and then callblock_resize command in qemu which will issue areadcapacity16 to update the capacity....
iscsi: retry read, write, flush and unmap on unit attention check conditions
the storage might return a check condition status for various reasons.(e.g. bus reset, capacity change, thin-provisioning info etc.)
currently all these informative status responses lead to an I/O error...
move socket_set_nodelay to osdep.c
sheepdog: accept URIs
The URI syntax is consistent with the NBD and Gluster syntax. Thesyntax is
sheepdog[+tcp]://[host:port]/vdiname[#snapid|#tag]
sheepdog: use inet_connect to simplify connect code
This uses the form "<host>:<port>" for the representation of thesheepdog server to use inet_connect.
sheepdog: add support for connecting to unix domain socket
This patch adds support for a unix domain socket for a connectionbetween qemu and local sheepdog server. You can use the unix domainsocket with the following syntax:
$ qemu sheepdog+unix:///<vdiname>?socket=<socket path>[#snapid]...
qcow2: introduce check_refcounts_l1/l2() flags
The check_refcounts_l1/l2() functions have a check_copied argument tocheck that the QCOW_O_COPIED flag is consistent with refcount == 1.This should be a bool, not an int.
However, the next patch introduces qcow2 fragmentation statistics and...
qcow2: record fragmentation statistics during check
The qemu-img check command can display fragmentation statistics: * Total number of clusters in virtual disk * Number of allocated clusters * Number of fragmented clusters
This patch adds fragmentation statistics support to qcow2....
qcow2: support compressed clusters in BlockFragInfo
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
qemu-img: find the image end offset during check
This patch adds the support for reporting the image end offset (inbytes). This is particularly useful after a conversion (or a rebase)where the destination is a block device in order to find the firstunused byte at the end of the image....
block/curl: only restrict protocols with libcurl>=7.19.4
The curl_easy_setopt(state->curl, CURLOPT_PROTOCOLS, ...) interface wasintroduced in libcurl 7.19.4. Therefore we cannot protect againstCVE-2013-0249 when linking against an older libcurl.
This fixes the build failure introduced by...
Revert "block/vpc: Fix size calculation"
This reverts commit f880defbb06708d30a38ce9f2667067626acdd38.
Jeff Cody's testing revealed that the interpretation of size differseven between VirtualPC and HyperV. Revert this so there is time toconsider the impact of any backwards incompatible behavior this change...
block/raw-posix: detect readonly Linux block devices using BLKROGET
Linux block devices can be set read-only with "blockdev --setro<device>". The same thing can be done for LVM volumes using "lvchange--permission r <volume>". This read-only setting is independent of...
block/vpc: Fix size calculation
The size calculated from the CHS values is not the real image (disk) size,but usually a smaller value. This is caused by rounding effects.
Only older operating systems use CHS. Such guests won't be able to usethe whole disk. All modern operating systems use the real size....
error: Strip trailing '\n' from error string arguments (again)
Commit 6daf194d and be62a2eb got rid of a bunch, but they keep comingback. Tracked down with this Coccinelle semantic patch:
r expression err, eno, cls, fmt; position p; @@ (...
r
block/curl: disable extra protocols to prevent CVE-2013-0249
There is a buffer overflow in libcurl POP3/SMTP/IMAP. The workaround issimple: disable extra protocols so that they cannot be exploited. Fulldetails here:
http://curl.haxx.se/docs/adv_20130206.html...
block/raw-posix: Build fix for O_ASYNC
Commit eeb6b45d48800e96f67ef2a5c80332557fd45ddb (block: raw-posix imagefile reopen) broke the build on OpenIndiana.
illumos has no O_ASYNC. Exclude it from flags to be comparedand instead assert that it is not set where defined....
dmg: Fix bdrv_open() error handling
Return -errno instead of -1 on errors and add error checks in someplaces that didn't have one. Passing things by reference requires morecorrect typing, replaced a few off_ts therefore - with a 32-bit off_tthis is even a fix for truncation bugs....
dmg: Use g_free instead of free
The buffers are allocated with g_(re)alloc, so use g_free to free them.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
parallels: Fix bdrv_open() error handling
Return -errno instead of -1 on errors. Hey, no memory leak to fix herewhile we're touching it!
vmdk: Allow space in file name
The previous scanf() format string stopped parsing the file name on thefirst white white space, which seems to be allowed at least by VMwareWorkstation.
Change the format string to collect everything between the first and...
vmdk: Allow selecting SCSI adapter in image creation
Introduce a new option "adapter_type" when converting to vmdk images.It can be one of the following: ide (default), buslogic, lsilogicor legacyESX (according to the vmdk spec from vmware).
In case of a non-ide adapter, heads is set to 255 instead of the 16....
sheepdog: pass vdi_id to sheep daemon for sd_close()
Sheep daemon needs vdi_id to identify which vdi is closed to release resourcessuch as object cache.
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>Cc: Kevin Wolf <kwolf@redhat.com>Cc: Stefan Hajnoczi <stefanha@redhat.com>...
bochs: Fix bdrv_open() error handling
Return -errno instead of -1 on errors. While touching thecode, fix a memory leak.
cloop: Fix bdrv_open() error handling
vpc: Fix bdrv_open() error handling
g_malloc(0) and g_malloc0(0) return NULL; simplify
Once upon a time, it was decided that qemu_malloc(0) should abort.Switching to glib retired that bright idea. Some code that was addedto cope with it (e.g. in commits 702ef63, b76b6e9) is still around....
mirror: support more than one in-flight AIO operation
With AIO support in place, we can start copying more than one chunkin parallel. This patch introduces the required infrastructure forthis: the buffer is split into multiple granularity-sized chunks,...
mirror: support arbitrarily-sized iterations
Yet another optimization is to extend the mirroring iteration to include moreadjacent dirty blocks. This limits the number of I/O operations and makesmirroring efficient even with a small granularity. Most of the infrastructure...
block: Use error code EMEDIUMTYPE for wrong format in some block drivers
This improves error reports for bochs, cow, qcow, qcow2, qed and vmdkwhen a file with the wrong format is selected.
Signed-off-by: Stefan Weil <sw@weilnetz.de>Reviewed-by: Eric Blake <eblake@redhat.com>...
block/vdi: Improve debug output for signature
The signature is a 32 bit value and needs up to 8 hex digits for printing.
Signed-off-by: Stefan Weil <sw@weilnetz.de>Reviewed-by: Eric Blake <eblake@redhat.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
block/vdi: Improved return values from vdi_open
vdi_open returned -1 in case of any error, but it should return anerror code (negative value of errno or -EMEDIUMTYPE).
block/vdi: Check for bad signature
vdi_open did not check for a bad signature.This check was only in vdi_probe.
mirror: do nothing on zero-sized disk
On a zero-sized disk we need to break out of the job successfullybefore bdrv_dirty_iter_init is called, otherwise you will get anassertion failure with the next patch.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>...
block: allow customizing the granularity of the dirty bitmap
Reviewed-by: Eric Blake <eblake@redhat.com>Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
mirror: allow customizing the granularity
The desired granularity may be very different depending on the kind ofoperation (e.g. continuous replication vs. collapse-to-raw) and whetherthe VM is expected to perform lots of I/O while mirroring is in progress....
mirror: switch mirror_iteration to AIO
There is really no change in the behavior of the job here, sincethere is still a maximum of one in-flight I/O operation betweenthe source and the target. However, this patch already introducesthe AIO callbacks (which are unmodified in the next patch)...
mirror: add buf-size argument to drive-mirror
This makes sense when the next commit starts using the extra buffer spaceto perform many I/O operations asynchronously.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
block: implement dirty bitmap using HBitmap
This actually uses the dirty bitmap in the block layer, and convertsmirroring to use an HBitmapIter.
Reviewed-by: Laszlo Ersek <lersek@redhat.com> (except block/mirror.c parts)Reviewed-by: Eric Blake <eblake@redhat.com>...
mirror: perform COW if the cluster size is bigger than the granularity
When mirroring runs, the backing files for the target may not yet beready. However, this means that a copy-on-write operation on the targetwould fill the missing sectors with zeros. Copy-on-write only happens...
block: return count of dirty sectors, not chunks
Reviewed-by: Laszlo Ersek <lersek@redhat.com>Reviewed-by: Eric Blake <eblake@redhat.com>Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
iscsi: do not leak acb->buf when commands are aborted
acb->buf is freed in the WRITE callback, but this may notget called at all when commands are aborted. Add anotherfree in the ABORT TASK callback, which requires setting acb->bufto NULL everywhere....
iscsi: add support for iovectors
This patch adds support for directly passing the iovecarray from QEMUIOVector if libiscsi supports it (1.8.0or newer).
Signed-off-by: Peter Lieven <pl@kamp.de>[Preserve the improvements from commit 4cc841b, iscsi: partly...
Merge remote-tracking branch 'bonzini/scsi-next' into staging
iscsi: add iscsi_create support
This patch adds support for bdrv_create. This allows e.g.to use qemu-img to convert from any supported device toan iscsi backed storage as destination.
Signed-off-by: Peter Lieven <pl@kamp.de>Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
iscsi: partly avoid iovec linearization in iscsi_aio_writev
libiscsi expects all write16 data in a linear buffer. If theiovec only contains one buffer we can skip the linearizationstep as well as the additional malloc/free and pass thebuffer directly....
iscsi: add support for iSCSI NOPs [v2]
This patch will send NOP-Out PDUs every 5 seconds to the iSCSI target.If a consecutive number of NOP-In replies fail a reconnect is initiated.iSCSI NOPs help to ensure that the connection to the target is still operational....
Merge remote-tracking branch 'stefanha/block' into staging
block/raw-posix: Make hdev_aio_discard() available outside Linux
Fixes the build on OpenBSD among others.
Suggested-by: Kevin Wolf <kwolf@redhat.com>Signed-off-by: Andreas Färber <andreas.faerber@web.de>Cc: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
win32-aio: use iov utility functions instead of open-coding them
We have iov_from_buf() and iov_to_buf(), use them instead ofopen-coding these in block/win32-aio.c
win32-aio: Fix memory leak
The buffer is allocated for both reads and writes, and obviously itshould be freed even if an error occurs.
Cc: qemu-stable@nongnu.orgSigned-off-by: Kevin Wolf <kwolf@redhat.com>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
win32-aio: Fix vectored reads
Copying data in the right direction really helps a lot!
block: fix null-pointer bug on error case in block commit
This is a bug that was caught by a coverity run by Markus. Inthe error case when we errored out to exit_restore_open early in thefunction, 'overlay_bs' was still NULL at that point, although it is...
block: Fix how mirror_run() frees its buffer
It allocates with qemu_blockalign(), therefore it must free withqemu_vfree(), not g_free().
Signed-off-by: Markus Armbruster <armbru@redhat.com>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
win32-aio: Fix how win32_aio_process_completion() frees buffer
win32_aio_submit() allocates it with qemu_blockalign(), therefore itmust be freed with qemu_vfree(), not g_free().
Signed-off-by: Markus Armbruster <armbru@redhat.com>Reviewed-by: Kevin Wolf <kwolf@redhat.com>...
sheepdog: clean up sd_aio_setup()
The last two parameters of sd_aio_setup() are never used, so remove them.
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>Cc: Kevin Wolf <kwolf@redhat.com>Cc: Stefan Hajnoczi <stefanha@redhat.com>Signed-off-by: Liu Yuan <tailai.ly@taobao.com>...
sheepdog: multiplex the rw FD to flush cache
This will reduce sockfds connected to the sheep server to one, which simply thefuture hacks.
raw-posix: support discard on more filesystems
Linux 2.6.38 introduced the filesystem independent interface todeallocate part of a file. As of Linux 3.7, btrfs, ext4, ocfs2,tmpfs and xfs support it.
Even though the system calls here are in practice issued on Linux,...
raw-posix: remember whether discard failed
Avoid sending system calls repeatedly if they shall fail. Thisdoes not apply to XFS: if the filesystem-specific ioctl fails,something weird is happening.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
raw: support discard on block devices
Block devices use a ioctl instead of fallocate, so add a separateimplementation.
block: make discard asynchronous
This is easy with the thread pool, because we can use s->is_xfs ands->has_discard from the worker function.
QEMU has a widespread assumption that each I/O operation writes lessthan 2^32 bytes. This patch doesn't fix it throughout of course,...
qcow2: Fix segfault on zero-length write
One of the recent refactoring patches (commit f50f88b9) didn't take careto initialise l2meta properly, so with zero-length writes, which don'teven enter the write loop, qemu just segfaulted.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>...
Merge remote-tracking branch 'kwolf/for-anthony' into staging
sheepdog: implement direct write semantics
Sheepdog supports both writeback/writethrough write but has not yet supportedDIRECTIO semantics which bypass the cache completely even if Sheepdog daemon isset up with cache enabled.
Suppose cache is enabled on Sheepdog daemon size, the new cache control is...
raw-posix: fix bdrv_aio_ioctl
When the raw-posix aio=thread code was moved from posix-aio-compat.cto block/raw-posix.c, there was an unintended change to the ioctl code.The code used to return the ioctl command, which posix_aio_read()would later morph into a zero. This hack is not necessary anymore,...
block: make qiov_is_aligned() public
The qiov_is_aligned() function checks whether a QEMUIOVector meets aBlockDriverState's alignment requirements. This is needed byvirtio-blk-data-plane so:
1. Move the function from block/raw-posix.c to block/block.c....
qemu-option: move standard option definitions out of qemu-config.c
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Replace remaining gmtime, localtime by gmtime_r, localtime_r
This allows removing of MinGW specific code and improvesreentrancy for POSIX hosts.
[Removed unused ret variable in qemu_get_timedate() to fix warning:vl.c: In function ‘qemu_get_timedate’:vl.c:451:16: error: variable ‘ret’ set but not used [-Werror=unused-but-set-variable]...
sheepdog: pass oid directly to send_pending_req()
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>Cc: Kevin Wolf <kwolf@redhat.com>Signed-off-by: Liu Yuan <tailai.ly@taobao.com>Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
sheepdog: don't update inode when create_and_write fails
For the error case such as SD_RES_NO_SPACE, we shouldn't update the inode bitmapto avoid the scenario that the object is allocated but wasn't created at theserver side. This will result in VM's IO error on the failed object....
block/raw-win32: Fix compiler warnings (wrong format specifiers)
Commit fbcad04d6bfdff937536eb23088a01a280a1a3af added fprintf statementswith wrong format specifiers.
GetLastError() returns a DWORD which is unsigned long, so %lu must be used.
Signed-off-by: Stefan Weil <sw@weilnetz.de>...
raw-posix: add raw_get_aio_fd() for virtio-blk-data-plane
The raw_get_aio_fd() function allows virtio-blk-data-plane to get thefile descriptor of a raw image file with Linux AIO enabled. Thisinterface is really a layering violation that can be resolved once the...
softmmu: move include files to include/sysemu/
misc: move include files to include/qemu/
migration: move include files to include/migration/
qapi: move include files to include/qobject/
block: move include files to include/block/
janitor: do not include qemu-char everywhere
Touching char/char.h basically causes the whole of QEMU tobe rebuilt. Avoid this, it is usually unnecessary.
janitor: do not rely on indirect inclusions of or from qemu-char.h
Various header files rely on qemu-char.h including qemu-config.h ormain-loop.h, but they really do not need qemu-char.h at all (particularlyinteresting is the case of the block layer!). Clean this up, and also...
build: move rules from Makefile to */Makefile.objs
qcow2: Introduce Qcow2COWRegion
This makes it easier to address the areas for which a COW must beperformed. As a nice side effect, the COW code inqcow2_alloc_cluster_link_l2 becomes really trivial.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qcow2: Allocate l2meta dynamically
As soon as delayed COW is introduced, the l2meta struct is needed evenafter completion of the request, so it can't live on the stack.
qcow2: Drop l2meta.cluster_offset
There's no real reason to have an l2meta for normal requests that don'tallocate anything. Before we can get rid of it, we must return the hostcluster offset in a different way.
qcow2: Allocate l2meta only for cluster allocations
Even for writes to already allocated clusters, an l2meta is allocated,though it stays effectively unused. After this patch, only allocatingrequests still have one. Each l2meta now describes an in-flight request...
qcow2: Enable dirty flag in qcow2_alloc_cluster_link_l2
This is closer to where the dirty flag is really needed, and it avoidshaving checks for special cases related to cluster allocation directlyin the writev loop.
qcow2: Execute run_dependent_requests() without lock
There's no reason for run_dependent_requests() to hold s->lock, and alater patch will require that in fact the lock is not held.
Also, before this patch, run_dependent_requests() not only does what its...
qcow2: Factor out handle_dependencies()