block: allow customizing the granularity of the dirty bitmap
Reviewed-by: Eric Blake <eblake@redhat.com>Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
block: implement dirty bitmap using HBitmap
This actually uses the dirty bitmap in the block layer, and convertsmirroring to use an HBitmapIter.
Reviewed-by: Laszlo Ersek <lersek@redhat.com> (except block/mirror.c parts)Reviewed-by: Eric Blake <eblake@redhat.com>...
mirror: perform COW if the cluster size is bigger than the granularity
When mirroring runs, the backing files for the target may not yet beready. However, this means that a copy-on-write operation on the targetwould fill the missing sectors with zeros. Copy-on-write only happens...
block: return count of dirty sectors, not chunks
Reviewed-by: Laszlo Ersek <lersek@redhat.com>Reviewed-by: Eric Blake <eblake@redhat.com>Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
iscsi: do not leak acb->buf when commands are aborted
acb->buf is freed in the WRITE callback, but this may notget called at all when commands are aborted. Add anotherfree in the ABORT TASK callback, which requires setting acb->bufto NULL everywhere....
iscsi: add support for iovectors
This patch adds support for directly passing the iovecarray from QEMUIOVector if libiscsi supports it (1.8.0or newer).
Signed-off-by: Peter Lieven <pl@kamp.de>[Preserve the improvements from commit 4cc841b, iscsi: partly...
Merge remote-tracking branch 'bonzini/scsi-next' into staging
iscsi: add iscsi_create support
This patch adds support for bdrv_create. This allows e.g.to use qemu-img to convert from any supported device toan iscsi backed storage as destination.
Signed-off-by: Peter Lieven <pl@kamp.de>Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
iscsi: partly avoid iovec linearization in iscsi_aio_writev
libiscsi expects all write16 data in a linear buffer. If theiovec only contains one buffer we can skip the linearizationstep as well as the additional malloc/free and pass thebuffer directly....
iscsi: add support for iSCSI NOPs [v2]
This patch will send NOP-Out PDUs every 5 seconds to the iSCSI target.If a consecutive number of NOP-In replies fail a reconnect is initiated.iSCSI NOPs help to ensure that the connection to the target is still operational....
Merge remote-tracking branch 'stefanha/block' into staging
block/raw-posix: Make hdev_aio_discard() available outside Linux
Fixes the build on OpenBSD among others.
Suggested-by: Kevin Wolf <kwolf@redhat.com>Signed-off-by: Andreas Färber <andreas.faerber@web.de>Cc: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
win32-aio: use iov utility functions instead of open-coding them
We have iov_from_buf() and iov_to_buf(), use them instead ofopen-coding these in block/win32-aio.c
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
win32-aio: Fix memory leak
The buffer is allocated for both reads and writes, and obviously itshould be freed even if an error occurs.
Cc: qemu-stable@nongnu.orgSigned-off-by: Kevin Wolf <kwolf@redhat.com>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
win32-aio: Fix vectored reads
Copying data in the right direction really helps a lot!
block: fix null-pointer bug on error case in block commit
This is a bug that was caught by a coverity run by Markus. Inthe error case when we errored out to exit_restore_open early in thefunction, 'overlay_bs' was still NULL at that point, although it is...
block: Fix how mirror_run() frees its buffer
It allocates with qemu_blockalign(), therefore it must free withqemu_vfree(), not g_free().
Signed-off-by: Markus Armbruster <armbru@redhat.com>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
win32-aio: Fix how win32_aio_process_completion() frees buffer
win32_aio_submit() allocates it with qemu_blockalign(), therefore itmust be freed with qemu_vfree(), not g_free().
Signed-off-by: Markus Armbruster <armbru@redhat.com>Reviewed-by: Kevin Wolf <kwolf@redhat.com>...
sheepdog: clean up sd_aio_setup()
The last two parameters of sd_aio_setup() are never used, so remove them.
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>Cc: Kevin Wolf <kwolf@redhat.com>Cc: Stefan Hajnoczi <stefanha@redhat.com>Signed-off-by: Liu Yuan <tailai.ly@taobao.com>...
sheepdog: multiplex the rw FD to flush cache
This will reduce sockfds connected to the sheep server to one, which simply thefuture hacks.
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>Cc: Kevin Wolf <kwolf@redhat.com>Cc: Stefan Hajnoczi <stefanha@redhat.com>...
raw-posix: support discard on more filesystems
Linux 2.6.38 introduced the filesystem independent interface todeallocate part of a file. As of Linux 3.7, btrfs, ext4, ocfs2,tmpfs and xfs support it.
Even though the system calls here are in practice issued on Linux,...
raw-posix: remember whether discard failed
Avoid sending system calls repeatedly if they shall fail. Thisdoes not apply to XFS: if the filesystem-specific ioctl fails,something weird is happening.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
raw: support discard on block devices
Block devices use a ioctl instead of fallocate, so add a separateimplementation.
block: make discard asynchronous
This is easy with the thread pool, because we can use s->is_xfs ands->has_discard from the worker function.
QEMU has a widespread assumption that each I/O operation writes lessthan 2^32 bytes. This patch doesn't fix it throughout of course,...
qcow2: Fix segfault on zero-length write
One of the recent refactoring patches (commit f50f88b9) didn't take careto initialise l2meta properly, so with zero-length writes, which don'teven enter the write loop, qemu just segfaulted.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>...
Merge remote-tracking branch 'kwolf/for-anthony' into staging
sheepdog: implement direct write semantics
Sheepdog supports both writeback/writethrough write but has not yet supportedDIRECTIO semantics which bypass the cache completely even if Sheepdog daemon isset up with cache enabled.
Suppose cache is enabled on Sheepdog daemon size, the new cache control is...
raw-posix: fix bdrv_aio_ioctl
When the raw-posix aio=thread code was moved from posix-aio-compat.cto block/raw-posix.c, there was an unintended change to the ioctl code.The code used to return the ioctl command, which posix_aio_read()would later morph into a zero. This hack is not necessary anymore,...
block: make qiov_is_aligned() public
The qiov_is_aligned() function checks whether a QEMUIOVector meets aBlockDriverState's alignment requirements. This is needed byvirtio-blk-data-plane so:
1. Move the function from block/raw-posix.c to block/block.c....
qemu-option: move standard option definitions out of qemu-config.c
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Replace remaining gmtime, localtime by gmtime_r, localtime_r
This allows removing of MinGW specific code and improvesreentrancy for POSIX hosts.
[Removed unused ret variable in qemu_get_timedate() to fix warning:vl.c: In function ‘qemu_get_timedate’:vl.c:451:16: error: variable ‘ret’ set but not used [-Werror=unused-but-set-variable]...
sheepdog: pass oid directly to send_pending_req()
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>Cc: Kevin Wolf <kwolf@redhat.com>Signed-off-by: Liu Yuan <tailai.ly@taobao.com>Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
sheepdog: don't update inode when create_and_write fails
For the error case such as SD_RES_NO_SPACE, we shouldn't update the inode bitmapto avoid the scenario that the object is allocated but wasn't created at theserver side. This will result in VM's IO error on the failed object....
block/raw-win32: Fix compiler warnings (wrong format specifiers)
Commit fbcad04d6bfdff937536eb23088a01a280a1a3af added fprintf statementswith wrong format specifiers.
GetLastError() returns a DWORD which is unsigned long, so %lu must be used.
Signed-off-by: Stefan Weil <sw@weilnetz.de>...
raw-posix: add raw_get_aio_fd() for virtio-blk-data-plane
The raw_get_aio_fd() function allows virtio-blk-data-plane to get thefile descriptor of a raw image file with Linux AIO enabled. Thisinterface is really a layering violation that can be resolved once the...
softmmu: move include files to include/sysemu/
misc: move include files to include/qemu/
migration: move include files to include/migration/
qapi: move include files to include/qobject/
block: move include files to include/block/
janitor: do not include qemu-char everywhere
Touching char/char.h basically causes the whole of QEMU tobe rebuilt. Avoid this, it is usually unnecessary.
janitor: do not rely on indirect inclusions of or from qemu-char.h
Various header files rely on qemu-char.h including qemu-config.h ormain-loop.h, but they really do not need qemu-char.h at all (particularlyinteresting is the case of the block layer!). Clean this up, and also...
build: move rules from Makefile to */Makefile.objs
qcow2: Round QCowL2Meta.offset down to cluster boundary
The offset within the cluster is already present as n_start and this iswhat the code uses. QCowL2Meta.offset is only needed at a clustergranularity.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qcow2: Introduce Qcow2COWRegion
This makes it easier to address the areas for which a COW must beperformed. As a nice side effect, the COW code inqcow2_alloc_cluster_link_l2 becomes really trivial.
qcow2: Allocate l2meta dynamically
As soon as delayed COW is introduced, the l2meta struct is needed evenafter completion of the request, so it can't live on the stack.
qcow2: Drop l2meta.cluster_offset
There's no real reason to have an l2meta for normal requests that don'tallocate anything. Before we can get rid of it, we must return the hostcluster offset in a different way.
qcow2: Allocate l2meta only for cluster allocations
Even for writes to already allocated clusters, an l2meta is allocated,though it stays effectively unused. After this patch, only allocatingrequests still have one. Each l2meta now describes an in-flight request...
qcow2: Enable dirty flag in qcow2_alloc_cluster_link_l2
This is closer to where the dirty flag is really needed, and it avoidshaving checks for special cases related to cluster allocation directlyin the writev loop.
qcow2: Execute run_dependent_requests() without lock
There's no reason for run_dependent_requests() to hold s->lock, and alater patch will require that in fact the lock is not held.
Also, before this patch, run_dependent_requests() not only does what its...
qcow2: Factor out handle_dependencies()
blkdebug: Allow usage without config file
As soon as new rules can be set during runtime, as introduced by thenext patch, blkdebug makes sense even without a config file.
blkdebug: Factor out remove_rule()
The cleanup work to remove a rule depends on the type of the rule. It'seasy for the existing rules as there is no data that must be cleaned upand is specific to a type yet, but the next patch will change this.
blkdebug: Implement suspend/resume of AIO requests
This allows more systematic AIO testing. The patch adds three newoperations to blkdebug:
qcow2: Move BLKDBG_EVENT out of the lock
We want to use these events to suspend requests for testing concurrentAIO requests. Suspending requests while they are holding the CoMutex israther boring for this purpose.
Fix error code checking for SetFilePointer() call
An error has occurred if the return value is invalid_set_file_pointerand getlasterror doesn't return no_error.
Signed-off-by: Fabien Chouteau <chouteau@adacore.com>Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
rbd: Fix race between aio completition and aio cancel
This one fixes a race which qemu had also in iscsi block driverbetween cancellation and io completition.
qemu_rbd_aio_cancel was not synchronously waiting for the end ofthe command.
To archieve this it introduces a new status flag which uses...
block: vpc support for ~2 TB disks
The VHD specification allows for up to a 2 TB disk size. The currentimplementation in qemu emulates EIDE and ATA-2 hardware which only allowsfor up to 127 GB. This disk size limitation can be overridden by allowingup to 255 heads instead of the normal 4 bit limitation of 16. Doing so...
raw-posix: inline paio_ioctl into hdev_aio_ioctl
clang now warns about an unused function: CC block/raw-posix.oblock/raw-posix.c:707:26: warning: unused function paio_ioctl[-Wunused-function]static BlockDriverAIOCB *paio_ioctl(BlockDriverState *bs, int fd,...
aio: Get rid of qemu_aio_flush()
There are no remaining users, and new users should probably beusing bdrv_drain_all() in the first place.
block: vpc initialize the uuid footer field
Initialize the uuid field in the footer with a generated uuid.
Signed-off-by: Charles Arnold <carnold@suse.com>Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
iscsi: do not assume device is zero initialized
Without any complex checks we can't assume that aniscsi target is initialized to zero.
iscsi: fix deadlock during login
If the connection is interrupted before the first login is successfullycompleted qemu-kvm is waiting forever in qemu_aio_wait().
This is fixed by performing an sync login to the target. If theconnection breaks after the first successful login errors are...
iscsi: fix segfault in url parsing
If an invalid URL is specified iscsi_get_error(iscsi) is calledwith iscsi == NULL.
use int64_t for return values from rbd instead of int
rbd / rados tends to return pretty often length of writesor discarded blocks. These values might be bigger than int.
The steps to reproduce are:
mkfs.xfs -f a whole device bigger than int in bytes. mkfs.xfs sends...
block: add bdrv_reopen() support for raw hdev, floppy, and cdrom
For hdev, floppy, and cdrom, the reopen() handlers are the same asfor the file reopen handler. For floppy and cdrom types, however,we keep O_NONBLOCK, as in the _open function.
Signed-off-by: Jeff Cody <jcody@redhat.com>...
vdi: don't override libuuid symbols
It's poor symbol hygiene to provide a global symbols that collide with acommon library like libuuid. If QEMU links against a shared librarythat depends on uuid_generate() it can end up calling our stub versionof the function....
vmdk: Fix data corruption bug in WRITE and READ handling
Fixed a MAJOR BUG in VMDK files on file boundaries on readsand ALSO ON WRITES WHICH MIGHT CORRUPT THE IMAGE AND DATA!!!!!!
Triggered for example with the following VMDK file (partly listed):RW 4193792 FLAT "XP-W1-f001.vmdk" 0...
qcow2: Fix refcount table size calculation
A missing factor for the refcount table entry size in the calculationcould mean that too little memory was allocated for the in-memoryrepresentation of the table, resulting in a buffer overflow.
block: Workaround for older versions of MinGW gcc
Versions before gcc-4.6 don't support unnamed fields in initializers(see http://gcc.gnu.org/bugzilla/show_bug.cgi?id=10676).
Offset and OffsetHigh belong to an unnamed struct which is part of anunnamed union. Therefore the original code does not work with older...
aio: rename AIOPool to AIOCBInfo
Now that AIOPool no longer keeps a freelist, it isn't really a "pool" anymore. Rename it to AIOCBInfo and make it const since it no longerneeds to be modified.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
nbd: accept URIs
The URI syntax is consistent with the Gluster syntax. Export namesare specified in the path, preceded by one or more (otherwise unused)slashes.
nbd: accept relative path to Unix socket
Adding the "is_unix" member now will simplify the parsing of NBD URIs.
Merge remote-tracking branch 'origin/master' into threadpool
raw-win32: add emulated AIO support
raw-posix: move linux-aio.c to block/
raw-win32: implement native asynchronous I/O
With the new support for EventNotifiers in the AIO event loop, wecan hook a completion port to every opened file and use asynchronousI/O on them.
Wine's support is extremely inefficient, also because it really does...
block: switch posix-aio-compat to threadpool
This is not meant for portability, but to remove code duplication.
raw: merge posix-aio-compat.c into block/raw-posix.c
Making the qemu_paiocb specific to raw devices will let us access membersof the BDRVRawState arbitrarily.
raw-posix: rename raw-posix-aio.h, hide unavailable prototypes
aio: add Win32 implementation
The Win32 implementation will only accept EventNotifiers, thus a fewdrivers are disabled under Windows. EventNotifiers are a good matchfor the GSource implementation, too, because the Win32 port of gliballows to place their HANDLEs in a GPollFD....
mirror: implement completion
Switching to the target of the migration is done mostly asynchronously,and reported to management via the BLOCK_JOB_COMPLETED event; the onlysynchronous phase is opening the backing files. bdrv_open_backing_filecan always be done, even for migration of the full image (aka sync:...
mirror: add support for on-source-error/on-target-error
Error management is important for mirroring; otherwise, an error on thetarget (even something as "innocent" as ENOSPC) requires to start againwith a full copy. Similar to on_read_error/on_write_error, two separate...
block: in commit, determine base image from the top image
This simplifies some code and error checking, and also fixes a bug.
bdrv_find_backing_image() should only be passed absolute filenames,or filenames relative to the chain. In the QMP message handler for...
block: rename block_job_complete to block_job_completed
The imperative will be used for the QMP command.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
mirror: introduce mirror job
This patch adds the implementation of a new job that mirrors a disk toa new image while letting the guest continue using the old image.The target is treated as a "black box" and data is copied from thesource to the target in the background. This can be used for several...
sheepdog: use bool for boolean variables
This improves readability.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Merge branch 'trivial-patches' of git://github.com/stefanha/qemu
cleanup useless return sentence
This patch cleans up return sentences in the end of void functions.
Reported-by: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Amos Kong <akong@redhat.com>Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
qcow2: mark this file's sole strncpy use as justified
Acked-by: Kevin Wolf <kwolf@redhat.com>Signed-off-by: Jim Meyering <meyering@redhat.com>Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
sheepdog: avoid a few buffer overruns
vmdk: relative_path: use pstrcpy in place of strncpy
Avoid strncpy+manual-NUL-terminate. Use pstrcpy instead.
stream: add on-error argument
This patch adds support for error management to streaming.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>Reviewed-by: Eric Blake <eblake@redhat.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
blkdebug: process all set_state rules in the old state
Currently it is impossible to write a blkdebug script that ping-pongsbetween two states, because the second set-state rule will use thestate that is set in the first. If you have
[set-state] event = "..."...
iostatus: move BlockdevOnError declaration to QAPI
This will let block-stream reuse the enum. Places that used the enumsare renamed accordingly.
block: move job APIs to separate files
block: add live block commit functionality
This adds the live commit coroutine. This iteration focuses on thecommit only below the active layer, and not the active layer itself.
The behaviour is similar to block streaming; the sectors are walkedthrough, and anything that exists above 'base' is committed back down...
block: Support GlusterFS as a QEMU block backend.
This patch adds gluster as the new block backend in QEMU. This givesQEMU the ability to boot VM images from gluster volumes. Its alreadypossible to boot from VM images on gluster volumes using FUSE mount, but...
block: vpc image file reopen
There is currently nothing that needs to be done for VPC imagefile reopen.
Signed-off-by: Jeff Cody <jcody@redhat.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>