vmdk: store fields of VmdkMetaData in cpu endian
Previously VmdkMetaData.offset is stored little endian while otherfields are cpu endian. This changes offset to cpu endian and convertbefore writing to image.Signed-off-by: Fam Zheng <famz@redhat.com>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
vmdk: change magic number to macro
Two hard coded flag bits are changed to macros.Signed-off-by: Fam Zheng <famz@redhat.com>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
vmdk: Add option to create zeroed-grain image
Add image create option "zeroed-grain" to enable zeroed-grain GTEfeature of vmdk sparse extents. When this option is on, header versionof newly created extent will be 2 and VMDK4_FLAG_ZERO_GRAIN flag bitwill be set....
vmdk: add support for “zeroed‐grain” GTE
Introduced support for zeroed-grain GTE, as specified in Virtual DiskFormat 5.01.
Recent VMware hosted platform products support a new “zeroed‐grain” grain table entry (GTE). The zeroed‐grain GTE returns all zeros on...
vmdk: named return code.
Internal routines in vmdk.c previously return -1 on error and 0 onsuccess. More return values are useful for future changes such aszeroed-grain GTE. Change all the magic `return 0` and `return -1` tomacro names:
block: vhdx header for the QEMU support of VHDX images
This is based on Microsoft's VHDX specification: "VHDX Format Specification v0.95", published 4/12/2012 https://www.microsoft.com/en-us/download/details.aspx?id=29681
These structures define the various header, metadata, and other...
block: initial VHDX driver support framework - supports open and probe
This is the initial block driver framework for VHDX image support(i.e. Hyper-V image file formats), that supports opening VHDX files, andparsing the headers.
This commit does not yet enable:...
block: add read-only support to VHDX image format.
This adds in read-only support to the VHDX image format. This supportsreads for fixed-size, and dynamic sized VHDX images.
Differencing files are still unsupported.
The image must be opened without BDRV_O_RDWR set, because we do not...
sheepdog: fix loadvm operation
Currently the 'loadvm' opertaion works as following:1. switch to the snapshot2. mark current working VDI as a snapshot3. rely on sd_create_branch to create a new working VDI based on the snapshot
This works not the same as other format as QCOW2. For e.g,...
sheepdog: use BDRV_SECTOR_SIZE
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>Cc: Kevin Wolf <kwolf@redhat.com>Cc: Stefan Hajnoczi <stefanha@redhat.com>Signed-off-by: Liu Yuan <tailai.ly@taobao.com>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
sheepdog: implement .bdrv_co_is_allocated()
rbd: Fix use after free in rbd_open()
Commit a9ccedc3 frees the QemuOpts for the driver-specific optionsimmediately, even though it still needs the filename string that iscontained there. This doesn't work. Move the deletion of the QemuOpts tothe end of the function where its content isn't needed any more....
sheepdog: cleanup find_vdi_name
This makes 'filename' and 'tag' constant variables, and renames'for_snapshot' to 'lock' to clear how it works.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
sheepdog: add SD_RES_READONLY result code
Sheepdog returns SD_RES_READONLY when qemu sends write requests to thesnapshot vdi. This adds the result code and makes sd_strerror() printits error reason.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>...
sheepdog: add helper function to reload inode
This adds a helper function to update the current inode state with thespecified vdi object.
sheepdog: resend write requests when SD_RES_READONLY is received
When a snapshot is taken from out side of qemu (e.g. qemu-imgsnapshot), write requests to the current vdi return SD_RES_READONLY.In this case, the sheepdog block driver needs to update the current...
sheepdog: add discard/trim support for sheepdog
The 'TRIM' command from VM that is to release underlying data storage forbetter thin-provision is already supported by the Sheepdog.
This patch adds the TRIM support at QEMU part.
For older Sheepdog that doesn't support it, we return 0(success) to upper layer....
Merge remote-tracking branch 'kwolf/for-anthony' into staging
Merge remote-tracking branch 'bonzini/nbd-next' into staging
Message-id: 1366381830-11267-1-git-send-email-pbonzini@redhat.com...
block: Remove filename parameter from .bdrv_file_open()
It is unused now in all block drivers.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>Reviewed-by: Eric Blake <eblake@redhat.com>
block: Add driver-specific options for backing files
Options starting in "backing." are passed to the backing file now. Ifyou don't need to specify the filename for the backing file, you can addit on the command line instead of in the image file:
$ qemu-nbd -t /tmp/test.img...
raw-posix: Use bdrv_open options instead of filename
raw-win32: Use bdrv_open options instead of filename
blkdebug: Use bdrv_open options instead of filename
blkverify: Use bdrv_open options instead of filename
curl: Use bdrv_open options instead of filename
As a bonus, going through the QemuOpts QEMU_OPT_SIZE parser for thereadahead option gives us proper error reporting that the previous useof atoi() lacked.
gluster: Use bdrv_open options instead of filename
This is only to convert the internal interface that is used for passingthe "filename" to be parsed, but converting to actual fine grainedoptions is left for another day, as it doesn't look trivial.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>...
iscsi: Use bdrv_open options instead of filename
rbd: Use bdrv_open options instead of filename
sheepdog: Use bdrv_open options instead of filename
vvfat: Use bdrv_open options instead of filename
qcow2: allow sub-cluster compressed write to last cluster
Compression in qcow2 requires image length to be a multiple of thecluster size. Lift this requirement by zero-padding the final clusterwhen necessary. The virtual disk size is still not cluster-aligned, so...
qcow: allow sub-cluster compressed write to last cluster
Compression in qcow requires image length to be a multiple of thecluster size. Lift this requirement by zero-padding the final clusterwhen necessary. The virtual disk size is still not cluster-aligned, so...
ssh: Remove unnecessary use of strlen function.
Reviewed-by: Eric Blake <eblake@redhat.com>Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
block/ssh: Add missing gcc format attributes
Now gcc will check whether format string and variable arguments match.
Signed-off-by: Stefan Weil <sw@weilnetz.de>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
nbd: set TCP_NODELAY
Disable the Nagle algorithm to reduce latency. Note this means we mustalso use TCP_CORK when sending header followed by payload to avoidfragmenting lots of little packets. The previous patch took care ofthat.
Suggested-by: Nick Thomas <nick@bytemark.co.uk>...
nbd: use TCP_CORK in nbd_co_send_request()
Use TCP_CORK to defer packet transmission until both the header and thepayload have been written.
Suggested-by: Nick Thomas <nick@bytemark.co.uk>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
nbd: unlock mutex in nbd_co_send_request() error path
Cc: qemu-stable@nongnu.orgSigned-off-by: Stefan Hajnoczi <stefanha@redhat.com>Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
block: Add support for Secure Shell (ssh) block device.
qemu-system-x86_64 -drive file=ssh://hostname/some/image
QEMU will ssh into 'hostname' and open '/some/image' which is madeavailable as a standard block device.
You can specify a username (ssh://user@host/...) and/or a port number...
block: ssh: Use libssh2_sftp_fsync (if supported by libssh2) to flush to disk.
libssh2_sftp_fsync is an extension to libssh2 to support fsync(2) oversftp, which is itself an extension of OpenSSH.
If both libssh2 and the ssh daemon support it, this will allow...
rbd: add an asynchronous flush
The existing bdrv_co_flush_to_disk implementation uses rbd_flush(),which is sychronous and causes the main qemu thread to block until itis complete. This results in unresponsiveness and extra latency forthe guest.
Fix this by using an asynchronous version of flush. This was added to...
block: Introduce bdrv_writev_vmstate
Signed-off-by: Kevin Wolf <kwolf@redhat.com>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
block: Introduce bdrv_pwritev() for qcow2_save_vmstate
Directly pass the QEMUIOVector on instead of linearising it.
aes: move aes.h from include/block to include/qemu
Move aes.h from include/block to include/qemu to show it can be reusedby other subsystems.
Cc: Kevin Wolf <kwolf@redhat.com>Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>...
hw: move headers to include/
Many of these should be cleaned up with proper qdev-/QOM-ification.Right now there are many catch-all headers in include/hw/ARCH dependingon cpu.h, and this makes it necessary to compile these files per-target.However, fixing this does not belong in these patches....
qcow2: Return real error in qcow2_update_snapshot_refcount
This fixes the error message triggered by the following script:
cat > /tmp/blkdebug.cfg <<EOF [inject-error] event = "cluster_free" errno = "28" immediately = "off" EOF
$qemu_img create -f qcow2 test.qcow2 10G...
qcow2: Fix L1 write error handling in qcow2_update_snapshot_refcount
It ignored the error code, and at least the 'goto fail' is obviousnonsense as it creates an endless loop (if the next attempt doesn'tmagically succeed) and leaves the in-memory L1 table in big-endian...
oslib-posix: rename socket_set_nonblock() to qemu_set_nonblock()
The fcntl(fd, F_SETFL, O_NONBLOCK) flag is not specific to sockets.Rename to qemu_set_nonblock() just like qemu_set_cloexec().
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>Reviewed-by: Eric Blake <eblake@redhat.com>...
qcow2: Use byte granularity in qcow2_alloc_cluster_offset()
This gets rid of the nb_clusters and keep_clusters and the associatedcomplicated calculations. Just advance the number of bytes that havebeen processed and everything is fine.
This patch advances the variables even after the last operation even...
qcow2: Allow requests with multiple l2metas
Instead of expecting a single l2meta, have a list of them. This allowsto still have a single I/O request for the guest data, even thoughmultiple l2meta may be needed in order to describe both a COW overwriteand a new cluster allocation (typical sequential write case)....
qcow2: Move cluster gathering to a non-looping loop
This patch is mainly to separate the indentation change from thesemantic changes. All that really changes here is that everything movesinto a while loop, all 'goto done' become 'break' and at the end of the...
qcow2: Gather clusters in a looping loop
Instead of just checking once in exactly this order if there aredependendies, non-COW clusters and new allocation, this starts loopingaround these. This way we can, for example, gather non-COW clusters afternew allocations as long as the host cluster offsets stay contiguous....
qcow2: Improve check for overlapping allocations
The old code detected an overlapping allocation even when theallocations didn't actually overlap, but were only adjacent.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>Reviewed-by: Eric Blake <eblake@redhat.com>...
qcow2: Change handle_dependency to byte granularity
This is a more precise description of what really constitutes adependency. The behaviour doesn't change at this point because the COWarea of the old request is still aligned to cluster boundaries andtherefore an overlap is detected wheneven the requests touch any part of...
qcow2: Decouple cluster allocation from cluster reuse code
This moves some code that prepares the allocation of new clusters towhere the actual allocation happens. This is the minimum required to beable to move it to a separate function in the next patch....
qcow2: Factor out handle_alloc()
qcow2: handle_alloc(): Get rid of nb_clusters parameter
We already communicate the same information in *bytes.
qcow2: handle_alloc(): Get rid of keep_clusters parameter
handle_alloc() is now called with the offset at which the actual newallocation starts instead of the offset at which the whole write requeststarts, part of which may already be processed.
qcow2: Finalise interface of handle_alloc()
The interface works completely on a byte granularity now and duplicatedparameters are removed.
qcow2: Clean up handle_alloc()
Things can be simplified a bit now. No semantic changes.
qcow2: Factor out handle_copied()
qcow2: handle_copied(): Get rid of nb_clusters parameter
handle_copied() uses its bytes parameter now to determine how manyclusters it should try to find.
qcow2: handle_copied(): Get rid of keep_clusters parameter
Now *bytes is used to return the length of the area that can be writtento without performing an allocation or COW.
qcow2: handle_copied(): Implement non-zero host_offset
Look only for clusters that start at a given physical offset.
qcow2: Prepare handle_alloc/copied() for byte granularity
This makes handle_alloc() and handle_copied() return byte-granularityhost offsets instead of returning always the cluster start. This isrequired so that qcow2_alloc_cluster_offset() can stop aligning...
qcow2: Fix "total clusters" number in bdrv_check
This should be based on the virtual disk size, not on the size of theimage.
Interesting observation: With some VM state stored in the image file,percentages higher than 100% are possible, even though snapshots...
qcow2: Remove bogus unlock of s->lock
The unlock wakes up the next coroutine, but the currently runningcoroutine will lock it again before it yields, so this doesn't make alot of sense.
qcow2: Handle dependencies earlier
Handling overlapping allocations isn't just a detail of clusterallocation. It is rather one of three ways to get the host clusteroffset for a write request:
1. If a request overlaps an in-flight allocations, the cluster offset...
block: Add options QDict to bdrv_file_open() prototypes (fix MinGW build)
The new parameter is unused yet.
This part was missing in commit 787e4a8500020695eb391e2f1cc4767ee071d441.
Cc: Kevin Wolf <kwolf@redhat.com>Cc: Eric Blake <eblake@redhat.com>Signed-off-by: Stefan Weil <sw@weilnetz.de>...
rbd: fix compile error
Commit 787e4a85 [block: Add options QDict to bdrv_file_open() prototypes] didn'tupdate rbd.c accordingly.
Cc: Kevin Wolf <kwolf@redhat.com>Cc: Stefan Hajnoczi <stefanha@redhat.com>Signed-off-by: Liu Yuan <tailai.ly@taobao.com>Reviewed-by: Stefan Weil <sw@weilnetz.de>...
nbd: Accept -drive options for the network connection
The existing parsers for the file name now parse everything into thebdrv_open() options QDict. Instead of using these parsers, you can nowdirectly specify the options on the command line, like this:...
block: Introduce .bdrv_parse_filename callback
If a driver needs structured data and not just a string, it can providea .bdrv_parse_filename callback now that parses the command line stringinto separate options. Keeping this separate from .bdrv_open_filename...
block: Make find_image_format safe with NULL filename
In order to achieve this, the .bdrv_probe callbacks of all drivers mustcope with this. The DMG driver is the only one that bases its decisionon the filename and it needs to be changed.
nbd: Use default port if only host is specified
The URL method already takes care to apply the default port when none isspecfied. Directly specifying driver-specific options required the portnumber until now. Allow leaving it out and apply the default....
nbd: Check against invalid option combinations
A file name may only specified if no host or socket path is specified.The latter two may not appear at the same time either.
block: Add options QDict to bdrv_file_open() prototypes
nbd: Keep hostname and port separate
The NBD block supports an URL syntax, for which a URL parser returnsseparate hostname and port fields. It also supports the traditional qemusyntax encoded in a filename. Until now, after parsing the URL to geteach piece of information, a new string is built to be fed to socket...
sheepdog: show error message for halt status
Sheepdog (neither quorum nor unsafe mode) will refuse to serve IO requests whennumber of alive nodes is less than that of copies specified by users. This willreturn 0x19 to QEMU client which currently doesn't recognize it....
qcow2: Fix segfault in qcow2_invalidate_cache
Need to pass an options QDict to qcow2_open() now. This fixes a segfaulton the migration target with qcow2.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
threadpool: drop global thread pool
Now that each AioContext has a ThreadPool and the main loop AioContextcan be fetched with bdrv_get_aio_context(), we can eliminate the conceptof a global thread pool from thread-pool.c.
The submit functions must take a ThreadPool* argument....
qcow2: flush refcount cache correctly in qcow2_write_snapshots()
Since qcow2 metadata is cached we need to flush the caches, not just theunderlying file. Use bdrv_flush(bs) instead of bdrv_flush(bs->file).
Also add the error return path when bdrv_flush() fails and move the...
qcow2: set L2 cache dependency in qcow2_alloc_bytes()
Compressed writes use qcow2_alloc_bytes() to allocate space with bytegranularity. The affected clusters' refcounts will be incremented butwe do not need to flush yet.
Set a L2 cache dependency on the refcount block cache, so that the...
qcow2: flush in qcow2_update_snapshot_refcount()
Users of qcow2_update_snapshot_refcount() do not flush consistently.qcow2_snapshot_create() flushes but qcow2_snapshot_goto() andqcow2_snapshot_delete() do not.
Solve this by moving the bdrv_flush() into...
qcow2: drop flush in update_cluster_refcount()
The update_cluster_refcount() function increments/decrements a cluster'srefcount and then returns the new refcount value.
There is no need to flush since both update_cluster_refcount() callersalready take care of this:...
qcow2: drop unnecessary flush in qcow2_update_snapshot_refcount()
We already flush when the function completes. There is no need to flushafter every compressed cluster.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qcow2: make is_allocated return true for zero clusters
Otherwise, live migration of the top layer will miss zero clusters andlet the backing file show through. This also matches what is done in qed.
QCOW2_CLUSTER_ZERO clusters are invalid in v2 image files. Check this...
sheepdog: use non-blocking fd in coroutine context
Using a blocking socket in the coroutine context reduces the chance ofswitching to other work. This patch makes the sheepdog driver use anon-blocking fd always.
sheepdog: set io_flush handler in do_co_req
If an io_flush handler is not set, qemu_aio_wait doesn't invokecallbacks.
block: Add options QDict to .bdrv_open()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>Reviewed-by: Eric Blake <eblake@redhat.com>Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
block: Add options QDict to bdrv_open() prototype
It doesn't do anything yet except storing the options QDict in theBlockDriverState.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>Reviewed-by: Eric Blake <eblake@redhat.com>Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>...
qcow2: Allow lazy refcounts to be enabled on the command line
qcow2 images now accept a boolean lazy_refcounts options. Use it likethis:
-drive file=test.qcow2,lazy_refcounts=on
If the option is specified on the command line, it overrides the default...
qcow2: flush refcount cache correctly in alloc_refcount_block()
update_refcount() affects the refcount cache, it does not write to disk.Therefore bdrv_flush(bs->file) does nothing. We need to flush therefcount cache in order to write out the refcount updates!...
iscsi: add iscsi_truncate support
this patch adds iscsi_truncate which effectively allows foronline resizing of iscsi volumes. for this to work you haveto resize the volume on your storage and then callblock_resize command in qemu which will issue areadcapacity16 to update the capacity....
iscsi: retry read, write, flush and unmap on unit attention check conditions
the storage might return a check condition status for various reasons.(e.g. bus reset, capacity change, thin-provisioning info etc.)
currently all these informative status responses lead to an I/O error...
move socket_set_nodelay to osdep.c
sheepdog: accept URIs
The URI syntax is consistent with the NBD and Gluster syntax. Thesyntax is
sheepdog[+tcp]://[host:port]/vdiname[#snapid|#tag]
sheepdog: use inet_connect to simplify connect code
This uses the form "<host>:<port>" for the representation of thesheepdog server to use inet_connect.
sheepdog: add support for connecting to unix domain socket
This patch adds support for a unix domain socket for a connectionbetween qemu and local sheepdog server. You can use the unix domainsocket with the following syntax:
$ qemu sheepdog+unix:///<vdiname>?socket=<socket path>[#snapid]...
qcow2: record fragmentation statistics during check
The qemu-img check command can display fragmentation statistics: * Total number of clusters in virtual disk * Number of allocated clusters * Number of fragmented clusters
This patch adds fragmentation statistics support to qcow2....
qcow2: support compressed clusters in BlockFragInfo