Split nbd block client code
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>Acked-by: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
nbd: don't change socket block during negotiate
The caller might handle non-blocking using coroutine. Leave the choiceto the caller to use a blocking or non-blocking negotiate.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>Acked-by: Paolo Bonzini <pbonzini@redhat.com>...
nbd: pass export name as init argument
There is no need to keep the export name around, and it seems a betterfit as an argument in the init() call.
nbd: make nbd_client_session_close() idempotent
nbd: finish any pending coroutine
Make sure all pending coroutines are finished when closing the session.
Signed-off-by: Marc-André Lureau <marcandre.lureau@gmail.com>Acked-by: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
block/iscsi: introduce bdrv_co_{readv, writev, flush_to_disk}
this converts read, write and flush functions from aio to coroutineseliminating almost 200 lines of code.
The requirement for libiscsi is bumped to version 1.4.0 which wasreleased in may 2012....
qcow2: use start_of_cluster() and offset_into_cluster() everywhere
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>Reviewed-by: Fam Zheng <famz@redhat.com>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
block/iscsi: set bdi->cluster_size
this patch aims to set bdi->cluster_size to the internal page sizeof the iscsi target so that enabled callers can align requestsproperly.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Peter Lieven <pl@kamp.de>...
block/iscsi: set bs->bl.opt_transfer_length
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Peter Lieven <pl@kamp.de>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
snapshot: distinguish id and name in load_tmp
Since later this function will be used so improve it. The only caller of itnow is qemu-img, and it is not impacted by introduce functionbdrv_snapshot_load_tmp_by_id_or_name() that call bdrv_snapshot_load_tmp()...
qemu-nbd: support internal snapshot export
Now it is possible to directly export an internal snapshot, whichcan be used to probe the snapshot's contents without qemu-imgconvert.
Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
qcow2: Zero-initialise first cluster for new images
Strictly speaking, this is only required for has_zero_init() == false,but it's easy enough to just do a cluster-aligned write that is paddedwith zeros after the header.
This fixes that after 'qemu-img create' header extensions are attempted...
block: handle ENOTSUP from discard in generic code
Similar to write_zeroes, let the generic code receive a ENOTSUP fordiscard operations. Since bdrv_discard has advisory semantics,we can just swallow the error.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>...
vpc, vhdx: add get_info
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>Reviewed-by: Peter Lieven <pl@kamp.de>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
block drivers: add discard/write_zeroes properties to bdrv_get_info implementation
block drivers: expose requirement for write same alignment from formats
This will let misaligned but large requests use zero clusters. Thisis important because the cluster size is not guest visible.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>Reviewed-by: Peter Lieven <pl@kamp.de>...
block/iscsi: remove .bdrv_has_zero_init
since commit 3ac21627 the default value changed to 0.
Signed-off-by: Peter Lieven <pl@kamp.de>Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
block/iscsi: updated copyright
added myself to reflect recent work on the iscsi block driver.
block/iscsi: check WRITE SAME support differently depending on MAY_UNMAP
The current check is right for MAY_UNMAP=1. For MAY_UNMAP=0, justtry and fall back to regular writes as soon as a WRITE SAME commandfails.
raw-posix: implement write_zeroes with MAY_UNMAP for files
Writing zeroes to a file can be done by punching a hole ifMAY_UNMAP is set.
Note that in this case ENOTSUP is not ignored, but makesthe block layer fall back to the generic implementation.
raw-posix: implement write_zeroes with MAY_UNMAP for block devices
See the next commit for the description of the Linux kernel problemthat is worked around in raw_open_common.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
raw-posix: add support for write_zeroes on XFS and block devices
The code is similar to the implementation of discard and write_zeroeswith UNMAP. However, failure must be propagated up to block.c.
The stale page cache problem can be reproduced as follows:...
vmdk: Fix creating big description file
The buffer for description file was 4096 which only covers a fewhundred of extents. This changes the buffer to dynamic allocated withg_strdup_printf in order to support bigger cases.
Signed-off-by: Fam Zheng <famz@redhat.com>...
vmdk: Allow read only open of VMDK version 3
Signed-off-by: Fam Zheng <famz@redhat.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
block: Use BDRV_O_NO_BACKING where appropriate
If you open an image temporarily just because you want to check its sizeor get it flushed, there's no real reason to open the whole backing filechain.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>Reviewed-by: Fam Zheng <famz@redhat.com>...
sheepdog: refactor do_sd_create()
We can actually use BDRVSheepdogState *s to pass most of the parameters.
Cc: Kevin Wolf <kwolf@redhat.com>Cc: Stefan Hajnoczi <stefanha@redhat.com>Signed-off-by: Liu Yuan <namei.unix@gmail.com>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
sheepdog: support user-defined redundancy option
Sheepdog support two kinds of redundancy, full replication and erasure coding.
blkdebug: add "remove_break" command
This adds "remove_break" command which is the reverse of blkdebugcommand "break": it removes all breakpoints with given tag and resumesall the requests.
Signed-off-by: Fam Zheng <famz@redhat.com>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
qapi: Change BlockDirtyInfo to list
We have multiple dirty bitmaps in BDS now, switch QAPI to allow queryit (BlockInfo.dirty_bitmaps), and also drop old BlockInfo.dirty.
COW: Speed up writes
Process a whole sector's worth of COW bits by reading a sector, settingthe bits after skipping any already set bits, then writing it out again.Make sure we only flush once before writing metadata, and only if weneed to write metadata....
COW: Extend checking allocated bits to beyond one sector
cow_co_is_allocated() only checks one sector's worth of allocated bitsbefore returning. This is allowed but (slightly) inefficient, so extendit to check all of the file's metadata sectors.
Signed-off-by: Charlie Shepherd <charlie@ctshepherd.com>...
block: per caller dirty bitmap
Previously a BlockDriverState has only one dirty bitmap, so only onecaller (e.g. a block job) can keep track of writing. This changes thedirty bitmap to a list and creates a BdrvDirtyBitmap for each caller, thelifecycle is managed with these new functions:...
block/stream: Don't stream unbacked devices
If a block device is unbacked, a streaming blockjob should immediatelyfinish instead of beginning to try to stream, then noticing the backingfile does not contain even the first sector (since it does not exist)...
iscsi: set limits in BlockDriverState
Reviewed-by: Eric Blake <eblake@redhat.com>Signed-off-by: Peter Lieven <pl@kamp.de>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
iscsi: simplify iscsi_co_discard
now that bdrv_co_discard can handle limits we do not needthe request split logic here anymore.
iscsi: add bdrv_co_write_zeroes
Signed-off-by: Peter Lieven <pl@kamp.de>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
sheepdog: implement .bdrv_get_allocated_file_size
With this patch, qemu-img info sheepdog:image will show disk size for sheepdogimages.
Cc: Kevin Wolf <kwolf@redhat.com>Cc: Stefan Hajnoczi <stefanha@redhat.com>Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>...
block: add flags to bdrv_*_write_zeroes
block: introduce BDRV_REQ_MAY_UNMAP request flag
block/iscsi: add .bdrv_get_info
block/raw: copy BlockLimits on raw_open
qcow2: fix possible corruption when reading multiple clusters
if multiple sectors spanning multiple clusters are read thefunction count_contiguous_clusters should ensure that thecluster type should not change between the clusters.
Especially the for-loop should break when we have one...
block: Print its file name if backing file opening failed
If backing file doesn't exist, the error message is confusing andmisleading:
$ qemu /tmp/a.qcow2 qemu: could not open disk image /tmp/a.qcow2: Could not open file: No such file or directory...
block: vhdx - add region overlap detection for image files
Regions in the image file cannot overlap - the log, region tables,and metdata must all be unique and non-overlapping.
This adds region checking by means of a QLIST; there can be a variablenumber of regions and metadata (there may be metadata or region tables...
block: vhdx - add log write support
This adds support for writing to the VHDX log.
For spec details, see VHDX Specification Format v1.00:https://www.microsoft.com/en-us/download/details.aspx?id=34750
There are a few limitations to this log support:1.) There is no caching yet...
block: vhdx write support
This adds support for writing to VHDX image files, using coroutines.Writes into the BAT table goes through the VHDX log. Currently, BATtable writes occur when expanding a dynamic VHDX file, and allocating anew BAT entry.
Signed-off-by: Jeff Cody <jcody@redhat.com>...
block: vhdx - remove BAT file offset bit shifting
Bit shifting can be fun, but in this case it was unnecessary. Theupper 44 bits of the 64-bit BAT entry is specifies the File Offset,so we shifted the bits to get access to the value.
However, per the spec the value is in MB. So we dutifully shifted back...
block: vhdx - move more endian translations to vhdx-endian.c
In preparation for vhdx_create(), move more endian translationfunctions out to vhdx-endian.c.
Signed-off-by: Jeff Cody <jcody@redhat.com>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
block: vhdx - break out code operations to functions
This is preperation for vhdx_create(). The ability to write headers,and calculate the number of BAT entries will be needed within thecreate() functions, so move this relevant code into helper functions....
block: vhdx - fix comment typos in header, fix incorrect struct fields
VHDXPage83Data and VHDXParentLocatorHeader both incorrectly had theirMSGUID fields set as arrays of 16. This is incorrect (it stems froman early version where those fields were uint_8 arrays). Those fields...
block: vhdx - add .bdrv_create() support
This adds support for VHDX image creation, for images of type "Fixed" and "Dynamic". "Differencing" types (i.e., VHDX images with backingfiles) are currently not supported.
Options for image creation include: * log size:...
block/vpc: fix virtual size for images created with disk2vhd
block: vhdx - minor comments and typo correction.
Just a couple of minor comments to help note where allocatedbuffers are freed, and a typo fix.
block: vhdx - add header update capability.
This adds the ability to update the headers in a VHDX image, includinggenerating a new MS-compatible GUID.
As VHDX depends on uuid.h, VHDX is now a configurable build option. IfVHDX support is enabled, that will also enable uuid as well. The...
block: vhdx code movement - VHDXMetadataEntries and BDRVVHDXState to header.
In preparation for VHDX log support, move these structures to theheader.
block: vhdx - log support struct and defines
This adds some magic number defines, and internal structure definitionsfor VHDX log replay support. The struct VHDXLogEntries does not reflectan on-disk data structure, and thus does not need to be packed....
block: vhdx - break endian translation functions out
This moves the endian translation functions out from the vhdx.c source,into a separate source file. In addition to the previously definedendian functions, new endian translation functions for log support are...
block: vhdx - update log guid in header, and first write tracker
Allow tracking of first file write in the VHDX image, as well asthe ability to update the GUID in the header. This is in preparationfor log support.
block: vhdx code movement - move vhdx_close() above vhdx_open()
block: vhdx - log parsing, replay, and flush support
This adds support for VHDX v0 logs, as specified in Microsoft'sVHDX Specification Format v1.00:https://www.microsoft.com/en-us/download/details.aspx?id=34750
The following support is added:
block/raw-posix: fix FreeBSD compilation
The below patch is needed to compile qemu trunk on FreeBSD with gcc48,clang will fail.... ;). Host x84_64-freebsd.
Signed-off-by: Andreas Tobler <andreast@FreeBSD.org>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
bswap.h: Remove cpu_to_be64wu()
Replace the legacy cpu_to_be64wu() with stq_be_p().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>Reviewed-by: Richard Henderson <rth@twiddle.net>Reviewed-by: Michael S. Tsirkin <mst@redhat.com>Message-id: 1383669517-25598-9-git-send-email-peter.maydell@linaro.org...
Merge remote-tracking branch 'kwolf/tags/for-anthony' into staging
Block patches for 1.7.0-rc0 (v2)
vmdk: Implment bdrv_get_specific_info
Implement .bdrv_get_specific_info to return the extent information.
sheepdog: check simultaneous create in resend_aioreq
After reconnection happens, all the inflight requests are moved to thefailed request list. As a result, sd_co_rw_vector() can send anothercreate request before resend_aioreq() resends a create request from...
sheepdog: cancel aio requests if possible
This patch tries to cancel aio requests in pending queue and failedqueue. When the sheepdog driver cannot cancel the requests, it waitsfor them to be completed.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>...
sheepdog: make add_aio_request and send_aioreq void functions
These functions no longer return errors. We can make them voidfunctions and simplify the codes.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>Tested-by: Liu Yuan <namei.unix@gmail.com>...
sheepdog: try to reconnect to sheepdog after network error
This introduces a failed request queue and links all the inflightrequests to the list after network error happens. After QEMUreconnects to the sheepdog server successfully, the sheepdog blockdriver will retry all the requests in the failed queue....
sheepdog: reload inode outside of resend_aioreq
This prepares for using resend_aioreq() after reconnecting to thesheepdog server.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>Tested-by: Liu Yuan <namei.unix@gmail.com>Reviewed-by: Liu Yuan <namei.unix@gmail.com>...
sheepdog: handle vdi objects in resend_aio_req
The current resend_aio_req() doesn't work when the request is againstvdi objects. This fixes the problem.
sheepdog: check return values of qemu_co_recv/send correctly
If qemu_co_recv/send doesn't return the specified length, it meansthat an error happened.
block: Avoid unecessary drv->bdrv_getlength() calls
The block layer generally keeps the size of an image cached inbs->total_sectors so that it doesn't have to perform expensiveoperations to get the size whenever it needs it.
This doesn't work however when using a backend that can change its size...
sheepdog: pass copy_policy in the request
Currently copy_policy isn't used. Recent sheepdog supports erasure coding, whichmake use of copy_policy internally, but require client explicitly passingcopy_policy from base inode to newly creately inode for snapshot related...
sheepdog: explicitly set copies as type uint8_t
'copies' is actually uint8_t since day one, but request headers and some helperfunctions parameterize it as uint32_t for unknown reasons and effectivelyreserve 24 bytes for possible future use. This patch explicitly set the correct...
qcow2: Flush image after creation
Opening the qcow2 image with BDRV_O_NO_FLUSH prevents any flushes duringthe image creation. This means that the image has not yet been flushedto disk when qemu-img create exits. This flush is delayed until the nextoperation on the image involving opening it without BDRV_O_NO_FLUSH and...
misc: New spelling fixes in comments
compatiblity -> compatibilitycontinously -> continuouslyexistance -> existenceusefull -> usefulshoudl -> should
Signed-off-by: Stefan Weil <sw@weilnetz.de>Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
block/vpc: check that the image has not been truncated
this adds a check that a dynamic VHD file has not beenaccidently truncated (e.g. during transfer or upload).
Signed-off-by: Peter Lieven <pl@kamp.de>Reviewed-by: Eric Blake <eblake@redhat.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qcow2: Unset zero_beyond_eof in save_vmstate
Saving the VM state is done using bdrv_pwrite. This function may performa read-modify-write, which in this case results in data being read frombeyond the end of the virtual disk. Since we are actually trying to...
qcow2: Restore total_sectors value in save_vmstate
Since df2a6f29a5, bdrv_co_do_writev increases the total_sectors value ofa growable block devices on writes after the current end. This leads tothe virtual disk apparently growing in qcow2_save_vmstate, which in turn...
vmdk: fix VMFS extent parsing
The VMFS extent line in description file doesn't have start offset asFLAT lines does, and it should be defaulted to 0. The flat_offsetvariable is initialized to -1, so we need to set it in this case.
vmdk: Only read cid from image file when opening
Previously cid of parent is parsed from image file for every IO request.We already have L1/L2 cache and don't have assumption that parent imagecan be updated behind us, so remove this to get more efficiency....
block/raw-win32: Always use -errno in hdev_open
On one occasion, hdev_open() returned -1 in case of an unknown errorinstead of a proper -errno value. Adjust this to match the behavior ofraw_open() (in raw-win32), which is to return -EINVAL in this case....
vmdk: Fix vmdk_parse_extents
An extra 'p++' after while loop when *p == '\n' will move p to unknowndata position, risking parsing junk data or memory access violation.
Cc: qemu-stable@nongnu.orgSigned-off-by: Fam Zheng <famz@redhat.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
vmdk: convert error code to use errp
Convert "fprintf(stderr,..." and standardize error messages:
Remove a few local_error's and use errp.
Remove "VMDK:" or "Vmdk:" prefixes in error message and fix to uppercase.
vmdk: refuse enabling zeroed grain with flat images
This is a header flag and we needs sparse for the header.
qcow2: Fix snapshot restoration in snapshot_create
If the new snapshot table could not be written in qcow2_snapshot_create,the old snapshot table has to be restored in memory and the new onereleased. This should include restoration of the old snapshot count as...
qcow2: Use better type for numerical snapshot ID
When trying to find a new snapshot ID, the existing ones are convertedto integers using strtoul. This function returns an unsigned long,therefore its result should be saved in an unsigned long as well.
Signed-off-by: Max Reitz <mreitz@redhat.com>...
qcow2: Use negated overflow check mask
In qcow2_check_metadata_overlap and qcow2_pre_write_overlap_check,change the parameter signifying the checks to perform from its currentpositive form to a negative one, i.e., it will no longer explicitlyspecify every check to perform but rather a mask of checks not to...
qcow2: Make overlap check mask variable
Replace the QCOW2_OL_DEFAULT macro by a variable overlap_check inBDRVQcowState.
Signed-off-by: Max Reitz <mreitz@redhat.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qcow2: Add overlap-check options
Add runtime options to tune the overlap checks to be performed beforewrite accesses.
qcow2: Array assigning options to OL check bits
Add an array which assigns the option string to its correspondingoverlap check bit.
qcow2: Add more overlap check bitmask macros
Introduces the macros QCOW2_OL_CONSTANT and QCOW2_OL_ALL in addition tothe already existing QCOW2_OL_CACHED, signifying all metadata overlapchecks that can be performed in constant time (regardless of image size...
qcow2: Evaluate overlap check options
Evaluate the runtime overlap check options and setBDRVQcowState.overlap_check appropriately.
block/raw_bsd: Employ error parameter
Propagate errors in raw_create rather than directly reporting andafterwards discarding them.
block/raw-win32: Employ error parameter
Make use of the error parameter in the opening and creating functions inblock/raw-win32.c.
blkdebug: Employ error parameter
Make use of the error parameter in blkdebug_open.
blkverify: Employ error parameter
Make use of the error parameter in blkverify_open.
block/raw-posix: Employ error parameter
Make use of the error parameter in the opening and creating functions inblock/raw-posix.c.
qcow2: Add missing space in error message
The error message in qcow2_downgrade about an unsupported refcountorder is missing a space. This patch adds it.
qcow2: Remove wrong metadata overlap check
In qcow2_write_compressed, if the compression fails, a normal cluster iswritten to disk. This is done through bdrv_write on the qcow2 BDSitself (using the guest offset), thus it is wrong to do a metadataoverlap check before....