block: purge s->aligned_buf and s->aligned_buf_size from raw-posix.c
The aligned_buf pointer and aligned_buf size are no longer used inraw_posix.c, so remove all references to them.
Signed-off-by: Jeff Cody <jcody@redhat.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
block: move aio initialization into a helper function
Move AIO initialization for raw-posix block driver into a helper function.
In addition to just code motion, the aio_ctx pointer is checked for NULL,prior to calling laio_init(), to make sure laio_init() is only run once....
block: move open flag parsing in raw block drivers to helper functions
Code motion, to move parsing of open flags into a helper function.
block: do not parse BDRV_O_CACHE_WB in block drivers
Block drivers should ignore BDRV_O_CACHE_WB in .bdrv_open flags,and in the bs->open_flags.
This patch removes the code, leaving the behaviour behind as ifBDRV_O_CACHE_WB was set.
Signed-off-by: Jeff Cody <jcody@redhat.com>...
block: use BDRV_O_NOCACHE instead of s->aligned_buf in raw-posix.c
Rather than check for a non-NULL aligned_buf to determine ifraw_aio_submit needs to check for alignment, check for the presenceof BDRV_O_NOCACHE in the bs->open_flags.
sheepdog: fix savevm and loadvm
This patch sets data to be sent to Sheepdog correctly and fixes savevmand loadvm operations on a Sheepdog image.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
block/curl: Fix wrong free statement
Report from smatch:block/curl.c:546 curl_close(21) info: redundant null check on s->url calling free()
The check was redundant, and free was also wrong because the memorywas allocated using g_strdup.
Signed-off-by: Stefan Weil <sw@weilnetz.de>...
vdi: Fix warning from clang
ccc-analyzer reports these warnings:
block/vdi.c:704:13: warning: Dereference of null pointer bmap[i] = VDI_UNALLOCATED; ^block/vdi.c:702:13: warning: Dereference of null pointer bmap[i] = i;...
Merge remote-tracking branch 'kwolf/for-anthony' into staging
qed: refuse unaligned zero writes with a backing file
Zero writes have cluster granularity in QED. Therefore they can only beused to zero entire clusters.
If the zero write request leaves sectors untouched, zeroing the entirecluster would obscure the backing file. Instead return -ENOTSUP, which...
stream: complete early if end of backing file is reached
It is possible to create an image that is larger than its backing file.Reading beyond the end of the backing file produces zeroes if no writeshave been made to those sectors in the image file.
This patch finishes streaming early when the end of the backing file is...
iscsi: Set number of blocks to 0 for blank CDROM devices
The number of blocks of the device is used to compute the device sizein bdrv_getlength()/iscsi_getlength().For MMC devices, the ReturnedLogicalBlockAddress in the READCAPACITY10has a special meaning when it is 0....
Merge remote-tracking branch 'bonzini/scsi-next' into staging
sheepdog: don't leak socket file descriptor upon connection failure
Signed-off-by: Jim Meyering <meyering@redhat.com>Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
iscsi: move iscsi_schedule_bh and iscsi_readv_writev_bh_cb
Put these functions at the beginning, to avoid forward referencesin the next patches.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
iscsi: simplify iscsi_schedule_bh
It is always used with the same callback, remove the argument. Andits return value is never used, assume allocation succeeds.
iscsi: fix races between task completion and abort
This patch fixes two main issues with block/iscsi.c:
1) iscsi_task_mgmt_abort_task_async calls iscsi_scsi_task_cancel whichwas also directly called in iscsi_aio_cancel
2) a race between task completion and task abortion could happen cause...
Revert "iscsi: Fix NULL dereferences / races between task completion and abort"
This reverts commit 64e69e80920d82df3fa679bc41b13770d2f99360. The commitreturned immediately from iscsi_aio_cancel, risking corruption in case thefollowing happens:
guest qemu target...
vmdk: Read footer for streamOptimized images
The footer takes precedence over the header when it exists. It containsthe real grain directory offset that is missing in the header. Withoutthis patch, streamOptimized images with a footer cannot be read.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>...
vmdk: Fix header structure
Commit bb45ded9 swapped gd_offset and rgd_offset. This is wrong.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
iscsi: Fix NULL dereferences / races between task completion and abort
Signed-off-by: Stefan Priebe <s.priebe@profihost.ag>Acked-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
block: Prevent detection of /dev/fdset/ as floppy
Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
block: Convert open calls to qemu_open
This patch converts all block layer open calls to qemu_open.
Note that this adds the O_CLOEXEC flag to the changed open pathswhen the O_CLOEXEC macro is defined.
Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>...
block: Convert close calls to qemu_close
This patch converts all block layer close calls, that correspondto qemu_open calls, to qemu_close.
qed: mark image clean after repair succeeds
The dirty bit is cleared after image repair succeeds in qed_open().Move this into qed_check() so that all callers benefit from thisbehavior when fix=true.
This is necessary so qemu-img check can call .bdrv_check() and mark the...
qcow2: mark image clean after repair succeeds
The dirty bit is cleared after image repair succeeds in qcow2_open().Move this into qcow2_check() so that all callers benefit from thisbehavior when fix mode is enabled.
block: add BLOCK_O_CHECK for qemu-img check
Image formats with a dirty bit, like qed and qcow2, repair dirty imagefiles upon open with BDRV_O_RDWR. Performing automatic repair whenqemu-img check runs is not ideal because the bdrv_open() call repairsthe image before the actual bdrv_check() call from qemu-img.c....
iscsi: Pick default initiator-name based on the name of the VM
This patch updates the iscsi layer to automatically pick a 'unique'initiator-name based on the name of the vm in case the user has not setan explicit iqn-name to use.
Create a new function qemu_get_vm_name() that returns the name of the VM,...
iscsi: do not leak initiator_name
The argument of iscsi_create_context is never freed by libiscsi,which in fact calls strdup on it. Avoid a leak.
iscsi: reorganize code for parse_initiator_name
Merge the occurrences of the "iqn.2008-11.org.linux-kvm" stringto avoid duplication.
qcow2: introduce dirty bit
This patch adds an incompatible feature bit to mark images that have notbeen closed cleanly. When a dirty image file is opened a consistencycheck and repair is performed.
Update qemu-iotests 031 and 036 since the extension header size changes...
qcow2: implement lazy refcounts
Lazy refcounts is a performance optimization for qcow2 that postponesrefcount metadata updates and instead marks the image dirty. In thecase of crash or power failure the image will be left in a dirty stateand repaired next time it is opened....
vvfat: Fix partition table
Unless parameter ":floppy:" is given, vvfat creates a virtual imagewith DOS MBR defining a single partition which holds the FAT filesystem. The size of the virtual image depends on the width of theFAT: 32 MiB (CHS 64, 16, 63) for 12 bit FAT, 504 MiB (CHS 1024, 16,...
vvfat: Do not clobber the user's geometry
vvfat creates a virtual VFAT filesystem with a certain logicalgeometry that depends on its options. It sets the "geometry hint" tothis geometry. It is the only block driver to do this.
The geometry hint is about about physical geometry, and used only by...
sheepdog: always use coroutine-based network functions
This reduces some code duplication.
sheepdog: do not blindly memset all read buffers
Only buffers that map to unallocated blocks need to be zeroed.
Signed-off-by: Christoph Hellwig <hch@lst.de>Acked-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Merge remote-tracking branch 'mjt/mjt-iov2' into staging
sheepdog: split outstanding list into inflight and pending
outstanding_list_head is used for both pending and inflight requests.This patch splits it and improves readability.
sheepdog: traverse pending_list from the first for each time
The pending list can be modified in other coroutine contextsd_co_rw_vector, so we need to traverse the list from the first againafter we send the pending request.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>...
blkdebug: remove sync i/o events
These are unused, except (by mistake more or less) in QED.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
blkdebug: tiny cleanup
blkdebug: pass getlength to underlying file
This is required when using blkdebug with raw format. Unlike qcow2/QED,raw asks blkdebug for the length of the file, it doesn't get it froma header.
blkdebug: store list of active rules
This prepares for the next patch, where some active rules may actuallynot trigger depending on input to readv/writev. Store the active rulesin a SIMPLEQ (so that it can be emptied easily with QSIMPLEQ_INIT), andfetch the errno/once/immediately arguments from there....
blkdebug: optionally tie errors to a specific sector
This makes blkdebug scripts more powerful, and independent of theexact sequence of operations performed by streaming.
raw: hook into blkdebug
qcow2: fix #ifdef'd qcow2_check_refcounts() callers
The DEBUG_ALLOC qcow2.h macro enables additional consistency checksthroughout the code. This makes it easier to spot corruptions that areintroduced during development. Since consistency check is an expensive...
qcow2: preserve free_byte_offset when qcow2_alloc_bytes() fails
When qcow2_alloc_clusters() error handling code was introduced in commit5d757b563d59142ca81e1073a8e8396750a0ad1a, the value of free_byte_offsetwas clobbered in the error case. This patch keeps free_byte_offset at 0...
sheepdog: fix dprintf format strings
This fixes warnings about dprintf format in debug mode.
sheepdog: restart I/O when socket becomes ready in do_co_req()
Currently, no one reenters the yielded coroutine. This fixes it.
sheepdog: use coroutine based socket functions in coroutine context
This removes blocking network I/Os in coroutine context.
sheepdog: make sure we don't free aiocb before sending all requests
This patch increments the pending counter before sending requests, andmake sures that aiocb is not freed while sending them.
ISCSI: Add SCSI passthrough via scsi-generic to libiscsi
Update iscsi to allow passthrough of SG_IO scsi commands when the iscsidevice is forced to be scsi-generic.
Implement both bdrv_ioctl() and bdrv_aio_ioctl() in the iscsi backend,emulate the SG_IO ioctl and pass the SCSI commands across to the...
ISCSI: force use of sg for SMC and SSC devices
If the device we open is a SMC or SSC device, then force the use of sg. Wedont have any medium changer or tape emulation so only passthrough viareal sg or scsi-generic via iscsi would work anyway.
Forcing sg also makes qemu skip trying to read from the device to guess...
raw-posix: Fix build without is_allocated support
Move the declaration of s into the #ifdef sections that actually makeuse of it.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>Signed-off-by: Alexander Graf <agraf@suse.de>
qcow2: always operate caches in writeback mode
Writethrough does not need special-casing anymore in the qcow2 caches.The block layer adds flushes after every guest-initiated data write,and these will also flush the qcow2 caches to the OS.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>...
qcow2: Simplify calculation for COW area at the end
copy_sectors() always uses the sum (cluster_offset + n_start) or(start_sect + n_start), so if some value is added to both cluster_offsetand start_sect, and subtracted from n_start, it's cancelled out anyway....
qcow2: Fix avail_sectors in cluster allocation code
avail_sectors should really be the number of sectors from the start ofthe allocation, not from the start of the write request.
We're lucky enough that this mistake didn't cause any real bug.avail_sectors is only used in the intialiser of QCowL2Meta:...
qcow2: fix autoclear image header update
The autoclear feature bits can be used for qcow2 file format featuresthat are safe to "drop" by old programs that do not understand thefeature. Upon opening the image file unknown autoclear feature bits arecleared and the image file header is rewritten, but this was happening...
qcow2: remove a line of unnecessary code
Commit 3948d1d4 removed the pointer argument we filled in with l2_offsetbut forgot to remove the unnecessary l2_offset assignment.
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>...
qcow2: fix endianness conversion
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
block: implement is_allocated for raw
Either FIEMAP, or SEEK_DATA+SEEK_HOLE can be used to implement theis_allocated callback for raw files. On Linux ext4, btrfs and XFSall support it.
stream: tweak usage of bdrv_co_is_allocated
is_allocated_base has complex semantics that are not really usableoutside streaming. Split the check in two parts, where the allocatedstate for the top bs is moved to the caller. The resulting functionis more generally useful....
stream: move is_allocated_above to block.c
stream: move rate limiting to a separate header file
Make the code reusable.
qemu-img check -r for repairing images
The QED block driver already provides the functionality to not onlydetect inconsistencies in images, but also fix them. However, thisfunctionality cannot be manually invoked with qemu-img, but thecheck happens only automatically during bdrv_open()....
qemu-img check: Print fixed clusters and recheck
When any inconsistencies have been fixed, print the statistics and runanother check to make sure everything is correct now.
qcow2: Support for fixing refcount inconsistencies
rbd: hook up cache options
Writeback caching was added in Ceph 0.46, and writethrough will be in0.47. These are controlled by general config options, so there's noneed to check for librbd version.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
sheepdog: add coroutine_fn markers to coroutine functions
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qcow2: Silence false warning
Some gcc versions seem not to be able to figure out that the switchstatement covers all possible values and that c is therefore alwaysinitialised. Add a default branch for them.
Reported-by: malc <av1474@comtv.ru>Signed-off-by: Kevin Wolf <kwolf@redhat.com>...
allow qemu_iovec_from_buffer() to specify offset from which to start copying
Similar to qemu_iovec_memset(QEMUIOVector *qiov, size_t offset, int c, size_t bytes);the new prototype is: qemu_iovec_from_buf(QEMUIOVector *qiov, size_t offset,...
consolidate qemu_iovec_copy() and qemu_iovec_concat() and make them consistent
qemu_iovec_concat() is currently a wrapper forqemu_iovec_copy(), use the former (with extra"0" arg) in a few places where it is used.
Change skip argument of qemu_iovec_copy() from...
change qemu_iovec_to_buf() to match other to,from_buf functions
It now allows specifying offset within qiov to start from andamount of bytes to copy. Actual implementation is just a callto iov_to_buf().
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
cleanup qemu_co_sendv(), qemu_co_recvv() and friends
The same as for non-coroutine versions in previouspatches: rename arguments to be more obvious, changetype of arguments from int to size_t where appropriate,and use common code for send and receive paths (with...
consolidate qemu_iovec_memset{,_skip}() into single function and use existing iov_memset()
This patch combines two functions into one, and replacesthe implementation with already existing iov_memset() fromiov.c.
The new prototype of qemu_iovec_memset():...
build: move block/ objects to nested Makefile.objs
block: prevent snapshot mode $TMPDIR symlink attack
In snapshot mode, bdrv_open creates an empty temporary file withoutchecking for mkstemp or close failure, and ignoring the possibilityof a buffer overrun given a surprisingly long $TMPDIR.Change the get_tmp_filename function to return int (not void),...
sheepdog: fix return value of do_load_save_vm_state
bdrv_save_vmstate and bdrv_load_vmstate should return the vmstate sizeon success, and -errno on error.
ISCSI: Switch to using READ16/WRITE16 for I/O to the LUN
This allows using LUNs bigger than 2TB. Keep using READ10 for otherdevice types such as MMC.
Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
ISCSI: Only call READCAPACITY16 for SBC devices, use READCAPACITY10 for MMC
ISCSI: change num_blocks to 64-bit
ISCSI: get device type at connection time
This is needed to avoid READ CAPACITY for MMC devices.
Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
ISCSI: redo how we set up the events
Call qemu_notify_event() after updating events. Otherwise, If we addan event for -is-writeable but the socket is already writeable theremay be a delay before the event callback is actually triggered.
Those delays would in particular hurt performance during BIOS boot and...
qcow2: don't leak buffer for unexpected qcow_version in header
Signed-off-by: Jim Meyering <meyering@redhat.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
sheepdog: mark image as snapshot when tag is specified
When a snapshot tag is specified in the filename, the opened image isa snapshot.
sheepdog: return -errno on error
On error, BlockDriver APIs should return -errno instead of -1.
sheepdog: use heap instead of stack for BDRVSheepdogState
bdrv_create() is called in coroutine context now, so we cannot usemore stack than 1 MB in the function if we use ucontext coroutine.This patch allocates BDRVSheepdogState, whose size is 4 MB, on the...
qcow2: Check qcow2_alloc_clusters_at() return value
When using qcow2_alloc_clusters_at(), the cluster allocation codechecked the wrong variable for an error code.
qcow2: Don't ignore failure to clear autoclear flags
block: fix warning introduced in efcc7a23
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
stream: pass new base image format to bdrv_change_backing_file
When an image is modified to point to the new backing file, the backingfile format is set to NULL, which means auto-probe. This is wrong, infact it is a small security problem.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>...
stream: fix ratelimiting corner case
This fixes inability to make progress in streaming if the quota is setto less than the amount of data that an I/O operation has to write.
In this case, limit->dispatched + n will always be above the quota and,due to the "goto retry" to recheck cancellation and allocation, streaming...
stream: do not copy unallocated sectors from the base
Unallocated sectors should really never be accessed by the guest,so there's no need to copy them during the streaming process.If they are read by the guest during streaming, guest-initiatedcopy-on-read will copy them (we're in the base == NULL case, which...
block: add block_job_sleep_ns
This function abstracts the pretty complex semantics of the "busy" member of BlockJob.
block: wait for job callback in block_job_cancel_sync
The limitation on not having I/O after cancellation cannot really bekept. Even streaming has a very small race window where you couldcancel a job and have it report completion. If this window is hit,...