iscsi: Fix NULL dereferences / races between task completion and abort
Signed-off-by: Stefan Priebe <s.priebe@profihost.ag>Acked-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
block: Prevent detection of /dev/fdset/ as floppy
Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
block: Convert open calls to qemu_open
This patch converts all block layer open calls to qemu_open.
Note that this adds the O_CLOEXEC flag to the changed open pathswhen the O_CLOEXEC macro is defined.
Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>...
block: Convert close calls to qemu_close
This patch converts all block layer close calls, that correspondto qemu_open calls, to qemu_close.
Merge remote-tracking branch 'kwolf/for-anthony' into staging
Merge remote-tracking branch 'bonzini/scsi-next' into staging
qed: mark image clean after repair succeeds
The dirty bit is cleared after image repair succeeds in qed_open().Move this into qed_check() so that all callers benefit from thisbehavior when fix=true.
This is necessary so qemu-img check can call .bdrv_check() and mark the...
qcow2: mark image clean after repair succeeds
The dirty bit is cleared after image repair succeeds in qcow2_open().Move this into qcow2_check() so that all callers benefit from thisbehavior when fix mode is enabled.
block: add BLOCK_O_CHECK for qemu-img check
Image formats with a dirty bit, like qed and qcow2, repair dirty imagefiles upon open with BDRV_O_RDWR. Performing automatic repair whenqemu-img check runs is not ideal because the bdrv_open() call repairsthe image before the actual bdrv_check() call from qemu-img.c....
iscsi: Pick default initiator-name based on the name of the VM
This patch updates the iscsi layer to automatically pick a 'unique'initiator-name based on the name of the vm in case the user has not setan explicit iqn-name to use.
Create a new function qemu_get_vm_name() that returns the name of the VM,...
iscsi: do not leak initiator_name
The argument of iscsi_create_context is never freed by libiscsi,which in fact calls strdup on it. Avoid a leak.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
iscsi: reorganize code for parse_initiator_name
Merge the occurrences of the "iqn.2008-11.org.linux-kvm" stringto avoid duplication.
qcow2: introduce dirty bit
This patch adds an incompatible feature bit to mark images that have notbeen closed cleanly. When a dirty image file is opened a consistencycheck and repair is performed.
Update qemu-iotests 031 and 036 since the extension header size changes...
qcow2: implement lazy refcounts
Lazy refcounts is a performance optimization for qcow2 that postponesrefcount metadata updates and instead marks the image dirty. In thecase of crash or power failure the image will be left in a dirty stateand repaired next time it is opened....
vvfat: Fix partition table
Unless parameter ":floppy:" is given, vvfat creates a virtual imagewith DOS MBR defining a single partition which holds the FAT filesystem. The size of the virtual image depends on the width of theFAT: 32 MiB (CHS 64, 16, 63) for 12 bit FAT, 504 MiB (CHS 1024, 16,...
vvfat: Do not clobber the user's geometry
vvfat creates a virtual VFAT filesystem with a certain logicalgeometry that depends on its options. It sets the "geometry hint" tothis geometry. It is the only block driver to do this.
The geometry hint is about about physical geometry, and used only by...
sheepdog: always use coroutine-based network functions
This reduces some code duplication.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
sheepdog: do not blindly memset all read buffers
Only buffers that map to unallocated blocks need to be zeroed.
Signed-off-by: Christoph Hellwig <hch@lst.de>Acked-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Merge remote-tracking branch 'mjt/mjt-iov2' into staging
sheepdog: split outstanding list into inflight and pending
outstanding_list_head is used for both pending and inflight requests.This patch splits it and improves readability.
sheepdog: traverse pending_list from the first for each time
The pending list can be modified in other coroutine contextsd_co_rw_vector, so we need to traverse the list from the first againafter we send the pending request.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>...
blkdebug: remove sync i/o events
These are unused, except (by mistake more or less) in QED.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
blkdebug: tiny cleanup
blkdebug: pass getlength to underlying file
This is required when using blkdebug with raw format. Unlike qcow2/QED,raw asks blkdebug for the length of the file, it doesn't get it froma header.
blkdebug: store list of active rules
This prepares for the next patch, where some active rules may actuallynot trigger depending on input to readv/writev. Store the active rulesin a SIMPLEQ (so that it can be emptied easily with QSIMPLEQ_INIT), andfetch the errno/once/immediately arguments from there....
blkdebug: optionally tie errors to a specific sector
This makes blkdebug scripts more powerful, and independent of theexact sequence of operations performed by streaming.
raw: hook into blkdebug
qcow2: fix #ifdef'd qcow2_check_refcounts() callers
The DEBUG_ALLOC qcow2.h macro enables additional consistency checksthroughout the code. This makes it easier to spot corruptions that areintroduced during development. Since consistency check is an expensive...
qcow2: preserve free_byte_offset when qcow2_alloc_bytes() fails
When qcow2_alloc_clusters() error handling code was introduced in commit5d757b563d59142ca81e1073a8e8396750a0ad1a, the value of free_byte_offsetwas clobbered in the error case. This patch keeps free_byte_offset at 0...
sheepdog: fix dprintf format strings
This fixes warnings about dprintf format in debug mode.
sheepdog: restart I/O when socket becomes ready in do_co_req()
Currently, no one reenters the yielded coroutine. This fixes it.
sheepdog: use coroutine based socket functions in coroutine context
This removes blocking network I/Os in coroutine context.
sheepdog: make sure we don't free aiocb before sending all requests
This patch increments the pending counter before sending requests, andmake sures that aiocb is not freed while sending them.
ISCSI: Add SCSI passthrough via scsi-generic to libiscsi
Update iscsi to allow passthrough of SG_IO scsi commands when the iscsidevice is forced to be scsi-generic.
Implement both bdrv_ioctl() and bdrv_aio_ioctl() in the iscsi backend,emulate the SG_IO ioctl and pass the SCSI commands across to the...
ISCSI: force use of sg for SMC and SSC devices
If the device we open is a SMC or SSC device, then force the use of sg. Wedont have any medium changer or tape emulation so only passthrough viareal sg or scsi-generic via iscsi would work anyway.
Forcing sg also makes qemu skip trying to read from the device to guess...
raw-posix: Fix build without is_allocated support
Move the declaration of s into the #ifdef sections that actually makeuse of it.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>Signed-off-by: Alexander Graf <agraf@suse.de>
qcow2: always operate caches in writeback mode
Writethrough does not need special-casing anymore in the qcow2 caches.The block layer adds flushes after every guest-initiated data write,and these will also flush the qcow2 caches to the OS.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>...
qcow2: Simplify calculation for COW area at the end
copy_sectors() always uses the sum (cluster_offset + n_start) or(start_sect + n_start), so if some value is added to both cluster_offsetand start_sect, and subtracted from n_start, it's cancelled out anyway....
qcow2: Fix avail_sectors in cluster allocation code
avail_sectors should really be the number of sectors from the start ofthe allocation, not from the start of the write request.
We're lucky enough that this mistake didn't cause any real bug.avail_sectors is only used in the intialiser of QCowL2Meta:...
qcow2: fix autoclear image header update
The autoclear feature bits can be used for qcow2 file format featuresthat are safe to "drop" by old programs that do not understand thefeature. Upon opening the image file unknown autoclear feature bits arecleared and the image file header is rewritten, but this was happening...
qcow2: remove a line of unnecessary code
Commit 3948d1d4 removed the pointer argument we filled in with l2_offsetbut forgot to remove the unnecessary l2_offset assignment.
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>...
qcow2: fix endianness conversion
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
block: implement is_allocated for raw
Either FIEMAP, or SEEK_DATA+SEEK_HOLE can be used to implement theis_allocated callback for raw files. On Linux ext4, btrfs and XFSall support it.
stream: tweak usage of bdrv_co_is_allocated
is_allocated_base has complex semantics that are not really usableoutside streaming. Split the check in two parts, where the allocatedstate for the top bs is moved to the caller. The resulting functionis more generally useful....
stream: move is_allocated_above to block.c
stream: move rate limiting to a separate header file
Make the code reusable.
qemu-img check -r for repairing images
The QED block driver already provides the functionality to not onlydetect inconsistencies in images, but also fix them. However, thisfunctionality cannot be manually invoked with qemu-img, but thecheck happens only automatically during bdrv_open()....
qemu-img check: Print fixed clusters and recheck
When any inconsistencies have been fixed, print the statistics and runanother check to make sure everything is correct now.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qcow2: Support for fixing refcount inconsistencies
rbd: hook up cache options
Writeback caching was added in Ceph 0.46, and writethrough will be in0.47. These are controlled by general config options, so there's noneed to check for librbd version.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
sheepdog: add coroutine_fn markers to coroutine functions
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qcow2: Silence false warning
Some gcc versions seem not to be able to figure out that the switchstatement covers all possible values and that c is therefore alwaysinitialised. Add a default branch for them.
Reported-by: malc <av1474@comtv.ru>Signed-off-by: Kevin Wolf <kwolf@redhat.com>...
allow qemu_iovec_from_buffer() to specify offset from which to start copying
Similar to qemu_iovec_memset(QEMUIOVector *qiov, size_t offset, int c, size_t bytes);the new prototype is: qemu_iovec_from_buf(QEMUIOVector *qiov, size_t offset,...
consolidate qemu_iovec_copy() and qemu_iovec_concat() and make them consistent
qemu_iovec_concat() is currently a wrapper forqemu_iovec_copy(), use the former (with extra"0" arg) in a few places where it is used.
Change skip argument of qemu_iovec_copy() from...
change qemu_iovec_to_buf() to match other to,from_buf functions
It now allows specifying offset within qiov to start from andamount of bytes to copy. Actual implementation is just a callto iov_to_buf().
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
cleanup qemu_co_sendv(), qemu_co_recvv() and friends
The same as for non-coroutine versions in previouspatches: rename arguments to be more obvious, changetype of arguments from int to size_t where appropriate,and use common code for send and receive paths (with...
consolidate qemu_iovec_memset{,_skip}() into single function and use existing iov_memset()
This patch combines two functions into one, and replacesthe implementation with already existing iov_memset() fromiov.c.
The new prototype of qemu_iovec_memset():...
build: move block/ objects to nested Makefile.objs
block: prevent snapshot mode $TMPDIR symlink attack
In snapshot mode, bdrv_open creates an empty temporary file withoutchecking for mkstemp or close failure, and ignoring the possibilityof a buffer overrun given a surprisingly long $TMPDIR.Change the get_tmp_filename function to return int (not void),...
sheepdog: fix return value of do_load_save_vm_state
bdrv_save_vmstate and bdrv_load_vmstate should return the vmstate sizeon success, and -errno on error.
ISCSI: Switch to using READ16/WRITE16 for I/O to the LUN
This allows using LUNs bigger than 2TB. Keep using READ10 for otherdevice types such as MMC.
Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
ISCSI: Only call READCAPACITY16 for SBC devices, use READCAPACITY10 for MMC
ISCSI: change num_blocks to 64-bit
ISCSI: get device type at connection time
This is needed to avoid READ CAPACITY for MMC devices.
Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
ISCSI: redo how we set up the events
Call qemu_notify_event() after updating events. Otherwise, If we addan event for -is-writeable but the socket is already writeable theremay be a delay before the event callback is actually triggered.
Those delays would in particular hurt performance during BIOS boot and...
qcow2: don't leak buffer for unexpected qcow_version in header
Signed-off-by: Jim Meyering <meyering@redhat.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
sheepdog: mark image as snapshot when tag is specified
When a snapshot tag is specified in the filename, the opened image isa snapshot.
sheepdog: return -errno on error
On error, BlockDriver APIs should return -errno instead of -1.
sheepdog: use heap instead of stack for BDRVSheepdogState
bdrv_create() is called in coroutine context now, so we cannot usemore stack than 1 MB in the function if we use ucontext coroutine.This patch allocates BDRVSheepdogState, whose size is 4 MB, on the...
qcow2: Check qcow2_alloc_clusters_at() return value
When using qcow2_alloc_clusters_at(), the cluster allocation codechecked the wrong variable for an error code.
qcow2: Don't ignore failure to clear autoclear flags
block: fix warning introduced in efcc7a23
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
stream: pass new base image format to bdrv_change_backing_file
When an image is modified to point to the new backing file, the backingfile format is set to NULL, which means auto-probe. This is wrong, infact it is a small security problem.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>...
stream: fix ratelimiting corner case
This fixes inability to make progress in streaming if the quota is setto less than the amount of data that an I/O operation has to write.
In this case, limit->dispatched + n will always be above the quota and,due to the "goto retry" to recheck cancellation and allocation, streaming...
stream: do not copy unallocated sectors from the base
Unallocated sectors should really never be accessed by the guest,so there's no need to copy them during the streaming process.If they are read by the guest during streaming, guest-initiatedcopy-on-read will copy them (we're in the base == NULL case, which...
block: fix snapshot on QED
QED's opaque data includes a pointer back to the BlockDriverState.This breaks when bdrv_append shuffles data between bs_new and bs_top.To avoid this, add a "rebind" function that tells the driver aboutthe new relationship between the BlockDriverState and its opaque....
block: add block_job_sleep_ns
This function abstracts the pretty complex semantics of the "busy" member of BlockJob.
block: wait for job callback in block_job_cancel_sync
The limitation on not having I/O after cancellation cannot really bekept. Even streaming has a very small race window where you couldcancel a job and have it report completion. If this window is hit,...
block: push bdrv_change_backing_file error checking up from drivers
This check applies to all drivers, but QED lacks it.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
block: update in-memory backing file and format
These are needed to print "info block" output correctly. QCOW2 does thisbecause it needs it to write the header, but QED does not, and common codeis the right place to do it.
sheepdog: switch to writethrough mode if cluster doesn't support flush
This is necessary for qemu to work with the older version of Sheepdogwhich doesn't support SD_OP_FLUSH_VDI.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@gmail.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qcow2: Limit COW to where it's needed
This fixes a regression introduced in commit 250196f1. The bug leads todata corruption, found during an Autotest run with a Fedora 8 guest.
Consider a write request whose first part is covered by an alreadyallocated cluster, but additional clusters need to be newly allocated....
qcow2: lock on prealloc
preallocate() will be locked. This is required becauseqcow2_alloc_cluster_link_l2() assumes that it runs under a lock that itcan drop while COW is being performed.
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
ISCSI: Add support for thin-provisioning via discard/UNMAP and bigger LUNs
Update the configure test for libiscsi support to detect version 1.3or later. Version 1.3 of libiscsi provides both READCAPACITY16 as wellas UNMAP commands.
Update the iscsi block layer to use READCAPACITY16 to detect the size of...
rbd: add discard support
Change the write flag to an operation type in RBDAIOCB, and make thebuffer optional since discard doesn't use it.
Discard is first included in librbd 0.1.2 (which is in Ceph 0.46).If librbd is too old, leave out qemu_rbd_aio_discard entirely,...
block/qcow2: Add missing GCC_FMT_ATTR to function report_unsupported()
Cc: Kevin Wolf <kwolf@redhat.com>Signed-off-by: Stefan Weil <sw@weilnetz.de>Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qcow2: Remove unused parameter in do_alloc_cluster_offset
qcow2: Don't hold cache references across yield
If cache references are held while the coroutine has yielded, the cachemay get used up and abort() when it can't find a free entry.
qcow2: fix the return value -ENOENT -> -EEXIST
raw-posix: Do not use CONFIG_COCOA macro
Use APPLE and MACH macros instead of CONFIG_COCOA to detect MacOS X host. The patch is based on Ben Leslie's patch:http://patchwork.ozlabs.org/patch/97859/
Signed-off-by: Ben Leslie <benno@benno.id.au>Signed-off-by: Pavel Borzenkov <pavel.borzenkov@gmail.com>...
Merge remote-tracking branch 'qmp/queue/qmp' into staging
block: use Error mechanism instead of -errno for block_job_create()
The block job API uses -errno return values internally and we convertthese to Error in the QMP functions. This is ugly because the Errorshould be created at the point where we still have all the relevant...
block: use Error mechanism instead of -errno for block_job_set_speed()
There are at least two different errors that can occur inblock_job_set_speed(): the job might not support setting speeds or thevalue might be invalid.
Use the Error mechanism to report the error where it occurs....
block: change block-job-set-speed argument from 'value' to 'speed'
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>Acked-by: Kevin Wolf <kwolf@redhat.com>Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
block: add 'speed' optional parameter to block-stream
Allow streaming operations to be started with an initial speed limit.This eliminates the window of time between starting streaming andissuing block-job-set-speed. Users should use the new optional 'speed'...
nbd: Fix uninitialised use of s->sock
s->sock is assigned only afterwards, so we're really registering anaio_fd_handler for file descriptor 0 here. Not exactly what we intended.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>