« Previous | Next » 

Revision f789aa7b

IDf789aa7baff33e74c549a249aba3ae7a364d7642

Added by Michael Roth almost 12 years ago

qemu-ga: persist tracking of fsfreeze state via filesystem

Currently, qemu-ga may die/get killed/go away for whatever reason after
guest-fsfreeze-freeze has been issued, and before guest-fsfreeze-thaw
has been issued. This means the only way to unfreeze the guest is via
VNC/network/console access, but obtaining that access after-the-fact can
often be very difficult when filesystems are frozen. Logins will almost
always hang, for instance. In many cases the only recourse would be to
reboot the guest without any quiescing of volatile state, which makes
this a corner-case worth giving some attention to.

A likely failsafe for this situation would be to use a watchdog to
restart qemu-ga if it goes away. There are some precautions qemu-ga
needs to take in order to avoid immediately hanging itself on I/O,
however, namely, we must disable logging and defer to processing/creation
of user-specific logfiles, along with creation of the pid file if we're
running as a daemon. We also need to disable non-fsfreeze-safe commands,
as we normally would when processing the guest-fsfreeze-freeze command.

To track when we need to do this in a way that persists between multiple
invocations of qemu-ga, we create a file on the guest filesystem before
issuing the fsfreeze, and delete it when doing the thaw. On qemu-ga
startup, we check for the existance of this file to determine
the need to take the above precautions.

We're forced to do it this way since a more traditional approach such as
reading/writing state to a dedicated state file will cause
access/modification time updates, respectively, both of which will hang
if the file resides on a frozen filesystem. Both can occur even if
relatime is enabled. Checking for file existence will not update the
access time, however, so it's a safe way to check for fsfreeze state.

An actual watchdog-based restart of qemu-ga can itself cause an access
time update that would thus hang the invocation of qemu-ga, but the
logic to workaround that can be handled via the watchdog, so we don't
address that here (for relatime we'd periodically touch the qemu-ga
binary if the file $qga_statedir/qga.state.isfrozen is not present, this
avoids qemu-ga updates or the 1 day relatime threshold causing an
access-time update if we try to respawn qemu-ga shortly after it goes
away)

Signed-off-by: Michael Roth <>

Files

  • added
  • modified
  • copied
  • renamed
  • deleted

View differences