« Previous | Next » 

Revision 3f78eef2

ID3f78eef21e5c4f401db51376664f68ed16a67e90
Parent 79f87a76
Child 830da270

Added by Iustin Pop almost 15 years ago

Implement device to instance mapping cache

Currently, troubleshooting DRBD problems involves a manual process of going
backwards from the DRBD device to the instance that owns it.

This patch adds a weak (i.e. not guaranteed to be correct or up-to-date)
cache of device to instance. The cache should be, in normal operation,
having correct information as the only time when devices change paths
are when they are started/stopped, and the code in backend.py adds cache
updates to exactly these operations.

The only drawback of this implementation is that we don't fully update
the cache on renames of devices (we clean the old entries but we don't
add new ones). Since the rename changes the path only for LVs (and not
drbd and md), this is less of a problem as the target of this code is
debugging DRBD and MD issues.

The patch writes files named bdev_drbd<N> (or bdev_md<N>,
bdev_xenvg_...) in /var/run/ganeti (more exactly, LOCALSTATEDIR/ganeti).
The files start with 'bdev_' and continue with the path of the device
under /dev/ (this prefix stripped), and contain the following values,
space separated:
- instance name
- primary or secondary (depending on how the device is on the primary
or secondary node)
- instance visible name: sda or sdb or not_visible, the latter case
when the device is not the top-level device (i.e. remote_raid1
templates will have sd[ab] for the md, but not_visible for drbd and
logical volumes)

The cache is designed to not raise any errors, if there is an I/O error
it will only be logged in the node daemon log file. This is in order to
reduce the possible impact of the cache on the block device activation
and shutdown code.

Reviewed-by: imsnah

Files

  • added
  • modified
  • copied
  • renamed
  • deleted

View differences