Revision 81198f6e
ID | 81198f6eb6f9f3070cf946ffbd35bf4964b49049 |
node daemon: allow working with broken queue dir
In case the queue dir cannot be create/initialized, currently
ganeti-noded exits. This means that a read-only filesystem or a
permission error breaks all node daemon functionality, including
powercycle. This is not good for the usual failure case for nodes.
To workaround this, we don't require successful initialization at node
daemon startup; if we can't init the queue dir/lock, we retry at every
RPC call requiring a job queue lock, and if we still can't acquire the
lock, we raise an exception (which is catched in HandleRequest and
transformed into an RPC failure).
This allows the node daemon to start in face of queue issues, and the
master node to power-cycle it.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Files
- added
- modified
- copied
- renamed
- deleted