burnin: Implement retryable operations
Some burnin steps are idempotent: e.g. reinstalling an instance (from
burning p.o.v.) can be done multiple times without any side-effects that
would affect later burnin steps. As such, failing the whole burnin
process due a reinstall failure is undesirable.
This patch modifies burnin by marking each opcode (in case of individual
execution) and job set retryable or not. Retryable actions will be
retried up to a number of times, after which we give up and return
failure.
One side-effect is that in case of full-failure in retryable job sets we
lose the original exception (but we do log its string format), so we
have a little bit less information in this case.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>