Revert "utils.log: Write error messages to stderr"
This reverts commit 34aa8b7c4bb6f5e2e788108e024c9cd70bdb3431. Writingerror messages to stderr would also include backtraces, something wetried to avoid in the past.
Signed-off-by: Michael Hanselmann <hansmi@google.com>...
Fix adding nodes after commit 64c7b3831dc
Commit 64c7b3831dc changed the RPC call for verifying SSH connections.Unfortunately this case in adding nodes was missed.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
LUClusterVerifyGroup: Spread SSH checks over more nodes
When verifying a group the code would always check SSH to all nodes inthe same group, as well as the first node for every other group. On bigclusters this can cause issues since many nodes will try to connect to...
Optimise cli.JobExecutor with many pending jobs
In the case we submit many pending jobs (> 100) to the masterd, theJobExecutor 'spams' the master daemon with status requests for thestatus of all the jobs, even though in the end it will only choose asingle job for polling....
listrunner: Don't pass arguments if there are none
If no arguments were specified the “exec_args” variable was “None”,leading to the command being run as “… ./… None”.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>...
ssh: Quote strings in error message
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
utils.log: Write error messages to stderr
When “gnt-cluster copyfile” failed it would only print “Copy of file …to node … failed”. A detailed message is written using logging.error.Writing error messages to stderr can be helpful in figuring out whatwent wrong (the messages also go to the log file, but not everyone might...
Add signal handling doc to hbal man page
Also remove a bug note, since hbal can now for a long time directlyexecute jobs.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Fix handling of cluster verify hooks
The change to enforce boolean results for cluster verify group opcodemissed the HooksCallBack, which uses a very ugly 1/0logic. Furthermore, the logic is wrong, since it unconditionallyresets the verify result to true....
Redistribute the RAPI certificate
This reverts to the old behaviour in Ganeti 2.4 and before.
QA: Add tests for instance start/stop via RAPI
This would have detected the issue fixed in the previous patch.
RAPI: Fix wrong check on instance shutdown
Commit 7fa310f6d84 (April 1st, 2011) converted the RAPI resource forshutting down an instance to FillOpCode. Unfortunately it missed thefact that the shutdown resource gets its parameters as query arguments.
baserlib: Accept empty body in FillOpcode
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: René Nussbaumer <rn@google.com>(cherry picked from commit c6e1a3eef05674d637570c39f25a799cec7ba187)
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Version bump for 2.5.0~beta3
Makefile: Use $(LN_S) instead of “ln -s”
Some platforms apparently don't support “ln -s”, otherwise Autoconfwouldn't have AC_PROG_LN_S.
Fixes to errors/warnings raised by pylint 0.24
Running pylint 0.24.0 revealed 2 errors and 1 warning. Here is how Ifixed them:
PEP8 for QA
- Makefile.am: added QA directory to the paths checked by pep8- qa/: fixed the reported errors- Makefile.am: also, added qa_group.py to qa_scripts
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
listrunner: Allow passing of arguments to executable
This wasn't possible until now.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
DeprecationWarning fixes for pylint
In version 0.21, pylint unified all the disable-* (and enable-*)directives to disable (resp. enable). This leads to a lot ofDeprecationWarning being emitted even if one uses the recommendedversion of pylint (0.21.1, as stated in devnotes.rst)....
listrunner: Replace str.split with library functions
- str.split("/").pop() should be os.path.basename- str.split("\n") should be str.splitlines()
Minor updates and fixes to CPU pinning design doc
Signed-off-by: Tsachy Shacham <tsachy@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Merge branch 'devel-2.4' into devel-2.5
Conflicts: NEWS (trivial) configure.ac (trivial) daemons/ensure-dirs.in (deleted)
Signed-off-by: René Nussbaumer <rn@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
utils: Fix UnescapeAndSplit parsing bug
If a value passed to UnescapeAndSplit ended with a backslash anexception would be raised:
$ gnt-instance modify -H mem=x\\ inst1.example.com[…] e2 = slist.pop(0)IndexError: pop from empty list
Delete master IPs from mergee master nodes
Added a step in cluster-merge that removes the cluster IP from themaster node of the mergee clusters.
Use pep8 utility in “make lint”
This utility checks whether the code conforms to PEP8. Some checks hadto be disabled for Ganeti.
Two more PEP8 fixes
cmdlib: Avoid wrapping using backslash
gnt_group: Avoid * magic using keyword arguments (the “pep8” tooldoesn't like the inline comment in this case and will complain aboutspaces around the “*” operator)
check-python-code: Give location(s) of lines longer than 80 chars
Until now it would only say that there was a line longer than 80characters, but not where.
PEP8 style fixes
Identified using the “pep8” utility.
Wrap a few long lines
Had to break it as well, today! ;)
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
listrunner: Avoid exception if machine is rebooted
Handle exceptions gracefully when trying to read the command's output.
Remove wrong type declaration from option
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Andrea Spadaccini <spadaccio@google.com>
Fix wrong method name in cluster-merge
Fixed a wrong method name in the last patch.
Version bump 2.4.4
Fix --skip-stop-instances help message
cluster-merge: Add the --skip-stop-instances opt
This option allows to do a check for running instances on the mergeeclusters instead of stopping them.
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>Signed-off-by: Guido Trotter <ultrotter@google.com>...
Update NEWS file
Documentation fix for importing with --src-dir option
Signed-off-by: Agata Murawska <agatamurawska@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>(cherry picked from commit b7d7876bd0e9844fab8be28bfa1fd5d563ec7412)
Conflicts:
lib/cmdlib.py (easily fixed)
Adding missing test data for commit 7a380ddfc
Fix a parsing issue with DRBD 8.3.11 in the Linux Kernel
In the Linux kernel commit 4b0715f096 introduced a display bug into/proc/drbd which broke our regex.
The bug was first introduced into Linux 2.6.39-rc1. This bug is stillunfixed as of today.
This patch adapt the regular expression to workaround this bug for the...
watcher: Wait for child processes by default
This patch retains the behaviour of ganeti-watcher in previous Ganetiversions.
Update release date in NEWS for 2.5.0~beta2
Try 3 times before giving up on per-node commands
When contacting lots of nodes some may fail. Give it a couple morechances before giving up on them.
Possible future TODO: continue, but just mark them as offline.
Signed-off-by: Guido Trotter <ultrotter@google.com>...
Allow retrying commands in cluster-merge
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Add a TODO on the VerifyCluster option
Transform node readd exceptions into just errors
We are after the point of no return, no point in failing everythingbecause a node failed to readd. Better to just report it and move on.
Offline node when adding it to a merged cluster
Bump version to 2.5.0~beta2
Also update NEWS file.
sphinx_ext: workaround epydoc warning
Similar to commit c29e35f, this works around epydoc breakage byaliasing the module. Makes 'apidoc' pass again on my machine.
check-news: Show per-file line number
… not the global line number.
Unify some file headers
Remove unnecessary commas, add empty lines where necessary to make themconsistent.
I'm working on a script to check this, but it's not yet ready.
Makefile: Add design-ovf-support to list of doc files
ensure-dirs: Fix epydoc error
Signed-off-by: Agata Murawska <agatamurawska@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
ensure-dirs: Check mode and owner before changing
This avoids many calls to chmod(2) and chown(2), and thereby ctimeupdates.
Since I had to update the unittests anyway I untangled the code a bit,split it into more separate functions and added some more tests....
ensure-dirs: Refine error handling on stat(2)
The “_stat_fn” function is renamed to “_lstat_fn” to reflect itsfunction. The try/except block just wraps calling lstat(2) and nothingelse.
ensure-dirs: Change wording of some messages
ensure-dirs: Implement debug logging
There was no logging at all.
ensure-dirs: Set permissions on job files in queue
This was a regression from 2.4.
ensure-dirs: Fix a bug with queue/archive permissions
While it sets the permission on all files in queue/archive accordinglyit doesn't do so for the created archive directories. This patch fixesthis problem.
Signed-off-by: René Nussbaumer <rn@google.com>...
ensure-dirs: Set permissions on queue lock file
ensure-dirs: Set correct permissions on ssconf files
The files should be 0444, not 0400. This was a regression from 2.4.
cfgupgrade: Add confirmation message
A message will be given instead of just dropping the user back to theprompt in case of a successful upgrade.
[…]documentation formats). Continue with upgrading configuration?y/[n]/?: yConfiguration successfully upgraded for version 2.5.0~beta1....
Handle network interfaces without IPs
If the user specified a network interface with no IPs, he would receivean unhelpful "list index out of range" error. Fixed that.
Fixed potential unreferenced variable usage
I noticed a path in the code that would use spice_ip_version even ifit was not initialized. This patch fixes it.
Added documentation for gnt-instance remove --force in the man page
Added documentation for SPICE options in the gnt-instance man page
Added basic support for SPICE
Implemented the following parameters:- spice_bind- spice_ip_version
NEWS: Add release date for 2.5.1~beta1
Fix exit code of “gnt-cluster verify”
With commit fcad7225e3fc4 LU-generated jobs are used, but theexit code must still be backwards-compatible.
Update NEWS for 2.5
Small improvements for cluster verify
- Check if BGL is actually owned- Show group name as feedback
watcher: Use locks when querying for resource information
Allow locking to be used via OpQuery
The original design for query2 specifically excluded locking, but nowit's turned out that it would be a good thing to have in watcher. Thispatch adds a new parameter to OpQuery and enables its use in LUQuery. Amissing function is added to LUGroupQuery, a comment clarified in...
Document job results for RAPI where possible
Some opcodes aren't documented yet.
opcodes: Add more result checks, add some comments
Some of these will be used by the RAPI documentation.
sphinx_ext: Allow documenting opcode results
Will be used by RAPI documentation.
ht: Allow adding comment to type descriptions
This will be used to add some more details to type descriptions, e.g. onopcode parameters or result values. The implementation is very similarto “WithDesc”.
I chose to use “[…]” after finding “/*…*/” hard to read and spot. At...
Clarify job ID-related type checks, add unittests
Instead of a rather complicated expression only “JobId” is output. JobID lists (like generated by “SubmitManyJobs”) are limited to two-itemlists. Unittests are added.
Change OpClusterVerifyConfig's result, verify results
This patch removes the list of node groups (not used anymore sincecommit fcad7225e3fc) from OpClusterVerifyConfig's result and adds resultverification to all OpClusterVerify* opcodes.
Fixed error in Makefile.am, changing spaces with tabs
Added the test data files for netutils to Makefile.am
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Use LU-generated jobs for verifying cluster
This patch moves the logic for verifying the various node groups in acluster into the master daemon. Job dependencies are used to ensure theconfiguration, which requires the BGL, is verified first.
With this change it will be possible to expose whole-cluster...
opcodes: Use variables for verification parameters
Just some cleanup before the 2.5 release.
mcpu: Specify actual received type on opcode issue
This helped me debug an issue with opcodes.
Use resource kind as OpQuery*'s description
This gives a hint as to what's queried. “QUERY” or“QUERY” are way better than just “QUERY”.
Added helper functions in netutils and related constants
Added the following functions to netutils:- IsValidInterface- GetInterfaceIpAddresses- _GetIpAddressesFromIpOutput
Added the following static methods to netutils.IPAddress:- GetAddressFamilyFromVersion...
Fix epydoc error in rlib2.py
I blindly assumed epydoc would use normal reST, but turns out it usesits own “epytext” in our configuration. Since the latter doesn't supportblockquotes, I just make the paragraph a literal block.
Fix typo in rlib2's docstring
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Benjamin Lipton <benlipton@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Documentation fixes and clarification
- In README, refer to “install.rst”, not “install.html”- In rapi.rst, wrap line longer than 72 characters- In rlib2.py, update and clarify description of POST vs. PUT
gnt-instance: Rename SHUTDOWN* to EXPAND*
Once upon a time these constants were only used for stopping instances,but pretty soon they became more useful. Let's rename them.
List returned fields in RAPI documentation
Also replace console types with constants.
rlib2: Exclude oplog/opresult from bulk job list
These fields can get rather large. Excluding them from the big bulk listreduces the amount of data. They are still available via per-jobrequests.
rlib: Expose node group tags
Commit 1ffd26739d3 added support for tagging node groups. Also add acheck for exposed fields.
rapi: Bulk support for jobs
This was requested in issue 181.
Fixed an error in the documentation of _GetKVMVersion
Fixed an epydoc compilation error that I introduced with last commit.
Mention globbing filters in ganeti(7) manpage
Removed code duplication for calls to _GetKVMVersion
Fix epydoc breakage caused by f8638e288c7a
Changed NET_PORT_CHECK to REQ_NET_PORT_CHECK, to improve consistency
I originally made this change because I needed the OPT_NET_PORT_CHECK,and I am committing it even if I don't need anymore OPT_NET_PORT_CHECKbecause IMO it improves the consistency of the name of the wrappers....
Added check for the ip command at configure time
Also, corrected a few places where the ip command was hardcoded.