ganeti-local
15 years agoCatch BlockDeviceError when starting instance
Iustin Pop [Wed, 7 Jan 2009 17:01:46 +0000 (17:01 +0000)]
Catch BlockDeviceError when starting instance

This is a forward-port of commit 1149 on the 1.2 branch:
  _GatherAndLinkBlockDevs used to raise the errors.BlockDeviceError
  exception when it failed to create a block device, and with this patch
  set it does so also when it fails to create a symlink to it.

  With this patch we move the call to this function into a pre-existing
  try-except block in the code, and catch the BlockDeviceError exception,
  logging a message and returning a failure state if it happens.

  Reviewed-by: iustinp

The changes are related to the new hypervisor and logging syntax.

Original-Author: ultrotter

15 years agoCreate symlinks to intances' block devices
Iustin Pop [Wed, 7 Jan 2009 17:01:36 +0000 (17:01 +0000)]
Create symlinks to intances' block devices

This is a forward-port of commit 1148 on the 1.2 branch:
  Change the _GatherBlockDevs private function, called only one time by
  StartInstance, to _GatherAndLinkBlockDevs, and make it transform the
  device returned even more by calling the new _SimlinkBlockDev auxiliary
  function.

  This makes sure that every time an instance is started symlinks to its
  block devices are created, and the instance is started off them, rather
  than the underlying block devices.

  Reviewed-by: iustinp

The changes we make to the patch is related to newer function signatures
in 2.0, and to the fact that iv_name is deprecated and we use instead
disk%d based on the disk index.

Original-Author: ultrotter

15 years agoSimplify hypervisor block_devices structure
Iustin Pop [Wed, 7 Jan 2009 17:01:25 +0000 (17:01 +0000)]
Simplify hypervisor block_devices structure

This is a partial forward-port of commit 1136 on the 1.2 branch:

  The hypervisor doesn't need to be passed the whole block device
  structure, so we'll just give it the block device name on the local
  node, and the name as seen by the instance. This will make it easier to
  manipulate it later without messing with the block devices (eg. by
  changing the system name to a symlink to the name itself).

  Since the HVM hypervisor changes the "virtual" name a note is added
  calling for a redesign that doesn't need this change, as different
  hypervisors and emulation types will anyway have different names for
  exported devices.

  Reviewed-by: iustinp

The changes in this patch compared to the original are:
  - we keep passing the original disk object, not for its iv_name, but
    for it's physical_id which is needed by the file driver (this could
    be fixed maybe)
  - we don't use the iv_name anymore, since in 2.0 we already use the
    index of the device

Original-Author: ultrotter

15 years ago_AssembleInstanceDisks: fix rpcresult handling
Iustin Pop [Wed, 7 Jan 2009 14:38:29 +0000 (14:38 +0000)]
_AssembleInstanceDisks: fix rpcresult handling

Commit 2117 changed _AssembleInstanceDisks to correctly parse the
failure status of the new RpcResult structure, but it didn't fix the
storing of only the result payload. Since RpcResult is not JSON
serializable, LUActivateInstanceDisks is failing.

Reviewed-by: ultrotter

15 years agoFix some pylint-detected issues
Iustin Pop [Tue, 6 Jan 2009 09:57:53 +0000 (09:57 +0000)]
Fix some pylint-detected issues

Two bad indentation cases and a missing variable.

Reviewed-by: imsnah

15 years agoganeti.bootstrap: Set permissions on newly uploaded files
Michael Hanselmann [Fri, 19 Dec 2008 19:31:17 +0000 (19:31 +0000)]
ganeti.bootstrap: Set permissions on newly uploaded files

Reviewed-by: amishchenko

15 years agoganeti.cmdlib: Check remote API certificate on "gnt-cluster verify"
Michael Hanselmann [Fri, 19 Dec 2008 19:31:04 +0000 (19:31 +0000)]
ganeti.cmdlib: Check remote API certificate on "gnt-cluster verify"

Reviewed-by: amishchenko

15 years agoganeti.bootstrap: Upload remote API certificate to new nodes
Michael Hanselmann [Fri, 19 Dec 2008 19:30:46 +0000 (19:30 +0000)]
ganeti.bootstrap: Upload remote API certificate to new nodes

Reviewed-by: amishchenko

15 years agoganeti.bootstrap: Prepare for remote API certificate
Michael Hanselmann [Fri, 19 Dec 2008 19:30:31 +0000 (19:30 +0000)]
ganeti.bootstrap: Prepare for remote API certificate

Reviewed-by: amishchenko

15 years agoganeti.bootstrap: Write SSL key to temporary file and set permissions
Michael Hanselmann [Fri, 19 Dec 2008 19:30:17 +0000 (19:30 +0000)]
ganeti.bootstrap: Write SSL key to temporary file and set permissions

Previously, we set the permissions only after writing the key. This
gave other users on the system a small window during which they could
read the key.

Reviewed-by: amishchenko

15 years agoganeti.bootstrap: Generate SSL certificate for remote API
Michael Hanselmann [Fri, 19 Dec 2008 19:30:05 +0000 (19:30 +0000)]
ganeti.bootstrap: Generate SSL certificate for remote API

Reviewed-by: amishchenko

15 years agoganeti.bootstrap: Move SSL certificate generation into separate function
Michael Hanselmann [Fri, 19 Dec 2008 19:29:50 +0000 (19:29 +0000)]
ganeti.bootstrap: Move SSL certificate generation into separate function

Reviewed-by: amishchenko

15 years agoganeti-rapi: Implement HTTP authentication
Michael Hanselmann [Fri, 19 Dec 2008 12:58:27 +0000 (12:58 +0000)]
ganeti-rapi: Implement HTTP authentication

Passwords are stored in "$localstatedir/lib/ganeti/rapi_users". User
options specify the access permissions of a user (see docstring for
ganeti.http.ReadPasswordFile), for which only "write" is supported
to grant write access. Every other user has read-only access.

Reviewed-by: amishchenko

15 years agoganeti-rapi: Introduce per-request context
Michael Hanselmann [Fri, 19 Dec 2008 12:58:10 +0000 (12:58 +0000)]
ganeti-rapi: Introduce per-request context

This will be used to evaluate access permissions to resources.

Reviewed-by: amishchenko

15 years agoganeti.http: Function to read password file
Michael Hanselmann [Fri, 19 Dec 2008 12:57:58 +0000 (12:57 +0000)]
ganeti.http: Function to read password file

Lines in the password file are of the following format:

  <username> <password> [options]

Fields are separated by whitespace. Username and password are
mandatory, options are optional and separated by comma (",").
Empty lines and comments ("#") are ignored.

Reviewed-by: amishchenko

15 years agoganeti.http: Add support for private data in HTTP requests
Michael Hanselmann [Fri, 19 Dec 2008 12:57:38 +0000 (12:57 +0000)]
ganeti.http: Add support for private data in HTTP requests

Reviewed-by: amishchenko

15 years agoganeti.http: Add support for basic HTTP authentication
Michael Hanselmann [Fri, 19 Dec 2008 12:57:22 +0000 (12:57 +0000)]
ganeti.http: Add support for basic HTTP authentication

As per RFC2617.

Reviewed-by: amishchenko

15 years agoganeti.http: Prepare authentication for HTTP server
Michael Hanselmann [Fri, 19 Dec 2008 12:57:07 +0000 (12:57 +0000)]
ganeti.http: Prepare authentication for HTTP server

The authentication class will override PreHandleRequest.

Reviewed-by: amishchenko

15 years agoJob queue: Allow more than one file rename per RPC call
Michael Hanselmann [Thu, 18 Dec 2008 16:39:04 +0000 (16:39 +0000)]
Job queue: Allow more than one file rename per RPC call

Reviewed-by: ultrotter

15 years agoganeti.jqueue: Group job archivals to reduce number of RPC calls
Michael Hanselmann [Thu, 18 Dec 2008 16:38:47 +0000 (16:38 +0000)]
ganeti.jqueue: Group job archivals to reduce number of RPC calls

Reducing the actual number of RPC calls will come in another patch.

Reviewed-by: ultrotter

15 years agoPrevent RPC timeout on auto-archiving jobs
Michael Hanselmann [Thu, 18 Dec 2008 16:38:32 +0000 (16:38 +0000)]
Prevent RPC timeout on auto-archiving jobs

With a large job queue, auto-archiving jobs can take a very long time,
causing timeouts on the luxi RPC layer. With this change, auto-
archive returns after half of the RPC timeout has passed. The user
will see how many jobs are left unchecked.

Reviewed-by: ultrotter

15 years agojqueue: When auto-archiving jobs, calculate job status only once
Michael Hanselmann [Thu, 18 Dec 2008 16:38:09 +0000 (16:38 +0000)]
jqueue: When auto-archiving jobs, calculate job status only once

This is done by passing the job object to _ArchiveJobUnlocked instead
of only the job ID. Also return whether job was actually archived.

Reviewed-by: ultrotter

15 years agoUse subdirectories for job queue archive
Michael Hanselmann [Thu, 18 Dec 2008 16:23:26 +0000 (16:23 +0000)]
Use subdirectories for job queue archive

As it turned out, having many files in a single directory can be
very painful. With this patch, only 10'000 files are stored in a
directory for the job queue archive. With 10'000 directries, this
allows for up to 100 million jobs be archived without having large
numbers of files in a single directories. Not that it is realistic,
anyway.

Reviewed-by: ultrotter

15 years agoAdd rename function automatically creating directories if needed
Michael Hanselmann [Thu, 18 Dec 2008 16:23:05 +0000 (16:23 +0000)]
Add rename function automatically creating directories if needed

Unfortunately, os.makedirs in Python 2.4 is not safe against multiple
processes creating the same directory tree at the same time. This is
only fixed in Python 2.5 and up. Adding more checks in our code doesn't
make it any better.

Reviewed-by: iustinp

15 years agoganeti.http: Don't pass poller object around
Michael Hanselmann [Thu, 18 Dec 2008 16:21:13 +0000 (16:21 +0000)]
ganeti.http: Don't pass poller object around

They're cheap to instantiate and doing this changes makes the code
a bit simpler.

Reviewed-by: ultrotter

15 years agoRename http.HttpInternalError to HttpInternalServerError
Michael Hanselmann [Thu, 18 Dec 2008 13:45:41 +0000 (13:45 +0000)]
Rename http.HttpInternalError to HttpInternalServerError

All other exceptions are named after the error name in RFC2616 (HTTP/1.1).

Reviewed-by: amishchenko

15 years agoganeti.http: Add more constants and errors
Michael Hanselmann [Thu, 18 Dec 2008 13:45:24 +0000 (13:45 +0000)]
ganeti.http: Add more constants and errors

Reviewed-by: amishchenko

15 years agoganeti.http: Ignore ENOTCONN when shutting down the connection
Michael Hanselmann [Thu, 18 Dec 2008 13:45:10 +0000 (13:45 +0000)]
ganeti.http: Ignore ENOTCONN when shutting down the connection

Reviewed-by: amishchenko

15 years agoImplement support for additional headers with HTTP errors
Michael Hanselmann [Thu, 18 Dec 2008 13:44:53 +0000 (13:44 +0000)]
Implement support for additional headers with HTTP errors

Reviewed-by: amishchenko

15 years agoAdd simple unittests for ganeti.http
Michael Hanselmann [Wed, 17 Dec 2008 14:30:58 +0000 (14:30 +0000)]
Add simple unittests for ganeti.http

More complex unittests will need some refactoring in the HTTP code.

Reviewed-by: amishchenko

15 years agoganeti.bootstrap: Whitespace fix
Michael Hanselmann [Wed, 17 Dec 2008 14:09:39 +0000 (14:09 +0000)]
ganeti.bootstrap: Whitespace fix

Reviewed-by: iustinp

15 years agoAdd job queue size limit
Michael Hanselmann [Wed, 17 Dec 2008 13:18:35 +0000 (13:18 +0000)]
Add job queue size limit

A job queue with too many jobs can increase memory usage and/or make
the master daemon slow. The current limit is just an arbitrary number.
A "soft" limit for automatic job archival is prepared.

Reviewed-by: iustinp

15 years agoutils.KillProcess: Use waitpid() to wait for child processes
Michael Hanselmann [Wed, 17 Dec 2008 11:24:12 +0000 (11:24 +0000)]
utils.KillProcess: Use waitpid() to wait for child processes

Sometimes the proc filesystem doesn't reflect the current status of
a process. By calling waitpid(), we make sure to get the current
information, at least for child processes. The timeout is still
kept for child processes to make sure the proc filesystem is updated.

Reviewed-by: iustinp

15 years agoRelease ganeti 2.0~alpha1
Guido Trotter [Tue, 16 Dec 2008 16:24:46 +0000 (16:24 +0000)]
Release ganeti 2.0~alpha1

Reviewed-by: iustinp

15 years agoLUConnectConsole: fix primary_node online check
Guido Trotter [Tue, 16 Dec 2008 16:24:35 +0000 (16:24 +0000)]
LUConnectConsole: fix primary_node online check

The primary node is part of the instance, not of the opcode.

Reviewed-by: iustinp

15 years ago_RunCmdPipe: handle EINTR in poller.poll()
Guido Trotter [Tue, 16 Dec 2008 16:24:22 +0000 (16:24 +0000)]
_RunCmdPipe: handle EINTR in poller.poll()

poll() can be interrupted. rather than failing we retry until it
returns.

Reviewed-by: iustinp

15 years agoKVM: improve socat interface
Guido Trotter [Tue, 16 Dec 2008 16:24:08 +0000 (16:24 +0000)]
KVM: improve socat interface

Call socat with a full path specified at configure time, rather than
just by its name, and check for the binary to exist at hypervisor
verify.

Reviewed-by: iustinp

15 years agoKVM: use a different default kernel path
Guido Trotter [Tue, 16 Dec 2008 16:23:53 +0000 (16:23 +0000)]
KVM: use a different default kernel path

It makes sense for the default kvm kernel not to be called "xenU".

Reviewed-by: iustinp

15 years agoganeti.http: Add three TODOs for improvements
Michael Hanselmann [Mon, 15 Dec 2008 10:06:24 +0000 (10:06 +0000)]
ganeti.http: Add three TODOs for improvements

Reviewed-by: iustinp

15 years agoganeti.http: Explicitly initiate handshake
Michael Hanselmann [Mon, 15 Dec 2008 09:48:25 +0000 (09:48 +0000)]
ganeti.http: Explicitly initiate handshake

Otherwise it would be done on the first read/write operation, making
error handling more difficult (such as EOF during handshake).

Reviewed-by: iustinp

15 years agoganeti.http: Implement handshake socket operation
Michael Hanselmann [Mon, 15 Dec 2008 09:40:57 +0000 (09:40 +0000)]
ganeti.http: Implement handshake socket operation

Reviewed-by: iustinp

15 years agoganeti.http: Handle SSL_ERROR_ZERO_RETURN
Michael Hanselmann [Mon, 15 Dec 2008 09:40:42 +0000 (09:40 +0000)]
ganeti.http: Handle SSL_ERROR_ZERO_RETURN

Also add a comment next to the place where the SSL connection is shut
down.

Reviewed-by: iustinp

15 years agocleanup: ConfigWriter, initialize all attributes
Iustin Pop [Sun, 14 Dec 2008 12:05:30 +0000 (12:05 +0000)]
cleanup: ConfigWriter, initialize all attributes

We should initialized the _last_cluster_serial in the constructor too (just to
be consistent).

Reviewed-by: amishchenko

15 years agocleanup: rapi v2 instance tags wrong attribute
Iustin Pop [Sun, 14 Dec 2008 12:05:22 +0000 (12:05 +0000)]
cleanup: rapi v2 instance tags wrong attribute

This was changed in the past, but it seems this class was forgotten.

Reviewed-by: amishchenko

15 years agocleanup: http server, line too long
Iustin Pop [Sun, 14 Dec 2008 12:05:12 +0000 (12:05 +0000)]
cleanup: http server, line too long

Reviewed-by: amishchenko

15 years agocleanup: http client, line too long
Iustin Pop [Sun, 14 Dec 2008 12:05:03 +0000 (12:05 +0000)]
cleanup: http client, line too long

Reviewed-by: amishchenko

15 years agocleanup: xen hypervisor
Iustin Pop [Sun, 14 Dec 2008 12:04:54 +0000 (12:04 +0000)]
cleanup: xen hypervisor

Wrong indentation and uniformize one method signature.

Reviewed-by: amishchenko

15 years agocleanup: kvm code likes to redefine names
Iustin Pop [Sun, 14 Dec 2008 12:04:45 +0000 (12:04 +0000)]
cleanup: kvm code likes to redefine names

Reviewed-by: amishchenko

15 years agolib/ssh.py: import the logging module
Iustin Pop [Sun, 14 Dec 2008 12:04:36 +0000 (12:04 +0000)]
lib/ssh.py: import the logging module

This only means most of our error paths in this module were not working
(and generating exceptions).

Reviewed-by: amishchenko

15 years agoSshRunner: add docstring for _BuildSshOptions
Iustin Pop [Sun, 14 Dec 2008 12:04:29 +0000 (12:04 +0000)]
SshRunner: add docstring for _BuildSshOptions

Reviewed-by: amishchenko

15 years agoImprove _autoconf.py comments
Iustin Pop [Sun, 14 Dec 2008 12:04:20 +0000 (12:04 +0000)]
Improve _autoconf.py comments

This adds a docstring to the _autoconf.py file detailing how it's
generated (the other comment it's not visible in pydoc/epydoc).

Reviewed-by: amishchenko

15 years agocleanup: use _ for unused loop counter
Iustin Pop [Sun, 14 Dec 2008 12:04:13 +0000 (12:04 +0000)]
cleanup: use _ for unused loop counter

Reviewed-by: amishchenko

15 years agocleanup: WorkerPool, wrong variable name
Iustin Pop [Sun, 14 Dec 2008 12:04:05 +0000 (12:04 +0000)]
cleanup: WorkerPool, wrong variable name

Quoting Michael: "why is this even working?"

Reviewed-by: imsnah,amishchenko

15 years agocleanup: TcpPing, wrong variable name
Iustin Pop [Sun, 14 Dec 2008 12:03:56 +0000 (12:03 +0000)]
cleanup: TcpPing, wrong variable name

The default value of 'False' wasn't initialized properly. It doesn't
require initialization, but it's cleaner this way.

Reviewed-by: amishchenko

15 years agocleanup: SetEtcHostsEntry unused var
Iustin Pop [Sun, 14 Dec 2008 12:03:47 +0000 (12:03 +0000)]
cleanup: SetEtcHostsEntry unused var

Reviewed-by: amishchenko

15 years agocleanup: fix IAllocator hypervisor usage
Iustin Pop [Sun, 14 Dec 2008 12:03:38 +0000 (12:03 +0000)]
cleanup: fix IAllocator hypervisor usage

Two problems: the iallocator.hypervisor wasn't initialized to None in
the constructor, so pylint doesn't realize it's initialized later with
setattr.

Second, 'hypervisor' is a module, so we shouldn't use it as a variable.

Reviewed-by: amishchenko

15 years agocleanup: LUReplaceDisks unused vars
Iustin Pop [Sun, 14 Dec 2008 12:03:30 +0000 (12:03 +0000)]
cleanup: LUReplaceDisks unused vars

And a small whitespace fix.

Reviewed-by: amishchenko

15 years agocleanup: do not hide upper-scope name
Iustin Pop [Sun, 14 Dec 2008 12:03:20 +0000 (12:03 +0000)]
cleanup: do not hide upper-scope name

hypervisor is a module, so we shouldn't use it as an argument.

Reviewed-by: amishchenko

15 years agocleanup: fix use of _CheckNodeOnline
Iustin Pop [Sun, 14 Dec 2008 12:03:11 +0000 (12:03 +0000)]
cleanup: fix use of _CheckNodeOnline

A few cases of wrong variable name.

Reviewed-by: amishchenko

15 years agocleanup: LUAddNode, LUSetNodeParams unused variable
Iustin Pop [Sun, 14 Dec 2008 12:03:03 +0000 (12:03 +0000)]
cleanup: LUAddNode, LUSetNodeParams unused variable

This is a leftover from the abstraction of AdjustCandidatePool, and it
also requires the config lock, so it's better to remove it.

Reviewed-by: amishchenko

15 years agocleanup: LURenameCluster wrong variable name
Iustin Pop [Sun, 14 Dec 2008 12:02:53 +0000 (12:02 +0000)]
cleanup: LURenameCluster wrong variable name

Reviewed-by: amishchenko

15 years agocleanup: fix export NIC count the same way as disk
Iustin Pop [Sun, 14 Dec 2008 12:02:45 +0000 (12:02 +0000)]
cleanup: fix export NIC count the same way as disk

For safety, we use the same algorithm as in disk count.

Reviewed-by: amishchenko

15 years agocleanup: fix backend._RecursiveFindBD
Iustin Pop [Sun, 14 Dec 2008 12:02:36 +0000 (12:02 +0000)]
cleanup: fix backend._RecursiveFindBD

_RecursiveFindBD takes a parameter that isn't used; moreover, nowhere in
the SVN history can I find a case that it has been used.

As such, remove this parameter and fix its callers.

Reviewed-by: amishchenko

15 years agocleanup: more unused vars
Iustin Pop [Sun, 14 Dec 2008 12:02:27 +0000 (12:02 +0000)]
cleanup: more unused vars

Reviewed-by: amishchenko

15 years agocleanup: sanitize a default parameter
Iustin Pop [Sun, 14 Dec 2008 12:02:18 +0000 (12:02 +0000)]
cleanup: sanitize a default parameter

Instead of relying that the usage of the parameter is ok with mutable
default parameters, let's just make it safer..

Reviewed-by: amishchenko

15 years agocleanup: exceptions should derive from Exception
Iustin Pop [Sun, 14 Dec 2008 12:02:09 +0000 (12:02 +0000)]
cleanup: exceptions should derive from Exception

Reviewed-by: amishchenko

15 years agocleanup: fix GatherMasterVotes
Iustin Pop [Sun, 14 Dec 2008 12:02:01 +0000 (12:02 +0000)]
cleanup: fix GatherMasterVotes

Remove unused vars

Reviewed-by: amishchenko

15 years agocleanup: _InitSSHSetup doesn't need its argument
Iustin Pop [Sun, 14 Dec 2008 12:01:52 +0000 (12:01 +0000)]
cleanup: _InitSSHSetup doesn't need its argument

Reviewed-by: imsnah

15 years agocleanup: fix 'variable unused' warning
Iustin Pop [Sun, 14 Dec 2008 12:01:41 +0000 (12:01 +0000)]
cleanup: fix 'variable unused' warning

In the iteration we don't care about the node names, so we change the
for loop to be over the values (and not itervalues).

Reviewed-by: amishchenko

15 years agoganeti.http: Rename HttpBase._using_ssl to HttpBase.using_ssl
Michael Hanselmann [Fri, 12 Dec 2008 16:50:41 +0000 (16:50 +0000)]
ganeti.http: Rename HttpBase._using_ssl to HttpBase.using_ssl

It'll be queried from other classes.

Reviewed-by: iustinp

15 years agoganeti.http: Rename HttpSocketBase to HttpBase
Michael Hanselmann [Fri, 12 Dec 2008 16:50:27 +0000 (16:50 +0000)]
ganeti.http: Rename HttpSocketBase to HttpBase

It's more appropriate.

Reviewed-by: iustinp

15 years agoFix epydoc format warnings
Iustin Pop [Thu, 11 Dec 2008 17:13:30 +0000 (17:13 +0000)]
Fix epydoc format warnings

This patch should fix all outstanding epydoc parsing errors; as such, we
switch epydoc into verbose mode so that any new errors will be visible.

Reviewed-by: imsnah

15 years agoSwitch epydoc to parse only
Iustin Pop [Thu, 11 Dec 2008 14:58:01 +0000 (14:58 +0000)]
Switch epydoc to parse only

epydoc seems to be mightily confused by decorators and how they change
functions (it starts mixing the parameters of the decorated function
into the decorator itself); so we want it to parse only and not look at
the objects themselves.

Reviewed-by: ultrotter

15 years agoganeti.backend: Improve compression check
Michael Hanselmann [Wed, 10 Dec 2008 12:11:47 +0000 (12:11 +0000)]
ganeti.backend: Improve compression check

Reviewed-by: iustinp

15 years agoganeti.http: Docstring updates
Michael Hanselmann [Wed, 10 Dec 2008 12:06:35 +0000 (12:06 +0000)]
ganeti.http: Docstring updates

Reviewed-by: iustinp

15 years agoganeti.http: Remove _HttpClientError
Michael Hanselmann [Tue, 9 Dec 2008 18:42:53 +0000 (18:42 +0000)]
ganeti.http: Remove _HttpClientError

This is a leftover from old code.

Reviewed-by: iustinp

15 years agoganeti.http.server: Increase connection backlog to 1024
Michael Hanselmann [Tue, 9 Dec 2008 17:35:53 +0000 (17:35 +0000)]
ganeti.http.server: Increase connection backlog to 1024

This solves a problem with many concurrent requests. By default, 1024
is the maximum backlog on Linux kernels. We limit the number of clients
through MAX_CHILDREN, too. The idea of just increasing the backlog is
taken from lighttpd.

Reviewed-by: amishchenko

15 years agoRPC: Compress file upload data
Michael Hanselmann [Tue, 9 Dec 2008 13:24:26 +0000 (13:24 +0000)]
RPC: Compress file upload data

Adding compression to larger amounts of data is more efficient than
transferring it (len(nodes) - 1) times over the network without
compression. We were able to compress a 800KB config file to about
30 KB, which is about 40 KB with Base64 encoding (required due to
the way SimpleJson handles strings).

Reviewed-by: ultrotter

15 years agoWarn for instances living on offline nodes
Iustin Pop [Tue, 9 Dec 2008 09:33:32 +0000 (09:33 +0000)]
Warn for instances living on offline nodes

The patch also changes the result to error for non-reachable secondary nodes
(as for primary nodes).

Reviewed-by: ultrotter

15 years agoFix _AdjustCandidatePool
Iustin Pop [Mon, 8 Dec 2008 17:45:56 +0000 (17:45 +0000)]
Fix _AdjustCandidatePool

Currently the ConfigWriter.MaintainCandidatePool returns node names, and
_AdjustCandidatePool uses them as such, but then it passes these to
context.ReaddNode which in turn passes them to jqueue.JobQueue.AddNode which
uses them as objects.Node instances.

Since this is currently the only usage, we change return type from
ConfigWriter.MaintainCandidatePool to be objects and adjust the logging of
their names, so that the auto-adjusement works.

Reviewed-by: ultrotter

15 years agognt-node modify: add the offline attribute
Iustin Pop [Mon, 8 Dec 2008 11:46:51 +0000 (11:46 +0000)]
gnt-node modify: add the offline attribute

This patch changes gnt-node modify and the associated opcode/lu to allow
modification of the node offline attribute.

Setting a node into offline mode automatically demotes it from the
master role.

Reviewed-by: ultrotter

15 years agoRPC: do not make calls to offline nodes
Iustin Pop [Mon, 8 Dec 2008 09:10:48 +0000 (09:10 +0000)]
RPC: do not make calls to offline nodes

This patch changes the _MultNodeCall and _SingleNodeCall helpers to not
actually make calls to offline nodes, but instead generate fake
responses which have a parameter caller 'offline' set so that callers
can check for this value if they want (otherwise, it's just a failed RPC
call).

Reviewed-by: ultrotter

15 years agochmod ganeti.initd before uploading it
Guido Trotter [Sun, 7 Dec 2008 11:01:55 +0000 (11:01 +0000)]
chmod ganeti.initd before uploading it

When an upload is done to a node which doesn't have any version of
ganeti installed, this prevents a non-executable-initd error later in
the upload.

Reviewed-by: imsnah

15 years agoMake cluster verify understand offline nodes
Iustin Pop [Fri, 5 Dec 2008 11:41:33 +0000 (11:41 +0000)]
Make cluster verify understand offline nodes

This patch changes cluster verify to not alert on offline nodes, but
instead just show a note at the end with the number of such nodes.

It also removes warnings in verify-disks and hooks about failures to
make rpc calls to such nodes.

Reviewed-by: ultrotter

15 years agocmdlib: check node stats in prereqs
Iustin Pop [Fri, 5 Dec 2008 11:32:02 +0000 (11:32 +0000)]
cmdlib: check node stats in prereqs

This patch adds checks for offline nodes in most instance LUs so that we
can work with offline secondaries, but not with offline primaries. Some
cases (like grow disk, which needs both sides up) are not allowing
offline nodes at all.

Reviewed-by: ultrotter

15 years agoAdd two utility functions to cmdlib
Iustin Pop [Fri, 5 Dec 2008 11:20:20 +0000 (11:20 +0000)]
Add two utility functions to cmdlib

These will be used for parameter checking and node status checking.

Reviewed-by: ultrotter

15 years agoAdd function to compute the master candidates
Iustin Pop [Fri, 5 Dec 2008 11:14:19 +0000 (11:14 +0000)]
Add function to compute the master candidates

Since some nodes can be offline, we can't just take the length of the
node list as the maximum possible number of master candidates.

The patch adds an utility function to correctly compute this value and
replaces hardcoded computations with the use of this function. It then
adds utility functions to automate the maintenance of the node lists.

Reviewed-by: ultrotter

15 years agohttp: use slicing instead of string modification
Iustin Pop [Fri, 5 Dec 2008 10:12:58 +0000 (10:12 +0000)]
http: use slicing instead of string modification

The combination of the current buffer splitting method and (4KB) buffer
size is very inefficient when writing big amounts of data. Just walking
over a 16 megabyte string using a 4K buffer takes (on a random computer)
1m06s, whereas using slices will decrease this to 0.080s, and slicing
with 32 KB size decreases this to 0.073s.

This means that uploading a big config file (it nears 1MB for big
clusters) will take more and more time per the number of nodes, since it
needs lots of slicing.

I happened upon this by accidentally setting all nodes as master
candidates, at which point just uploading the config file to all nodes
took 40s. Applying the patch decreases this to 15s (this probably can
still be optimized).

The patch also removes a duplicate constant (the one actually used is in
http/client.py), and changes the receive buffer size to use the same
constant.

Reviewed-by: imsnah

15 years agoAdd the offline node list to ssconf
Iustin Pop [Fri, 5 Dec 2008 10:12:45 +0000 (10:12 +0000)]
Add the offline node list to ssconf

The patch also changes the various node list generation to be more
consistent.

Reviewed-by: imsnah

15 years agoCleanup the config file on demotion from candidate
Iustin Pop [Fri, 5 Dec 2008 03:01:21 +0000 (03:01 +0000)]
Cleanup the config file on demotion from candidate

This patch adds a simple rpc which makes a backup of the config file and
then removes it. This is done so that cluster verify doesn't complain
immediately after demoting a node.

Reviewed-by: imsnah

15 years agowatcher: handle offline nodes better
Iustin Pop [Fri, 5 Dec 2008 02:58:40 +0000 (02:58 +0000)]
watcher: handle offline nodes better

This patch changes the LUQueryInstances to show a different state for
offline nodes and also modifies the watcher to understand the offline
state in its checks.

Reviewed-by: ultrotter

15 years agonode list: add the offline field
Iustin Pop [Fri, 5 Dec 2008 02:53:33 +0000 (02:53 +0000)]
node list: add the offline field

Reviewed-by: ultrotter

15 years agoAdd a new node parameter 'offline'
Iustin Pop [Fri, 5 Dec 2008 02:53:21 +0000 (02:53 +0000)]
Add a new node parameter 'offline'

This patch adds a new node parameter called offline that will be used to
mark nodes which should be touched by commands.

We also add this flag at cluster init, node add, and export it to
iallocator scripts.

Reviewed-by: ultrotter

15 years agossconf: empty files should not add a newline
Iustin Pop [Fri, 5 Dec 2008 02:42:18 +0000 (02:42 +0000)]
ssconf: empty files should not add a newline

Currently we add a newline in the ssconf writeout process, even if the
file is empty. We chage this case so that lists of values (e.g. offline
nodes) are correct (not a list of one empty element).

Reviewed-by: imsnah

15 years agoganeti.http: Add constant for DELETE
Michael Hanselmann [Thu, 4 Dec 2008 15:25:26 +0000 (15:25 +0000)]
ganeti.http: Add constant for DELETE

Reviewed-by: amishchenko

15 years agoRemove old HTTP code
Michael Hanselmann [Thu, 4 Dec 2008 15:25:12 +0000 (15:25 +0000)]
Remove old HTTP code

Reviewed-by: amishchenko

15 years agoganeti.rpc: Convert to new HTTP server
Michael Hanselmann [Thu, 4 Dec 2008 15:24:52 +0000 (15:24 +0000)]
ganeti.rpc: Convert to new HTTP server

Reviewed-by: amishchenko

15 years agoganeti-rapi: Convert to new HTTP server
Michael Hanselmann [Thu, 4 Dec 2008 15:24:14 +0000 (15:24 +0000)]
ganeti-rapi: Convert to new HTTP server

Reviewed-by: amishchenko

15 years agoganeti-noded: Migrate to new HTTP server
Michael Hanselmann [Thu, 4 Dec 2008 15:23:50 +0000 (15:23 +0000)]
ganeti-noded: Migrate to new HTTP server

Reviewed-by: amishchenko

15 years agoganeti.http: Split HTTP server and client into separate files
Michael Hanselmann [Thu, 4 Dec 2008 15:23:38 +0000 (15:23 +0000)]
ganeti.http: Split HTTP server and client into separate files

This includes a large rewrite of the HTTP server code. The handling of
OpenSSL errors had some problems that were hard to fix with its
structure. When preparing all of this, I realized that actually HTTP
is a message protocol and that the same code can be used on both the
server and client side to parse requests/responses, with only a few
differences. There are still a few TODOs in the code, but none should
be a show stopper. Many pylint warnings have been fixed, too.

The old code will be removed once all users have been migrated.

Reviewed-by: amishchenko