« Previous | Next » 

Revision 6e0a3771

ID6e0a37719df114882b142f141ceb01be77d249d8

Added by Georgios D. Tsoukalas about 12 years ago

Introduce execution domains in code and building

Only fixed to compile. Untested.

Lots of refactoring, cleanup, and fixing in the process.
This log has three parts:
1. What are execution domains and why we need them
2. What are the API incompatibilities introduced
3. Issues that were discovered in the process,
and how they were fixed, or how they need fixing.

Note that there are many changes not documented in this log.

1. Execution Domains.

The execution domains are simply the different environments where
the xseg library, drivers and peers are compiled and linked in.
Currently there are two domains, 'user' for userspace processes,
and 'kernel' for kernel modules. The rule of thumb is that if you
need to compile it differently, then you need a new execution domain.

New domains could be introduced, for example,
so that xseg runs inside a hypervisor, another OS/arch, or even
to have two different implementations of xseg run together.
Until now, the kernel domain was all stuffed in sys/,
domain, drivers, and peers together.

The new repository layout defines domains nicely:

sys/${DOMAIN}/ -> domain specific support code
drivers/${DOMAIN/ -> drivers available to the domain
peers/${DOMAIN}/ -> the peers that execute in the domain
lib/${DOMAIN/ -> built libraries, modules for the domain

The kernel domain has been split into four parts:
a. the xseg.ko library kernel module in sys/kernel.
(this includes the domain support code in xsegmod.c)
b. the segdev.ko (renamed from xsegdev) character device in sys/kernel
(this could be part of the driver, but is generic enough
to be included into the domain to be generally available)
c. the xseg_segdev.ko segdev segment and peer driver in drivers/kernel
d. the xsegbd.ko segdev peer in peers/kernel

Similar is the layout for the user domain.
The domain support code is in sys/user/xseg_user.c

The domain support code finds out what it must implement in
various "domain.h" files in the tree, currently:
xq/domain.h
xseg/domain.h
sys/domain.h

2. API changes ==============

i. xseg_initialize() no longer takes an argument.
Instead, the type of the peer is decided at xseg_join().
This way a program may implement more than one type of peers
(in the same, or different segments)
xseg_initialize() is only for library initialization.

ii. xseg_join takes two more arguments:
a. the name of the peer type to be initialized
b. a callback function for fully asynchronous peers,
such as xsegbd, who can never sleep in wait_for_signal()
This callback is just registered by the library;
if the driver does not call it, it will never be.

iii. xseg_wait_signal() no longer takes a portno argument.
This was a mistake.
The sleep is intended to be as generic as possible.
Ideally, the library (through the domain support code)
should offer a wakeup method in all possible ways
(e.g. from other peers as with xseg_signal(),
from the peer context e.g. with select() in user,
or waitqueues in kernel, signals, timers, ...)

Right now, the peer has to schedule its wakeup
by using xseg_signal on itself from another context,
or by exploiting the same mechanism that the peer driver does
(e.g. SIGIO)

Deeper in the library, the driver interface for signaling was changed
so that functions take the full segment descriptor plus the portno,
rather than just a pointer to the port.

3. Issues =========

  • Kill security.
    Any peer signaling a segdev peer can send SIGIO to any process in the system.
    There is no infrastructure for signaling security.
    The segment could be limited to a specific uid, so that
    all peers must be run as this user,
    and all kills are made as (with the permissions of) this user.
  • Killing threads.
    Posix driver for kernel and user domains kills specific threads with SIGIO,
    and not processes. Before this commit, the kernel driver killed threads,
    while the user driver killed processes.
    It is trivial to change the behavior to one or the other, if we need to.

    Killing processes has the advantage that a multithreaded user peer
    need not have a master peer that will accept and dispatch work.
    The kernel will send a waking signal to one of the sleeping threads.

    However, killing threads provides more flexibility,
    while it can emulate the other way:
    All threads when sleeping, they register the id of the master thread,
    so at all times, only the master thread receives signals.
    Then, the master thread can kill the process and emulate
    a process-killing signal.

    Currently, the posix driver registers the id of the thread
    that initializes the library as the master thread.
    Fine tunings will probably be needed for this to work correctly.

  • Callback design & code cleanup.
    xsegbd cannot sleep. So may other peers in the future,
    or so we might decide to implement some otherwise sleepable ones.
    prepare/cancel_wait() and wait_signal() are not used by such peers.
    Instead, a callback is registered at join,
    which takes a segment descriptor and the signalled port as arguments.
    The library only registers the callback in the segment descriptor.
    It is up to the peer drivers to schedule its call upon signaling.

    In the case of xsegbd, the peer driver (xseg_segdev) utilizes
    the kernel domain support character device (segdev).
    The userspace segdev driver calls the kernel segdev driver through
    /dev/segdev, and the kernel segdev calls the xsegbd-registerd callback
    through the xseg descriptor.

    Currently there is support for only one segdev character device,
    and only one kernel segdev peer, but the design is easily extensible.

    xsegdev was renamed to segdev.
    Notice that in the whole segdev code there is no reference to xseg
    (thus the rename). Segdev is considered an independent utility
    that supports the kernel domain.

  • Make will pass V=1 to kbuild, so that verbosity can be controlled
    from the top-level make.
  • ${XSEG_HOME}/envsetup is an environment-setting tool that will either
    - initialize variables if sourced
    - initialize variables and start a shell if executed with no arguments
    - printout variables if executed with 'show'

    Its intended use is from the command line, to set up a dev shell,
    and it is used by the makefiles to set up/discover the environment.

  • The build system was improved.
    (fewer includes, more central configuration, more automation, etc)

    The top-level base.mk is at the heart of it,
    discovering and setting most of the variables,
    while defining global rules.

    It uses envsetup for basic variables and
    tools/xseg-domain-targets to auto-detect what targets are available
    in the tree.

  • LOGMSG was renamed to XSEGLOG and promoted to domain-specific.
    The xsegbd-specific XSEGLOG was discarded.
  • xsegbd hacks.
    There were some ugly hacks and bad stuff in how xsegbd handled
    its signalling via callbacks, and specifically how it blocked to
    wait for device size. A pointer was overlayed in a private field
    along with an integer, and circulated through the userspace,
    before being intrusively extracted from port->waitcue and
    auto-detected as pointer.

    Now there are are proper structures (struct pending)
    for distinguishing among asynchronous or completion-blocked requests,
    and the callbacks all run through well defined interfaces.

  • xsegbd hazards.
    The initialization of xsegbd, both in sysfs and in disk,
    has memory leaks, not freeing memory on failure.

    The loop receiving requests within the callback,
    may receive replies for more than one device,
    and therefore needs to call request_fn's for more than one device.
    This was fixed (not tested).

Files

  • added
  • modified
  • copied
  • renamed
  • deleted

View differences