OVERVIEW
The EVS library is delivered with the openais project. This library is
used to create distributed applications that operate properly during
partitions, merges, and faults.
The library provides a mechanism to: * handle abstraction for multiple
instances of an EVS library in one application * Deliver messages *
Deliver configuration changes * join one or more groups * leave one or
more groups * send messages to one or more groups * send messages to
currently joined groups
The EVS library implements a messaging model known as Extended Virtual
Synchrony. This model allows one sender to transmit to many receivers
using standard UDP/IP. UDP/IP is unreliable and unordered, so the EVS
library applies ordering and reliability to messages. Hardware
multicast is used to avoid duplicated packets with two or more
receivers. Erroneous messages are corrected automatically by the
library.
Certain gaurantees are provided by the EVS library. These guarantees
are related to message delivery and configuration change delivery.
DEFINITIONS
multicast
A multicast occurs when a network interface card sends a UDP
packet to multiple receivers simulatenously.
processor
A processor is the entity that executes the extended virtual
synchrony algorithms.
configuration
A configuration is the current description of the processors
executing the extended virtual syncrhony algorithm.
configuration change
A configuration change occurs when a new configuration is
delivered.
partition
A partition occurs when a configuration splits into two or more
configurations, or a processor fails or is stopped and leaves
the configuration.
merge A merge occurs when two or more configurations join into a
larger new configuration. When a new processor starts up, it is
treated as a configuration with only one processor and a merge
occurs.
fifo ordering
A message is FIFO ordered when one sender and one receiver agree
on the order of the messages sent.
agreed ordering
A message is AGREED ordered when all processors agree on the
order of the messages sent.
safe ordering
A message is SAFE ordered when all processors agree on the order
of messages sent and those messages are not delivered until all
processors have a copy of the message to deliver.
virtual syncrhony
Virtual syncrhony is obtained when all processors agree on the
order of messages sent and configuration changes sent for each
new configuration.
USING VIRTUAL SYNCHRONY
The virtual synchrony messaging model has many benefits for developing
distributed applications. Applications designed using replication have
the most benefits. Applications that must be able to partition and
merge also benefit from the virtual synchrony messaging model.
All applications receive a copy of transmitted messages even if there
are errors on the transmission media. This allows optimiziations when
every processor must receive a copy of the message for replication.
All messages are ordered according to agreed ordering. This mechanism
allows the avoidance of race conditions. Consider a lock service
implemented over several processors. Two requests occur at the same
time on two seperate processors. The requests are ordered for every
processor in the same order and delivered to the processors. Then all
processors will get request A before request B and can reject request
B. Any type of creation or deletion of a shared data structure can
benefit from this mechanism.
Self delivery ensures that messages that are sent by a processor are
also delivered back to that processor. This allows the processor
sending the message to execute logic when the message is self delivered
according to agreed ordering and the virtual synchrony rules. It also
permits all logic to be placed in one message handler instead of two
seperate places.
Virtual Synchrony allows the current configuration to be used to make
decisions in partitions and merges. Since the configuration is sent in
the stream of messages to the application, the application can alter
its behavior based upon the configuration changes.
ARCHITECTURE AND ALGORITHM
The EVS library is a thin IPC interface to the openais executive. The
openais executive provides services for the SA Forum AIS libraries as
well as the EVS library.
The openais executive uses a ring protocol and membership protocol to
send messages according to the semantics required by extended virtual
synchrony. The ring protocol creates a virtual ring of processors. A
token is rotated around the ring of processors. When the token is
possessed by a processor, that processor may multicast messages to
other processors in the system.
The token is called the ORF token (for ordering, reliability, flow
control). The ORF token orders all messages by increasing a sequence
number every time a message is multicasted. In this way, an ordering
is placed on all messages that all processors agree to. The token also
contains a retransmission list. If a token is received by a processor
that has not yet received a message it should have, a message sequence
number is added to the retransmission list. A processor that has a
copy of the message then retransmits the message. The ORF token
provides configuration-wide flow control by tracking the number of
messages sent and limiting the number of messages that may be sent by
one processor on each posession of the token.
The membership protocol is responsible for ring formation and detecting
when a processor within a ring has failed. If the token fails to make
a rotation within a timeout period known as the token rotation timeout,
the membership protocol will form a new ring. If a new processor
starts, it will also form a new ring. Two or more configurations may
be used to form a new ring, allowing many partitions to merge together
into one new configuration.
PERFORMANCE
The EVS library obtains 8.5MB/sec throughput on 100 mbit network links
with many processors. Larger messages obtain better throughput results
because the time to access Ethernet is about the same for a small
message as it is for a larger message. Smaller messages obtain better
messages per second, because the time to send a message is not exactly
the same.
80% of CPU utilization occurs because of encryption and authentication.
The openais can be built without encryption and authentication for
those with no security requirements and low CPU utilization
requirements. Even without encryption or authentication, under heavy
load, processor utilization can reach 25% on 1.5 GHZ CPU processors.
The current openais executive supports 16 processors, however, support
for more processors is possible by changing defines in the openais
executive. This is untested, however.
SECURITY
The EVS library encrypts all messages sent over the network using the
SOBER-128 stream cipher. The EVS library uses HMAC and SHA1 to
authenticate all messages. The EVS library uses SOBER-128 as a pseudo
random number generator. The EVS library feeds the PRNG using the
/dev/random Linux device.
BUGS
This software is not yet production, so there may still be some bugs.
But it appears there are very few since nobody reports any unknown bugs
at this point.
SEE ALSO
evs_initialize(3), evs_finalize(3), evs_fd_get(3), evs_dispatch(3),
evs_join(3), evs_leave(3), evs_mcast_joined(3), evs_mcast_groups(3),
evs_mmembership_get(3)