Man Linux: Main Page and Category List


       The EVS library is delivered with the openais project.  This library is
       used to create distributed applications that  operate  properly  during
       partitions, merges, and faults.

       The  library provides a mechanism to: * handle abstraction for multiple
       instances of an EVS library in one application  *  Deliver  messages  *
       Deliver  configuration changes * join one or more groups * leave one or
       more groups * send messages to one or more groups *  send  messages  to
       currently joined groups

       The  EVS library implements a messaging model known as Extended Virtual
       Synchrony.  This model allows one sender to transmit to many  receivers
       using  standard UDP/IP.  UDP/IP is unreliable and unordered, so the EVS
       library  applies  ordering  and  reliability  to  messages.    Hardware
       multicast  is  used  to  avoid  duplicated  packets  with  two  or more
       receivers.  Erroneous  messages  are  corrected  automatically  by  the

       Certain  gaurantees  are provided by the EVS library.  These guarantees
       are related to message delivery and configuration change delivery.


              A multicast occurs when a network interface  card  sends  a  UDP
              packet to multiple receivers simulatenously.

              A  processor  is  the  entity that executes the extended virtual
              synchrony algorithms.

              A configuration is the current  description  of  the  processors
              executing the extended virtual syncrhony algorithm.

       configuration change
              A  configuration  change  occurs  when  a  new  configuration is

              A partition occurs when a configuration splits into two or  more
              configurations,  or  a  processor fails or is stopped and leaves
              the configuration.

       merge  A merge occurs when two  or  more  configurations  join  into  a
              larger new configuration.  When a new processor starts up, it is
              treated as a configuration with only one processor and  a  merge

       fifo ordering
              A message is FIFO ordered when one sender and one receiver agree
              on the order of the messages sent.

       agreed ordering
              A message is AGREED ordered when all  processors  agree  on  the
              order of the messages sent.

       safe ordering
              A message is SAFE ordered when all processors agree on the order
              of messages sent and those messages are not delivered until  all
              processors have a copy of the message to deliver.

       virtual syncrhony
              Virtual  syncrhony  is obtained when all processors agree on the
              order of messages sent and configuration changes sent  for  each
              new configuration.


       The  virtual synchrony messaging model has many benefits for developing
       distributed applications.  Applications designed using replication have
       the  most  benefits.   Applications  that must be able to partition and
       merge also benefit from the virtual synchrony messaging model.

       All applications receive a copy of transmitted messages even  if  there
       are  errors on the transmission media.  This allows optimiziations when
       every processor must receive a copy of the message for replication.

       All messages are ordered according to agreed ordering.  This  mechanism
       allows  the  avoidance  of  race  conditions.   Consider a lock service
       implemented over several processors.  Two requests occur  at  the  same
       time  on  two  seperate processors.  The requests are ordered for every
       processor in the same order and delivered to the processors.  Then  all
       processors  will  get request A before request B and can reject request
       B.  Any type of creation or deletion of a  shared  data  structure  can
       benefit from this mechanism.

       Self  delivery  ensures  that messages that are sent by a processor are
       also delivered back to  that  processor.   This  allows  the  processor
       sending the message to execute logic when the message is self delivered
       according to agreed ordering and the virtual synchrony rules.  It  also
       permits  all  logic  to be placed in one message handler instead of two
       seperate places.

       Virtual Synchrony allows the current configuration to be used  to  make
       decisions in partitions and merges.  Since the configuration is sent in
       the stream of messages to the application, the  application  can  alter
       its behavior based upon the configuration changes.


       The  EVS library is a thin IPC interface to the openais executive.  The
       openais executive provides services for the SA Forum AIS  libraries  as
       well as the EVS library.

       The  openais  executive uses a ring protocol and membership protocol to
       send messages according to the semantics required by  extended  virtual
       synchrony.   The ring protocol creates a virtual ring of processors.  A
       token is rotated around the ring of  processors.   When  the  token  is
       possessed  by  a  processor,  that  processor may multicast messages to
       other processors in the system.

       The token is called the ORF  token  (for  ordering,  reliability,  flow
       control).   The  ORF token orders all messages by increasing a sequence
       number every time a message is multicasted.  In this way,  an  ordering
       is placed on all messages that all processors agree to.  The token also
       contains a retransmission list.  If a token is received by a  processor
       that  has not yet received a message it should have, a message sequence
       number is added to the retransmission list.  A  processor  that  has  a
       copy  of  the  message  then  retransmits  the  message.  The ORF token
       provides configuration-wide flow control  by  tracking  the  number  of
       messages  sent  and limiting the number of messages that may be sent by
       one processor on each posession of the token.

       The membership protocol is responsible for ring formation and detecting
       when  a processor within a ring has failed.  If the token fails to make
       a rotation within a timeout period known as the token rotation timeout,
       the  membership  protocol  will  form  a  new ring.  If a new processor
       starts, it will also form a new ring.  Two or more  configurations  may
       be  used to form a new ring, allowing many partitions to merge together
       into one new configuration.


       The EVS library obtains 8.5MB/sec throughput on 100 mbit network  links
       with many processors.  Larger messages obtain better throughput results
       because the time to access Ethernet is  about  the  same  for  a  small
       message  as it is for a larger message.  Smaller messages obtain better
       messages per second, because the time to send a message is not  exactly
       the same.

       80% of CPU utilization occurs because of encryption and authentication.
       The openais can be built  without  encryption  and  authentication  for
       those   with   no   security   requirements  and  low  CPU  utilization
       requirements.  Even without encryption or authentication,  under  heavy
       load, processor utilization can reach 25% on 1.5 GHZ CPU processors.

       The  current openais executive supports 16 processors, however, support
       for more processors is possible by  changing  defines  in  the  openais
       executive.  This is untested, however.


       The  EVS  library encrypts all messages sent over the network using the
       SOBER-128 stream cipher.   The  EVS  library  uses  HMAC  and  SHA1  to
       authenticate  all messages.  The EVS library uses SOBER-128 as a pseudo
       random number generator.  The EVS library  feeds  the  PRNG  using  the
       /dev/random Linux device.


       This  software  is not yet production, so there may still be some bugs.
       But it appears there are very few since nobody reports any unknown bugs
       at this point.


       evs_initialize(3),   evs_finalize(3),  evs_fd_get(3),  evs_dispatch(3),
       evs_join(3),  evs_leave(3),  evs_mcast_joined(3),  evs_mcast_groups(3),