ragator.conf - ragator flow model definitions.

NAME

       ragator.conf - ragator flow model definitions.

COPYRIGHT

       Copyright (c) 2000-2002 QoSient. All rights reserved.

SYNOPSIS

       ragator.conf

DESCRIPTION

       Programs  that  perform  flexible  aggregation  of  argus data, such as
       ragator(1)  and  radium(8),  can  be  configured  to  aggregate   using
       arbitrary  flow  models.  This configuration file provides a syntax for
       flow matching and aggregation model assignments on a  per  flow  basis,
       allowing  for  highly flexible aggregation strategies on a single argus
       stream.

       The configuration  file  is  structured  as  a  set  of  initialization
       variables,  and  then  followed by a collection of flow descriptors and
       model definitions.  The concept is that one identifies  specific  Argus
       Flow Activity Records through specification of an Argus flow descriptor
       matching statement.

OPTIONS

       The aggregation clients have a small number of options for  controlling
       specific aspects of aggregation function and output.

RAGATOR_MODEL_NAME

       Ragator  configurations  can  be  named.   This  is  important  for ra*
       aggregation pprograms that support  multiple  concurrent  models  at  a
       time, so you can tell them apart.  This is completely optional.

RAGATOR_REPORT_AGGREGATION

       Ragator,  when it merges argus records together, adds a new aggregation
       metric to the resulting record, which reports on the number of  records
       that  were  merged  together  and  provides some additional statistical
       values that provide record arrival rates and mean record durations.  By
       setting  this  option  to "no", you can have ragator() not provide this
       metric.  This is useful when creating full-duplex  records  from  half-
       duplex merging operations.

       RAGATOR_REPORT_AGGREGATION=yes

RAGATOR_PRESERVE_FIELDS

       All  aggregation  clients  have  the  ability  to  detect  when  a flow
       descriptor would not be modified during the aggregation process.   This
       is  valuable  information when attempting to discover trends.  However,
       some applications may want the resulting output to  completely  conform
       to the new flow definitions.  In order to force ragator() like programs
       to convert flow descriptions to the new  flow  model  descriptors,  set
       this option to "no".

       RAGATOR_PRESERVE_FIELDS=yes

RAGATOR_AUTOCORRECTION

       When  aggregating  Argus records together, all aggregation clients have
       the ability  to  autocorrect  the  assignment  of  flow  initiator  and
       receiver.   This is important for processing Argus records derived from
       Cisco Netflow style flow monitors, and  for  Argus  records  that  were
       generated by long lived flows that have extended idle periods.  Because
       it is possible for ra* aggregation clients to receive half-duplex  flow
       records,  or  multiple  flow  records  for  the  same  long  live flow,
       autocorrecting the argus records allows the client aggregation  process
       to match A -> B and B -> A records that belong to the same flow.

       With certain flow aggregation models, however, the autocorrection logic
       can cause aggregation errors.   As  a  result,  when  providing  custom
       aggregation models, autocorrection is disabled by default.

       If  you  would  like to re-enable the autocorrection function, set this
       variable to "yes";

       RAGATOR_AUTOCORRECTION=no

AGGREGATION CONFIGURATION

       Argus record flow descriptors  are  compared  to  the  flow  descriptor
       matching  statements in sequential, or "fall through", order, much like
       existing Access Control List definitions supported by routers, switches
       and firewalls.

       The  matching  statement references a flow model that is used to modify
       the flow description of each Argus  record.    Records  are  aggregated
       based  on  the  modified flow descriptor that results from applying the
       flow model  that  is  refererenced  in  the  matching  flow  descriptor
       matching statement.

       In  each  flow descriptor matching statement is a TimeOut period, which
       is how long the aggregator will hold the flow  cache  before  reporting
       it,  and  an  IdleTimeOut  period,  which  is  how long the aggregation
       process will hold the flow in its cache, if there is no activity.

       If a record doesn’t match any statement in the configuration,  then  it
       is aggregated based on its unmodified flow descriptor.  This aggregates
       flow reports from the same long lived flow.

ARGUS FLOW DESCRIPTOR MATCHING STATEMENT

An Argus flow matching statement specifies values for the network
protocol, the network src and dst addresses, the transport protocol,
and for TCP and UDP, the src and dst port numbers.

The supported network protocol is "ip", which represents IPv4. This
field specifies the type of the other fields in the flow descriptor.
Support for arp, dhcp and ipv6 are expected soon.

The address field can be names, dot ’.’ notation IPv4 addresses, or
CIDR addresse, which are partial dot ’.’ notation addresses with a
significant field indicator, using either the ’:’ or ’/’ seperators.

Proto field can be any valid IP protocol number, or the keywords, found
in the /etc/protocols file. For systems that do not support
/etc/protocols, ragator() understands ’tcp’, ’udp’, ’icmp’, and ’igmp’
tokens on its own.

Port values for ’tcp’ and ’udp’ flow can be any valid key word in the
/etc/services file, or, of course, actual port numbers which are 16 bit
values, between 0 and 65535 (0xFFFF).

When the protocol is ’icmp’, the values after the Proto field are valid
ICMP type and code values. Valid icmp types are:
echo
unreach
srcquench
redirect
timexed
timestamp
info
address

Numbers can be specified in decimal or as hex with the 0x prefix.

ARGUS AGGREGATION MODEL SPECIFIERS

       Argus flow matching statements reference a specific  aggregation  model
       specifier,  which  describes  how  the flow descriptor will be modified
       prior to aggregation.  This entry  in  the  aggregation  configuration,
       specifies what values will be preserved in the flow descriptor, and how
       they should be modified.

       When dealing with IP flows, the source and destination  address  fields
       can be modified using mask descriptors.  Protocol values and source and
       destination ports, however, are simply retained, by  specifying  "yes",
       or discarded, by specifying "no".

       There  can be any number of aggregation model specifiers, but they must
       have a unique Model id number.

EXAMPLE

       Here  is  a  configuration  that  aggregates  and  reports   individual
       transactions twice a day, but "forgets" each transaction if it has been
       idle for a full 24 hours.

 #label id    SAddr DAddr Proto  SPort  DPort Model  Duration  Idle
 Flow   100 ip  *     *     *      *      *    200     21600   43200

 #label  id      SAddrMask         DAddrMask      Proto  SPort  DPort
 Model  200 ip 255.255.255.255  255.255.255.255    yes    yes    yes

       The Flow descriptor matching statement 100 matches all  Argus  records,
       because all the flow descriptor fields are wildcarded, using ’*’.  Each
       record will be modified using the Model 200 defintion, which  preserves
       all fields, and the resulting aggregate will be held for 21600 seconds,
       at which time it will be reported.

       While this type of  configuration  is  not  likely  to  aggregate  many
       records,  it  will be very good at aggregating long lived single flows,
       such as persistant ping sessions between hosts, which  can  generate  a
       lot of activity data.  Since this may not be what you are really after,
       we’ll present a more complex example.

 #label id     SAddr DAddr Proto  SPort  DPort Model  Duration  Idle
 Flow   100 ip   *     *    icmp   echo    *    300     21600   43200
 Flow   102 ip 10:24 10:24   tcp    *      80   201       300   300
 Flow   103 ip 10:24   *     tcp    *      80   230       300   300
 Flow   104 ip   *     *     tcp    *      80   210       300   300
 Flow   101 ip   *     *     udp    *    domain 201      3600   300
 Flow   105 ip   *     *      *     *      *    241       120   300

 #TCP and UDP Flow Model Definitions
 #label  id        SAddrMask        DAddrMask      Proto  SPort  DPort
 Model  201 ip  255.255.255.255  255.255.255.255    yes     no    yes
 Model  210 ip  255.255.255.255  255.255.255.252    yes     no    yes
 Model  230 ip  255.0.0.0        255.255.255.255    yes     no    yes
 Model  241 ip  0.0.0.0          0.0.0.0            yes     no    yes

 # ICMP Flow Model Definitions
 #label  id        SAddrMask        DAddrMask      Proto  Type   Code
 Model  300 ip  255.255.255.255  255.255.255.255    yes    yes    yes

       Argus records are matched in falling order, so you will test all  Argus
       records against flow 100, then 101, then 102, and finally 105.  Flow Id
       numbers are used to report syntax errors in the configuration, and they
       don’t have to be unique.

       This  configuration  is  designed  to  track  pings, the clients of tcp
       services and the server of udp based DNS services.  All  other  traffic
       is accounted for either by protocol or lumped together.  Although not a
       particularly  useful  configuration,  it  is  an  example  of  how   to
       architecture your aggregation.

       Flow  100 matches all icmp echo (ping) transactions, and indicates that
       ragator should use FlowModel 300 to aggregate  the  ping  transactions.
       The  aggregate  should  be  held  for  21600 seconds (6 hours) and then
       reported.

       Model  300  is  designed  to  aggregate   ICMP   transactions   without
       modification.   The  result  will be that ragator() will aggregate only
       echo transactions between the same machines.  Very useful for  tracking
       generic  connectivity failure between two machines that are pinging one
       or the other.

       Flow 102 matches all destination port  80  tcp  connections  where  the
       servers  and  clients  are  both in the 10 network, and aggregates them
       based on Model 201, holding  the  aggregate  for  5  minutes  and  then
       reporting  them.   This  is  an example an aggregation scheme that will
       report on HTTP sessions (clumps of TCP  connections  that  occur  in  a
       short time range) between individual clients and their servers.

       Flow 103 then matches all destination port 80 tcp connections where the
       clients are from the 10 network.  Model 230  will  track  the  class  A
       address of the clients (net 10) and keep track of the individual remote
       servers.

       Flow 104 then matches all of the rest of the port 80  tcp  connections,
       which  should  be  connections  into  net  10, from non-net 10 clients.
       Model 210 is designed to track the traffic to a set of 4 load  balanced
       HTTP  servers.  Model 210 is designed to track the clients of services,
       so the src address goes unmodified (255.255.255.255), but  the  servers
       (dst  address)  are  going  to  be modifed to represent a subset of the
       class C network address (255.255.255.252).  basically mask off the last
       2  bits  in  the address.  The protocol value and the dst port (in this
       case the service port) will be preserved, but the src port is  removed,
       so the individual TCP connections can be matched.

       Flow  102  tracks udp based DNS transactions, aggregating them based on
       Flow Model 201 and holding the aggregate for an hour (3600 secs).  This
       strategy reports the aggregate DNS transactions between each client and
       server pair.  To do this, the Flow Modeler preserves everything  except
       the source port, which changes on each DNS request.

       All  other  traffic  is aggregated based on Flow Model 241 and reported
       every 12 hours.  Flow Model 241 is designed to track just the protocol,
       so this will generate Argus Records that have bytes and packets for TCP
       and UDP and the other protocols but it will not report  the  addresses.
       This can be useful.