Man Linux: Main Page and Category List

NAME

       pmlogextract  -  reduce, extract, concatenate and merge Performance Co-
       Pilot archives

SYNOPSIS

       $PCP_BINADM_DIR/pmlogextract [-dfwz] [-c configfile] [-n pmnsfile]  [-S
       starttime]  [-s  samples]  [-T  endtime]  [-v volsamples] [-Z timezone]
       input [...] output

DESCRIPTION

       pmlogextract reads one or more Performance Co-Pilot (PCP) archive  logs
       identified  by input and creates a temporally merged and/or reduced PCP
       archive log in output.  The nature of  merging  is  controlled  by  the
       number  of  input  archive  logs, while the nature of data reduction is
       controlled by the command line arguments.  The  input(s)  must  be  PCP
       archive  logs  created  by  pmlogger(1) with performance data collected
       from the same  host,  but  usually  over  different  time  periods  and
       possibly  (although  not  usually)  with  different performance metrics
       being logged.

       If only one input is specified, then the default behavior simply copies
       the  input  PCP archive log, into the output PCP archive log.  When two
       or more PCP archive logs are specified as input, the  logs  are  merged
       (or concatenated) and written to output.

       In  the output archive log a ‘‘mark’’ record will be inserted at a time
       just past the end of each of the  input  archive  logs  to  indicate  a
       possible  temporal  discontinuity  between the end of one input archive
       log and the start of the next input archive log.  See the MARK  RECORDS
       section  below for more information.  There is no ‘‘mark’’ record after
       the end of the last (in temporal order) of the input archive logs.

COMMAND LINE OPTIONS

       The command line options for pmlogextract are as follows:

       -c configfile
              Extract only the metrics specified in configfile from the  input
              PCP   archive   log(s).    The  configfile  syntax  accepted  by
              pmlogextract is explained in more detail  in  the  Configuration
              File Syntax section.

       -d     Desperate  mode.  Normally if a fatal error occurs, all trace of
              the partially written PCP archive output is removed.   With  the
              -d option, the output archive log is not removed.

       -f     For  most  common  uses, all of the input archive logs will have
              been collected in the same timezone.  But if  this  is  not  the
              case,  then  pmlogextract  must choose one of the timezones from
              the input archive logs to be used as the timezone for the output
              archive  log.   The default is to use the timezone from the last
              input archive log.  The -f option forces the timezone  from  the
              first input archive log to be used.

       -n pmnsfile
              Normally  pmlogextract  operates on the Performance Metrics Name
              Space (PMNS) from input, however if the -n option  is  specified
              an alternative local PMNS is loaded from the file pmnsfile.

       -S starttime
              Define  the  start  of  a  time  window  to restrict the samples
              retrieved or specify  a  ‘‘natural’’  alignment  of  the  output
              sample times; refer to PCPIntro(1).  See also the -w option.

       -s samples
              The argument samples defines the number of samples to be written
              to output.  If samples is 0 or -s is not specified, pmlogextract
              will  sample until the end of the PCP archive log, or the end of
              the time window as specified by -T, whichever comes first.   The
              -s option will override the -T option if it occurs sooner.

       -T endtime
              Define  the termination of a time window to restrict the samples
              retrieved or specify  a  ‘‘natural’’  alignment  of  the  output
              sample times; refer to PCPIntro(1).  See also the -w option.

       The  output archive log is potentially a multi-volume data set, and the
       -v option causes pmlogextract to start a new  volume  after  volsamples
       log records have been written to the archive log.

       -w     Where  -S  and -T specify a time window within the same day, the
              -w flag will cause  the  data  within  the  time  window  to  be
              extracted,  for  every day in the archive log.  For example, the
              options -w -S @11:00 -T @15:00 specify that pmlogextract  should
              include  archive  log  records only for the periods from 11am to
              3pm on each day.  When -w is used, the output archive  log  will
              contain  ‘‘mark’’ records to indicate the temporal discontinuity
              between the end of one time window and the start of the next.

       -Z timezone
              Use timezone when displaying the date and time.  Timezone is  in
              the  format  of  the  environment  variable  TZ  as described in
              environ(5).

       -z     Use the local timezone of the host from the input archive  logs.
              The  default is to initially use the timezone of the local host.

CONFIGURATION FILE SYNTAX

       The configfile contains metrics  of  interest,  listed  one  per  line.
       Instances may also be specified, but they are optional.  The format for
       each metric name is

               metric [[instance[,instance...]]]

       where metric may be a leaf  or  a  non-leaf  node  in  the  Performance
       Metrics  Namespace  (PMNS,  see pmns(4)).  If a metric refers to a non-
       leaf node in the PMNS, pmlogextract will recursively descend  the  PMNS
       and  include  all  metrics  corresponding  to  descendent  leaf  nodes.
       Instances are optional, and may be specified as a list of one  or  more
       space  (or comma) separated names, numbers or strings.  Elements in the
       list that are numbers are assumed to be external instance identifiers -
       see  pmGetInDom(3)  for  more  information.  If no instances are given,
       then the logging specification is  applied  to  all  instances  of  the
       associated metric(s).

CONFIGURATION FILE EXAMPLE

       This is an example of a valid configfile:

               #
               # config file for pmlogextract
               #

               kernel.all.cpu
               kernel.percpu.cpu.sys ["cpu0","cpu1"]
               disk.dev ["dks0d1"]

MARK RECORDS

       When  more  than  one input archive log contributes performance data to
       the output archive log, then ‘‘mark’’ records are inserted to  indicate
       a possible discontinuity in the performance data.

       A  ‘‘mark’’  record contains a timestamp and no performance data and is
       used to indicate that there is a time period in  the  PCP  archive  log
       where  we  do  not  know the values of any performance metrics, because
       there was  no  pmlogger(1)  collecting  performance  data  during  this
       period.  Since these periods are often associated with the restart of a
       service or pmcd(1) or a system, there may be considerable doubt  as  to
       the continuity of performance data across this time period.

       The  rationale  behind  ‘‘mark’’  records  may  be demonstrated with an
       example.  Consider one input archive log that starts at 00:10 and  ends
       at  09:15 on the same day, and another input archive log that starts at
       09:20 on the same day and ends at 00:10  the  following  morning.   The
       would  be  a  very  common  case  for  archives  managed and rotated by
       pmlogger_check(1) and pmlogger_daily(1).

       The output archive log would contain:
       00:10.000   first record from first input archive log
       ...
       09:15.000   last record from first input archive log
       09:15.001   <mark record>
       09:20.000   first record from second input archive log
       ...
       01:10.000   last record from second input archive log

       The time period where the performance data is missing starts just after
       09:15  and  ends  just  before  09:20.   When the output archive log is
       processed with any of the PCP reporting tools, the ‘‘mark’’  record  is
       used  to indicate a period of missing data.  For example in the archive
       above, if  one  was  reporting  the  average  I/O  rate  at  30  minute
       intervals,  aligned  on  the  hour,  then  there  would be data for the
       intervals ending at 09:00 and  10:00  but  no  data  reported  for  the
       interval ending at 09:30 as this spans a ‘‘mark’’ record.

       The  presence  of  ‘‘mark’’  records  in  a  PCP  archive  log  can  be
       established using pmdumplog(1) where a  timestamp  and  the  annotation
       <mark> is used to indicate a ‘‘mark’’ record.

FILES

       For  each  of the input and output archive logs, several physical files
       are used.
       archive.meta
                 metadata (metric descriptions, instance  domains,  etc.)  for
                 the archive log
       archive.0 initial  volume  of  metrics  values (subsequent volumes have
                 suffixes 1, 2, ...)
       archive.index
                 temporal index to support rapid random access  to  the  other
                 files in the archive log.

PCP ENVIRONMENT

       Environment variables with the prefix PCP_ are used to parameterize the
       file and directory names used by PCP.  On each installation,  the  file
       /etc/pcp.conf  contains  the  local  values  for  these variables.  The
       $PCP_CONF variable may be used to specify an alternative  configuration
       file, as described in pcp.conf(4).

SEE ALSO

       PCPIntro(1),   pmdumplog(1),   pmlc(1),   pmlogger(1),  pmlogreduce(1),
       pcp.conf(4) and pcp.env(4).

DIAGNOSTICS

       All error conditions detected by pmlogextract are  reported  on  stderr
       with textual (if sometimes terse) explanation.

       Should  one  of the input archive logs be corrupted (this can happen if
       the pmlogger instance writing the log suddenly dies), then pmlogextract
       will  detect and report the position of the corruption in the file, and
       any subsequent information from that archive log will not be processed.

       If  any  error  is  detected,  pmlogextract  will  exit with a non-zero
       status.

CAVEATS

       The preamble metrics  (pmcd.pmlogger.archive,  pmcd.pmlogger.host,  and
       pmcd.pmlogger.port),  which  are  automatically recorded by pmlogger at
       the start of the archive, may not be present in the archive  output  by
       pmlogextract.   These  metrics  are  only relevant while the archive is
       being created, and have no significance once recording has finished.