pmcd - performance metrics collector daemon

NAME

       pmcd - performance metrics collector daemon

SYNOPSIS

       pmcd  [-f]  [-i  ipaddress]  [-l  logfile] [-L bytes] [-n pmnsfile] [-p
       port[,port ...]  [-q timeout] [-T traceflag] [-t timeout] [-x file]

DESCRIPTION

       pmcd  is  the  collector  used  by  the   Performance   Co-Pilot   (see
       PCPIntro(1))  to  gather  performance  metrics on a system.  As a rule,
       there must be  an  instance  of  pmcd  running  on  a  system  for  any
       performance metrics to be available to the PCP.

       pmcd accepts connections from client applications running either on the
       same machine or remotely and  provides  them  with  metrics  and  other
       related  information  from the machine that pmcd is executing on.  pmcd
       delegates most of this request servicing to a collection of Performance
       Metrics Domain Agents (or just agents), where each agent is responsible
       for a particular group of metrics, known as the domain  of  the  agent.
       For  example the environ agent is responsible for reporting information
       relating to the environment of a Challenge system, such as the  cabinet
       temperature and voltage levels of the power supply.

       The  agents  may be processes started by pmcd, independent processes or
       Dynamic Shared Objects (DSOs, see dso(5)) attached  to  pmcd’s  address
       space.   The  configuration  section below describes how connections to
       agents are specified.

       The options to pmcd are as follows.

       -f     By default pmcd is started as a daemon.  The -f option indicates
              that  it should run in the foreground.  This is most useful when
              trying to diagnose problems with misbehaving agents.

       -i ipaddress
              This option is usually only used on hosts  with  more  than  one
              network  interface.  If no -i options are specified pmcd accepts
              connections made to any of its  host’s  IP  (Internet  Protocol)
              addresses.   The  -i  option is used to specify explicitly an IP
              address that  connections  should  be  accepted  on.   ipaddress
              should  be  in the standard dotted form (e.g. 100.23.45.6).  The
              -i option may be used multiple times to  define  a  list  of  IP
              addresses.   Connections made to any other IP addresses the host
              has will be refused.  This can be used to limit  connections  to
              one  network  interface if the host is a network gateway.  It is
              also useful if the host takes over the  IP  address  of  another
              host  that has failed.  In such a situation only the standard IP
              addresses of the host should be given (not  the  ones  inherited
              from   the  failed  host).   This  allows  PCP  applications  to
              determine that a host has failed, rather than connecting to  the
              host that has assumed the identity of the failed host.

       -l logfile
              By default a log file named pmcd.log is written in the directory
              $PCP_LOG_DIR/pmcd.  The -l option causes  the  log  file  to  be
              written  to  logfile  instead  of  the default.  If the log file
              cannot be created or is not writable, output is written  to  the
              standard error instead.

       -L bytes
              PDUs  received by pmcd from monitoring clients are restricted to
              a maximum size of 65536  bytes  by  default  to  defend  against
              Denial  of Service attacks.  The -L option may be used to change
              the maximum incoming PDU size.

       -n pmnsfile
              Normally pmcd loads the default Performance Metrics  Name  Space
              (PMNS)  from $PCP_VAR_DIR/pmns/root, however if the -n option is
              specified an alternative  namespace  is  loaded  from  the  file
              pmnsfile.

       -q timeout
              The  pmcd  to  agent version exchange protocol (new in PCP 2.0 -
              introduced to provide backward compatibility) uses this  timeout
              to  specify  how  long  pmcd should wait before assuming that no
              version response is coming from an agent.  If  this  timeout  is
              reached,  the  agent  is  assumed  to be an agent which does not
              understand the PCP 2.0 protocol.  The default  timeout  interval
              is five seconds, but the -q option allows an alternative timeout
              interval (which must be greater than zero) to be specified.  The
              unit of time is seconds.

       -t timeout
              To   prevent   misbehaving   agents   from  hanging  the  entire
              Performance Metrics Collection System (PMCS), pmcd uses timeouts
              on  PDU  exchanges with agents running as processes.  By default
              the timeout interval is five seconds.  The -t option  allows  an
              alternative  timeout  interval  in  seconds to be specified.  If
              timeout  is  zero,  timeouts  are  turned  off.   It  is  almost
              impossible  to use the debugger interactively on an agent unless
              timeouts have been turned off for its "parent" pmcd.

              Once pmcd is running, the timeout may be dynamically modified by
              storing  an  integer  value  (the  timeout  in seconds) into the
              metric pmcd.control.timeout via pmstore(1).

       -T traceflag
              To assist with error diagnosis for agents and/or clients of pmcd
              that  are  not  behaving  correctly,  an  internal event tracing
              mechanism is supported within pmcd.  The value of  traceflag  is
              interpreted as a bit field with the following control functions:

              1   enable client connection tracing
              2   enable PDU tracing
              256 unbuffered event tracing

              By default, event tracing is buffered using  a  circular  buffer
              that  is  over-written  as new events are recorded.  The default
              buffer size holds the last 20 events, although this  number  may
              be   over-ridden  by  using  pmstore(1)  to  modify  the  metric
              pmcd.control.tracebufs.

              Similarly once pmcd is running, the event tracing control may be
              dynamically  modified  by storing 1 (enable) or 0 (disable) into
              the metrics  pmcd.control.traceconn,  pmcd.control.tracepdu  and
              pmcd.control.tracenobuf.   These  metrics  map to the bit fields
              associated with the traceflag argument for the -T option.

              When operating in buffered mode, the event trace buffer will  be
              dumped  whenever  an  agent connection is terminated by pmcd, or
              when any value is stored into the metric  pmcd.control.dumptrace
              via pmstore(1).

              In unbuffered mode, every event will be reported when it occurs.

       -x file
              Before the pmcd logfile can be  opened,  pmcd  may  encounter  a
              fatal  error  which  prevents it from starting.  By default, the
              output describing this error is sent  to  /dev/tty  but  it  may
              redirected to file.

       If  a  PDU exchange with an agent times out, the agent has violated the
       requirement that it delivers metrics with little or no delay.  This  is
       deemed a protocol failure and the agent is disconnected from pmcd.  Any
       subsequent requests for information from the agent  will  fail  with  a
       status indicating that there is no agent to provide it.

       It  is  possible  to  specify  host-level access control to pmcd.  This
       allows one to prevent users  from  certain  hosts  from  accessing  the
       metrics provided by pmcd and is described in more detail in the Section
       on ACCESS CONTROL below.

CONFIGURATION

       On   startup   pmcd   looks   for   a    configuration    file    named
       $PCP_PMCDCONF_PATH.   This  file  specifies  which  agents  cover which
       performance metrics domains and how pmcd should make contact  with  the
       agents.   An optional section specifying host-based access controls may
       follow the agent configuration data.

       Warning: pmcd is usually started as part of the boot sequence and  runs
       as  root.   The configuration file may contain shell commands to create
       agents, which will be executed by root.  To prevent  security  breaches
       the  configuration  file  should  be writable only by root.  The use of
       absolute path names is also recommended.

       The  case  of  the  reserved  words  in  the  configuration   file   is
       unimportant, but elsewhere, the case is preserved.

       Blank  lines  and  comments  are  permitted  (even  encouraged)  in the
       configuration file.  A  comment  begins  with  a  ‘‘#’’  character  and
       finishes  at  the end of the line.  A line may be continued by ensuring
       that the last character on the line is a ‘‘\’’ (backslash).  A  comment
       on  a continued line ends at the end of the continued line.  Spaces may
       be included in lexical elements by  enclosing  the  entire  element  in
       double  quotes  (there  must be whitespace before the opening and after
       the closing quote).  A double quote preceded by a backslash is always a
       literal  double  quote.   A  ‘‘#’’  in  double  quotes or preceded by a
       backslash is treated literally rather  than  as  a  comment  delimiter.
       Lexical  elements and separators are described further in the following
       sections.

AGENT CONFIGURATION

       Each line of the agent configuration section of the configuration  file
       contains  details  of  how  to  connect  pmcd  to one of its agents and
       specifies which metrics domain the agent deals with.  An agent  may  be
       attached as a DSO, or via a socket, or a pair of pipes.

       Each  line of the agent configuration section of the configuration file
       must be either an agent specification, a  comment,  or  a  blank  line.
       Lexical  elements  are  separated  by  whitespace characters, however a
       single agent specification may not be broken across lines  unless  a  \
       (backslash) is used to continue the line.

       Each  agent  specification  must  start  with  a textual label (string)
       followed by an integer in the range 1 to 510.  The label is a tag  used
       to  refer  to  the agent and the integer specifies the domain for which
       the agent supplies data.  This domain  identifier  corresponds  to  the
       domain portion of the PMIDs handled by the agent.  Each agent must have
       a unique label and domain identifier.

       For DSO agents a line of the form:

              label domain-no dso entry-point path

       should appear.  Where,

       label         is a string identifying the agent
       domain-no     is an unsigned integer specifying the agent’s  domain  in
                     the range 1 to 510
       entry-point   is  the  name of an initialization function which will be
                     called when the DSO is loaded
       path          designates the location of the DSO. This field is treated
                     differently  on Irix and on Linux. Later expects it to be
                     an absolute pathname, while former uses  some  heuristics
                     to  find an agent. If path begins with a / it is taken as
                     an absolute path specifying the DSO. If path is relative,
                     pmcd  will  expect  to  find the agent in a file with the
                     name mips_simabi.path, where simabi is either o32, n32 or
                     64.   pmcd  is only able to load DSO agents that have the
                     same simabi (Subprogram Interface Model ABI,  or  calling
                     conventions)  as  it  does  (i.e.  only one of the simabi
                     versions will be applicable).  The simabi  version  of  a
                     running  pmcd  may be determined by fetching pmcd.simabi.
                     Alternatively,  the  file(1)  command  may  be  used   to
                     determine the simabi version from the pmcd executable.

                     For  a  relative  path the environment variable PMCD_PATH
                     defines a colon (:)  separated  list  of  directories  to
                     search  when trying to locate the agent DSO.  The default
                     search path is $PCP_SHARE_DIR/lib:/usr/pcp/lib.

       For agents providing socket connections, a line of the form

              label domain-no socket addr-family address [ command ]

       should appear.  Where,

       label         is a string identifying the agent
       domain-no     is an unsigned integer specifying the agent’s  domain  in
                     the range 1 to 510
       addr-family   designates  whether  the  socket  is  in  the  AF_INET or
                     AF_UNIX domain, and the  corresponding  values  for  this
                     parameter are inet and unix respectively.
       address       specifies the address of the socket within the previously
                     specified addr-family.  For  unix  sockets,  the  address
                     should be the name of an agent’s socket on the local host
                     (a valid address for the UNIX domain).  For inet sockets,
                     the  address  may  be either a port number or a port name
                     which may be used to connect to an  agent  on  the  local
                     host.   There  is  no syntax for specifying an agent on a
                     remote host as a pmcd deals only with agents on the  same
                     machine.
       command       is  an  optional parameter used to specify a command line
                     to start the agent when pmcd initializes.  If command  is
                     not  present,  pmcd  assumes that the specified agent has
                     already been created.  The command is considered to start
                     from  the  first  non-white  character  after  the socket
                     address  and  finish  at  the  next  newline  that  isn’t
                     preceded  by a backslash.  After a fork(2) the command is
                     passed unmodified to execve(2) to instantiate the  agent.

       For  agents  interacting  with the pmcd via stdin/stdout, a line of the
       form:

              label domain-no pipe protocol command

       should appear.  Where,

       label         is a string identifying the agent
       domain-no     is an unsigned integer specifying the agent’s domain
       protocol      The value for this parameter should be binary.

                     Additionally,  the  protocol  can  include  the  notready
                     keyword  to indicate that the agent must be marked as not
                     being ready to process requests from pmcd. The agent will
                     explictily  notify  the  pmcd when it is ready to process
                     the requests by sending PM_ERR_PMDAREADY PDU.

       command       specifies a command line to start  the  agent  when  pmcd
                     initializes.   Note  that  command is mandatory for pipe-
                     based agents.  The command is considered  to  start  from
                     the   first   non-white   character  after  the  protocol
                     parameter and finish  at  the  next  newline  that  isn’t
                     preceded  by a backslash.  After a fork(2) the command is
                     passed unmodified to execve(2) to instantiate the  agent.

ACCESS CONTROL CONFIGURATION

The access control section of the configuration file is optional, but
if present it must follow the agent configuration data. The case of
reserved words is ignored, but elsewhere case is preserved. Lexical
elements in the access control section are separated by whitespace or
the special delimiter characters: square brackets (‘‘[’’ and ‘‘]’’),
braces (‘‘{’’ and ‘‘}’’), colon (‘‘:’’), semicolon (‘‘;’’) and comma
(‘‘,’’). The special characters are not treated as special in the
agent configuration section.

The access control section of the file must start with a line of the
form:

[access]

Leading and trailing whitespace may appear around and within the
brackets and the case of the access keyword is ignored. No other text
may appear on the line except a trailing comment.

Following this line, the remainder of the configuration file should
contain lines that allow or disallow operations from particular hosts
or groups of hosts.

There are two kinds of operations that occur via pmcd:

fetch allows retrieval of information from pmcd. This may be
information about a metric (e.g. its description,
instance domain or help text) or a value for a metric.

store allows pmcd to be used to store metric values in agents
that permit store operations.

Access to pmcd is granted at the host level, i.e. all users on a host
are granted the same level of access. Permission to perform the store
operation should not be given indiscriminately; it has the potential to
be abused by malicious users.

Hosts may be identified by name, IP address or a wildcarded IP address
with the single wildcard character ‘‘*’’ as the last-given component of
the IP address. Host names may not be wildcarded. The following are
all valid host identifiers:

boing
localhost
giggle.melbourne.sgi.com
129.127.112.2
129.127.114.*
129.*
*

The following are not valid host identifiers:

*.melbourne
129.127.*.*
129.*.114.9
129.127*

The first example is not allowed because only (numeric) IP addresses
may contain a wildcard. The second example is not valid because there
is more than one wildcard character. The third contains an embedded
wildcard, the fourth has a wildcard character that is not the last
component of the IP address (the last component is 127*).

The name localhost is given special treatment to make the behavior of
host wildcarding consistent. Rather than being 127.0.0.1, it is mapped
to the primary IP address associated with the name of the host on which
pmcd is running. Beware of this when running pmcd on multi-homed
hosts.

Access for hosts are allowed or disallowed by specifying statements of
the form:

allow hostlist : operations ;
disallow hostlist : operations ;

hostlist is a comma separated list of host identifiers.

operations is a comma separated list of the operation types
described above, all (which allows/disallows all
operations), or all except operations (which
allows/disallows all operations except those listed).

Where no specific allow or disallow statement applies to an operation
for some host, the default is to allow the operation from that host.
In the trivial case when there is no access control section in the
configuration file, all operations from all hosts are permitted.

If a new connection to pmcd is attempted from a host that is not
permitted to perform any operations, the connection will be closed
immediately after an error response PM_ERR_PERMISSION has been sent to
the client attempting the connection.

Statements with the same level of wildcarding specifying identical
hosts may not contradict each other. For example if a host named clank
had an IP address of 129.127.112.2, specifying the following two rules
would be erroneous:

allow clank : fetch, store;
disallow 129.127.112.2 : all except fetch;

because they both refer to the same host, but disagree as to whether
the fetch operation is permitted from that host.

Statements containing more specific host specifications override less
specific ones according to the level of wildcarding. For example a
rule of the form

allow clank : all;

overrides

disallow 129.127.112.* : all except fetch;

because the former contains a specific host name (equivalent to a fully
specified IP address), whereas the latter has a wildcard. In turn, the
latter would override

disallow * : all;

It is possible to limit the number of connections from a host to pmcd.
This may be done by adding a clause of the form

maximum n connections

to the operations list of an allow statement. Such a clause may not be
used in a disallow statement. Here, n is the maximum number of
connections that will be accepted from hosts matching the host
identifier(s) used in the statement.

An access control statement with a list of host identifiers is
equivalent to a group of access control statements, with each
specifying one of the host identifiers in the list and all with the
same access controls (both permissions and connection limits). A
wildcard should be used if you want hosts to contribute to a shared
connection limit.

When a new client requests a connection, and pmcd has determined that
the client has permission to connect, it searches the matching list of
access control statements for the most specific match containing a
connection limit. For brevity, this will be called the limiting
statement. If there is no limiting statement, the client is granted a
connection. If there is a limiting statement and the number of pmcd
clients with IP addresses that match the host identifier in the
limiting statement is less than the connection limit in the statement,
the connection is allowed. Otherwise the connection limit has been
reached and the client is refused a connection.

The wildcarding in host identifiers means that once pmcd actually
accepts a connection from a client, the connection may contribute to
the current connection count of more than one access control statement
(the client’s host may match more than one access control statement).
This may be significant for subsequent connection requests.

Note that because most specific match semantics are used when checking
the connection limit, priority is given to clients with more specific
host identifiers. It is also possible to exceed connection limits in
some situations. Consider the following:

allow clank : all, maximum 5 connections;
allow * : all except store, maximum 2 connections;

This says that only 2 client connections at a time are permitted for
all hosts other than "clank", which is permitted 5. If a client from
host "boing" is the first to connect to pmcd, its connection is checked
against the second statement (that is the most specific match with a
connection limit). As there are no other clients, the connection is
accepted and contributes towards the limit for only the second
statement above. If the next client connects from "clank", its
connection is checked against the limit for the first statement. There
are no other connections from "clank", so the connection is accepted.
Once this connection is accepted, it counts towards both statements’
limits because "clank" matches the host identifier in both statements.
Remember that the decision to accept a new connection is made using
only the most specific matching access control statement with a
connection limit. Now, the connection limit for the second statement
has been reached. Any connections from hosts other than "clank" will
be refused.

If instead, pmcd with no clients saw three successive connections
arrived from "boing", the first two would be accepted and the third
refused. After that, if a connection was requested from "clank" it
would be accepted. It matches the first statement, which is more
specific than the second, so the connection limit in the first is used
to determine that the client has the right to connect. Now there are 3
connections contributing to the second statement’s connection limit.
Even though the connection limit for the second statement has been
exceeded, the earlier connections from "boing" are maintained. The
connection limit is only checked at the time a client attempts a
connection rather than being re-evaluated every time a new client
connects to pmcd.

This gentle scheme is designed to allow reasonable limits to be imposed
on a first come first served basis, with specific exceptions.

As illustrated by the example above, a client’s connection is honored
once it has been accepted. However, pmcd reconfiguration (see the next
section) re-evaluates all the connection counts and will cause client
connections to be dropped where connection limits have been exceeded.

RECONFIGURING PMCD

       If  the  configuration  file  has  been  changed  or if an agent is not
       responding because it has terminated or the PMNS has been changed, pmcd
       may be reconfigured by sending it a SIGHUP, as in

            # pmsignal -a -s HUP pmcd

       When  pmcd  receives  a  SIGHUP,  it  checks the configuration file for
       changes.  If the file  has  been  modified,  it  is  reparsed  and  the
       contents  become  the  new  configuration.   If there are errors in the
       configuration file, the existing  configuration  is  retained  and  the
       contents  of the file are ignored.  Errors are reported in the pmcd log
       file.

       It also checks the PMNS file for changes. If the  PMNS  file  has  been
       modified,  then  it  is  reloaded.   Use  of tail(1) on the log file is
       recommended while reconfiguring pmcd.

       If the configuration for an agent has changed (any parameter except the
       agent’s  label  is  different),  the  agent is restarted.  Agents whose
       configurations do not change are not restarted.   Any  existing  agents
       not  present  in  the  new  configuration are terminated.  Any deceased
       agents are that are still listed are restarted.

       Sometimes it is necessary to restart an agent that  is  still  running,
       but  malfunctioning.   Simply  stop  the agent (e.g. using SIGTERM from
       pmsignal(1)), then send pmcd a SIGHUP, which will cause the agent to be
       restarted.

STARTING AND STOPPING PMCD

       Normally,  pmcd  is started automatically at boot time and stopped when
       the system is being brought down  (see  rc2(1M)  and  rc0(1M)).   Under
       certain  circumstances  it is necessary to start or stop pmcd manually.
       To do this one must become superuser and type

            # $PCP_RC_DIR/pcp start

       to start pmcd, or

            # $PCP_RC_DIR/pcp stop

       to stop pmcd.  Starting pmcd when it is already running is the same  as
       stopping it and then starting it again.

       Sometimes  it  may be necessary to restart pmcd during another phase of
       the boot process.  Time-consuming parts of the boot process  are  often
       put  into the background to allow the system to become available sooner
       (e.g. mounting huge databases).  If an agent run by pmcd requires  such
       a  task  to  complete  before  it  can run properly, it is necessary to
       restart or reconfigure pmcd after the task  completes.   Consider,  for
       example,  the  case  of  mounting  a  database  in the background while
       booting.  If the PMDA which provides the  metrics  about  the  database
       cannot function until the database is mounted and available but pmcd is
       started before the database is ready, the PMDA will fail (however  pmcd
       will  still  service  requests for metrics from other domains).  If the
       database is initialized by running a shell script, adding a line to the
       end  of  the  script  to reconfigure pmcd (by sending it a SIGHUP) will
       restart the PMDA (if it exited  because  it  couldn’t  connect  to  the
       database).   If  the  PMDA  didn’t exit in such a situation it would be
       necessary to restart pmcd because if the PMDA was  still  running  pmcd
       would not restart it.

       Normally  pmcd listens for client connections on one or more well-known
       TCP/IP port numbers (historically 4321 and more recently the officially
       registered  port  44321;  in  the current release, pmcd listens on only
       port 44321 by default).  Either the environment variable  PMCD_PORT  or
       the  -p  command  line  option  may be used to specify alternative port
       number(s) when pmcd is started; in each case, the  specficiation  is  a
       comma-separated  list  of  one  or more numerical port numbers.  Should
       both methods be used or multiple -p options appear on the command line,
       pmcd  will listen on the union of the set of ports specified via all -p
       options and the PMCD_PORT environment variable.  If  non-default  ports
       are  used  with  pmcd  care should be taken to ensure that PMCD_PORT is
       also set in the environment of any client application that will connect
       to pmcd.

FILES

       $PCP_PMCDCONF_PATH
                 default configuration file
       $PCP_PMCDOPTIONS_PATH
                 command   line   options   to   pmcd   when   launched   from
                 $PCP_RC_DIR/pcp All the  command  line  option  lines  should
                 start  with  a  hyphen as the first character.  This file can
                 also  contain  environment  variable  settings  of  the  form
                 "VARIABLE=value".
       ./pmcd.log
                 (or $PCP_LOG_DIR/pmcd/pmcd.log when started automatically)
       $PCP_RUN_DIR/pmcd.pid
                 contains an ascii decimal representation of the process ID of
                 pmcd , when it’s running.
                 All messages and diagnostics are directed here

ENVIRONMENT

       In addition to the PCP  environment  variables  described  in  the  PCP
       ENVIRONMENT section below, the PMCD_PORT variable is also recognised as
       the TCP/IP port for incoming connections (default 44321).

PCP ENVIRONMENT

       Environment variables with the prefix PCP_ are used to parameterize the
       file  and  directory names used by PCP.  On each installation, the file
       /etc/pcp.conf contains the  local  values  for  these  variables.   The
       $PCP_CONF  variable may be used to specify an alternative configuration
       file, as described in pcp.conf(4).

DIAGNOSTICS

       If  pmcd is already running the message "Error: OpenRequestSocket bind:
       Address may already be in use" will appear.  This may  also  appear  if
       pmcd  was  shutdown with an outstanding request from a client.  In this
       case, a request socket has been left in the TIME_WAIT state  and  until
       the  system  closes  it down (after some timeout period) it will not be
       possible to run pmcd.

       In addition to the standard PCP debugging  flags,  see  pmdbg(1),  pmcd
       currently  uses  DBG_TRACE_APPL0  for  tracing  I/O  and termination of
       agents, DBG_TRACE_APPL1 for tracing host access control (see below) and
       DBG_TRACE_APPL2  for tracing the configuration file scanner and parser.

CAVEATS

       pmcd does not explicitly  terminate  its  children  (agents),  it  only
       closes  their pipes.  If an agent never checks for a closed pipe it may
       not terminate.

       The configuration file parser will only read lines of  less  than  1200
       characters.  This is intended to prevent accidents with binary files.

       The  timeouts controlled by the -t option apply to IPC between pmcd and
       the  PMDAs  it  spawns.   This  is  independent  of  settings  of   the
       environment  variables  PMCD_CONNECT_TIMEOUT  and  PMCD_REQUEST_TIMEOUT
       (see PCPIntro(1)) which may be used respectively  to  control  timeouts
       for client applications trying to connect to pmcd and trying to receive
       information from pmcd.