NAME
wulflogger - A logging utility/client for xmlsysd
SYNOPSIS
wulflogger [-h] [-v] [-t display_type] [-d delay] [-c count]
[-f /path/to/wulfhosts] [-l]
WULFLOGGER OPTIONS
-h shows help (command synopsis).
-v makes execution verbose for debugging or the bored.
-t display_type selects display type from list below
-d delay (in seconds) selects update loop delay
-c count causes it to output count pages (only) and exit
-f /path/to/wulfhosts to use a particular wulfhosts file
-l show localhost only (use no wulfhosts file from any location)
DESCRIPTION
wulflogger is a simple yet powerful tty based cluster monitoring tool.
It requires xmlsysd (running on each system to be monitored) to
efficiently provide it with system and proc-derived information that is
processed and provided to the user in one of several user-selectable
display formats. With it a user can monitor things across en entire
beowulf, cluster, or workstation LAN systems descriptors such as load
average, memory consumption, swap, page, and interrupt activity and
network loads or can even retrieve and display such mundane information
is CPU make and base clock, system time, uptime or other potentially
useful but slowly varying system descriptors. The information
presented is updated regularly after a user-selectable delay. This
tool prints cluster results to stdout, from which they can be
redirected into a log file or piped into a tool (for example, a
graphing utility or web application).
WULFHOST
To run wulflogger as anything but a monitor of the local host one
REQUIRES a wulfhost file. wulflogger run with no viable wulfhost file
defaults to a localhost connection. A localhost connection can also be
forced (overriding the search for a wulfhost file) with the -l command
line argument.
The wulfhost file tells wulflogger where to to connect to xmlsysd’s.
It consists of any mix of the following xml discriptors:
<?xml version="1.0"?>
<wulfstat>
<root/>
<user>rgb</user>
<task>On_spin3d</task>
<host>
<name>ganesh</name>
</host>
<host>
<ip>192.168.1.132</ip>
<port>7887</port>
<host>
<host>
<name>lucifer</name>
<ip>192.168.1.131</ip>
<port>7887</port>
</host>
<hostrange>
<hostfmt>g%02d</hostfmt>
<imin>1</imin>
<imax>15</imax>
<port>7887</port>
</hostrange>
<iprange>
<ipmin>152.3.182.193</ipmin>
<ipmax>152.3.182.200</ipmax>
<port>7887</port>
</iprange>
</wulfstat>
From this example, one sees that the <host></host> tag defines a host
to connect to. Within this tag, the host can be specified by the
<name></name> tag (which can contain any name resolvable by
gethostbyname()) or the <ip></ip> tag, most commonly used for hosts in
a cluster that haven’t been named. In addition, for each host one can
specify a <port></port> if one for any reason is running the xmlsysd on
a different port than its installation default.
This information can easily be overspecified. In most cases, for
example, it is better to just use the default port (7887) and let local
hostname ip address lookup take care of determining the interface IP
number. Note that xml doesn’t care how the tags are laid out as long as
they are nested correctly, and that there can be more than one <host>,
<hostrange>, or <iprange> tagset in a wulfhosts to specify the
simultaneous monitoring of any mix of hosts, clusters, lans.
Note also that xml DOES preserve whitespace, so
<host><name>b0 </name></host>
is NOT the same is
<host><name>b0</name></host>
and would likely not work correctly. If you do enter port, name, and
ip explicitly and incorrectly or inconsistently, be prepared for odd
behavior.
The <hostrange> is hopefully self explanatory. It can be used to
rapidly define an entire cluster on the basis of a systematic ordering
of hostname. The contents of the <hostfmt> tag should be a SIMPLE
printf-format string for a presumed integer that will be iterated from
<imin> to <imax> in steps of one. In this way a single xml tag can
define an entire cluster e.g. g01-g15.
The <iprange> is similar, except that it uses ip number directly in
<ipmin> and <ipmax>. Use caution -- in almost all cases the first
three tuples in the ip number should be the SAME in <ipmin> and
<ipmax>. This option is provided in case the hosts don’t have a well-
defined and published hostname and are accessible only by e.g. dhcp-
assigned ip number in any event.
All forms of defining a host or list of hosts permit an optional <port>
to be assigned to override xmlsysd’s installation default of 7887.
wulflogger will connect to these hosts as fast as it can in a parallel
thread, and then will periodically attempt to REconnect to any hosts
that might be down or that might go down while wulflogger is running.
wulflogger itself is thus moderately robust against cluster node state
changes.
Note that any hosts that do not resolve are displayed but marked
unknown. Any hosts that resolve but that cannot accept a connection
(which could mean that no daemon is installed or running, the daemon
has more connections than the number permitted in e.g.
/etc/xinetd.d/xmlsysd, or that the host is down) are marked down.
DISPLAY TYPES
The following display types are supported by wulflogger:
0 - load and status only (default), a very useful display for cluster
users
1 - stat -- information and rates primary derived from /proc/stat
2 - memory only (similar to running "free" on each host)
3 - network rates
4 - time displays system clocks, uptime, cpu type and clock
5 - pids interface for monitoring running distributed tasks.
6 - pids interface for monitoring running distributed tasks with
full command line displayed.
The pids interface is a bit quirky. It will generally ignore root-owned
tasks, for example, presuming that the tool is intended to monitor
userspace applications. There exist wulfhosts controls for these
properties; eventually they will likely be controllable at the command
line as well.
CRON USAGE
wulflogger can be used in a cron script in a variety of ways. The -c
count flag was introduced to facilitate this usage. For example, one
could put wulflogger into the following sort of pipe:
#!/bin/sh
DOWN=`/usr/bin/wulflogger -f /etc/wulfhosts.cluster1 -t 1 -c 1 \
| grep down | cut -f 1 -d ´ ´`
# now do something about the down hosts...
DEBUGGING
To help debug wulflogger (or problems you might have with wulfhosts),
note the table of verbose/debugging values that is printed as part of
its Usage (-h flag). This yields anything from a simple trace of a
particular subsystem such as connect_hosts() to everything the program
does. To limit the output, one can also use the -c count flag to only
display a single cycle. It is a good idea to pipe stderr into a
logfile separately so that the display output is unaltered. The
logfile can be examined later or mailed back to me for analysis.
An example of this might be:
wulflogger -l -c 1 -v 10 2>connect_hosts.log
to trace what wulflogger does connecting to localhost.
SEE ALSO
libwulf(3), wulfstat(1)
PUBLICATION RULES
wulflogger can be modified and used at will by any user, provided that:
a) The original copyright notices are maintained and that the source,
including all modifications, is made publically available at the time
of any derived publication. This is open source software according to
the precepts and spirit of the Gnu Public License. See the
accompanying file COPYING, which also must accompany any
redistribution.
b) The author of the code (Robert G. Brown) is appropriately
acknowledged and referenced in any derived use or publication.
c) Full responsibility for the accuracy, suitability, and
effectiveness of the program rests with the users and/or modifiers. As
is clearly stated in the accompanying copyright.h:
THE COPYRIGHT HOLDERS DISCLAIM ALL WARRANTIES WITH REGARD TO THIS
SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS, IN NO EVENT SHALL THE COPYRIGHT HOLDERS BE LIABLE FOR ANY
SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER
RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF
CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN
CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
ACKNOWLEDGEMENTS
None to speak of now, but a good place to comply with b) above later,
if you hack this code. Well, I should probably acknowledge the
essential help of Icon (Konstantin Raibetsev) and Seth Vidal, the
entire beowulf list, and various books on xml, network programming
(e.g. Stevens) and a cast of thousands. So let’s assume that I just
did;-)
GPL 2b; see the file COPYING that accompanies the source of this
program. This is the "standard Gnu General Public License version 2 or
any later version", with the one minor (humorous) "Beverage"
modification listed below. Note that this modification is probably not
legally defensible and can be followed really pretty much according to
the honor rule.
As to my personal preferences in beverages, red wine is great, beer is
delightful, and Coca Cola or coffee or tea or even milk acceptable to
those who for religious or personal reasons wish to avoid stressing my
liver.
The Beverage Modification to the GPL:
Any satisfied user of this software shall, upon meeting the primary
author(s) of this software for the first time under the appropriate
circumstances, offer to buy him or her or them a beverage. This
beverage may or may not be alcoholic, depending on the personal ethical
and moral views of the offerer. The beverage cost need not exceed one
U.S. dollar (although it certainly may at the whim of the offerer:-)
and may be accepted or declined with no further obligation on the part
of the offerer. It is not necessary to repeat the offer after the
first meeting, but it can’t hurt...