NAME
fenced - the I/O Fencing daemon
SYNOPSIS
fenced [OPTION]...
DESCRIPTION
The fencing daemon, fenced, fences cluster nodes that have failed.
Fencing a node generally means rebooting it or otherwise preventing it
from writing to storage, e.g. disabling its port on a SAN switch.
Fencing involves interacting with a hardware device, e.g. network power
switch, SAN switch, storage array. Different "fencing agents" are run
by fenced to interact with various hardware devices.
Software related to sharing storage among nodes in a cluster, e.g. GFS,
usually requires fencing to be configured to prevent corruption of the
storage in the presence of node failure and recovery. GFS will not
allow a node to mount a GFS file system unless the node is running
fenced. Fencing happens in the context of a cman/openais cluster. A
node must be a cluster member before it can run fenced.
Once started, fenced waits for the ’fence_tool join’ command to be run,
telling it to join the fence domain: a group of nodes managed by the
openais/cpg/groupd cluster infrastructure. In most cases, all nodes
will join the fence domain after joining the cluster.
Fence domain members are aware of the membership of the group, and are
notified when nodes join or leave. If a fence domain member fails, one
of the remaining members will fence it. If the cluster has lost
quorum, fencing won’t occur until quorum has been regained. If a
failed node is reset and rejoins the cluster before the remaining
domain members have fenced it, the fencing will be bypassed.
Node failure
When a domain member fails, fenced runs an agent to fence it. The
specific agent to run and the parameters the agent requires are all
read from the cluster.conf file (using libccs) at the time of fencing.
The fencing operation against a failed node is not considered complete
until the exec’ed agent exits. The exit value of the agent indicates
the success or failure of the operation. If the operation failed,
fenced will retry (possibly with a different agent, depending on the
configuration) until fencing succeeds. Other systems such as DLM and
GFS will not begin their own recovery for a failed node until fenced
has successfully completed fencing it. So, a delay or problem in
fencing will result in other systems like DLM/GFS being blocked.
Information about fencing operations will also appear in syslog.
When a domain member fails, the actual fencing operation can be delayed
by a configurable number of seconds (cluster.conf:post_fail_delay or
-f). Within this time, the failed node could be reset and rejoin the
cluster to avoid being fenced. This delay is 0 by default to minimize
the time that other systems are blocked (see above).
Domain startup
When the domain is first created in the cluster (by the first node to
join it) and subsequently enabled (by the cluster gaining quorum) any
nodes listed in cluster.conf that are not presently members of the cman
cluster are fenced. The status of these nodes is unknown, and to be on
the side of safety they are assumed to be in need of fencing. This
startup fencing can be disabled, but it’s only truly safe to do so if
an operator is present to verify that no cluster nodes are in need of
fencing.
This example illustrates why startup fencing is important. Take a
three node cluster with nodes A, B and C; all three have a GFS fs
mounted. All three nodes experience a low-level kernel hang at about
the same time. A watchdog triggers a reboot on nodes A and B, but not
C. A and B boot back up, form the cluster again, gain quorum, join the
fence domain, *don’t* fence node C which is still hung and
unresponsive, and mount the GFS fs again. If C were to come back to
life, it could corrupt the fs. So, A and B need to fence C when they
reform the fence domain since they don’t know the state of C. If C
*had* been reset by a watchdog like A and B, but was just slow in
rebooting, then A and B might be fencing C unnecessarily when they do
startup fencing.
The first way to avoid fencing nodes unnecessarily on startup is to
ensure that all nodes have joined the cluster before any of the nodes
start the fence daemon. This method is difficult to automate.
A second way to avoid fencing nodes unnecessarily on startup is using
the cluster.conf:post_join_delay setting (or -j option). This is the
number of seconds fenced will delay before actually fencing any victims
after nodes join the domain. This delay gives nodes that have been
tagged for fencing a chance to join the cluster and avoid being fenced.
A delay of -1 here will cause the daemon to wait indefinitely for all
nodes to join the cluster and no nodes will actually be fenced on
startup.
To disable fencing at domain-creation time entirely, the -c option can
be used to declare that all nodes are in a clean or safe state to
start. The clean_start cluster.conf option can also be set to do this,
but automatically disabling startup fencing in cluster.conf can risk
file system corruption.
Avoiding unnecessary fencing at startup is primarily a concern when
nodes are fenced by power cycling. If nodes are fenced by disabling
their SAN access, then unnecessarily fencing a node is usually less
disruptive.
Fencing override
If a fencing device fails, the agent may repeatedly return errors as
fenced tries to fence a failed node. In this case, the admin can
manually reset the failed node, and then use fence_ack_manual to tell
fenced to continue without fencing the node.
CONFIGURATION FILE
Fencing daemon behavior can be controlled by setting options in the
cluster.conf file under the section <fence_daemon> </fence_daemon>.
See above for complete descriptions of these values. The delay values
are in seconds; -1 secs means an unlimited delay. The values shown are
the defaults.
Post-join delay is the number of seconds the daemon will wait before
fencing any victims after a node joins the domain.
<fence_daemon post_join_delay="6"/>
Post-fail delay is the number of seconds the daemon will wait before
fencing any victims after a domain member fails.
<fence_daemon post_fail_delay="0"/>
Clean-start is used to prevent any startup fencing the daemon might do.
It indicates that the daemon should assume all nodes are in a clean
state to start.
<fence_daemon clean_start="0"/>
Override-path is the location of a FIFO used for communication between
fenced and fence_ack_manual.
<fence_daemon override_path="/var/run/cluster/fenced_override"/>
Override-time is the amount of time to wait for administrator
intervention after fencing has failed. The default is 5 seconds.
<fence_daemon override_time="10"/>
Per-node fencing settings
The per-node fencing configuration can become complex and is largely
specific to the hardware being used. The general framework begins like
this:
<clusternodes>
<clusternode name="node1" nodeid="1">
<fence>
</fence>
</clusternode>
<clusternode name="node2" nodeid="2">
<fence>
</fence>
</clusternode>
...
</clusternodes>
The simple fragment above is a valid configuration: there is no way to
fence these nodes. If one of these nodes is in the fence domain and
fails, fenced will repeatedly fail in its attempts to fence it. The
admin will need to manually reset the failed node and then use
fence_ack_manual to tell fenced to continue on without fencing it (see
override above).
There is typically a single method used to fence each node (the name
given to the method is not significant). A method refers to a specific
device listed in the separate <fencedevices> section, and then lists
any node-specific parameters related to using the device.
<clusternodes>
<clusternode name="node1" nodeid="1">
<fence>
<method name="single">
<device name="myswitch" hw-specific-param="x"/>
</method>
</fence>
</clusternode>
<clusternode name="node2" nodeid="2">
<fence>
<method name="single">
<device name="myswitch" hw-specific-param="y"/>
</method>
</fence>
</clusternode>
...
</clusternodes>
Fence device settings
This section defines properties of the devices used to fence nodes.
There may be one or more devices listed. The per-node fencing sections
above reference one of these fence devices by name.
<fencedevices>
<fencedevice name="myswitch" ipaddr="1.2.3.4" .../>
</fencedevices>
Multiple methods for a node
In more advanced configurations, multiple fencing methods can be
defined for a node. If fencing fails using the first method, fenced
will try the next method, and continue to cycle through methods until
one succeeds.
<clusternode name="node1" nodeid="1">
<fence>
<method name="first">
<device name="powerswitch" hw-specific-param="x"/>
</method>
<method name="second">
<device name="storageswitch" hw-specific-param="1"/>
</method>
</fence>
</clusternode>
Dual path, redundant power
Sometimes fencing a node requires disabling two power ports or two i/o
paths. This is done by specifying two or more devices within a method.
<clusternode name="node1" nodeid="1">
<fence>
<method name="single">
<device name="sanswitch1" hw-specific-param="x"/>
<device name="sanswitch2" hw-specific-param="x"/>
</method>
</fence>
</clusternode>
When using power switches to fence nodes with dual power supplies, the
agents must be told to turn off both power ports before restoring power
to either port. The default off-on behavior of the agent could result
in the power never being fully disabled to the node.
<clusternode name="node1" nodeid="1">
<fence>
<method name="single">
<device name="nps1" hw-param="x" action="off"/>
<device name="nps2" hw-param="x" action="off"/>
<device name="nps1" hw-param="x" action="on"/>
<device name="nps2" hw-param="x" action="on"/>
</method>
</fence>
</clusternode>
Hardware-specific settings
Find documentation for configuring specific devices at
http://sources.redhat.com/cluster/
OPTIONS
Command line options override corresponding values in cluster.conf.
-j secs
Post-join fencing delay
-f secs
Post-fail fencing delay
-c All nodes are in a clean state to start.
-O Path of the override FIFO.
-T Amount of time to wait for administrator intervention after
fencing has failed, in seconds.
-D Enable debugging code and don’t fork into the background.
-V Print the version information and exit.
-h Print out a help message describing available options, then
exit.
DEBUGGING
The fenced daemon keeps a circular buffer of debug messages that can be
dumped with the ’fence_tool dump’ command.
SEE ALSO
fence_tool(8), cman(8), groupd(8), group_tool(8)
fenced(8)