NAME
nfs - fstab format and options for the nfs and nfs4 file systems
SYNOPSIS
/etc/fstab
DESCRIPTION
NFS is an Internet Standard protocol created by Sun Microsystems in
1984. NFS was developed to allow file sharing between systems residing
on a local area network. The Linux NFS client supports three versions
of the NFS protocol: NFS version 2 [RFC1094], NFS version 3 [RFC1813],
and NFS version 4 [RFC3530].
The mount(8) command attaches a file system to the system’s name space
hierarchy at a given mount point. The /etc/fstab file describes how
mount(8) should assemble a system’s file name hierarchy from various
independent file systems (including file systems exported by NFS
servers). Each line in the /etc/fstab file describes a single file
system, its mount point, and a set of default mount options for that
mount point.
For NFS file system mounts, a line in the /etc/fstab file specifies the
server name, the path name of the exported server directory to mount,
the local directory that is the mount point, the type of file system
that is being mounted, and a list of mount options that control the way
the filesystem is mounted and how the NFS client behaves when accessing
files on this mount point. The fifth and sixth fields on each line are
not used by NFS, thus conventionally each contain the digit zero. For
example:
server:path /mountpoint fstype option,option,... 0 0
The server’s hostname and export pathname are separated by a colon,
while the mount options are separated by commas. The remaining fields
are separated by blanks or tabs.
The server’s hostname can be an unqualified hostname, a fully qualified
domain name, a dotted quad IPv4 address, or an IPv6 address enclosed in
square brackets. Link-local and site-local IPv6 addresses must be
accompanied by an interface identifier. See ipv6(7) for details on
specifying raw IPv6 addresses.
The fstype field contains either "nfs" (for version 2 or version 3 NFS
mounts) or "nfs4" (for NFS version 4 mounts). The nfs and nfs4 file
system types share similar mount options, which are described below.
MOUNT OPTIONS
Refer to mount(8) for a description of generic mount options available
for all file systems. If you do not need to specify any mount options,
use the generic option defaults in /etc/fstab.
Valid options for either the nfs or nfs4 file system type
These options are valid to use when mounting either nfs or nfs4 file
system types. They imply the same behavior and have the same default
for both file system types.
soft / hard Determines the recovery behavior of the NFS client after
an NFS request times out. If neither option is
specified (or if the hard option is specified), NFS
requests are retried indefinitely. If the soft option
is specified, then the NFS client fails an NFS request
after retrans retransmissions have been sent, causing
the NFS client to return an error to the calling
application.
NB: A so-called "soft" timeout can cause silent data
corruption in certain cases. As such, use the soft
option only when client responsiveness is more important
than data integrity. Using NFS over TCP or increasing
the value of the retrans option may mitigate some of the
risks of using the soft option.
timeo=n The time (in tenths of a second) the NFS client waits
for a response before it retries an NFS request. If this
option is not specified, requests are retried every 60
seconds for NFS over TCP. The NFS client does not
perform any kind of timeout backoff for NFS over TCP.
However, for NFS over UDP, the client uses an adaptive
algorithm to estimate an appropriate timeout value for
frequently used request types (such as READ and WRITE
requests), but uses the timeo setting for infrequently
used request types (such as FSINFO requests). If the
timeo option is not specified, infrequently used request
types are retried after 1.1 seconds. After each
retransmission, the NFS client doubles the timeout for
that request, up to a maximum timeout length of 60
seconds.
retrans=n The number of times the NFS client retries a request
before it attempts further recovery action. If the
retrans option is not specified, the NFS client tries
each request three times.
The NFS client generates a "server not responding"
message after retrans retries, then attempts further
recovery (depending on whether the hard mount option is
in effect).
rsize=n The maximum number of bytes in each network READ request
that the NFS client can receive when reading data from a
file on an NFS server. The actual data payload size of
each NFS READ request is equal to or smaller than the
rsize setting. The largest read payload supported by the
Linux NFS client is 1,048,576 bytes (one megabyte).
The rsize value is a positive integral multiple of 1024.
Specified rsize values lower than 1024 are replaced with
4096; values larger than 1048576 are replaced with
1048576. If a specified value is within the supported
range but not a multiple of 1024, it is rounded down to
the nearest multiple of 1024.
If an rsize value is not specified, or if the specified
rsize value is larger than the maximum that either
client or server can support, the client and server
negotiate the largest rsize value that they can both
support.
The rsize mount option as specified on the mount(8)
command line appears in the /etc/mtab file. However, the
effective rsize value negotiated by the client and
server is reported in the /proc/mounts file.
wsize=n The maximum number of bytes per network WRITE request
that the NFS client can send when writing data to a file
on an NFS server. The actual data payload size of each
NFS WRITE request is equal to or smaller than the wsize
setting. The largest write payload supported by the
Linux NFS client is 1,048,576 bytes (one megabyte).
Similar to rsize , the wsize value is a positive
integral multiple of 1024. Specified wsize values lower
than 1024 are replaced with 4096; values larger than
1048576 are replaced with 1048576. If a specified value
is within the supported range but not a multiple of
1024, it is rounded down to the nearest multiple of
1024.
If a wsize value is not specified, or if the specified
wsize value is larger than the maximum that either
client or server can support, the client and server
negotiate the largest wsize value that they can both
support.
The wsize mount option as specified on the mount(8)
command line appears in the /etc/mtab file. However, the
effective wsize value negotiated by the client and
server is reported in the /proc/mounts file.
ac / noac Selects whether the client may cache file attributes. If
neither option is specified (or if ac is specified), the
client caches file attributes.
To improve performance, NFS clients cache file
attributes. Every few seconds, an NFS client checks the
server’s version of each file’s attributes for updates.
Changes that occur on the server in those small
intervals remain undetected until the client checks the
server again. The noac option prevents clients from
caching file attributes so that applications can more
quickly detect file changes on the server.
In addition to preventing the client from caching file
attributes, the noac option forces application writes to
become synchronous so that local changes to a file
become visible on the server immediately. That way,
other clients can quickly detect recent writes when they
check the file’s attributes.
Using the noac option provides greater cache coherence
among NFS clients accessing the same files, but it
extracts a significant performance penalty. As such,
judicious use of file locking is encouraged instead.
The DATA AND METADATA COHERENCE section contains a
detailed discussion of these trade-offs.
acregmin=n The minimum time (in seconds) that the NFS client caches
attributes of a regular file before it requests fresh
attribute information from a server. If this option is
not specified, the NFS client uses a 3-second minimum.
acregmax=n The maximum time (in seconds) that the NFS client caches
attributes of a regular file before it requests fresh
attribute information from a server. If this option is
not specified, the NFS client uses a 60-second maximum.
acdirmin=n The minimum time (in seconds) that the NFS client caches
attributes of a directory before it requests fresh
attribute information from a server. If this option is
not specified, the NFS client uses a 30-second minimum.
acdirmax=n The maximum time (in seconds) that the NFS client caches
attributes of a directory before it requests fresh
attribute information from a server. If this option is
not specified, the NFS client uses a 60-second maximum.
actimeo=n Using actimeo sets all of acregmin, acregmax, acdirmin,
and acdirmax to the same value. If this option is not
specified, the NFS client uses the defaults for each of
these options listed above.
bg / fg Determines how the mount(8) command behaves if an
attempt to mount an export fails. The fg option causes
mount(8) to exit with an error status if any part of the
mount request times out or fails outright. This is
called a "foreground" mount, and is the default behavior
if neither the fg nor bg mount option is specified.
If the bg option is specified, a timeout or failure
causes the mount(8) command to fork a child which
continues to attempt to mount the export. The parent
immediately returns with a zero exit code. This is
known as a "background" mount.
If the local mount point directory is missing, the
mount(8) command acts as if the mount request timed out.
This permits nested NFS mounts specified in /etc/fstab
to proceed in any order during system initialization,
even if some NFS servers are not yet available.
Alternatively these issues can be addressed using an
automounter (refer to automount(8) for details).
retry=n The number of minutes that the mount(8) command retries
an NFS mount operation in the foreground or background
before giving up. If this option is not specified, the
default value for foreground mounts is 2 minutes, and
the default value for background mounts is 10000 minutes
(80 minutes shy of one week). If a value of zero is
specified, the mount(8) command exits immediately after
the first failure.
sec=mode The RPCGSS security flavor to use for accessing files on
this mount point. If the sec option is not specified,
or if sec=sys is specified, the NFS client uses the
AUTH_SYS security flavor for all NFS requests on this
mount point. Valid security flavors are none, sys,
krb5, krb5i, krb5p, lkey, lkeyi, lkeyp, spkm, spkmi, and
spkmp. Refer to the SECURITY CONSIDERATIONS section for
details.
sharecache / nosharecache
Determines how the client’s data cache and attribute
cache are shared when mounting the same export more than
once concurrently. Using the same cache reduces memory
requirements on the client and presents identical file
contents to applications when the same remote file is
accessed via different mount points.
If neither option is specified, or if the sharecache
option is specified, then a single cache is used for all
mount points that access the same export. If the
nosharecache option is specified, then that mount point
gets a unique cache. Note that when data and attribute
caches are shared, the mount options from the first
mount point take effect for subsequent concurrent mounts
of the same export.
As of kernel 2.6.18, the behavior specified by
nosharecache is legacy caching behavior. This is
considered a data risk since multiple cached copies of
the same file on the same client can become out of sync
following a local update of one of the copies.
resvport / noresvport
Specifies whether the NFS client should use a privileged
source port when communicating with an NFS server for
this mount point. If this option is not specified, or
the resvport option is specified, the NFS client uses a
privileged source port. If the noresvport option is
specified, the NFS client uses a non-privileged source
port. This option is supported in kernels 2.6.28 and
later.
Using non-privileged source ports helps increase the
maximum number of NFS mount points allowed on a client,
but NFS servers must be configured to allow clients to
connect via non-privileged source ports.
Refer to the SECURITY CONSIDERATIONS section for
important details.
lookupcache=mode
Specifies how the kernel manages its cache of directory
entries for a given mount point. mode can be one of
all, none, pos, or positive. This option is supported
in kernels 2.6.28 and later.
The Linux NFS client caches the result of all NFS LOOKUP
requests. If the requested directory entry exists on
the server, the result is referred to as positive. If
the requested directory entry does not exist on the
server, the result is referred to as negative.
If this option is not specified, or if all is specified,
the client assumes both types of directory cache entries
are valid until their parent directory’s cached
attributes expire.
If pos or positive is specified, the client assumes
positive entries are valid until their parent
directory’s cached attributes expire, but always
revalidates negative entires before an application can
use them.
If none is specified, the client revalidates both types
of directory cache entries before an application can use
them. This permits quick detection of files that were
created or removed by other clients, but can impact
application and server performance.
The DATA AND METADATA COHERENCE section contains a
detailed discussion of these trade-offs.
Valid options for the nfs file system type
Use these options, along with the options in the above subsection, for
mounting the nfs file system type.
proto=netid The transport protocol name and protocol family the NFS
client uses to transmit requests to the NFS server for
this mount point. If an NFS server has both an IPv4 and
an IPv6 address, using a specific netid will force the
use of IPv4 or IPv6 networking to communicate with that
server.
If support for TI-RPC is built into the mount.nfs
command, netid is a valid netid listed in
/etc/netconfig. Otherwise, netid is one of "tcp,"
"udp," or "rdma," and only IPv4 may be used.
Each transport protocol uses different default retrans
and timeo settings. Refer to the description of these
two mount options for details.
In addition to controlling how the NFS client transmits
requests to the server, this mount option also controls
how the mount(8) command communicates with the server’s
rpcbind and mountd services. Specifying a netid that
uses TCP forces all traffic from the mount(8) command
and the NFS client to use TCP. Specifying a netid that
uses UDP forces all traffic types to use UDP.
If the proto mount option is not specified, the mount(8)
command discovers which protocols the server supports
and chooses an appropriate transport for each service.
Refer to the TRANSPORT METHODS section for more details.
udp The udp option is an alternative to specifying
proto=udp. It is included for compatibility with other
operating systems.
tcp The tcp option is an alternative to specifying
proto=tcp. It is included for compatibility with other
operating systems.
port=n The numeric value of the server’s NFS service port. If
the server’s NFS service is not available on the
specified port, the mount request fails.
If this option is not specified, or if the specified
port value is 0, then the NFS client uses the NFS
service port number advertised by the server’s rpcbind
service. The mount request fails if the server’s
rpcbind service is not available, the server’s NFS
service is not registered with its rpcbind service, or
the server’s NFS service is not available on the
advertised port.
mountport=n The numeric value of the server’s mountd port. If the
server’s mountd service is not available on the
specified port, the mount request fails.
If this option is not specified, or if the specified
port value is 0, then the mount(8) command uses the
mountd service port number advertised by the server’s
rpcbind service. The mount request fails if the
server’s rpcbind service is not available, the server’s
mountd service is not registered with its rpcbind
service, or the server’s mountd service is not available
on the advertised port.
This option can be used when mounting an NFS server
through a firewall that blocks the rpcbind protocol.
mountproto=netid
The transport protocol name and protocol family the NFS
client uses to transmit requests to the NFS server’s
mountd service when performing this mount request, and
when later unmounting this mount point.
If support for TI-RPC is built into the mount.nfs
command, netid is a valid netid listed in
/etc/netconfig. Otherwise, netid is one of "tcp" or
"udp," and only IPv4 may be used.
This option can be used when mounting an NFS server
through a firewall that blocks a particular transport.
When used in combination with the proto option,
different transports for mountd requests and NFS
requests can be specified. If the server’s mountd
service is not available via the specified transport,
the mount request fails.
Refer to the TRANSPORT METHODS section for more on how
the mountproto mount option interacts with the proto
mount option.
mounthost=name The hostname of the host running mountd. If this option
is not specified, the mount(8) command assumes that the
mountd service runs on the same host as the NFS service.
mountvers=n The RPC version number used to contact the server’s
mountd. If this option is not specified, the client
uses a version number appropriate to the requested NFS
version. This option is useful when multiple NFS
services are running on the same remote server host.
namlen=n The maximum length of a pathname component on this
mount. If this option is not specified, the maximum
length is negotiated with the server. In most cases,
this maximum length is 255 characters.
Some early versions of NFS did not support this
negotiation. Using this option ensures that pathconf(3)
reports the proper maximum component length to
applications in such cases.
nfsvers=n The NFS protocol version number used to contact the
server’s NFS service. The Linux client supports version
2 and version 3 of the NFS protocol when using the file
system type nfs. If the server does not support the
requested version, the mount request fails. If this
option is not specified, the client attempts to use
version 3, but negotiates the NFS version with the
server if version 3 support is not available.
vers=n This option is an alternative to the nfsvers option. It
is included for compatibility with other operating
systems.
lock / nolock Selects whether to use the NLM sideband protocol to lock
files on the server. If neither option is specified (or
if lock is specified), NLM locking is used for this
mount point. When using the nolock option, applications
can lock files, but such locks provide exclusion only
against other applications running on the same client.
Remote applications are not affected by these locks.
NLM locking must be disabled with the nolock option when
using NFS to mount /var because /var contains files used
by the NLM implementation on Linux. Using the nolock
option is also required when mounting exports on NFS
servers that do not support the NLM protocol.
intr / nointr Selects whether to allow signals to interrupt file
operations on this mount point. If neither option is
specified (or if nointr is specified), signals do not
interrupt NFS file operations. If intr is specified,
system calls return EINTR if an in-progress NFS
operation is interrupted by a signal.
Using the intr option is preferred to using the soft
option because it is significantly less likely to result
in data corruption.
The intr / nointr mount option is deprecated after
kernel 2.6.25. Only SIGKILL can interrupt a pending NFS
operation on these kernels, and if specified, this mount
option is ignored to provide backwards compatibility
with older kernels.
cto / nocto Selects whether to use close-to-open cache coherence
semantics. If neither option is specified (or if cto is
specified), the client uses close-to-open cache
coherence semantics. If the nocto option is specified,
the client uses a non-standard heuristic to determine
when files on the server have changed.
Using the nocto option may improve performance for read-
only mounts, but should be used only if the data on the
server changes only occasionally. The DATA AND METADATA
COHERENCE section discusses the behavior of this option
in more detail.
acl / noacl Selects whether to use the NFSACL sideband protocol on
this mount point. The NFSACL sideband protocol is a
proprietary protocol implemented in Solaris that manages
Access Control Lists. NFSACL was never made a standard
part of the NFS protocol specification.
If neither acl nor noacl option is specified, the NFS
client negotiates with the server to see if the NFSACL
protocol is supported, and uses it if the server
supports it. Disabling the NFSACL sideband protocol may
be necessary if the negotiation causes problems on the
client or server. Refer to the SECURITY CONSIDERATIONS
section for more details.
rdirplus / nordirplus
Selects whether to use NFS version 3 READDIRPLUS
requests. If this option is not specified, the NFS
client uses READDIRPLUS requests on NFS version 3 mounts
to read small directories. Some applications perform
better if the client uses only READDIR requests for all
directories.
Valid options for the nfs4 file system type
Use these options, along with the options in the first subsection
above, for mounting the nfs4 file system type.
proto=netid The transport protocol name and protocol family the NFS
client uses to transmit requests to the NFS server for
this mount point. If an NFS server has both an IPv4 and
an IPv6 address, using a specific netid will force the
use of IPv4 or IPv6 networking to communicate with that
server.
If support for TI-RPC is built into the mount.nfs
command, netid is a valid netid listed in
/etc/netconfig. Otherwise, netid is one of "tcp" or
"udp," and only IPv4 may be used.
All NFS version 4 servers are required to support TCP,
so if this mount option is not specified, the NFS
version 4 client uses the TCP protocol. Refer to the
TRANSPORT METHODS section for more details.
port=n The numeric value of the server’s NFS service port. If
the server’s NFS service is not available on the
specified port, the mount request fails.
If this mount option is not specified, the NFS client
uses the standard NFS port number of 2049 without first
checking the server’s rpcbind service. This allows an
NFS version 4 client to contact an NFS version 4 server
through a firewall that may block rpcbind requests.
If the specified port value is 0, then the NFS client
uses the NFS service port number advertised by the
server’s rpcbind service. The mount request fails if
the server’s rpcbind service is not available, the
server’s NFS service is not registered with its rpcbind
service, or the server’s NFS service is not available on
the advertised port.
intr / nointr Selects whether to allow signals to interrupt file
operations on this mount point. If neither option is
specified (or if intr is specified), system calls return
EINTR if an in-progress NFS operation is interrupted by
a signal. If nointr is specified, signals do not
interrupt NFS operations.
Using the intr option is preferred to using the soft
option because it is significantly less likely to result
in data corruption.
The intr / nointr mount option is deprecated after
kernel 2.6.25. Only SIGKILL can interrupt a pending NFS
operation on these kernels, and if specified, this mount
option is ignored to provide backwards compatibility
with older kernels.
cto / nocto Selects whether to use close-to-open cache coherence
semantics for NFS directories on this mount point. If
neither cto nor nocto is specified, the default is to
use close-to-open cache coherence semantics for
directories.
File data caching behavior is not affected by this
option. The DATA AND METADATA COHERENCE section
discusses the behavior of this option in more detail.
clientaddr=n.n.n.n
Specifies a single IPv4 address (in dotted-quad form),
or a non-link-local IPv6 address, that the NFS client
advertises to allow servers to perform NFS version 4
callback requests against files on this mount point. If
the server is unable to establish callback connections
to clients, performance may degrade, or accesses to
files may temporarily hang.
If this option is not specified, the mount(8) command
attempts to discover an appropriate callback address
automatically. The automatic discovery process is not
perfect, however. In the presence of multiple client
network interfaces, special routing policies, or
atypical network topologies, the exact address to use
for callbacks may be nontrivial to determine.
MOUNT CONFIGURATION FILE
If the mount command is configured to do so, all of the mount options
described in the previous section can also be configured in the
/etc/nfsmount.conf file. See nfsmount.conf(5) for details.
EXAMPLES
To mount an export using NFS version 2, use the nfs file system type
and specify the nfsvers=2 mount option. To mount using NFS version 3,
use the nfs file system type and specify the nfsvers=3 mount option.
To mount using NFS version 4, use the nfs4 file system type. The
nfsvers mount option is not supported for the nfs4 file system type.
The following example from an /etc/fstab file causes the mount command
to negotiate reasonable defaults for NFS behavior.
server:/export /mnt nfs defaults 0 0
Here is an example from an /etc/fstab file for an NFS version 2 mount
over UDP.
server:/export /mnt nfs nfsvers=2,proto=udp 0 0
Try this example to mount using NFS version 4 over TCP with Kerberos 5
mutual authentication.
server:/export /mnt nfs4 sec=krb5 0 0
This example can be used to mount /usr over NFS.
server:/export /usr nfs ro,nolock,nocto,actimeo=3600 0 0
This example shows how to mount an NFS server using a raw IPv6 link-
local address.
[fe80::215:c5ff:fb3e:e2b1%eth0]:/export /mnt nfs defaults 0 0
TRANSPORT METHODS
NFS clients send requests to NFS servers via Remote Procedure Calls, or
RPCs. The RPC client discovers remote service endpoints automatically,
handles per-request authentication, adjusts request parameters for
different byte endianness on client and server, and retransmits
requests that may have been lost by the network or server. RPC
requests and replies flow over a network transport.
In most cases, the mount(8) command, NFS client, and NFS server can
automatically negotiate proper transport and data transfer size
settings for a mount point. In some cases, however, it pays to specify
these settings explicitly using mount options.
Traditionally, NFS clients used the UDP transport exclusively for
transmitting requests to servers. Though its implementation is simple,
NFS over UDP has many limitations that prevent smooth operation and
good performance in some common deployment environments. Even an
insignificant packet loss rate results in the loss of whole NFS
requests; as such, retransmit timeouts are usually in the subsecond
range to allow clients to recover quickly from dropped requests, but
this can result in extraneous network traffic and server load.
However, UDP can be quite effective in specialized settings where the
networks MTU is large relative to NFSs data transfer size (such as
network environments that enable jumbo Ethernet frames). In such
environments, trimming the rsize and wsize settings so that each NFS
read or write request fits in just a few network frames (or even in a
single frame) is advised. This reduces the probability that the loss
of a single MTU-sized network frame results in the loss of an entire
large read or write request.
TCP is the default transport protocol used for all modern NFS
implementations. It performs well in almost every conceivable network
environment and provides excellent guarantees against data corruption
caused by network unreliability. TCP is often a requirement for
mounting a server through a network firewall.
Under normal circumstances, networks drop packets much more frequently
than NFS servers drop requests. As such, an aggressive retransmit
timeout setting for NFS over TCP is unnecessary. Typical timeout
settings for NFS over TCP are between one and ten minutes. After the
client exhausts its retransmits (the value of the retrans mount
option), it assumes a network partition has occurred, and attempts to
reconnect to the server on a fresh socket. Since TCP itself makes
network data transfer reliable, rsize and wsize can safely be allowed
to default to the largest values supported by both client and server,
independent of the network’s MTU size.
Using the mountproto mount option
This section applies only to NFS version 2 and version 3 mounts since
NFS version 4 does not use a separate protocol for mount requests.
The Linux NFS client can use a different transport for contacting an
NFS server’s rpcbind service, its mountd service, its Network Lock
Manager (NLM) service, and its NFS service. The exact transports
employed by the Linux NFS client for each mount point depends on the
settings of the transport mount options, which include proto,
mountproto, udp, and tcp.
The client sends Network Status Manager (NSM) notifications via UDP no
matter what transport options are specified, but listens for server NSM
notifications on both UDP and TCP. The NFS Access Control List
(NFSACL) protocol shares the same transport as the main NFS service.
If no transport options are specified, the Linux NFS client uses UDP to
contact the server’s mountd service, and TCP to contact its NLM and NFS
services by default.
If the server does not support these transports for these services, the
mount(8) command attempts to discover what the server supports, and
then retries the mount request once using the discovered transports.
If the server does not advertise any transport supported by the client
or is misconfigured, the mount request fails. If the bg option is in
effect, the mount command backgrounds itself and continues to attempt
the specified mount request.
When the proto option, the udp option, or the tcp option is specified
but the mountproto option is not, the specified transport is used to
contact both the server’s mountd service and for the NLM and NFS
services.
If the mountproto option is specified but none of the proto, udp or tcp
options are specified, then the specified transport is used for the
initial mountd request, but the mount command attempts to discover what
the server supports for the NFS protocol, preferring TCP if both
transports are supported.
If both the mountproto and proto (or udp or tcp) options are specified,
then the transport specified by the mountproto option is used for the
initial mountd request, and the transport specified by the proto option
(or the udp or tcp options) is used for NFS, no matter what order these
options appear. No automatic service discovery is performed if these
options are specified.
If any of the proto, udp, tcp, or mountproto options are specified more
than once on the same mount command line, then the value of the
rightmost instance of each of these options takes effect.
DATA AND METADATA COHERENCE
Some modern cluster file systems provide perfect cache coherence among
their clients. Perfect cache coherence among disparate NFS clients is
expensive to achieve, especially on wide area networks. As such, NFS
settles for weaker cache coherence that satisfies the requirements of
most file sharing types. Normally, file sharing is completely
sequential: first client A opens a file, writes something to it, then
closes it; then client B opens the same file, and reads the changes.
Close-to-open cache consistency
When an application opens a file stored on an NFS server, the NFS
client checks that it still exists on the server and is permitted to
the opener by sending a GETATTR or ACCESS request. When the
application closes the file, the NFS client writes back any pending
changes to the file so that the next opener can view the changes. This
also gives the NFS client an opportunity to report any server write
errors to the application via the return code from close(2). The
behavior of checking at open time and flushing at close time is
referred to as close-to-open cache consistency.
Weak cache consistency
There are still opportunities for a client’s data cache to contain
stale data. The NFS version 3 protocol introduced "weak cache
consistency" (also known as WCC) which provides a way of efficiently
checking a file’s attributes before and after a single request. This
allows a client to help identify changes that could have been made by
other clients.
When a client is using many concurrent operations that update the same
file at the same time (for example, during asynchronous write behind),
it is still difficult to tell whether it was that client’s updates or
some other client’s updates that altered the file.
Attribute caching
Use the noac mount option to achieve attribute cache coherence among
multiple clients. Almost every file system operation checks file
attribute information. The client keeps this information cached for a
period of time to reduce network and server load. When noac is in
effect, a client’s file attribute cache is disabled, so each operation
that needs to check a file’s attributes is forced to go back to the
server. This permits a client to see changes to a file very quickly,
at the cost of many extra network operations.
Be careful not to confuse the noac option with "no data caching." The
noac mount option prevents the client from caching file metadata, but
there are still races that may result in data cache incoherence between
client and server.
The NFS protocol is not designed to support true cluster file system
cache coherence without some type of application serialization. If
absolute cache coherence among clients is required, applications should
use file locking. Alternatively, applications can also open their files
with the O_DIRECT flag to disable data caching entirely.
Directory entry caching
The Linux NFS client caches the result of all NFS LOOKUP requests. If
the requested directory entry exists on the server, the result is
referred to as a positive lookup result. If the requested directory
entry does not exist on the server (that is, the server returned
ENOENT), the result is referred to as negative lookup result.
To detect when directory entries have been added or removed on the
server, the Linux NFS client watches a directory’s mtime. If the
client detects a change in a directory’s mtime, the client drops all
cached LOOKUP results for that directory. Since the directory’s mtime
is a cached attribute, it may take some time before a client notices it
has changed. See the descriptions of the acdirmin, acdirmax, and noac
mount options for more information about how long a directory’s mtime
is cached.
Caching directory entries improves the performance of applications that
do not share files with applications on other clients. Using cached
information about directories can interfere with applications that run
concurrently on multiple clients and need to detect the creation or
removal of files quickly, however. The lookupcache mount option allows
some tuning of directory entry caching behavior.
Before kernel release 2.6.28, the Linux NFS client tracked only
positive lookup results. This permitted applications to detect new
directory entries created by other clients quickly while still
providing some of the performance benefits of caching. If an
application depends on the previous lookup caching behavior of the
Linux NFS client, you can use lookupcache=positive.
If the client ignores its cache and validates every application lookup
request with the server, that client can immediately detect when a new
directory entry has been either created or removed by another client.
You can specify this behavior using lookupcache=none. The extra NFS
requests needed if the client does not cache directory entries can
exact a performance penalty. Disabling lookup caching should result in
less of a performance penalty than using noac, and has no effect on how
the NFS client caches the attributes of files.
The sync mount option
The NFS client treats the sync mount option differently than some other
file systems (refer to mount(8) for a description of the generic sync
and async mount options). If neither sync nor async is specified (or
if the async option is specified), the NFS client delays sending
application writes to the server until any of these events occur:
Memory pressure forces reclamation of system memory resources.
An application flushes file data explicitly with sync(2),
msync(2), or fsync(3).
An application closes a file with close(2).
The file is locked/unlocked via fcntl(2).
In other words, under normal circumstances, data written by an
application may not immediately appear on the server that hosts the
file.
If the sync option is specified on a mount point, any system call that
writes data to files on that mount point causes that data to be flushed
to the server before the system call returns control to user space.
This provides greater data cache coherence among clients, but at a
significant performance cost.
Applications can use the O_SYNC open flag to force application writes
to individual files to go to the server immediately without the use of
the sync mount option.
Using file locks with NFS
The Network Lock Manager protocol is a separate sideband protocol used
to manage file locks in NFS version 2 and version 3. To support lock
recovery after a client or server reboot, a second sideband protocol --
known as the Network Status Manager protocol -- is also required. In
NFS version 4, file locking is supported directly in the main NFS
protocol, and the NLM and NSM sideband protocols are not used.
In most cases, NLM and NSM services are started automatically, and no
extra configuration is required. Configure all NFS clients with fully-
qualified domain names to ensure that NFS servers can find clients to
notify them of server reboots.
NLM supports advisory file locks only. To lock NFS files, use fcntl(2)
with the F_GETLK and F_SETLK commands. The NFS client converts file
locks obtained via flock(2) to advisory locks.
When mounting servers that do not support the NLM protocol, or when
mounting an NFS server through a firewall that blocks the NLM service
port, specify the nolock mount option. NLM locking must be disabled
with the nolock option when using NFS to mount /var because /var
contains files used by the NLM implementation on Linux.
Specifying the nolock option may also be advised to improve the
performance of a proprietary application which runs on a single client
and uses file locks extensively.
NFS version 4 caching features
The data and metadata caching behavior of NFS version 4 clients is
similar to that of earlier versions. However, NFS version 4 adds two
features that improve cache behavior: change attributes and file
delegation.
The change attribute is a new part of NFS file and directory metadata
which tracks data changes. It replaces the use of a file’s
modification and change time stamps as a way for clients to validate
the content of their caches. Change attributes are independent of the
time stamp resolution on either the server or client, however.
A file delegation is a contract between an NFS version 4 client and
server that allows the client to treat a file temporarily as if no
other client is accessing it. The server promises to notify the client
(via a callback request) if another client attempts to access that
file. Once a file has been delegated to a client, the client can cache
that file’s data and metadata aggressively without contacting the
server.
File delegations come in two flavors: read and write. A read
delegation means that the server notifies the client about any other
clients that want to write to the file. A write delegation means that
the client gets notified about either read or write accessors.
Servers grant file delegations when a file is opened, and can recall
delegations at any time when another client wants access to the file
that conflicts with any delegations already granted. Delegations on
directories are not supported.
In order to support delegation callback, the server checks the network
return path to the client during the client’s initial contact with the
server. If contact with the client cannot be established, the server
simply does not grant any delegations to that client.
SECURITY CONSIDERATIONS
NFS servers control access to file data, but they depend on their RPC
implementation to provide authentication of NFS requests. Traditional
NFS access control mimics the standard mode bit access control provided
in local file systems. Traditional RPC authentication uses a number to
represent each user (usually the user’s own uid), a number to represent
the user’s group (the user’s gid), and a set of up to 16 auxiliary
group numbers to represent other groups of which the user may be a
member.
Typically, file data and user ID values appear unencrypted (i.e. "in
the clear") on the network. Moreover, NFS versions 2 and 3 use
separate sideband protocols for mounting, locking and unlocking files,
and reporting system status of clients and servers. These auxiliary
protocols use no authentication.
In addition to combining these sideband protocols with the main NFS
protocol, NFS version 4 introduces more advanced forms of access
control, authentication, and in-transit data protection. The NFS
version 4 specification mandates NFSv4 ACLs, RPCGSS authentication, and
RPCGSS security flavors that provide per-RPC integrity checking and
encryption. Because NFS version 4 combines the function of the
sideband protocols into the main NFS protocol, the new security
features apply to all NFS version 4 operations including mounting, file
locking, and so on. RPCGSS authentication can also be used with NFS
versions 2 and 3, but does not protect their sideband protocols.
The sec mount option specifies the RPCGSS security mode that is in
effect on a given NFS mount point. Specifying sec=krb5 provides
cryptographic proof of a user’s identity in each RPC request. This
provides strong verification of the identity of users accessing data on
the server. Note that additional configuration besides adding this
mount option is required in order to enable Kerberos security. Refer
to the rpc.gssd(8) man page for details.
Two additional flavors of Kerberos security are supported: krb5i and
krb5p. The krb5i security flavor provides a cryptographically strong
guarantee that the data in each RPC request has not been tampered with.
The krb5p security flavor encrypts every RPC request to prevent data
exposure during network transit; however, expect some performance
impact when using integrity checking or encryption. Similar support
for other forms of cryptographic security (such as lipkey and SPKM3) is
also available.
The NFS version 4 protocol allows clients and servers to negotiate
among multiple security flavors during mount processing. However,
Linux does not yet implement such negotiation. The Linux client
specifies a single security flavor at mount time which remains in
effect for the lifetime of the mount. If the server does not support
this flavor, the initial mount request is rejected by the server.
Using non-privileged source ports
NFS clients usually communicate with NFS servers via network sockets.
Each end of a socket is assigned a port value, which is simply a number
between 1 and 65535 that distinguishes socket endpoints at the same IP
address. A socket is uniquely defined by a tuple that includes the
transport protocol (TCP or UDP) and the port values and IP addresses of
both endpoints.
The NFS client can choose any source port value for its sockets, but
usually chooses a privileged port. A privileged port is a port value
less than 1024. Only a process with root privileges may create a
socket with a privileged source port.
The exact range of privileged source ports that can be chosen is set by
a pair of sysctls to avoid choosing a well-known port, such as the port
used by ssh. This means the number of source ports available for the
NFS client, and therefore the number of socket connections that can be
used at the same time, is practically limited to only a few hundred.
As described above, the traditional default NFS authentication scheme,
known as AUTH_SYS, relies on sending local UID and GID numbers to
identify users making NFS requests. An NFS server assumes that if a
connection comes from a privileged port, the UID and GID numbers in the
NFS requests on this connection have been verified by the client’s
kernel or some other local authority. This is an easy system to spoof,
but on a trusted physical network between trusted hosts, it is entirely
adequate.
Roughly speaking, one socket is used for each NFS mount point. If a
client could use non-privileged source ports as well, the number of
sockets allowed, and thus the maximum number of concurrent mount
points, would be much larger.
Using non-privileged source ports may compromise server security
somewhat, since any user on AUTH_SYS mount points can now pretend to be
any other when making NFS requests. Thus NFS servers do not support
this by default. They explicitly allow it usually via an export
option.
To retain good security while allowing as many mount points as
possible, it is best to allow non-privileged client connections only if
the server and client both require strong authentication, such as
Kerberos.
Mounting through a firewall
A firewall may reside between an NFS client and server, or the client
or server may block some of its own ports via IP filter rules. It is
still possible to mount an NFS server through a firewall, though some
of the mount(8) command’s automatic service endpoint discovery
mechanisms may not work; this requires you to provide specific endpoint
details via NFS mount options.
NFS servers normally run a portmapper or rpcbind daemon to advertise
their service endpoints to clients. Clients use the rpcbind daemon to
determine:
What network port each RPC-based service is using
What transport protocols each RPC-based service supports
The rpcbind daemon uses a well-known port number (111) to help clients
find a service endpoint. Although NFS often uses a standard port
number (2049), auxiliary services such as the NLM service can choose
any unused port number at random.
Common firewall configurations block the well-known rpcbind port. In
the absense of an rpcbind service, the server administrator fixes the
port number of NFS-related services so that the firewall can allow
access to specific NFS service ports. Client administrators then
specify the port number for the mountd service via the mount(8)
command’s mountport option. It may also be necessary to enforce the
use of TCP or UDP if the firewall blocks one of those transports.
NFS Access Control Lists
Solaris allows NFS version 3 clients direct access to POSIX Access
Control Lists stored in its local file systems. This proprietary
sideband protocol, known as NFSACL, provides richer access control than
mode bits. Linux implements this protocol for compatibility with the
Solaris NFS implementation. The NFSACL protocol never became a
standard part of the NFS version 3 specification, however.
The NFS version 4 specification mandates a new version of Access
Control Lists that are semantically richer than POSIX ACLs. NFS
version 4 ACLs are not fully compatible with POSIX ACLs; as such, some
translation between the two is required in an environment that mixes
POSIX ACLs and NFS version 4.
FILES
/etc/fstab file system table
BUGS
The generic remount option is not fully supported. Generic options,
such as rw and ro can be modified using the remount option, but NFS-
specific options are not all supported. The underlying transport or
NFS version cannot be changed by a remount, for example. Performing a
remount on an NFS file system mounted with the noac option may have
unintended consequences. The noac option is a mixture of a generic
option, sync, and an NFS-specific option actimeo=0.
Before 2.4.7, the Linux NFS client did not support NFS over TCP.
Before 2.4.20, the Linux NFS client used a heuristic to determine
whether cached file data was still valid rather than using the standard
close-to-open cache coherency method described above.
Starting with 2.4.22, the Linux NFS client employs a Van Jacobsen-based
RTT estimator to determine retransmit timeout values when using NFS
over UDP.
Before 2.6.0, the Linux NFS client did not support NFS version 4.
Before 2.6.8, the Linux NFS client used only synchronous reads and
writes when the rsize and wsize settings were smaller than the system’s
page size.
The Linux NFS client does not yet support certain optional features of
the NFS version 4 protocol, such as security negotiation, server
referrals, and named attributes.
SEE ALSO
fstab(5), mount(8), umount(8), mount.nfs(5), umount.nfs(5), exports(5),
netconfig(5), ipv6(7), nfsd(8), sm-notify(8), rpc.statd(8),
rpc.idmapd(8), rpc.gssd(8), rpc.svcgssd(8), kerberos(1)
RFC 768 for the UDP specification.
RFC 793 for the TCP specification.
RFC 1094 for the NFS version 2 specification.
RFC 1813 for the NFS version 3 specification.
RFC 1832 for the XDR specification.
RFC 1833 for the RPC bind specification.
RFC 2203 for the RPCSEC GSS API protocol specification.
RFC 3530 for the NFS version 4 specification.
2 November 2007