NAME
tcpslice - extract pieces of and/or merge together tcpdump files
SYNOPSIS
tcpslice [ -DdlRrt ] [ -w file ]
[ start-time [ end-time ] ] file ...
DESCRIPTION
Tcpslice is a program for extracting portions of packet-trace files
generated using tcpdump(l)’s -w flag. It can also be used to merge
together several such files, as discussed below.
The basic operation of tcpslice is to copy to stdout all packets from
its input file(s) whose timestamps fall within a given range. The
starting and ending times of the range may be specified on the command
line. All ranges are inclusive. The starting time defaults to the
earliest time of the first packet in any of the input files; we call
this the first time. The ending time defaults to ten years after the
starting time. Thus, the command tcpslice trace-file simply copies
trace-file to stdout (assuming the file does not include more than ten
years’ worth of data).
There are a number of ways to specify times. The first is using Unix
timestamps of the form sssssssss.uuuuuu (this is the format specified
by tcpdump’s -tt flag). For example, 654321098.7654 specifies 38
seconds and 765,400 microseconds after 8:51PM PDT, Sept. 25, 1990.
All examples in this manual are given for PDT times, but when
displaying times and interpreting times symbolically as discussed
below, tcpslice uses the local timezone, regardless of the timezone in
which the tcpdump file was generated. The daylight-savings setting
used is that which is appropriate for the local timezone at the date in
question. For example, times associated with summer months will
usually include daylight-savings effects, and those with winter months
will not.
Times may also be specified relative to either the first time (when
specifying a starting time) or the starting time (when specifying an
ending time) by preceding a numeric value in seconds with a ‘+’. For
example, a starting time of +200 indicates 200 seconds after the first
time, and the two arguments +200 +300 indicate from 200 seconds after
the first time through 500 seconds after the first time.
Times may also be specified in terms of years (y), months (m), days
(d), hours (h), minutes (m), seconds (s), and microseconds(u). For
example, the Unix timestamp 654321098.7654 discussed above could also
be expressed as 1990y9m25d20h51m38s765400u. 2 or 4 digit years may be
used; 2 digits can specify years from 1970 to 2069.
When specifying times using this style, fields that are omitted default
as follows. If the omitted field is a unit greater than that of the
first specified field, then its value defaults to the corresponding
value taken from either first time (if the starting time is being
specified) or the starting time (if the ending time is being
specified). If the omitted field is a unit less than that of the first
specified field, then it defaults to zero. For example, suppose that
the input file has a first time of the Unix timestamp mentioned above,
i.e., 38 seconds and 765,400 microseconds after 8:51PM PDT, Sept. 25,
1990. To specify 9:36PM PDT (exactly) on the same date we could use
21h36m. To specify a range from 9:36PM PDT through 1:54AM PDT the next
day we could use 21h36m 26d1h54m.
Relative times can also be specified when using the ymdhmsu format.
Omitted fields then default to 0 if the unit of the field is greater
than that of the first specified field, and to the corresponding value
taken from either the first time or the starting time if the omitted
field’s unit is less than that of the first specified field. Given a
first time of the Unix timestamp mentioned above, 22h +1h10m specifies
a range from 10:00PM PDT on that date through 11:10PM PDT, and +1h
+1h10m specifies a range from 38.7654 seconds after 9:51PM PDT through
38.7654 seconds after 11:01PM PDT. The first hour of the file could be
extracted using +0 +1h.
Note that with the ymdhmsu format there is an ambiguity between using m
for ‘month’ or for ‘minute’. The ambiguity is resolved as follows: if
an m field is followed by a d field then it is interpreted as
specifying months; otherwise it specifies minutes.
If more than one input file is specified then tcpslice merges the
packets from the various input files into the single output file.
Normally, this merge is done based on the value of the timestamps in
the packets in the individual files. (Tcpslice assumes that within
each input file, packets are in timestamp order.) If the -l option is
used, the value used for ordering is the timestamp of a given packet
minus the timestamp of the first packet in the input file in which the
given packet occurs.
When merging files, by default tcpslice will discard any duplicate
packet it finds in more than one file. A duplicate is a packet that
has an identical timestamp (either relative or absolute) and identical
packet contents (for as much as was captured) as another packet
previously seen in a different file. Note that it is possible for the
network to generate true replicates of packets, and for systems that
can return the same timestamp for multiple packets, these can be
mistaken for duplicates and discarded. Accordingly, tcpslice will not
discard duplicates in the same trace file. In addition, you can use
the -D option to suppress any discarding of duplicates.
A different issue arises if a file contains timestamps that skip
backwards. tcpslice will include these in the output, even if they
precede the minimum time requested. There should probably be an option
to suppress these.
Another problem relating to backwards timestamps is that tcpslice uses
random access to seek through a file looking for packets corresponding
to the desired range of time. While doing so leads to a major
performance benefit for very large trace files, it also means that in
the presence of backwards timestamps tcpslice can fail to find the true
earliest occurrence of a packet matching the time interval criteria.
There should probably be an option to specify not to use random access
but just read the file linearly.
OPTIONS
If any of -R, -r or -t are specified then tcpslice reports the
timestamps of the first and last packets in each input file and exits.
Only one of these three options may be specified.
-D Do not discard duplicate packets seen when merging multiple
trace files.
-d Dump the start and end times specified by the given range and
exit. This option is useful for checking that the given range
actually specifies the times you think it does. If one of -R,
-r or -t has been specified then the times are dumped in the
corresponding format; otherwise, raw format ( -R) is used.
-l When merging more than one file, merge on the basis of relative
time, rather than absolute time. Normally, when merging files
is done, packets are merged based on absolute timestamps. With
-l packets are merged based on the relative time between the
start of the file in which the packet is found and the timestamp
of the packet itself. The timestamp of packets in the output
file is calculated as the relative time for the packet within
its file plus first time.
-R Dump the timestamps of the first and last packets in each input
file as raw timestamps (i.e., in the form sssssssss.uuuuuu).
-r Same as -R except the timestamps are dumped in human-readable
format, similar to that used by date(1).
-t Same as -R except the timestamps are dumped in tcpslice format,
i.e., in the ymdhmsu format discussed above.
-w Direct the output to file rather than stdout.
SEE ALSO
tcpdump(l)
AUTHOR
Vern Paxson, of Lawrence Berkeley Laboratory, University of California,
Berkeley, CA.
The current version is available via anonymous ftp:
ftp://ftp.ee.lbl.gov/tcpslice.tar.Z
BUGS
Please send bug reports to tcpslice@ee.lbl.gov.
An input filename that beings with a digit or a ‘+’ can be confused
with a start/end time. Such filenames can be specified with a leading
‘./’; for example, specify the file ‘04Jul76.trace’ as
‘./04Jul76.trace’.
tcpslice cannot read its input from stdin, since it uses random-access
to rummage through its input files.
tcpslice refuses to write to its output if it is a terminal (as
indicated by isatty(3)). This is not a bug but a feature, to prevent
it from spraying binary data to the user’s terminal. Note that this
means you must either redirect stdout or specify an output file via -w.
tcpslice will not work properly on tcpdump files spanning more than one
year; with files containing portions of packets whose original length
was more than 65,535 bytes; nor with files containing fewer than two
packets. Such files result in the error message: ‘couldn’t find final
packet in file’. These problems are due to the interpolation scheme
used by tcpslice to greatly speed up its processing when dealing with
large trace files. Note that tcpslice can efficiently extract slices
from the middle of trace files of any size, and can also work with
truncated trace files (i.e., the final packet in the file is only
partially present, typically due to tcpdump being ungracefully killed).
Adding -l has broken some compatibility with older versions, since
tcpslice now merges its input files, rather than (approximately)
concatenating them together as it did previously.
It would sometimes be convenient if you could specify a clock offset to
use with the -l option.
It would be nice if tcpslice supported more general editing of trace
files.
24 February 2000