NAME
pmccabe - calculate McCabe cyclomatic complexity or non-commented line
counts for C and C++ programs
SYNOPSIS
pmccabe [-bCdfFntTvV?] [file(s)]
DESCRIPTION
pmccabe processes the named files, or standard input if none are named.
In default mode it calculates statistics including McCabe cyclomatic
complexity for each function. The files are expected to be either C
(ANSI or K&R) or C++.
-? Print an informative usage message.
-v Print column headers
-V Print pmccabe version number
De-commenting mode
-d Intended to help count non-commented source lines via something
like:
pmccabe -d *.c | grep -v '^[<blank><tab>]*$' | wc -l
Comments are removed, cpp directives are replaced by cpp, string
literals are replaced by STRINGLITERAL, character constants are
replaced by CHARLITERAL. The resulting source code is much
easier to parse. This is the first step performed by pmccabe so
that its parser can be simpler.
None of the other options work sensibly with -d.
Line-counting mode
-n Counts non-commented source lines. The output format is
identical to that of the anac program except that column headers
and totals must be requested if desired. If you want column
headers add -v. If you want totals add -t. If all you want is
totals add -T.
Complexity mode (default)
-C Custom output format - don't use it.
-c Report non-commented, non-blank lines per function (and file)
instead of the raw number of lines. Note that pre-processor
directives are NOT counted.
-b Output format compatible with compiler error browsing tools
which understand "classic" compiler errors. Numerical sorting
on this format is possible using:
sort -n +1 -t%
-t Print column totals. Note the total number of lines is *NOT*
the number of non-commented source lines - it's the same as
would be reported by "wc -l".
-T Print column totals *ONLY*.
-f Include per-file totals along with the per-function totals.
-F Print per-file totals but NOT per-function totals.
Parsing
pmccabe ignores all cpp preprocessor directives - calculating the
complexity of the appearance of the code rather than the complexity
after the preprocessor mangles the code. This is especially important
since simple things like getchar(3) expand into macros which increase
complexity.
Output Format
A line is written to standard output for each function found of the
form:
Modified McCabe Cyclomatic Complexity
| Traditional McCabe Cyclomatic Complexity
| | # Statements in function
| | | First line of function
| | | | # lines in function
| | | | | filename(definition line number):function
| | | | | |
5 6 11 34 27 gettoken.c(35): matchparen
Column 1 contains cyclomatic complexity calculated by adding 1 (for the
function) to the occurences of for, if, while, switch, &&, ||, and ?.
Unlike "normal" McCabe cyclomatic complexity, each case in a switch
statement is not counted as additional complexity. This treatment of
switch statements and complexity may be more useful than the "normal"
measure for judging maintenance effort and code difficulty.
Column 2 is the cyclomatic complexity calculated in the "usual" way
with regard to switch statements. Specifically it is calculated as in
column 1 but counting each case rather than the switch and may be more
useful than column 1 for judging testing effort.
Column 3 contains a statement count. It is calculated by adding each
occurence of for, if, while, switch, ?, and semicolon within the
function. One possible surprise is that for statements have a minimum
statement count of 3. This is realistic since for(A; B; C){...} is
really shorthand for A; while (B) { ... C;}. The number of statements
within a file is the sum of the number of statements for each function
implemented within that file, plus one for each of those functions
(because functions are statements too), plus one for each other file-
scoped statement (usually declarations).
Column 4 contains the first line number in the function. This is not
necessarily the same line on which the function name appears.
Column 5 is the number of lines of the function, from the number in
column 4 through the line containing the closing curly brace.
The final column contains the file name, line number on which the
function name occurs, and the name of the function.
APPLICATIONS
The obvious application of pmccabe is illustrated by the following
which gives a list of the "top ten" most complex functions:
pmccabe *.c | sort -nr | head -10
Many files contain more than one C function and sometimes it would be
useful to extract each function separately. matchparen() (see example
output above) can be extracted from gettoken.c by extracting 27 lines
starting with line 34. This can form the basis of tools which operate
on functions instead of files (e.g., use as a front-end for diff(1)).
DIAGNOSTICS
pmccabe returns a nonzero exit status if files could not be opened and
upon encountering some parsing errors.
Error messages to standard error, usually explaining that the parser is
confused about something, mimic classic C compiler error messages.
WARNINGS
pmccabe is confused by unmatched curly braces or parentheses which
sometimes occur with hasty use of cpp directives. In these cases a
diagnostic is printed and the complexity results for the files named
may be unreliable. Most times the "#ifdef" directives may be modified
such that the curly braces match. Note that if pmccabe is confused by
a cpp directive, most pretty printers will be too. In some cases,
preprocessing with unifdef(1) may be appropriate.
Statement counting could arguably be improved by: counting occurences
of the comma operator, multiple assignments, assignments within
conditional tests, and logical conjunction. However since there is no
crisp statement definition from the language or from people I've
queried, statement counting will probably not be improved. If you have
a crisp definition I'll be happy to consider it.
Templates cause pmccabe's scanner to exit.
It's a shame that ctags output isn't provided.
AUTHOR
Paul Bame
SEE ALSO
codechanges(1), decomment(1), vifn(1), sort(1), diff(1), wc(1),
grep(1), unifdef(1), head(1), anac(1)
http://parisc-linux.org/~bame/pmccabe/