NAME
p2c - Pascal to C translator, version 1.21alpha-07.Dec.93
SYNOPSIS
p2c [ options ] [ file [ module ] ]
DESCRIPTION
P2c is a tool for translating Pascal programs into C. The input
consists of a set of source files in any of the following Pascal
dialects: HP Pascal, Turbo/UCSD Pascal, DEC VAX Pascal, Oregon Software
Pascal/2, Macintosh Programmer’s Workshop Pascal, Sun/Berkeley Pascal,
Texas Instruments Pascal, Apollo Domain Pascal. Modula-2 syntax is
also supported. Output is a set of .c and .h files that comprise an
equivalent program in any of several dialects of C. Output code may be
kept machine- and dialect-independent, or it may be targeted to a
specific machine and compiler. Most reasonable Pascal programs are
converted into fully functional C which will compile and run with no
further modifications, although p2c sometimes chooses to generate
readable code at the expense of absolute generality. P2c endeavors to
insert notes and warning messages into the output code to point out
areas which may require human intervention. Output code is arranged to
be readable and efficient, and to make use of C idioms wherever
possible. The main goal of the translation is to produce C files which
are pleasant and "natural" enough to be acceptable as the new source
files for a program. In a pinch, p2c will also serve as an ad hoc
Pascal compiler. The p2cc(1) script makes it easy to use p2c as a
compiler.
Code generated by p2c normally does not assume characters are signed or
unsigned. Also, it assumes int is the same as either short or long but
does not depend on which. However, if int is not the same as long it
is best to use a modern C compiler which supports prototypes.
Generated code does not require an ANSI-compatible compiler (unless
ANSI-style code is requested), but it does use various ANSI-standard
library routines.
All generated code includes the file <p2c/p2c.h> which in turn includes
<stdio.h> and various other common resources. Also, many translated
programs will need to be linked with the run-time library, typically
-lp2c.
Given a file name, p2c reads from the specified file and outputs to a
file with a .c suffix added or substituted. For example,
p2c myfile.pas
reads from myfile.pas to produce the file myfile.c. The input file may
contain a Pascal main program or a single Pascal module (or "unit" in
Turbo and UCSD Pascal nomenclature), or it may just contain a number of
procedures and declarations. P2c is designed to work for correct input
programs. That is, it will accept partial programs but may
occasionally core dump if the input refers to undefined symbols.
If the input is a module, the translator will also produce a file
module.h containing a translation of the module’s interface section.
The implementation section may be omitted in which case only the .h
file will be interesting. If the program or module has include files,
these may cause additional .c files to be generated depending on the
value of the ExpandIncludes option (see below).
If no file name is given, p2c reads Pascal from the standard input and
writes the resulting C to standard output (though a .h file may still
be produced). If a file name and module name are given, the file may
include several modules (or units). The specified module is
translated; any others are skipped. The output files will be named
module.c and module.h. P2c never translates more than one module per
run.
Before starting, p2c reads the file --HOMEDIR--/p2crc for a number of
configuration parameters. (The actual path used on your system may
vary. The -i option is a handy way to examine this file.) If the
P2CRC environment variable is set, it gives the name of a file to read
instead of the system file; this file can start with Include %H/p2crc
to include the system file. Next, p2c attempts to read the file p2crc
in your directory for further configuration. If this file does not
exist, p2c looks for .p2crc instead.
OPTIONS
-o cfile
Use cfile in place of file.c or module.c as the primary output
file. A single dash (‘-o -’) says to write the C code to the
standard output.
-h hfile
Use hfile in place of module.h as the output file for interface
text. This only has effect if the input is an HP Pascal module
or a Turbo Pascal unit.
-s sfile
Read interface text from sfile before beginning the translation.
This file typically contains one or more modules, often with
implementation sections omitted for speed, which the program or
module being translated will use. (Typically the ImportFrom and
ImportDir parameters in p2crc are set up to allow p2c to locate
interface text without needing any -s options.) If there are
several -s options in the command, the sfiles are read from left
to right.
-pn Display progress of translation in the form of a line
number/file name display. This is refreshed every n lines, 25
by default.
-c rcfile
Read local configuration commands from rcfile instead of p2crc
or .p2crc. A dash (‘-c -’) in place of rcfile causes no local
configuration file to be used.
-v ("Vanilla.") Do not read from the system configuration file
--HOMEDIR--/p2crc. Since some of the parameters in this file
are required, your local configuration file must include those
parameters instead. This also suppresses the file named by the
P2CRC environment variable.
-H homedir
Use homedir instead of --HOMEDIR-- as the p2c home directory.
The system p2crc file will be searched for in this directory.
-Ipattern
Add pattern to the ImportDir search list of places to find
modules which are imported. The pattern should include a %s to
represent the module name, and should evaluate to a potential
file name for that module’s source code. For example, ../%s.pas
looks for modulename.pas in the parent of the current directory.
-i This special option (which must be the only argument on the
command line if used) simply copies the system configuration
file --HOMEDIR--/p2crc to the standard output in its entirety.
(It may be used with -H, but -i is most useful precisely when
you don’t know the location of the home directory.)
-q Quiet mode. Suppresses output of status messages during
translation.
-En Abort translation after n errors. If n is omitted it defaults
to zero, which means unlimited errors are allowed. Use -E1 to
make p2c halt after the first error.
-e Echo the Pascal source into the output file, surrounded by
#ifdefs. This is the same as the CopySource parameter in the
p2crc file.
-a Produce modern ANSI C. This is a convenient override for the
AnsiC parameter in the p2crc file.
-L language
Select input language name, such as VAX or TURBO. This is a
convenient override for the Language parameter.
-V Verbose mode. This causes p2c to generate an additional ".log"
file with further details of the translation, such as a list of
warnings and notes including those which are suppressed in the
regular output.
-comp Compiler mode. This switch tells p2c to use various
configuration defaults that are more suitable for use as a
Pascal compiler rather than a translator. It is the same as
specifying the following options in your p2crc file:
ElimDeadCode 0
AnalyzeFlow 0
MaxLineBreakTies 0
FoldConstants 1
FoldStrConstants 1
OffsetForLoops 0
StaticLinks 1
BitwiseMod 0
BitwiseDiv 0
AssumeBits 0
AssumeSigns 0
FormatStrings 1
StructFiles 1
FullStrWrite 1
The p2cc script specifies this option when it runs p2c to
compile a Pascal program.
-local Local settings. This switch uses various configuration defaults
that are appropriate if the code generated by p2c is going to be
compiled and run on the same machine that ran p2c itself.
-check Enable all error checking. Normally, some error checks are off
by default, as described in the comments in the system p2crc
file.
-M0 Disable memory conservation. This prevents p2c from freeing
various data structures after translating each function, in case
this new conservation feature causes unforseen problems.
-R Regression testing mode. Formats notes and warning messages in
a way that makes it easier to run diff(1) on the output of p2c.
P2c also understands a few debugging options which may occasionally be
useful when tracking down translation problems. The -dn option sets
the "debug level" to n, a small integer which is normally zero.
Debugging output is written into the regular output file along with the
C code; the higher your n, the more "wallpaper" you get. Also, -t
prints debugging information at every Pascal token, -Bn enables line-
breaker debugging, -Cn enables comment placement debugging, and -Fn
enables flow-analysis debugging.
CHOICE OF SOURCE LANGUAGE
The Language configuration parameter or -L command-line option tells
p2c which Pascal dialect to expect in the input file. Any language
features which do not overlap between dialects are supported all of the
time. The Language parameter is consulted when a syntax or usage is
detected that has different meanings in two different dialects, and
also to determine default values for various other translation
parameters as described below.
The following language words are supported by p2c. Names are case-
insensitive.
HP HP Pascal. This is the default language. All features of HP
Standard Pascal, the Pascal Workstation version, are supported
except as noted in BUGS below. Some features of MODCAL, HP’s
extended Pascal, are also supported. This is a superset of ISO
standard Pascal, including conformant arrays and procedural
parameters.
HP-UX HP Pascal, HP-UX version. Almost identical to the "HP"
dialect.
Turbo Turbo Pascal 5.0 for the IBM PC. Few conflicts with HP Pascal,
so the Language parameter is not often needed for Turbo. (Most
important is that the Turbo and HP dialects use 16 and 32 bit
integers, respectively.)
UCSD UCSD Pascal. Similar to Turbo in many ways.
MPW Macintosh Programmer’s Workshop Pascal 2.0. Should also do a
pretty good job for Lightspeed Pascal. Object Pascal features
are not supported, nor is the fact that char variables are
sometimes stored in 16 bits.
VAX VAX/VMS Pascal version 3.5. Most but not all language features
supported. This has not yet been tested on large programs.
Oregon Oregon Software Pascal/2. All features implemented.
Berk Berkeley Pascal with Sun extensions.
TIP Texas Instruments Pascal.
Apollo Apollo Domain Pascal.
Modula Modula-2. Based on Wirth’s Programming in Modula-2, 3rd
edition. Proper setting of the Language parameter is not
optional. Translation will be incomplete in most cases, but
should be good enough to work with. Structure of local sub-
modules is essentially ignored; like-named identifiers may be
confused. Type WORD is translated as an integer, but type
ADDRESS is translated as char * or void *; this may cause
inconsistencies in the output code.
Modula-2 modules have two parts in separate files. Suppose
these are called foo.def (definition part) and foo.mod
(implementation part) for module foo. Then a pattern like
%s.def must be included in the ImportDir list, and LibraryFile
must be changed to refer to system.m2 instead of system.imp.
To translate the definition part, give the command
p2c foo.def
to translate the definition part into files foo.h and foo.c;
the latter will usually be empty. The command
p2c -s foo.def foo.mod
will translate the implementation part into file foo.c.
Even if all language features are supported for a dialect, some
predefined functions may be omitted. In these cases, the function call
will be translated literally into C with a warning. Some hand
modification may be required.
CONFIGURATION PARAMETERS
P2c is highly configurable. The defaults are suitable for most
applications, but customizing these parameters will help you get the
best possible translation. Since the output of p2c is intended to be
used as human-maintainable source code, there are many parameters for
describing the coding style and conventions you prefer. Others give
hints about your program that help p2c to generate more correct,
efficient, or readable code.
The p2crc files contain a list of parameters, one per line. The system
configuration file, which may be viewed using the -i option to p2c,
serves as an example of the proper format. Parameter names are case-
insensitive. If a parameter name occurs exactly once in the system
p2crc, this indicates that it must have a unique value and the last
value given to it by the configuration files is used. Other parameters
are written several times in a row; these are lists to which each
configuration line adds an entry.
Many p2crc options take a numeric value of 0 or 1, roughly
corresponding to "no" or "yes." Sometimes a blank value or the value
"def" corresponds to an intermediate "maybe" state. For example, the
stylistic option ExtraParens switches between copious or minimal
parentheses in expressions, with the default being a nice compromise
intended to be best for readers with an average knowledge of C operator
precedences.
Configuration options may also be embedded in the source file in the
form of Pascal comments:
{ShortOpt=0} {AvoidName=fred}
{FuncMacro slope(x,y)=atan2(y,x)*RadDeg}
disables automatic short-circuiting of and and or expressions, adds
"fred" to the list of names to avoid using in generated C code, and
defines a special translation for the Pascal program’s slope function
using the standard C atan2 function and a constant RadDeg presumably
defined in the program. Whitespace is generally not allowed in
embedded parameters. The ‘=’ sign is required for embedded parameters,
though it is optional in p2crc files. Comments within embedded
parameters are delimited by ‘##’. Numeric parameters may replace ‘=’
with ‘+’ or ‘-’ to increase or decrease the parameter; list-based
parameters may use ‘-’ to remove a name from a list rather than adding
it. Also, the parameter name by itself in comment braces means to
restore the parameter’s value that was current before the last change:
{VarFiles=0 ## Pass FILE *’s params by value even if VAR}
some declarations
{VarFiles ## Back to original FILE * passing}
causes the parameter VarFiles to have the value 0 for those few
declarations, without affecting the parameter’s value elsewhere in the
file.
If an embedded parameter appears in an include file or in interface
text for a module, the effect of the assignment normally carries over
to any programs that included that file. If the parameter name is
preceded by a ‘*’, then the assignment is automatically undone after
the source file that contains it ends:
{IncludeFrom strings=<p2c/strings.h>}
{*ExportSymbol=pascal_%s}
module strings;
will record the location of the strings module’s include file for the
rest of the translation, but the assignment of ExportSymbol pertains
only to the module itself.
For the complete list of p2crc parameters, run p2c with the -i option.
Here are some additional comments on selected parameters:
ImportAll Because Turbo Pascal only allows one unit per source
file, p2c normally stops reading past the word
implementation in a file being scanned for interface
text. But HP Pascal allows several modules per file and
so this would not be safe to do. The ImportAll option
lets you override the default behavior for your Pascal
dialect.
AnsiC This parameter selects which dialect of C to use. If 1,
all conventions of ANSI C such as prototypes, void *
pointers, etc. are used. If 0, only strict K&R (first
edition) C is used. The default is to use "traditional
UNIX C," which includes enum and void but not void * or
prototypes. Once again there are a number of other
parameters which may be used to control the individual
features if just setting AnsiC is not enough.
C++ This tells p2c to use a number of language extensions
present in C++: Specifically, it enables the "//"
format for comments, use of "anonymous unions" for
variant records, use of declarations within the function
body, use of references for VAR parameters, and use of
"new" and "delete" instead of "malloc" and "free". P2c
will check for collisions with C++ reserved words unless
you explicitly set the C++ option to zero.
TurboObjects P2c recognizes two major dialects of object-oriented
Pascal. Turbo Pascal 6.0 object types translate fairly
directly into C++ classes. In Apple’s Object Pascal,
the object type has similar syntax but represents a
handle (a double pointer) to an object rather than an
object itself. The TurboObjects option (whose default
is determined by the Language setting) says whether
objects should be direct or indirect through pointers.
(P2c uses pointers instead of handles; p2c is most often
used to make programs more portable, and few systems
except the Mac use handles in this way.)
UseVExtern Many non-UNIX linkers prohibit variables from being
defined (not declared) by more than one source file.
One module must declare, e.g., "int foo;", and all
others must declare "extern int foo;". P2c accomplishes
this by declaring public variables "vextern" in header
files, and arranging for the macro vextern to expand to
extern or to nothing when appropriate. If you set
UseVExtern=0 p2c will instead declare variables in a
simpler way that works only on UNIX-style linkers.
UseAnyptrMacros
Certain C reserved words have meanings which may vary
from one C implementation to another. P2c uses special
capitalized names for these words; these names are
defined as macros in the file p2c.h which all translated
programs include. You can set UseAnyptrMacros=0 to
disable the use of these macros. Note that the
functions of many of these macros can also be had
directly using other parameters; for example, UseConsts
allows you to specify whether your target language
recognizes the word const in constant declarations. The
default is to use the Const macro instead, so that your
code will be portable to either kind of implementation.
Signed expands to the reserved word signed if that word
is available, otherwise it is given a null definition.
Similarly, Const expands to const if that feature is
available. The words Volatile and Register are also
defined in p2c.h, although p2c does not use them at
present. The word Char expands to char by default, but
might need to be redefined to signed char or unsigned
char in a particular implementation. This is used for
the Pascal character type; lowercase char is used when
the desired meaning is "byte," not "character."
The word Static always expands to static by default.
This is used in situations where a function or variable
is declared static to make it local to the source file;
lowercase static is used for static local variables.
Thus you can redefine Static to be null if you want to
force private names to be public for purposes of
debugging.
The word Void expands to void in all cases; it is used
when declaring a function with no return value. The
word Anyptr is a typedef for void * or char * as
necessary; it represents a generic pointer.
UsePPMacros The p2c.h header also declares two macros for function
prototyping, PP(x) and PV(). These macros are used as
follows:
Void foo PP( (int x, int y, Char *z) );
Char *bar PV( );
If prototypes are available, these macros will expand to
Void foo (int x, int y, Char *z);
Char *bar (void);
but if only old-style declarations are supported, you
instead get
Void foo ();
Char *bar ();
By default, p2c uses these macros for all function
declarations, but function definitions are written in
old-style C. The UsePPMacros parameter can be set to 0
to disable all use of PP and PV, or it can be set to 1
to use the macros even when defining a function. (This
is accomplished by preceding each old-style definition
with a PP-style declaration.) If you know your code
will always be compiled on systems that support
prototyping, it is prettier to set Prototypes=1 or
simply AnsiC=1 to get true function prototypes.
EatNotes Notes and warning messages containing any of these
strings as sub-strings are not emitted. Each type of
message includes an identifier like [145]; you can add
this identifier to the EatNotes list to suppress that
message. Another useful form is to use a variable name
or other identifier to suppress warnings about that
variable. The strings are a space-separated list, and
thus may not contain embedded spaces. To suppress notes
around a section of code, use, e.g., {EatNotes+[145]}
and {EatNotes-[145]}. Most notes are generated during
parsing, but to suppress those generated during output
the string may need to remain in the list far beyond the
point where it appears to be generated. Use the string
"1" or "0" to disable or enable all notes, respectively.
ExpandIncludes The default action is to expand Pascal include files in-
line. This may not be desirable if include files are
being used to simulate modules. With ExpandIncludes=0,
p2c attempts to convert include files containing only
whole procedures and global declarations into analogous
C include files. This may not always work, though; if
you get error messages, don’t use this option. By
combining this option with StaticFunctions=0, then doing
some fairly minor editing on the result, you can convert
a pseudo-modular Pascal program into a truly modular
collection of C source files.
ElimDeadCode Some transformations that p2c does on the program may
result in unreachable or "dead" code. By default p2c
removes such code, but sometimes it removes more than it
should. If you have "if false" segments which you wish
to retain in C, you may have to set ElimDeadCode=0.
AnalyzeFlow By default p2c does some basic dataflow analysis on the
program in an attempt to locate code that can be
simplified due to knowledge about the possible values of
certain variables. For example, a Pascal rewrite
statement must translate to an if that either calls
fopen on a formerly closed file variable, or freopen on
an already-open file. If flow analysis can prove that
the file was open or closed upon entry to the statement,
a much cleaner translation is possible.
It is possible that flow analysis will make
simplifications that are undesirable or buggy. If this
occurs, you can set AnalyzeFlow to 0 to disable this
feature.
SkipIndices Normally Pascal arrays not based at zero are "shifted"
down for C, preserving the total size of the array. A
Pascal array a[2..10] is translated to a C array a[9]
with references like "a[i]" changed to "a[i-2]"
everywhere. If SkipIndices is set to a value of 2 or
higher, this array would instead be translated to a[11]
with the first two elements never used. This
arrangement may generate incorrect code, though, for
tricky source programs.
FoldConstants Pascal non-structured constants generally translate to
#define’s in C. Set this to 1 to have constants
instantiated directly into the code. This may be turned
on or off around specific constant declarations. Set
this to 0 to force p2c to make absolutely no assumptions
about the constant’s value in generated code, so that
you can change the constant later in the C code without
invalidating the translation. The default is to allow
p2c to take advantage of its knowledge of a constant’s
value, such as by generating code that assumes the
constant is positive.
CharConsts This governs whether single-character string literals in
Pascal const declarations should be interpreted as
characters or strings. In other words, const a=x;
will translate to #define a x if CharConsts=1 (the
default), or to #define a x if CharConsts=0. Note that
if p2c guesses wrong, the generated code will not be
wrong, just uglier. For example, if a is written as a
character constant but it turns out to be used as a
string, p2c will have to write char-to-string conversion
code each time the constant is used.
PreserveTypes P2c makes an attempt to retain the original names used
for data types. For example,
type foo = integer; bar = integer;
establishes two synonyms for the standard integer type;
p2c does its best to preserve the particular synonym
that was used to declare each integer variable. Because
the Pascal language treats these types as
indistinguishable, there will be cases in the
translation where p2c must fall back on the "true" type,
int. PreserveTypes and a few related options control
whether various kinds of type names are preserved. The
default settings preserve all type names except for
pointer types, which use "*" notation throught the
program. This reflects the fact that Pascal forces
pointer types to be named when traditionally they are
not separately named in C.
VarStrings In HP Pascal, a parameter of the form "var s : string"
will match a string variable of any size; a hidden size
parameter is passed which may be accessed by the Pascal
strmax function. You can prevent p2c from creating a
hidden size parameter by setting VarStrings=0. (Note
that each function uses the value of VarStrings as of
the first declaration of the function that is parsed,
which is often in the interface section of a module.)
Prototypes Control whether ANSI C function prototypes are used.
Default is according to AnsiC or C++. This also
controls whether to include parameter names or just
their types in situations where names are optional. The
FullPrototyping parameter allows prototypes to be
generated for declarations but not for definitions
(older versions of Lightspeed C required this). If you
use a mixture of prototypes and old-style definitions,
types like short and float will be promoted to int and
double as required by the ANSI standard, unless
PromoteArgs is used to override this. The CastArgs
parameter controls whether type-casts are used in
function arguments; by default they are used only if
prototypes are not available.
StaticLinks HP Pascal and Turbo Pascal each include the concept of
procedure or function pointers, though with somewhat
different syntaxes. P2c recognizes both notational
styles. Another difference is that HP’s procedure
pointers can point to nested procedures, while Turbo’s
can point only to global procedures. In HP Pascal a
procedure pointer must be stored as a struct containing
both a pure C function pointer and a "static link," a
pointer to the parent procedure’s locals. (The static
link is NULL for global procedures.) This notation can
be forced by setting StaticLinks=1. In Turbo, the
default (StaticLinks=0) is to use plain C function
pointers with no static links. A third option
(StaticLinks=2) uses structures with static links, but
assumes the links are always NULL when calling through a
pointer (if you need compatibility with the HP format
but know your procedures are global).
SmallSetConst Pascal sets are translated into one of two formats,
depending on the size of the set. If all elements have
ordinal values in the range 0..31, the set is translated
as a single integer variable using bit operations. (The
SetBits parameter may be used to change the upper limit
of 31.) The SmallSetConst parameter controls whether
these small-sets are used, and, if so, how constant sets
should be represented in C. For larger sets, an array
of long is used. The s[0] element contains the number
of succeeding array elements which are in use. Set
elements in the range 0..31 are stored in the s[1] array
element, and so on. Sets are normalized so that s[s[0]]
is nonzero for any nonempty set. The standard run-time
library includes all the necessary procedures for
operating on sets.
ReturnValueName
This is one of many "naming conventions" parameters.
Most of these take the form of a printf-like string
containing a %s where the relevant information should
go. In the case of ReturnValueName, the %s refers to a
function name and the resulting string gives the name of
the variable to use to hold the function’s return value.
Such a variable will be made if a function contains
assignments to its return value buried within the body,
so that return statements cannot conveniently be used.
Some parameters (ReturnValueName included) do not
require the %s to be present in the format string; for
example, the standard p2crc file stores every function’s
return value in a variable called Result.
AlternateName P2c normally translates Pascal names into C names
verbatim, but occasionally this is not possible. A
Pascal name may be a C reserved word or traditional C
name like putc, or there may be several like-named
things that are hidden from each other by Pascal’s
scoping rules but must be global in C. In these
situations p2c uses the parameter AlternateName1 to
generate an alternative name for the symbol. The
default is to add an underscore to the name. There is
also an AlternateName2 parameter for a second alternate
name, and an AlternateName parameter for the nth
alternate name. (The value for this parameter should
include both a %s and a %d, in either order.) If these
latter parameters are not defined, p2c applies
AlternateName1 many times over.
ExportSymbol Symbols in the interface section for a Pascal module are
formatted according to the value of ExportSymbol, if
any. It is not uncommon to use modulename_%s for this
symbol; the default is %s, i.e., no special treatment
for exported symbols. If you also define the
Export_Symbol parameter, that format is used instead for
exported symbols which contain an underscore character.
If %S (with a capital "S") appears in the format string
it stands for the current module name.
Alias If the value of this parameter contains a %s, it is a
format string applied to the names of external functions
or variables. If the value does not contain a %s, it
becomes the name of the next external symbol which is
declared (after which the parameter is cleared).
Synonym This creates a synonym for another Pascal symbol or
keyword. The format is
Synonym old-name = new-name
All occurrences of old-name in the input text are
treated as if they were new-name by the parser. If new-
name is a keyword, old-name will be an equivalent
keyword. If new-name is the name of a predefined
function, old-name will behave in the same way as that
function, and so on. If new-name is omitted, then
occurrences of old-name are entirely ignored in the
input file. Synonyms allow you to skip over a keyword
in your dialect of Pascal that is not understood by p2c,
or to simulate a keyword or predefined identifier of
your dialect with a similar one that p2c recognizes.
Note that all predefined functions are available at all
times; if you have a library routine that behaves like,
e.g., Turbo Pascal’s getmem procedure, you can make your
routine a synonym for getmem even if you are not
translating in Turbo mode.
NameOf This defines the name to use in C for a specific symbol.
It must appear before the symbol is declared in the
Pascal code; it is usually placed in the local p2crc
file for the project. The format is
NameOf pascal-name = C-name
By default, Pascal names map directly onto C names with
no change (except for the various kinds of formatting
outlined above). If the pascal-name is of the form
module.name or procedure.name then the command applies
only to the instance of the Pascal name that is global
to that module, or local to that procedure. Otherwise,
it applies to all usages of the name.
VarMacro This is analogous to NameOf, but specifically for use
with Pascal variables. The righthand side can be most
any C expression; all references to the variable are
expanded into that C expression. Names used in the C
expression are taken verbatim. There is also a
ConstMacro parameter for translating constants as
arbitrary expressions. Note that the variable on the
lefthand side must actually be declared in the program
or in a module that it uses. The declaration for the
variable will be omitted from the generated code unless
the Pascal-name appears in the expression: If you ask
to replace i with i+1, the variable i will still be
declared but its value will be shifted accordingly.
Note that if i appears on the lefthand side of an
assignment, p2c will use algebra to "solve" for i.
In all cases where p2c parses C expressions, all C
operators are recognized except compound assignments
like ‘+=’. (Increment and decrement operators are
allowed.) All variable and function names are assumed
to have integer type, even if they are names that occur
in the actual program. A type-specification operator
‘::’ has been introduced; it has the same precedence as
‘.’ or ‘->’ but the righthand side must be a Pascal type
identifier (built-in or defined by your program
previously to when the macro definition was parsed), or
an arbitrary Pascal type expression in parentheses. The
lefthand argument is then considered to have the
specified type. This may be necessary if your macro is
used in situations where the exact type of the
expression must be known (say, as the argument to a
writeln).
FieldMacro Here the lefthand side must have the form record.field,
where record is the Pascal type or variable name for a
record, and field is a field in that record. The
righthand side must be a C expression generally
including the name record. All instances of that name
are replaced by the actual record being "dotted." For
example,
FieldMacro Rect.topLeft = topLeft(Rect)
translates a[i].topLeft into topLeft(a[i]), where a is
an array of Rect.
FuncMacro The lefthand side must be any Pascal function or
procedure name plus a parameter list. The number of
parameters must match the number in the function’s uses
and declaration. Calls to the function are replaced by
the C expression on the righthand side. For example,
FuncMacro PtInRect(p,r) = PtInRect(p,&r)
causes the second argument of PtInRect to be passed by
reference, even though the declaration says it’s not.
If the function in question is actually defined in the
program or module being translated, the FuncMacro will
not affect the definition but it will affect all calls
to the function elsewhere in the module. FuncMacros can
also be applied to predefined or never-defined
functions.
ReplaceBefore This option specifies a string replacement to be done on
every Pascal source line. For example:
ReplaceBefore "{$ifdef" "{EMBED #ifdef"
ReplaceBefore "{$endif}" "{EMBED #endif}"
These lines rewrite Turbo Pascal compile-time
conditionals into comments beginning with the special
word EMBED. This word instructs p2c to format the rest
of the comment without "/* */" delimiters, i.e., the
rest of the comment is embedded directly in the output C
program. There is also a ReplaceAfter option, which
specifies replacements to be done on the output of p2c.
Currently, this feature makes only literal string
replacements, not pattern-based matches. Some users of
p2c have found it useful to feed their Pascal programs
through a more powerful editor like sed or perl before
giving them to p2c. Quite often this is all that is
necessary to get an acceptable translation in the face
of unrecognized Pascal dialects or language features.
IncludeFrom This specifies that a given module’s header should be
included from a given place. The second argument may be
surrounded by " " or < > as necessary; if the second
argument is omitted, no include directive will be
generated for the module.
ImportFrom This specifies that a given module’s Pascal interface
text can be found in the given file. The named file
should be either the source file for the module, or a
specially prepared file with the implementation section
removed for speed. If no ImportFrom entry is found for
a module, the path defined by the ImportDir list is
searched. Each entry in the path may contain a %s,
which expands to the name of the module. The default
path looks for %s.pas and %s.text in the current
directory, then for --HOMEDIR--/%s.imp. (where
--HOMEDIR-- is the p2c home directory.)
StructFunction This parameter is a list of functions which follow the
p2c semantics for structure-valued functions (functions
returning arrays, sets, and strings, and structs in
primitive C dialects). For these functions, a pointer
to a return-value area is passed to the function as a
special first parameter. The function stores the result
in this area, then returns a copy of the pointer. (The
standard C function strcpy is an example of this
concept. Sprintf also behaves this way in some
dialects; it always appears on the StructFunction list
regardless of the type of implementation.) The system
configuration file includes a list of common structured
functions so that p2c’s optimizer will know how to
manipulate them.
StrlapFunction Functions on this list are structured functions as
above, but with the ability to work in-place; that is,
the same pointer may be passed as both the return value
area and a regular parameter.
Deterministic Functions on this list have no side effects or side
dependencies. An example is the sin function in the
standard math library; two calls with the same parameter
values produce the same result, and have no effects
other than returning a value. P2c can make use of this
knowledge when optimizing code for efficiency or
readability. Functions on this list are also assumed to
be relatively fast, so that it is acceptable to
duplicate a call to the function.
LeaveAlone Functions on this list are not subjected to the normal
built-in translation rules that p2c would otherwise use.
For example, adding writeln to this list would translate
writeln statements blindly into calls to a C writeln()
function, rather than being translated into equivalent
printf calls. The built-in translation is also
suppressed if the function has a FuncMacro.
BufferedFile P2c normally assumes binary files will use read/write,
not get/put/^ notation. A file buffer variable will
only be created for a file if buffer notation is used
for it. For global file variables this may be detected
too late (a declaration without buffers may already have
been written). Such files can be listed in BufferedFile
to force p2c to allocate buffers for them; do this if
you get a warning message that says it is necessary.
Set BufferedFile=1 to buffer all files, in which case
UnBufferedFile allows you to force certain files not to
have buffers.
StructFiles If p2c still can’t translate your file operations
correctly, you can set StructFiles=1 to cause Pascal
files to translate into structs which include the usual
C FILE pointer, as well as file buffer and file name
fields. While the resulting code doesn’t look as much
like native C, the file structs will allow p2c to do a
correct translation in many more cases.
CheckFileEOF Normally only file-open operations are checked for
errors. Additional error checking, such as read-past-
end-of-file, can be enabled with parameters like
CheckFileEOF. These checks can make the code very ugly!
If I/O checking is enabled by the program ($iocheck on$
in HP Pascal; {$I+} in Turbo; this is always the default
state), these checks will generate fatal errors unless
enclosed in an HP Pascal try-recover construct. If I/O
checking is disabled, these will cause the global
variable P_ioresult to be set zero or nonzero according
to the outcome. The default for most of these options
is to check only when I/O checking is enabled.
ISSUES
Integer size. P2c normally generates code to work with either 16 or 32
bit ints. If you know your C integers will be 16 or 32 bits, set
IntSize appropriately. In particular setting IntSize=32 will generate
much cleaner code: p2c no longer must carefully cast function arguments
between int and long. These casts also will be unnecessary if ANSI
prototypes are available. To disable int/long casting because you know
at least one of these cases will hold, set CastLongArgs=0. (The
CastArgs parameter similarly controls other types of casts, such as
between ints and doubles.) The Integer16 parameter controls whether
Pascal integers are interpreted as 16 or 32 bits, or translated as
native C integers. The default value depends on the Language selected.
Signed/unsigned chars. Pascal characters are normally "weakly"
interpreted as unsigned; this is controlled by UnsignedChar. The
default is "either," so that C’s native char type may be used even if
its signed-ness is unknown. Code that uses characters outside of the
range 0-127 may need a different setting. Alternatively, you can use
the types {SIGNED} char and {UNSIGNED} char in the few cases where it
really matters. These comments are controlled by the SignedComment and
UnsignedComment parameters. (The type {UNSIGNED} integer is also
recognized.) The SignedChar parameter tells whether C characters are
signed or unsigned (default is "unknown"). The HasSignedChar parameter
tells whether the phrase "signed char" is legal in the output. If it
is not, p2c may have to translate Pascal signed bytes into C shorts.
Special types. P2c understands the following predefined Pascal type
names: integer, signed integers depending on Integer16; longint, signed
32-bit integers; unsigned, unsigned 32-bit integers; sword, signed
16-bit integers; word, unsigned 16-bit integers; c_int, signed native C
integers; c_uint, unsigned native C integers; sbyte, signed 8-bit
integers; byte, unsigned 8-bit integers; real, floating-point numbers
depending on DoubleReals; single, single-precision floats; longreal,
double, and extended, double-precision floats; pointer and anyptr,
generic pointers (assignment-compatible with any pointer type); string,
generic string of length StringDefault (normally 255); also, the usual
Pascal types char, boolean, and text. (If your Pascal uses different
names for these concepts, the Synonym option will come in handy.)
Embedded code. It is possible to write a Pascal comment containing C
code to be embedded into the output. See the descriptions of
EmbedComment and its relatives in the system p2crc file. These
techniques are helpful if you plan to do repeated translations of code
that is still being maintained in Pascal. See the description of
ReplaceBefore for an example use of embedded code.
Comments and blank lines. P2c collects the comments in a procedure
into a list. All comments and statements are stamped with serial
numbers which are used to reattach comments to statements even after
code has been added, removed, or rearranged during translation.
"Orphan" comments attached to statements that have been lost are
attached to nearby statements or emitted at the end of the procedure.
Blank lines are treated as a kind of comment, so p2c will also
reproduce your usage of blank lines. If the comment mechanism goes
awry, you can disable comments with EatComments or disable their being
attached to code with SpitComments.
Indentation. P2c has a number of parameters to govern indentation of
code. The default values produce the GNU Emacs standard indentation
style, although p2c can do a better job since it knows more about the
code it is indenting. Indentation works by applying "indentation
deltas," which are either absolute numbers (which override the previous
indentation), or signed relative numbers (which augment the previous
indentation). A delta of "+0" specifies no change in indentation. All
of the indentation options are described in the standard p2crc file.
Line breaking. P2c uses an algorithm similar to the TeX typesetter’s
paragraph formatter for breaking long statements into multiple lines.
A "penalty" is assigned to various undesirable aspects of all possible
line breaks; the "badness" of a set of line breaks is approximately the
sum of all the penalties. Chief among these are serious penalties for
overrunning the desired maximum line length (default 78 columns), an
infinite penalty for overrunning the absolute maximum line length
(default 90), and progressively greater penalties for breaking at
operators deeply nested in expressions. Parameters such as
OpBreakPenalty control the relative weights of various choices.
BreakArith and its neighbors control whether the operator at a line
break should be placed at the end of the previous line or at the
beginning of the next. If you don’t want any oversize lines, define
MaxLineWidth=78.
Unlike TeX, p2c’s line breaker must actually try all possible sets of
break points. To avoid excessive computation, the total penalty
contributed at each decision point must sum to a nonnegative value;
negative values are clipped up to zero. This allows p2c to prune away
obviously undesirable alternatives in advance. The MaxLineBreakTries
parameter (default 5000) controls how many alternatives to try before
giving up and using the best so far.
PASCAL_MAIN. P2c generates a call to this function at the front of the
main program. In the (unmodified) run-time library all this does is
save argc and argv away because in both HP and Turbo these are accessed
as global variables. If you do not wish to use this feature, define
ArgCName to be argc, ArgVName to be argv, and MainName (normally
"PASCAL_MAIN") to be blank. This will work if argc and argv are never
accessed outside of your main program.
BUGS
P2c was designed with the idea that clean, readable output in most
cases is worth more than guaranteed correct output in extreme cases.
P2c is not a compiler! However, ideally the "extreme" cases would
include only those which never arise in real life. Thus if p2c
actually generates incorrect code I will consider it a bug, but I will
not apologize for it. :-) Below are the major remaining cases where
this is known to occur.
Certain kinds of conformant array parameters (including multi-
dimensional conformant arrays) produce code that declares variable-
length arrays in C. Only a few C compilers, such as the GNU C
compiler, support this language extension. Otherwise some hand re-
coding will be required.
HP Pascal try-recover structures are translated into calls to TRY and
RECOVER macros, which are defined to simulate the construct using
setjmp and longjmp. If this emulation does not work, define the symbol
FAKE_TRY to cause these macros to become "inert." (In cases where the
error is detected by code physically within the body of the try
statement, a C goto to the recover section is always generated.) Also,
local file variables in scopes which are destroyed by an escape are not
closed.
Non-local GOTO’s and try-recover statements are each implemented, but
may conflict if both are used at once. Non-local GOTO’s are fairly
careful about closing files that go out of scope but may fail to do so
in the presence of recursion.
Arrays containing files are not initialized to NULL as other files are.
In some cases, such as file variables allocated by NEW, the file is
initialized but not automatically closed by DISPOSE.
LINK variables allowing sub-procedures access to their parents’
variables are occasionally omitted by mistake, if the access is too
indirect for p2c to notice. If this happens, you can add an explicit
reference to a parent variable in the sub-procedure. A statement of
the form "a:=a" will count as a reference but then be optimized away by
p2c.
Many aspects of Modula-2 are translated only superficially. For
example, the type-compatibility properties of the WORD and ARRAY OF
WORD types are only roughly modelled, as are the scope rules concerning
modules.
Parts of VAX Pascal are still untreated. In particular, the [UNSAFE]
attribute and a few others are not fully supported, nor are the
semantics of the OPEN procedure.
Turbo and VAX Pascal’s double, quadruple, and extended real types all
translate to the C double type. Turbo’s computational type is not
supported at all.
Because Pascal strings (with length bytes) are translated into C
strings (with null terminators), certain Pascal string tricks will not
work in the translated code. For example the assignment s[0]:=chr(x)
is translated to s[x]=0 on the assumption that the string is being
shortened. If x is actually greater than the current length, but not
of a recognizable form like ord(s[0])+n, then the generated code will
not work. In VAX Pascal this corresponds to performing arithmetic on
the LENGTH field of a varying-length string.
Turbo Pascal’s automatic clipping of strings is not supported. In
Turbo, if a ten character string is assigned to a string[8] variable,
the last two characters are silently removed. The code produced by p2c
generally will overrun the target string instead! The StringTruncLimit
parameter (80 by default if Language=Turbo) specifies a string size
which should be considered "short"; assignments of potentially-long
strings to short string variables will cause a warning but will not
automatically truncate. The cure is to use copy in the Pascal source
to truncate the strings explicitly.
FILES
file.xxx Pascal source files
file.c resulting C source file
module.h resulting C header file
p2crc local configuration file
.p2crc alternate local configuration file
--HOMEDIR--/p2crcsystem-wide configuration file
--HOMEDIR--/system.impdeclarations for predefined functions
--HOMEDIR--/system.m2analogous declarations for Modula-2
--HOMEDIR--/*.impinterface text for standard modules
--INCDIR--/p2c.h header file for translated programs
--LIBDIR--/libp2c.arun-time library
AUTHOR
Dave Gillespie, daveg@synaptics.com.
Many thanks to William Bader, Steven Levi, Rick Koshi, Eric Raymond,
Magne Haveraaen, Dirk Grunwald, David Barto, Paul Fisher, Tom
Schneider, Dick Heijne, Guenther Sawitzki, and many others whose
suggestions and bug reports have helped improve p2c in countless ways.
local