Man Linux: Main Page and Category List

NAME

       ncgen  - From a CDL file generate a netCDF-3 file, a netCDF-4 file or a
       C program

SYNOPSIS

       ncgen [-b] [-c] [-f] [-k file format] [-l  output  language]  [-n]  [-o
              netcdf_filename] [-x] input_file

DESCRIPTION

       ncgen  generates  either  a  netCDF-3 (i.e. classic) binary .nc file, a
       netCDF-4 (i.e. enhanced) binary .nc file  or  a  file  in  some  source
       language that when executed will construct the corresponding binary .nc
       file.  The input to ncgen is a description of a netCDF file in a  small
       language  known  as  CDL (network Common Data form Language), described
       below.  If no options are specified in invoking ncgen, it merely checks
       the  syntax  of  the  input  CDL file, producing error messages for any
       violations of CDL syntax.  Other options can be used, for  example,  to
       create  the  corresponding netCDF file, or to generate a C program that
       uses the netCDF C interface to create the netCDF file.

       Note that this version of ncgen  was  originally  called  ncgen4.   The
       older ncgen program has been renamed to ncgen3.

       ncgen  may  be  used  with the companion program ncdump to perform some
       simple operations on netCDF files.  For example, to rename a  dimension
       in  a  netCDF file, use ncdump to get a CDL version of the netCDF file,
       edit the CDL file to change the name of the dimensions, and  use  ncgen
       to generate the corresponding netCDF file from the edited CDL file.

OPTIONS

       -b     Create  a  (binary)  netCDF file.  If the -o option is absent, a
              default file name will  be  constructed  from  the  netCDF  name
              (specified  after  the netcdf keyword in the input) by appending
              the  ‘.nc’  extension.   If  a  file  already  exists  with  the
              specified name, it will be overwritten.

       -c     Generate  C  source code that will create a netCDF file matching
              the netCDF specification.  The  C  source  code  is  written  to
              standard output; equivalent to -lc.

       -f     Generate  FORTRAN  77 source code that will create a netCDF file
              matching the netCDF specification.  The source code  is  written
              to standard output; equivalent to -lf77.

       -o netcdf_file
              Name  for  the  binary  netCDF  file created.  If this option is
              specified,  it  implies  the  "-b"  option.   (This  option   is
              necessary  because  netCDF  files  cannot be written directly to
              standard output, since standard output is not seekable.)

       -k file_format
              The -k flag specifies the format of the file to be created  and,
              by  inference,  the  data model accepted by ncgen (i.e. netcdf-3
              (classic) versus  netcdf-4).   The  possible  arguments  are  as
              follows.

                     ’1’,  ’classic’  =>  netcdf classic file format, netcdf-3
                     type model.

                     ’2’,  ’64-bit-offset’,  ’64-bit  offset’ => netcdf 64 bit
                     classic file format, netcdf-3 type model.

                     ’3’,  ’hdf5’,  ’netCDF-4’,  ’enhanced’  =>  netcdf-4 file
                     format, netcdf-4 type model.

                     ’4’, ’hdf5-nc3’, ’netCDF-4 classic model’, ’enhanced-nc3’
                     => netcdf-4 file format, netcdf-3 type model.
       If  no  -k  is  specified then it defaults to -k1 (i.e. classic).  Note
       also that -v is accepted to mean the same  thing  as  -k  for  backward
       compatibility,  but  -k is preferred, to match the corresponding ncdump
       option.

       -x     Don’t initialize data with  fill  values.   This  can  speed  up
              creation  of  large  netCDF files greatly, but later attempts to
              read unwritten data from the generated file will not  be  easily
              detectable.

       -l output_language
              The -l flag specifies the output language to use when generating
              source code that will create or define a  netCDF  file  matching
              the  netCDF  specification.   The  output is written to standard
              output.  The currently supported languages  have  the  following
              flags.

                     c|C’ => C language output.

                     f77|fortran77’ => FORTRAN 77 language output
                             ;  note  that currently only the classic model is
                             supported.

                     j|java’ => (experimental) Java language output
                             ; targets the existing  Unidata  Java  interface,
                             which  means  that  only  the  classic  model  is
                             supported.

EXAMPLES

       Check the syntax of the CDL file ‘foo.cdl’:

              ncgen foo.cdl

       From the CDL file ‘foo.cdl’, generate an equivalent binary netCDF  file
       named ‘x.nc’:

              ncgen -o x.nc foo.cdl

       From the CDL file ‘foo.cdl’, generate a C program containing the netCDF
       function invocations necessary to create an  equivalent  binary  netCDF
       file named ‘x.nc’:

              ncgen -c -o x.nc foo.cdl

USAGE

   CDL Syntax Overview
       Below  is  an  example  of  CDL  syntax,  describing a netCDF file with
       several named dimensions (lat, lon, and time), variables (Z, t, p,  rh,
       lat,  lon,  time),  variable attributes (units, long_name, valid_range,
       _FillValue), and some data.   CDL  keywords  are  in  boldface.   (This
       example  is  intended  to  illustrate the syntax; a real CDL file would
       have a more complete set of attributes so that the data would  be  more
       completely self-describing.)
              netcdf foo {  // an example netCDF specification in CDL

              types:
                  ubyte enum enum_t {Clear = 0, Cumulonimbus = 1, Stratus = 2};
                  opaque(11) opaque_t;
                  int(*) vlen_t;

              dimensions:
                   lat = 10, lon = 5, time = unlimited ;

              variables:
                   long    lat(lat), lon(lon), time(time);
                   float   Z(time,lat,lon), t(time,lat,lon);
                   double  p(time,lat,lon);
                   long    rh(time,lat,lon);

                   string  country(time,lat,lon);
                   ubyte   tag;

                   // variable attributes
                   lat:long_name = "latitude";
                   lat:units = "degrees_north";
                   lon:long_name = "longitude";
                   lon:units = "degrees_east";
                   time:units = "seconds since 1992-1-1 00:00:00";

                   // typed variable attributes
                   string Z:units = "geopotential meters";
                   float Z:valid_range = 0., 5000.;
                   double p:_FillValue = -9999.;
                   long rh:_FillValue = -1;
                   vlen_t :globalatt = {17, 18, 19};
              data:
                   lat   = 0, 10, 20, 30, 40, 50, 60, 70, 80, 90;
                   lon   = -140, -118, -96, -84, -52;
              group g {
              types:
                  compound cmpd_t { vlen_t f1; enum_t f2;};
              } // group g
              group h {
              variables:
                   /g/cmpd_t  compoundvar;
              data:
                      compoundvar = { {3,4,5}, Stratus } ;
              } // group h
              }

       All  CDL  statements  are terminated by a semicolon.  Spaces, tabs, and
       newlines can be used freely for readability.  Comments may  follow  the
       characters ‘//’ on any line.

       A  CDL  description consists of five optional parts: types, dimensions,
       variables,  data,  beginning  with  the  keyword  types:,  dimensions:,
       variables:,  and  data,  respectively.   The  variable part may contain
       variable declarations and  attribute  assignments.   All  sections  may
       contain global attribute assignments.

       In  addition,  after the data: section, the user may define a series of
       groups (see the example above).  Groups themselves can  contain  types,
       dimensions, variables, data, and other (nested) groups.

       The  netCDF type section declares the user defined types.  These may be
       constructed using any of the following types: enum,  vlen,  opaque,  or
       compound.

       A  netCDF  dimension  is used to define the shape of one or more of the
       multidimensional variables contained in  the  netCDF  file.   A  netCDF
       dimension  has  a  name and a size.  A dimension can have the unlimited
       size, which means a variable using  this  dimension  can  grow  to  any
       length in that dimension.

       A  variable  represents  a multidimensional array of values of the same
       type.  A variable has a name, a data type, and a shape described by its
       list  of dimensions.  Each variable may also have associated attributes
       (see below) as well as data values.  The name, data type, and shape  of
       a  variable are specified by its declaration in the variable section of
       a CDL description.  A variable may have the same name as  a  dimension;
       by   convention   such  a  variable  is  one-dimensional  and  contains
       coordinates of the  dimension  it  names.   Dimensions  need  not  have
       corresponding variables.

       A  netCDF  attribute  contains  information  about a netCDF variable or
       about the whole netCDF dataset.  Attributes are used  to  specify  such
       properties  as units, special values, maximum and minimum valid values,
       scaling factors, offsets, and  parameters.   Attribute  information  is
       represented by single values or arrays of values.  For example, "units"
       is an attribute represented by a character array such as "celsius".  An
       attribute  has  an  associated variable, a name, a data type, a length,
       and a value.  In contrast to variables  that  are  intended  for  data,
       attributes  are  intended  for  metadata  (data  about  data).   Unlike
       netCDF-3, attribute types can be any user defined type as well  as  the
       usual built-in types.

       In  CDL, an attribute is designated by a a type, a variable, a ’:’, and
       then an attribute name.  The type is optional and if missing,  it  will
       be  inferred from the values assigned to the attribute.  It is possible
       to assign global attributes not associated with  any  variable  to  the
       netCDF  as  a  whole  by  omitting  the  variable name in the attribute
       declaration.   Notice  that  there  is  a  potential  ambiguity  in   a
       specification such as
       x : a = ...
       In  this situation, x could be either a type for a global attribute, or
       the variable name for an attribute. Since there could both  be  a  type
       named  x  and  a  variable named x, there is an ambiguity.  The rule is
       that in this situation, x will be interpreted as a  type  if  possible,
       and otherwise as a variable.

       If  not specified, the data type of an attribute in CDL is derived from
       the type of the value(s) assigned to it.  The length of an attribute is
       the  number  of data values assigned to it, or the number of characters
       in the character string assigned to it.  Multiple values  are  assigned
       to  non-character attributes by separating the values with commas.  All
       values assigned to an attribute must be of the same type.

       The names for CDL dimensions, variables, and attributes must begin with
       an  alphabetic  character  or  ‘_’,  and  subsequent  characters may be
       alphanumeric or ‘_’ or ‘-’.

       The optional data section  of  a  CDL  specification  is  where  netCDF
       variables  may  be  initialized.   The  syntax  of an initialization is
       simple: a variable name, an equals sign, and a comma-delimited list  of
       constants  (possibly separated by spaces, tabs and newlines) terminated
       with a semicolon.  For multi-dimensional  arrays,  the  last  dimension
       varies  fastest.   Thus  row-order rather than column order is used for
       matrices.  If fewer values are supplied  than  are  needed  to  fill  a
       variable,  it is extended with a type-dependent ‘fill value’, which can
       be overridden  by  supplying  a  value  for  a  distinguished  variable
       attribute  named  ‘_FillValue’.   The types of constants need not match
       the type declared  for  a  variable;  coercions  are  done  to  convert
       integers  to floating point, for example.  The constant ‘_’ can be used
       to designate the fill value for a variable.

   Primitive Data Types
              char characters
              byte 8-bit data
              short     16-bit signed integers
              int  32-bit signed integers
              long (synonymous with int)
              int64     64-bit signed integers
              float     IEEE single precision floating point (32 bits)
              real (synonymous with float)
              double    IEEE double precision floating point (64 bits)
              ubyte     unsigned 8-bit data
              ushort    16-bit unsigned integers
              uint 32-bit unsigned integers
              uint64    64-bit unsigned integers
              string    arbitrary length strings

       CDL supports a superset of the primitive data types of  C.   The  names
       for the primitive data types are reserved words in CDL, so the names of
       variables, dimensions, and attributes must not be primitive type names.
       In  declarations,  type names may be specified in either upper or lower
       case.

       Bytes differ from characters in that they are intended to hold  a  full
       eight  bits  of data, and the zero byte has no special significance, as
       it does for character data.  ncgen converts byte declarations  to  char
       declarations  in  the  output  C  code  and  to  the  nonstandard  BYTE
       declaration in output Fortran code.

       Shorts can hold values between -32768 and 32767.  ncgen converts  short
       declarations  to  short  declarations  in  the output C code and to the
       nonstandard INTEGER*2 declaration in output Fortran code.

       Ints  can  hold  values  between  -2147483648  and  2147483647.   ncgen
       converts  int declarations to int declarations in the output C code and
       to INTEGER declarations in output Fortran code.  long is accepted as  a
       synonym  for int in CDL declarations, but is deprecated since there are
       now platforms with 64-bit representations for C longs.

       Int64   can    hold    values    between    -9223372036854775808    and
       9223372036854775807.   ncgen  converts  int64  declarations to longlong
       declarations in the output C code.

       Floats can  hold  values  between  about  -3.4+38  and  3.4+38.   Their
       external  representation  is as 32-bit IEEE normalized single-precision
       floating point numbers.  ncgen converts  float  declarations  to  float
       declarations  in  the  output C code and to REAL declarations in output
       Fortran code.   real  is  accepted  as  a  synonym  for  float  in  CDL
       declarations.

       Doubles  can  hold  values  between  about -1.7+308 and 1.7+308.  Their
       external representation is as 64-bit IEEE standard  normalized  double-
       precision  floating  point numbers.  ncgen converts double declarations
       to double declarations in the output C code  and  to  DOUBLE  PRECISION
       declarations in output Fortran code.

       The  unsigned counterparts of the above integer types are mapped to the
       corresponding unsigned C types.  Their ranges are suitably modified  to
       start at zero.

   CDL Constants
       Constants  assigned  to  attributes  or  variables may be of any of the
       basic netCDF types.  The syntax for constants is similar to  C  syntax,
       except  that  type  suffixes  must  be appended to shorts and floats to
       distinguish them from longs and doubles.

       A byte constant is  represented  by  a  single  character  or  multiple
       character escape sequence enclosed in single quotes.  For example,
               ’a’      // ASCII ‘a’
               ’\0’          // a zero byte
               ’\n’          // ASCII newline character
               ’\33’         // ASCII escape character (33 octal)
               ’\x2b’   // ASCII plus (2b hex)
               ’\377’   // 377 octal = 255 decimal, non-ASCII

       Character  constants  are enclosed in double quotes.  A character array
       may be represented as a string enclosed in double quotes.  The usual  C
       string escape conventions are honored.  For example
              "a"       // ASCII ‘a’
              "Two\nlines\n" // a 10-character string with two embedded newlines
              "a bell:\007"  // a string containing an ASCII bell
       Note  that  the  netCDF  character array "a" would fit in a one-element
       variable, since no terminating NULL character is assumed.   However,  a
       zero  byte  in  a  character  array  is  interpreted  as the end of the
       significant  characters  by  the  ncdump  program,  following   the   C
       convention.   Therefore,  a  NULL  byte  should  not  be  embedded in a
       character string unless at the end: use the byte data type instead  for
       byte arrays that contain the zero byte.

       short  integer  constants  are  intended for representing 16-bit signed
       quantities.  The form of a short constant is an integer  constant  with
       an  ‘s’  or  ‘S’  appended.  If a short constant begins with ‘0’, it is
       interpreted as octal, except  that  if  it  begins  with  ‘0x’,  it  is
       interpreted as a hexadecimal constant.  For example:
              -2s  // a short -2
              0123s     // octal
              0x7ffs  //hexadecimal

       int  integer  constants  are  intended  for  representing 32-bit signed
       quantities.  The form  of  an  int  constant  is  an  ordinary  integer
       constant,  although  it  is acceptable to append an optional ‘l’ or ‘L’
       (again, deprecated).  If  an  int  constant  begins  with  ‘0’,  it  is
       interpreted  as  octal,  except  that  if  it  begins  with ‘0x’, it is
       interpreted as a hexadecimal constant (but see opaque constants below).
       Examples of valid int constants include:
              -2
              1234567890L
              0123      // octal
              0x7ff          // hexadecimal

       int64  integer  constants  are  intended for representing 64-bit signed
       quantities.  The form of an int64 constant is an integer constant  with
       an  ‘ll’ or ‘LL’ appended.  If an int64 constant begins with ‘0’, it is
       interpreted as octal, except  that  if  it  begins  with  ‘0x’,  it  is
       interpreted as a hexadecimal constant.  For example:
              -2ll // an unsigned -2
              0123LL    // octal
              0x7ffLL  //hexadecimal

       Floating point constants of type float are appropriate for representing
       floating point data with about seven significant digits  of  precision.
       The form of a float constant is the same as a C floating point constant
       with an ‘f’ or  ‘F’  appended.   For  example  the  following  are  all
       acceptable float constants:
              -2.0f
              3.14159265358979f   // will be truncated to less precision
              1.f

       Floating   point   constants   of   type  double  are  appropriate  for
       representing floating point data with about sixteen significant  digits
       of  precision.   The  form  of  a  double  constant  is the same as a C
       floating point constant.  An optional ‘d’ or ‘D’ may be appended.   For
       example the following are all acceptable double constants:
              -2.0
              3.141592653589793
              1.0e-20
              1.d

       Unsigned  integer  constants  can be created by appending the character
       ’U’ or ’u’ between the constant and any trailing size specifier.   Thus
       one could say 10U, 100us, 100000ul, or 1000000ull, for example.

       String  constants  are,  like  character  constants,  represented using
       double quotes. This represents a potential  ambiguity  since  a  multi-
       character  string  may  also  indicate  a  dimensioned character value.
       Disambiguation usually occurs by context, but care should be  taken  to
       specify thestring type to ensure the proper choice.

       Opaque  constants  are  represented  as sequences of hexadecimal digits
       preceded by 0X or 0x: 0xaa34ffff, for  example.   These  constants  can
       still  be  used  as  integer  constants and will be either truncated or
       extended as necessary.

   Compound Constant Expressions
       In order to assign values to variables (or attributes)  whose  type  is
       user-defined  type,  the constant notation has been extended to include
       sequences of constants enclosed in  curly  brackets  (e.g.  "{"..."}").
       Such  a  constant is called a compound constant, and compound constants
       can be nested.

       Given a type "T(*) vlen_t", where T is some other arbitrary base  type,
       constants for this should be specified as follows.
           vlen_t var[2] = {t11,t12,...t1N}, {t21,t22,...t2m};
       The values tij, are assumed to be constants of type T.

       Given a type "compound cmpd_t {T1 f1; T2 f2...Tn fn}", where the Ti are
       other arbitrary base types, constants for this should be  specified  as
       follows.
           cmpd_t var[2] = {t11,t12,...t1N}, {t21,t22,...t2n};
       The  values tij, are assumed to be constants of type Ti.  If the fields
       are missing, then they will be set using any specified or default  fill
       value for the field’s base type.

       The general set of rules for using braces are defined in the Specifying
       Datalists section below.

   Scoping Rules
       With the addition of groups, the name space for defined objects  is  no
       longer flat. References (names) of any type, dimension, or variable may
       be prefixed with the absolute path specifying a  specific  declaration.
       Thus one might say
           variables:
               /g1/g2/t1 v1;
       The  type  being  referenced  (t1) is the one within group g2, which in
       turn is nested in group g1.  The similarity of this  notation  to  Unix
       file  paths  is  deliberate,  and  one can consider groups as a form of
       directory structure.

       1. When name is not prefixed, then scope rules are  applied  to  locate
              the specified declaration. Currently, there are three rules: one
              for dimensions, one for types and enumeration constants, and one
              for all others.

       2.  When  an  unprefixed  name of a dimension is used (as in a variable
              declaration), ncgen first looks  in  the  immediately  enclosing
              group  for  the  dimension.   If  it is not found there, then it
              looks in the group enclosing this group.  This continues up  the
              group  hierarchy  until  the dimension is found, or there are no
              more groups to search.

       3. For all  other  names,  only  the  immediately  enclosing  group  is
              searched.

       When  an  unprefixed name of a type or an enumeration constant is used,
       ncgen searches the group tree using  a  pre-order  depth-first  search.
       This  essentially means that it will find the matching declaration that
       precedes the reference textually in the cdl file and that is  "highest"
       in the group hierarchy.

       One  final  note.  Forward references are not allowed.  This means that
       specifying, for example, /g1/g2/t1 will fail if this  reference  occurs
       before g1 and/or g2 are defined.

   Special Attributes
       Special,  virtual,  attributes can be specified to provide performance-
       related  information  about  the  file  format   and   about   variable
       properties.  The file must be a netCDF-4 file for these to take effect.

       These special virtual attributes are not actually  part  of  the  file,
       they are merely a convenient way to set miscellaneous properties of the
       data in CDL

       The special attributes currently supported are as  follows:  ‘_Format’,
       ‘_Fletcher32,     ‘_ChunkSizes’,     ‘_Endianness’,    ‘_DeflateLevel’,
       ‘_Shuffle’, and ‘_Storage’.

       ‘_Format’ is a global attribute specifying the netCDF  format  variant.
       Its  value  must  be a single string matching one of ‘classic’, ‘64-bit
       offset’, ‘netCDF-4’, or ‘netCDF-4 classic model’.

       The rest  of  the  special  attributes  are  all  variable  attributes.
       Essentially  all  of  then  map  to some corresponding ‘nc_def_var_XXX’
       function as  defined  in  the  netCDF-4  API.   ‘_Fletcher32  sets  the
       ‘fletcher32’ property for a variable.  ‘_Endianness’ is either ‘little’
       or ‘big’, depending on how the variable is stored when  first  written.
       ‘_DeflateLevel’  is an integer between 0 and 9 inclusive if compression
       has been specified for the variable.  ‘_Shuffle’ is 1  if  use  of  the
       shuffle   filter   is   specified  for  the  variable.   ‘_Storage’  is
       ‘contiguous’ or ‘chunked’.  ‘_ChunkSizes’ is a list of chunk sizes  for
       each dimension of the variable

   Specifying Datalists
       Specifying  datalists  for  variables  in  the  ‘data:‘  section can be
       somewhat complicated. There are some rules that  must  be  followed  to
       ensure that datalists are parsed correctly by ncgen.

       1. The top level is automatically assumed to be a list of items,
                 so it should not be inside {...}.

       2. Instances of UNLIMITED dimensions (other than the first dimension)
                 must be surrounded by {...} in order to specify the size.

       3. Instances of vlens must be surrounded by {...} in order to
                 specify the size.

       4. Compound instances must be embedded in {...}

       5. Non-scalar fields of compound instances must be embedded in {...}.

       6.  Datalists associated with attributes are implicitly a vector (i.e.,
              a list) of values of the type of the  attribute  and  the  above
              rules must apply with that in mind.

       7. No other use of braces is allowed.

       Note  that  one  consequence  of  these  rules is that arrays of values
       cannot have subarrays within braces.  Thus,  given,  for  example,  int
       var(d1)(d2)...(dn),  a datalist for this variable must be a single list
       of integers, where the number of integers is no more than D=d1*d2*...dn
       values;  note  that  the  list  can  be less than D, in which case fill
       values will be used to pad the list.

       Rule 6 about attribute datalist has the following consequence.  If  the
       type  of  the attribute is a compound (or vlen) type, and if the number
       of entries in the list is one, then  the  compound  instances  must  be
       enclosed in braces.

BUGS

       The   programs   generated   by  ncgen  when  using  the  -c  flag  use
       initialization statements to store data in variables, and will fail  to
       produce  compilable programs if you try to use them for large datasets,
       since the resulting statements may exceed the line length or number  of
       continuation statements permitted by the compiler.

       The  CDL  syntax  makes  it  easy to assign what looks like an array of
       variable-length strings to a  netCDF  variable,  but  the  strings  may
       simply be concatenated into a single array of characters.  Specific use
       of the string type specifier may solve the problem