Man Linux: Main Page and Category List

NAME

       libunwind-dynamic -- libunwind-support for runtime-generated code

INTRODUCTION

       For  libunwind  to  do  its job, it needs to be able to reconstruct the
       frame state of each frame in a call-chain. The  frame  state  describes
       the  subset  of  the machine-state that consists of the frame registers
       (typically the  instruction-pointer  and  the  stack-pointer)  and  all
       callee-saved   registers   (preserved   registers).   The  frame  state
       describes each register either by  providing  its  current  value  (for
       frame  registers)  or  by  providing  the location at which the current
       value is stored (callee-saved registers).

       For statically generated code, the  compiler  normally  takes  care  of
       emitting  unwind-info  which provides the minimum amount of information
       needed to  reconstruct  the  frame-state  for  each  instruction  in  a
       procedure.  For  dynamically generated code, the runtime code generator
       must use the dynamic unwind-info interface  provided  by  libunwind  to
       supply  the  equivalent  information.  This  manual  page describes the
       format of this information in detail.

       For the purpose of this discussion, a procedure is  defined  to  be  an
       arbitrary  piece  of contiguous code. Normally, each procedure directly
       corresponds to a function  in  the  source-language  but  this  is  not
       strictly   required.   For  example,  a  runtime  code-generator  could
       translate  a  given  function   into   two   separate   (discontiguous)
       procedures:   one  for  frequently-executed  (hot)  code  and  one  for
       rarely-executed  (cold)   code.   Similarly,   simple   source-language
       functions  (usually  leaf  functions)  may get translated into code for
       which the default unwind-conventions apply and for such code, it is not
       strictly necessary to register dynamic unwind-info.

       A  procedure  logically consists of a sequence of regions.  Regions are
       nested in the sense that the frame state at the end of one  region  is,
       by  default,  assumed  to  be the frame state for the next region. Each
       region is thought of as being divided into a prologue, a body,  and  an
       epilogue.   Each  of them can be empty. If non-empty, the prologue sets
       up the frame state for the body. For example, the prologue may need  to
       allocate  some  space  on  the  stack  and  save  certain  callee-saved
       registers. The body performs the actual work of the procedure but  does
       not  change  the  frame  state  in  any way. If non-empty, the epilogue
       restores the previous frame state and as such it undoes or cancels  the
       effect  of the prologue. In fact, a single epilogue may undo the effect
       of the prologues of several (nested) regions.

       We should point out that even though the prologue, body,  and  epilogue
       are   logically  separate  entities,  optimizing  code-generators  will
       generally interleave instructions from all  three  entities.  For  this
       reason,  the  dynamic  unwind-info  interface  of  libunwind  makes  no
       distinction whatsoever between prologue and body. Similarly, the  exact
       set  of  instructions that make up an epilogue is also irrelevant.  The
       only point in the epilogue that needs to be described explicitly by the
       dynamic  unwind-info  is  the  point  at  which  the stack-pointer gets
       restored. The reason this point needs to be described is that once  the
       stack-pointer  is restored, all values saved in the deallocated portion
       of the stack frame become invalid and hence  libunwind  needs  to  know
       about  it.  The  portion  of  the frame state not saved on the stack is
       assume to remain valid through the end of the region. For this  reason,
       there  is  usually  no  need to describe instructions which restore the
       contents of callee-saved registers.

       Within a region, each instruction that affects the frame state in  some
       fashion  needs  to  be described with an operation descriptor. For this
       purpose, each instruction in the region is  assigned  a  unique  index.
       Exactly  how  this  index  is  derived depends on the architecture. For
       example, on RISC and EPIC-style architecture, instructions have a fixed
       size  so  it's possible to simply number the instructions. In contrast,
       most CISC use variable-length instruction encodings, so it  is  usually
       necessary  to  use  a  byte-offset  as the index. Given the instruction
       index, the operation descriptor specifies the effect of the instruction
       in  an  abstract  manner.  For  example,  it  might  express  that  the
       instruction stores calle-saved register r1 at offset 16  in  the  stack
       frame.

PROCEDURES

       A  runtime  code-generator  registers  the  dynamic  unwind-info  of  a
       procedure by setting up a structure of type unw_dyn_info_t and  calling
       _U_dyn_register(),  passing  the  address  of the structure as the sole
       argument. The members of the  unw_dyn_info_t  structure  are  described
       below:

       void *next
               Private to libunwind.  Must not be used by the application.

       void *prev
               Private to libunwind.  Must not be used by the application.

       unw_word_t start_ip
               The   start-address   of  the  instructions  of  the  procedure
              (remember: procedure are defined  to  be  contiguous  pieces  of
              code, so a single code-range is sufficient).

       unw_word_t end_ip
               The   end-address   of   the   instructions  of  the  procedure
              (non-inclusive, that is, end_ip-start_ip  is  the  size  of  the
              procedure in bytes).

       unw_word_t gp
               The  global-pointer  value in use for this procedure. The exact
              meaing of the global-pointer  is  architecture-specific  and  on
              some architecture, it is not used at all.

       int32_t format
               The  format  of  the  unwind-info.   This  member can be one of
              UNW_INFO_FORMAT_DYNAMIC,        UNW_INFO_FORMAT_TABLE,        or
              UNW_INFO_FORMAT_REMOTE_TABLE.

       union u
               This union contains one sub-member structure for every possible
              unwind-info format:

              unw_dyn_proc_info_t pi
                      This member is used for format  UNW_INFO_FORMAT_DYNAMIC.

              unw_dyn_table_info_t ti
                      This member is used for format UNW_INFO_FORMAT_TABLE.

              unw_dyn_remote_table_info_t rti
                      This       member      is      used      for      format
                     UNW_INFO_FORMAT_REMOTE_TABLE.

              The format of these sub-members is described in detail below.

   PROC-INFO FORMAT
       This is the preferred dynamic unwind-info format and  it  is  generally
       the one used by full-blown runtime code-generators. In this format, the
       details  of  a  procedure  are  described  by  a  structure   of   type
       unw_dyn_proc_info_t.  This structure contains the following members:

       unw_word_t name_ptr
               The address of a (human-readable) name of the procedure or 0 if
              no such name is available. If non-zero,  The  string  stored  at
              this  address must be ASCII NUL terminated. For source languages
              that use name-mangling (such as C++ or Java) the  string  stored
              at this address should be the demangled version of the name.

       unw_word_t handler
               The  address  of  the  personality-routine  for this procedure.
              Personality-routines are  used  in  conjunction  with  exception
              handling.        See        the        C++       ABI       draft
              (http://www.codesourcery.com/cxx-abi/) for  an  overview  and  a
              description  of the personality routine. If the procedure has no
              personality routine, handler must be set to 0.

       uint32_t flags
               A bitmask of flags. At the moment, no flags have  been  defined
              and this member must be set to 0.

       unw_dyn_region_info_t *regions
               A   NULL-terminated  linked  list  of  region-descriptors.  See
              section ``Region descriptors'' below for more details.

   TABLE-INFO FORMAT
       This format is generally used when the dynamically generated  code  was
       derived  from  static  code and the unwind-info for the dynamic and the
       static versions is identical. For example, this format  can  be  useful
       when  loading  statically-generated  code  into  an  address-space in a
       non-standard fashion (i.e., through some means  other  than  dlopen()).
       In  this format, the details of a group of procedures is described by a
       structure of type  unw_dyn_table_info.   This  structure  contains  the
       following members:

       unw_word_t name_ptr
               The address of a (human-readable) name of the procedure or 0 if
              no such name is available. If non-zero,  The  string  stored  at
              this  address must be ASCII NUL terminated. For source languages
              that use name-mangling (such as C++ or Java) the  string  stored
              at this address should be the demangled version of the name.

       unw_word_t segbase
               The   segment-base   value  that  needs  to  be  added  to  the
              segment-relative values stored in  the  unwind-info.  The  exact
              meaning of this value is architecture-specific.

       unw_word_t table_len
               The  length of the unwind-info (table_data) counted in units of
              words (unw_word_t).

       unw_word_t table_data
               A pointer to the actual  data  encoding  the  unwind-info.  The
              exact format is architecture-specific (see architecture-specific
              sections below).

   REMOTE TABLE-INFO FORMAT
       The remote table-info format has the same basic purpose as the  regular
       table-info  format. The only difference is that when libunwind uses the
       unwind-info, it will keep the table data in  the  target  address-space
       (which  may be remote). Consequently, the type of the table_data member
       is unw_word_t rather than a pointer.  This implies that libunwind  will
       have  to  access  the  table-data  via the address-space's access_mem()
       call-back, rather than through a direct memory reference.

       From the  point  of  view  of  a  runtime-code  generator,  the  remote
       table-info  format  offers  no  advantage  and it is expected that such
       generators will describe their procedures  either  with  the  proc-info
       format or the normal table-info format. The main reason that the remote
       table-info  format  exists  is  to  enable  the  address-space-specific
       find_proc_info()  callback  (see  unw_create_addr_space(3))  to  return
       unwind tables whose data remains in remote memory. This  can  speed  up
       unwinding  (e.g., for a debugger) because it reduces the amount of data
       that needs to be loaded from remote memory.

REGIONS DESCRIPTORS

       A region descriptor is a variable length structure that  describes  how
       each instruction in the region affects the frame state. Of course, most
       instructions in a region usualy do not change the frame state  and  for
       those,  nothing needs to be recorded in the region descriptor. A region
       descriptor is a structure of type  unw_dyn_region_info_t  and  has  the
       following members:

       unw_dyn_region_info_t *next
               A  pointer to the next region. If this is the last region, next
              is NULL.

       int32_t insn_count
               The length of the region in instructions. Each  instruction  is
              assumed to have a fixed size (see architecture-specific sections
              for details). The value of insn_count may  be  negative  in  the
              last  region  of  a  procedure (i.e., it may be negative only if
              next is NULL).  A  negative  value  indicates  that  the  region
              covers  the last N instructions of the procedure, where N is the
              absolute value of insn_count.

       uint32_t op_count
               The (allocated) length of the op_count array.

       unw_dyn_op_t op
               An array of dynamic unwind directives.  See  Section  ``Dynamic
              unwind directives'' for a description of the directives.

       A  region  descriptor with an insn_count of zero is an empty region and
       such regions are perfectly legal. In fact, empty regions can be  useful
       to  establish  a  particular  frame  state  before the start of another
       region.

       A single region list can be shared across multiple procedures  provided
       those procedures share a common prologue and epilogue (their bodies may
       differ, of course). Normally,  such  procedures  consist  of  a  canned
       prologue,  the  body, and a canned epilogue. This could be described by
       two regions: one covering the prologue and one covering  the  epilogue.
       Since  the  body  length  is  variable, the latter region would need to
       specify a negative value in insn_count such that libunwind  knows  that
       the region covers the end of the procedure (up to the address specified
       by end_ip).

       The region descriptor  is  a  variable  length  structure  to  make  it
       possible   to   allocate   all  the  necessary  memory  with  a  single
       memory-allocation request. To facilitate the  allocation  of  a  region
       descriptors  libunwind  provides  a  helper  routine with the following
       synopsis:

       size_t _U_dyn_region_size(int op_count);

       This routine returns the number  of  bytes  needed  to  hold  a  region
       descriptor  with  space  for  op_count unwind directives. Note that the
       length of the op array does not have to match exactly with  the  number
       of  directives  in  a region. Instead, it is sufficient if the op array
       contains at least as many entries as there are  directives,  since  the
       end  of  the  directives  can always be indicated with the UNW_DYN_STOP
       directive.

DYNAMIC UNWIND DIRECTIVES

       A dynamic unwind directive describes how the frame state changes  at  a
       particular  point  within a region. The description is in the form of a
       structure of type  unw_dyn_op_t.   This  structure  has  the  following
       members:

       int8_t tag
               The  operation  tag.  Must  be  one  of the unw_dyn_operation_t
              values described below.

       int8_t qp
               The qualifying predicate that  controls  whether  or  not  this
              directive  is active. This is useful for predicated architecturs
              such  as  IA-64  or  ARM,  where   the   contents   of   another
              (callee-saved) register determines whether or not an instruction
              is executed (takes effect). If the directive is  always  active,
              this  member  should  be set to the manifest constant _U_QP_TRUE
              (this constant is defined for all architectures,  predicated  or
              not).

       int16_t reg
               The number of the register affected by the instruction.

       int32_t when
               The  region-relative  number  of  the instruction to which this
              directive applies. For example, a value  of  0  means  that  the
              effect  described  by  this  directive  has taken place once the
              first instruction in the region has executed.

       unw_word_t val
               The value to be applied by the operation tag. The exact meaning
              of  this  value  varies  by  tag. See Section ``Operation tags''
              below.

       It  is  perfectly  legitimate  to  specify  multiple   dynamic   unwind
       directives  with the same when value, if a particular instruction has a
       complex effect on the frame state.

       Empty regions by definition contain no actual instructions and as  such
       the directives are not tied to a particular instruction. By convention,
       the when member should be set to 0, however.

       There is no need for the dynamic unwind directives to appear  in  order
       of  increasing  when  values.  If the directives happen to be sorted in
       that order, it may result in slightly faster execution, but  a  runtime
       code-generator  should  not go to extra lengths just to ensure that the
       directives are sorted.

       IMPLEMENTATION  NOTE:  should  libunwind  implementations  for  certain
       architectures  prefer the list of unwind directives to be sorted, it is
       recommended that such implementations  first  check  whether  the  list
       happens  to  be  sorted  already  and,  if  not,  sort  the  directives
       explicitly before the first use. With this approach,  the  overhead  of
       explicit  sorting  is only paid when there is a real benefit and if the
       runtime code-generator happens to generated sorted lists naturally, the
       performance penalty is limited to a simple O(N) check.

   OPERATIONS TAGS
       The   possible   operation   tags   are  defined  by  enumeration  type
       unw_dyn_operation_t which defines the following values:

       UNW_DYN_STOP
               Marks the  end  of  the  dynamic  unwind  directive  list.  All
              remaining  entries  in the op array of the region-descriptor are
              ignored. This tag is guaranteed to have a value of 0.

       UNW_DYN_SAVE_REG
               Marks an instruction which saves register reg to register  val.

       UNW_DYN_SPILL_FP_REL
               Marks   an   instruction   which   spills  register  reg  to  a
              frame-pointer-relative  location.   The   frame-pointer-relative
              offset  is  given  by  the  value stored in member val.  See the
              architecture-specific sections for a description  of  the  stack
              frame layout.

       UNW_DYN_SPILL_SP_REL
               Marks   an   instruction   which   spills  register  reg  to  a
              stack-pointer-relative  location.   The   stack-pointer-relative
              offset  is  given  by  the  value stored in member val.  See the
              architecture-specific sections for a description  of  the  stack
              frame layout.

       UNW_DYN_ADD
               Marks  an  instruction  which  adds  the  constant value val to
              register reg.  To add  subtract  a  constant  value,  store  the
              two's-complement of the value in val.  The set of registers that
              can  be  specified  for   this   tag   is   described   in   the
              architecture-specific sections below.

       UNW_DYN_POP_FRAMES
               .PP

       UNW_DYN_LABEL_STATE
               .PP

       UNW_DYN_COPY_STATE
               .PP

       UNW_DYN_ALIAS
               .PP unw_dyn_op_t

       _U_dyn_op_save_reg();                         _U_dyn_op_spill_fp_rel();
       _U_dyn_op_spill_sp_rel();   _U_dyn_op_add();    _U_dyn_op_pop_frames();
       _U_dyn_op_label_state();   _U_dyn_op_copy_state();   _U_dyn_op_alias();
       _U_dyn_op_stop();

IA-64 SPECIFICS

       - meaning of segbase member in  table-info/table-remote-info  format  -
       format   of   table_data   in   table-info/table-remote-info  format  -
       instruction size: each bundle is counted as 3 instructions,  regardless
       of  template  (MLX)  -  describe  stack-frame  layout,  especially with
       regards to sp-relative and fp-relative  addressing  -  UNW_DYN_ADD  can
       only add to ``sp'' (always a negative value); use POP_FRAMES otherwise

SEE ALSO

       libunwind(3), _U_dyn_register(3), _U_dyn_cancel(3)

AUTHOR

       David Mosberger-Tang
       Email: dmosberger@gmail.com
       WWW: http://www.nongnu.org/libunwind/.