The Abstract Format redux

The Extraction phase of Erlex works with the Elixir version of an Erlang abstract syntax tree (AST). This data structure, known as the abstract format, defines one or more Erlang modules.

The official documentation for this structure is found in The Abstract Format. So, Erlex development will be leaning heavily on the material in this document. Unfortunately, although the material is detailed and definitive, I also found it to be somewhat impenetrable.

So, I decided to create my own "expanded" version. The initial changes mostly involved formatting and copy editing: increasing white space and font usage, rewording text for clarity, etc. I then renamed a few variables and added numerous annotations. As I continue to work with the revised document, I expect to add assorted comments and questions.

It's quite possible that my understanding and explanations are flawed, so the original document should be regarded as definitive, by default. (For ease of comparison, I retained the original organization and headings.) Please feel free, in any case, to send me feedback!

6.0 The Abstract Format

This document describes the standard representation of parse trees for Erlang programs. This representation, which uses Erlang terms, is known as the abstract format. Functions dealing with such parse trees are compile:forms/[1,2] and functions in the modules epp, erl_eval, erl_lint, erl_pp, erl_parse, and io. Parse trees are also used as input and output for parse transforms (see the compile module).

We use the function Rep to denote the mapping from an Erlang source construct C to its abstract format representation R, and write R = Rep(C).

The word LINE below represents an integer, and denotes the number of the line in the source file where the construction occurred. Several instances of LINE in the same construction may denote different lines.

Operators are not terms in their own right. So, when operators are mentioned below, the representation should be taken to be the atom with a printname consisting of the same characters as the operator.

6.1 Module declarations and forms

Erlang code is divided into modules. A module declaration consists of a sequence of forms that are either function declarations or attributes. Each of these is terminated by a period (.).

  • If D is a module declaration with forms F_i, then:
      F_1, ..., F_k yields Rep(D):
      [ Rep(F_1), ..., Rep(F_k) ]

  • If F is a module attribute, then:
      -module( Mod ) yields Rep(F):
      { attribute, LINE, module, Mod }

  • If F is an export attribute with functions Fun_i and arities A_i, then:
      -export( [ Fun_1/A_1, ..., Fun_k/A_k ] ) yields Rep(F):
      { attribute, LINE, export,
        [ { Fun_1, A_1 }, ..., { Fun_k, A_k } ] }

  • If F is an import attribute with functions Fun_i and arities A_i, then:
      -import( Mod, [ Fun_1/A_1, ..., Fun_k/A_k ] ) yields Rep(F):
      { attribute, LINE, import,
        { Mod, [ { Fun_1, A_1 }, ..., { Fun_k, A_k } ] } }

  • If F is a compile attribute with options Opt_i, then:
      -compile( { Opt_1, ..., Opt_k } ) yields Rep(F):
      { attribute, LINE, compile,
        { Opt_1, ..., Opt_k } }

  • If F is a file attribute that specifies the current File and Line, then:
      -file( File, Line ) yields Rep(F):
      { attribute, LINE, file,
        { File, Line } }

  • If F is a record declaration that specifies V_i (field/value expressions), then:
      -record( Name, { V_1, ..., V_k } ) yields Rep(F):
      { attribute, LINE, record,
        { Name,
          [ Rep(V_1), ..., Rep(V_k) ] } }

    For Rep(V_i), see below.

  • If F is a type attribute (i.e., opaque or type) where each A_i is a variable, then:
      -Attr Name( A_1, ..., A_k ) :: T yields Rep(F):
      { attribute, LINE, Attr,
        { Name, Rep(T),
          [ Rep(A_1), ..., Rep(A_k) ] } }

    For Rep(T), see below.

  • If F is a type spec (i.e., callback or spec) where each Tc_i is a fun type clause
    with an argument sequence of the same length (i.e., Arity), then:
      -Attr F Tc_1; ...; Tc_k yields Rep(F):
      { Attr, LINE,
        { { F, Arity },
          [ Rep(Tc_1), ..., Rep(Tc_k) ] } }

    For Rep(Tc_i), see below.

  • If F is a type spec (i.e., callback or spec) for module Mod and function Fun,
    where each Tc_i is a fun type clause with an argument sequence of the same
    length (i.e., Arity), then:
      -Attr Mod:Fun Tc_1; ...; Tc_k yields Rep(F):
      { Attr, LINE,
        { { Mod, F, Arity },
          [ Rep(Tc_1), ..., Rep(Tc_k) ] } }

    For Rep(Tc_i), see below.

  • If F is a wild attribute specified by a name A and tuple T, then:
      -A(T) yields Rep(F):
      { attribute, LINE, A, T }

  • If F is a function declaration where each Fc_i is a function clause
    with a pattern sequence of the same length (i.e., Arity), then:
      Name Fc_1 ; ... ; Name Fc_k yields Rep(F):
      { function, LINE, Name, Arity,
        [ Rep(Fc_1), ...,Rep(Fc_k) ] }

Type clauses

  • If T is a fun type clause where A_i and Ret are types, then:
      ( A_1, ..., A_n ) -> Ret yields Rep(T):
      { type, LINE, 'fun',
        [ { type, LINE, product,
            [ Rep(A_1), ..., Rep(A_n) ] },
          Rep(Ret) ] }

  • If T is a bounded fun type clause, where Tc is an unbounded fun type clause
    and Tg is a type guard sequence, then:
      Tc when Tg yields Rep(T):
      { type, LINE, bounded_fun,
        [ Rep(Tc), Rep(Tg) ] }

Type guards

Legal guards in Erlang are boolean functions placed after the key word, "when" and before the arrow, "->". Guards may appear as part of a function definition or in "receive", 'if', "case", and "try/catch" expressions.

-- Erlang Programming/guards

  • If G is a type guard based on a constraint,
    where A is an atom and each T_i is a type, then:
      A( T_1, ..., T_k ) yields Rep(G):
      { type, LINE, constraint,
        [ Rep(A),
          [ Rep(T_1), ..., Rep(T_k) ] ] }

  • If G is a type guard based on a type definition,
    where Name is a variable and Type is a type, then:
      Name :: Type yields Rep(G):
      { type, LINE, constraint,
        [ { atom, LINE, is_subtype},
          [ Rep(Name), Rep(Type) ] ] }

Types

Types describe sets of Erlang terms. Types consist of, and are built from, a set of predefined types, for example, integer(), atom(), and pid(). Predefined types represent a typically infinite set of Erlang terms that belong to this type. For example, the type atom() stands for the set of all Erlang atoms.

-- Types and Function Specifications

  • If T is a type definition where Name is a variable and Type is a type, then:
      Name :: Type yields Rep(T):
      { ann_type, LINE,
        [ Rep(Name), Rep(Type) ] }

  • If T is a type union where each A_i is a type, then:
      A_1 | ... | A_k yields Rep(T):
      { type, LINE, union,
        [ Rep(A_1), ..., Rep(A_k) ] }

  • If T is a type range where L and R are types, then:
      L .. R yields Rep(T):
      { type, LINE, range,
        [ Rep(L), Rep(R) ] }

  • If T is a binary operation where Op is an arithmetic or bitwise binary operator
    and L and R are types, then:
      L Op R yields Rep(T):
      { op, LINE, Op, Rep(L), Rep(R) }

  • If Op is an arithmetic or bitwise unary operator and A is a type, then:
      Op A yields Rep(T):
      { op, LINE, Op, Rep(A) }

  • If T is a fun type, then:
      fun() yields Rep(T):
      { type, LINE, 'fun', [] }

  • If T is a variable V, where A is an atom with a printname
    consisting of the same characters as V, then:
      V yields Rep(T):
      { var, LINE, A }

  • If T is an atomic literal L which is not a string literal, then:
      L yields Rep(T):
      Rep(L)

  • If T is a tuple type or map type (i.e., tuple or map), then:
      F() yields Rep(T):
      { type, LINE, F, any }

  • If T is a type named F where each A_i is a type, then:
      F( A_1, ..., A_k ) yields Rep(T):
      { user_type, LINE, F,
        [ Rep(A_1), ..., Rep(A_k) ] }

  • If T is a remote type from module Mod and function Fun,
    where each A_i is a type, then:
      Mod:Fun( A_1, ..., A_k ) yields Rep(T):
      { remote_type, LINE,
        [ Rep(Mod), Rep(Fun),
          [ Rep(A_1), ..., Rep(A_k) ] ] }

  • If T is the nil type, then:
      [] yields Rep(T):
      { type, LINE, nil, [] }

  • If T is a list type where A is a type, then:
      [ A ] yields Rep(T):
      { type, LINE, list,
        [ Rep(A) ] }

  • If T is a non-empty list type where A is a type, then:
      [ A, ... ] yields Rep(T):
      { type, LINE, nonempty_list,
        [ Rep(A) ] }

  • If T is a map type where each P_i is a map pair type, then:
      #{ P_1, ..., P_k } yields Rep(T):
      { type, LINE, map,
        [ Rep(P_1), ..., Rep(P_k) ] }

  • If T is a map pair type where K and V are types, then:
      K => V yields Rep(T):
      { type, LINE, map_field_assoc,
        [ Rep(K), Rep(V) ] }

  • If T is a tuple type where each A_i is a type, then:
      { A_1, ..., A_k } yields Rep(T):
      { type, LINE, tuple,
        [ Rep(A_1), ..., Rep(A_k) ] }

  • If T is a record type where Name is an atom, then:
      #Name{} yields Rep(T):
      { type, LINE, record,
        [ [ Rep(Name) ] }

  • If T is a record type where Name is an atom and F_i are fields, then:
      #Name{ F_1, ..., F_k } yields Rep(T):
      { type, LINE, record,
        [ Rep(Name),
          [ [ Rep(F_1), ..., Rep(F_k) ] ] }

  • If T is a record field type where Name is an atom and Type is a type, then:
      Name :: Type yields Rep(T):
      { type, LINE, field_type,
        [ Rep(Name), Rep(Type) ] }

  • If T is a record field type, then:
      <<>> yields Rep(T):
      { type, LINE, binary,
        [ { integer, LINE, 0 },
            [ { integer, LINE, 0 } ] }

  • If T is a binary type where B is a type, then:
      << _ : B >> yields Rep(T):
      { type, LINE, binary,
        [ Rep(B),
          [ { integer, LINE, 0 } ] }

  • If T is a binary type where U is a type, then:
      << _ : _ * U >> yields Rep(T):
      { type, LINE, binary,
        [ { integer, LINE, 0 },
          Rep(U) ] }

  • If T is a binary type where B and U are types, then:
      << _ : B , _ : _ * U >> yields Rep(T):
      { type, LINE, binary,
        [ [ Rep(B), Rep(U) ] }

  • If T is a fun type and Ret is a return type, then:
      fun( (...) -> Ret ) yields Rep(T):
      { type, LINE, 'fun',
        [ { type, LINE, product, [] },
          Rep(Ret) ] }

  • If T is a fun type where Tc is an unbounded fun type clause, then:
      fun(Tc) yields Rep(T):
      Rep(Tc)

Record fields

A record is a data structure for storing a fixed number of elements. It has named fields and is similar to a struct in C. Record expressions are translated to tuple expressions during compilation.

-- Records (Erlang Reference Manual User's Guide)

Each field in a record declaration may have an optional explicit default initializer expression.

  • In V, if A is an atom, then:
      A yields Rep(V):
      { record_field, LINE, Rep(A) }

  • In V, if A is an atom and E is an expression, then:
      A = E yields Rep(V):
      { record_field, LINE, Rep(A), Rep(E) }

  • In V, if A is an atom and T is a valid type, then:
      A :: T yields Rep(V):
      { typed_record_field,
        { record_field, LINE, Rep(A) },
        Rep(undefined | T) }

    Note that if T is an annotated type, it will be wrapped in parentheses.

  • In V, if A is an atom and T is a type, then:
      A :: T yields Rep(V):
      { typed_record_field,
        { record_field, LINE, Rep(A) },
        Rep(T) }

  • In V, if A is an atom, E is an expression, and T is a type, then:
      A = E :: T yields Rep(V):
      { typed_record_field,
        { record_field, LINE, Rep(A), Rep(E) },
        Rep(T) }

Representation of parse errors and end of file

In addition to the representations of forms, the list that represents a module declaration (as returned by functions in erl_parse and epp) may contain tuples that indicate exceptional situations.

  • Syntactically incorrect forms are denoted by { error, E }.

  • Warnings are denoted by { warning, W }.

  • An end of file (i.e., stream) encountered during the parsing of a form is denoted by { eof, LINE }.

6.2 Atomic literals

There are five kinds of atomic literals: atom, character, float, integer, and string. They are represented in the same way in patterns, expressions, and guards.

  • If L is an integer or character literal, then:
      L yields Rep(L):
      { integer, LINE, L }

  • If L is a float literal, then:
      L yields Rep(L):
      { float, LINE, L }

  • If L is a string literal consisting of the characters C_1, ..., C_k, then:
      L yields Rep(L):
      { string, LINE,
        [ C_1, ..., C_k ] }

  • If L is an atom literal, then:
      L yields Rep(L):
      { atom, LINE, L }

Note that negative integer and float literals do not occur as such; they are parsed as an application of the unary negation operator.

6.3 Patterns

  • If Ps is a sequence of patterns P_i, then:
      P_1, ..., P_k yields Rep(Ps):
      [ Rep(P_1), ..., Rep(P_k) ]

    Such sequences occur as the list of arguments to a function or fun.

Individual patterns are represented as follows:

  • If P is an atomic literal L, then:
      L yields Rep(P):
      Rep(L)

  • If P is a compound pattern composed of P_1 and P_2, then:
      P_1 = P_2 yields Rep(P):
      { match, LINE, Rep(P_1), Rep(P_2) }

  • If P is a variable pattern, where A is an atom with a printname
    consisting of the same characters as V, then:
      V yields Rep(P):
      { var, LINE, A },

  • If P is a universal pattern, then:
      _   yields Rep(P):
      { var, LINE, '_' }

  • If P is a tuple pattern containing P_i, then:
      { P_1, ..., P_k } yields Rep(P):
      { tuple, LINE,
        [ Rep(P_1), ..., Rep(P_k) ] }

  • If P is a nil pattern, then:
      [] yields Rep(P):
      { nil, LINE }

  • If P is a cons pattern composed of patterns P_h (head) and P_t (tail), then:
      [ P_h | P_t ] yields Rep(P):
      { cons, LINE, Rep(P_h), Rep(P_t) }

  • If E is a binary pattern composed of sizes Size_1 and type specifier lists TSL_i, then:
      <<P_1:Size_1/TSL_1, ..., P_k:Size_k/TSL_k>> yields Rep(E):
      { bin, LINE,
        [ { bin_element, LINE, Rep(P_1), Rep(Size_1), Rep(TSL_1) }, ...,
          { bin_element, LINE, Rep(P_k), Rep(Size_k), Rep(TSL_k) } ] }

    For Rep(TSL), see below. An omitted Size is represented by default.
    An omitted TSL (type specifier list) is represented by default.

  • In P, if Op is a binary operator and P_i are patterns, then:
      P_1 Op P_2 yields Rep(P):
      { op, LINE, Op, Rep(P_1), Rep(P_2) }

    This is either an occurrence of ++ applied to a literal string or character list,
    or an occurrence of an expression that can be evaluated to a number at compile time.

  • In P, if Op is a unary operator and P_0 is a pattern, then:
      Op P_0 yields Rep(P):
      { op, LINE, Op, Rep(P_0) }

    This is an occurrence of an expression that can be evaluated to a number at compile time.

  • If P is a record pattern where Field_i are fields and P_i are patterns, then:
      #Name{Field_1=P_1, ..., Field_k=P_k} yields Rep(P):
      { record, LINE, Name,
        [ { record_field, LINE, Rep(Field_1), Rep(P_1) }, ...,
          { record_field, LINE, Rep(Field_k), Rep(P_k) } ] }

  • If P is a record pattern where Field is a field and Name is a record, then:
      #Name.Field yields Rep(P):
      { record_index, LINE, Name, Rep(Field) }

  • If P is a parenthesized pattern with P_0 as its body, then:
      ( P_0 ) yields Rep(P):
      Rep(P_0)

    That is, patterns cannot be distinguished from their bodies.
    Note that every pattern has the same source form as some expression
    and is represented the same way as the corresponding expression.

6.4 Expressions

  • If a body B is a sequence of expressions E_i, then:
      E_1, ..., E_k yields Rep(B):
      [ Rep(E_1), ..., Rep(E_k) ]

An expression E is one of the following alternatives:

  • If P is an atomic literal L, then:
      Rep(P) = Rep(L)

  • In E, if E_0 is an expression and P is a pattern, then:
      P = E_0 yields Rep(E):
      { match, LINE, Rep(P), Rep(E_0) }

  • In E, if V is a variable and A is an atom with a printname
    consisting of the same characters as V, then:
      Rep(E) = { var, LINE, A }

  • If E is a tuple skeleton containing expressions E_i, then:
      { E_1, ..., E_k } yields Rep(E):
      { tuple, LINE,
        [ Rep(E_1), ..., Rep(E_k) ] }

  • If E is an empty list, then:
      [] yields Rep(E):
      { nil, LINE }

  • If E is a cons skeleton composed of expressions E_h (head) and E_t (tail), then:
      [ E_h | E_t ] yields Rep(E):
      { cons, LINE, Rep(E_h), Rep(E_t) }

  • If E is a binary construct, then:
      <<V_1:Size_1/TSL_1, ..., V_k:Size_k/TSL_k>> yields Rep(E):
      { bin, LINE,
      [ { bin_element, LINE, Rep(V_1), Rep(Size_1), Rep(TSL_1) }, ...,
        { bin_element, LINE, Rep(V_k), Rep(Size_k), Rep(TSL_k) } ] }

    For Rep(TSL), see below. An omitted Size is represented by default.
    An omitted TSL (type specifier list) is represented by default.

  • In E, if Op is a binary operator, then:
      E_1 Op E_2 yields Rep(E):
      { op, LINE, Op, Rep(E_1), Rep(E_2) }

  • In E, if Op is a unary operator, then:
      Op E_0 yields Rep(E):
      { op, LINE, Op, Rep(E_0) }

  • In E, if Name is a record name, E_i are expressions, and Field_i are field names, then:
      #Name{Field_1=E_1, ..., Field_k=E_k} yields Rep(E):
      { record, LINE, Name,
        [ { record_field, LINE, Rep(Field_1), Rep(E_1) }, ...,
          { record_field, LINE, Rep(Field_k), Rep(E_k) } ] }

  • In E, if E_0 is an expression which evaluates to a Name record,
    the other E_i are expressions, and Field_i are field names, then:
      E_0#Name{Field_1=E_1, ..., Field_k=E_k} yields Rep(E):
      { record, LINE, Rep(E_0), Name,
        [ { record_field, LINE, Rep(Field_1), Rep(E_1)}, ...,
          { record_field, LINE, Rep(Field_k), Rep(E_k) } ] }

  • In E, if Name is a record name and Field is a field name, then:
      #Name.Field yields Rep(E):
      { record_index, LINE, Name, Rep(Field) }

  • In E, if E_0 is an expression which evaluates to a Name record
    and Field is a field name, then:
      E_0#Name.Field yields Rep(E):
      { record_field, LINE, Rep(E_0), Name, Rep(Field) }

  • In E, if each W_i is a map assoc or exact field, then:
      #{ W_1, ..., W_k } yields Rep(E):
      { map, LINE,
        [ Rep(W_1), ..., Rep(W_k) ] }

    For Rep(W), see below.

  • In E, if each W_i is a map assoc or exact field, then:
      E_0#{ W_1, ..., W_k } yields Rep(E):
      { map, LINE, Rep(E_0),
        [ Rep(W_1), ..., Rep(W_k) ] }

    For Rep(W), see below.

  • In E, if E_0 is a catch expression, then:
      catch E_0 yields Rep(E):
      { 'catch', LINE, Rep(E_0) }

  • In E, if E_0 is a function name and E_i are arguments, then:
      E_0( E_1, ..., E_k ) yields Rep(E):
      { call, LINE, Rep(E_0),
        [ Rep(E_1), ..., Rep(E_k) ] }

  • In E, if E_m is a module name, E_0 is a function name,
    and the remaining E_i are arguments, then:
      E_m:E_0( E_1, ..., E_k ) yields Rep(E):
      { call, LINE,
        { remote, LINE, Rep(E_m), Rep(E_0) },
        [ Rep(E_1), ..., Rep(E_k) ] }

  • If E is a list comprehension where E_0 is the resulting data structure
    and each W_i is a generator or a filter, then:
      [ E_0 || W_1, ..., W_k ] yields Rep(E):
      { lc, LINE, Rep(E_0),
        [ Rep(W_1), ..., Rep(W_k) ] }

    For Rep(W), see below.

  • If E is a binary comprehension where E_0 is the resulting data structure
    and each W_i is a generator or a filter, then:
      <<E_0 || W_1, ..., W_k>> yields Rep(E):
      { bc, LINE, Rep(E_0),
        [ Rep(W_1), ..., Rep(W_k) ] }

    For Rep(W), see below.

  • In E, if B is a body, then:
      begin B end yields Rep(E):
      { block, LINE, Rep(B) }

  • In E, if each Ic_i is an if clause, then:
      if Ic_1 ; ... ; Ic_k end yields Rep(E):
      { 'if', LINE,
        [ Rep(Ic_1), ..., Rep(Ic_k) ] }

  • In E, if E_0 is an expression and each Cc_i is a case clause, then:
      case E_0 of Cc_1 ; ... ; Cc_k end yields Rep(E):
      { 'case', LINE, Rep(E_0),
        [ Rep(Cc_1), ..., Rep(Cc_k) ] }

  • In E, if B is a body and each Tc_i is a catch clause, then:
      try B catch Tc_1 ; ... ; Tc_k end yields Rep(E):
      { 'try', LINE, Rep(B), [],
        [ Rep(Tc_1), ..., Rep(Tc_k) ],
        [] }

  • In E, if B is a body, each Cc_i is a case clause, and each Tc_j is a catch clause, then:
      try B of Cc_1 ; ... ; Cc_k catch Tc_1 ; ... ; Tc_n end yields Rep(E):
      { 'try', LINE, Rep(B),
        [ Rep(Cc_1), ..., Rep(Cc_k) ],
        [ Rep(Tc_1), ..., Rep(Tc_n) ], [] }

  • In E, if B and A are bodies, then:
      try B after A end yields Rep(E):
      { 'try', LINE, Rep(B), [], [], Rep(A) }

  • In E, if B and A are bodies and each Cc_i is a case clause, then:
      try B of Cc_1 ; ... ; Cc_k after A end yields Rep(E):
      { 'try', LINE, Rep(B),
        [ Rep(Cc_1), ..., Rep(Cc_k) ],
        [], Rep(A) }

  • In E, if B and A are bodies and each Tc_i is a catch clause, then:
      try B catch Tc_1 ; ... ; Tc_k after A end yields Rep(E):
      { 'try', LINE, Rep(B), [],
        [ Rep(Tc_1), ..., Rep(Tc_k) ],
        Rep(A) }

  • In E, if B and A are bodies, each Cc_i is a case clause, and each Tc_j is a catch clause, then:
      try B of Cc_1 ; ... ; Cc_k catch Tc_1 ; ... ; Tc_n after A end yields Rep(E):
      { 'try', LINE, Rep(B),
        [ Rep(Cc_1), ..., Rep(Cc_k) ],
        [ Rep(Tc_1), ..., Rep(Tc_n) ],
        Rep(A) }

  • In E, if each Cc_i is a case clause, then:
      receive Cc_1 ; ... ; Cc_k end yields Rep(E):
      { 'receive', LINE,
        [ Rep(Cc_1), ..., Rep(Cc_k) ] }

  • In E, if each Cc_i is a case clause, E_0 is an expression, and B_t is a body, then:
      receive Cc_1 ; ... ; Cc_k after E_0 -> B_t end yields Rep(E):
      { 'receive', LINE,
        [ Rep(Cc_1), ..., Rep(Cc_k) ],
        Rep(E_0), Rep(B_t) }

  • In E, if Name is a function name and Arity is the arity, then:
      fun Name / Arity yields Rep(E):
      { 'fun', LINE,
        { function, Name, Arity} }

  • In E, if Module is a module name, Name is a function name, and Arity is the arity, then:
      fun Module:Name/Arity yields Rep(E):
      { 'fun', LINE,
        { function, Rep(Module), Rep(Name), Rep(Arity) } }

    Before the R15 release, this was:
      Rep(E) = {'fun', LINE,
        { function, Module, Name, Arity } }

  • In E, if each Fc_i is a function clause, then:
      fun Fc_1 ; ... ; Fc_k end yields Rep(E):
      { 'fun', LINE,
        { clauses,
          [ Rep(Fc_1), ..., Rep(Fc_k) ] } }

  • In E, if Name is a variable and each Fc_i is a function clause, then:
      fun Name Fc_1 ; ... ; Name Fc_k end yields Rep(E):
      { named_fun, LINE, Name,
        [ Rep(Fc_1), ..., Rep(Fc_k) ] }

  • In query E, if each W_i is a generator or a filter, then:
      [ E_0 || W_1, ..., W_k ] end yields Rep(E):
      { 'query', LINE,
        { lc, LINE, Rep(E_0),
          [ Rep(W_1), ..., Rep(W_k) ] } }

    For Rep(W), see below.

  • If E is a Mnesia record access inside a query where E_0 is ??? and Field is ???, then:
      E_0.Field yields Rep(E):
      { record_field, LINE, Rep(E_0), Rep(Field) }

  • In E, if E_0 is an expression, then:
      ( E_0 ) yields Rep(E):
      Rep(E_0)

    That is, parenthesized expressions cannot be distinguished from their bodies.

Generators and filters

When W is a generator or a filter (in the body of a list or binary comprehension), then:

  • If W is a generator, where P is a pattern and E is an expression, then:
      P <- E yields Rep(W):
      { generate, LINE, Rep(P), Rep(E) }

  • If W is a generator, where P is a pattern and E is an expression, then:
      P <= E yields Rep(W):
      { b_generate, LINE, Rep(P), Rep(E) }

  • If W is a filter E, which is an expression, then:
      Rep(W) = Rep(E)

Binary element type specifiers

  • If a type specifier list TSL for a binary element is a sequence of type specifiers, then:
      TS_1 - ... - TS_k yields Rep(TSL):
      [ Rep(TS_1), ..., Rep(TS_k) ]

When TS is a type specifier for a binary element, then:

  • If TS is an atom, then:   A yields Rep(TS):
      A

  • If TS is a couple, where A is an atom and Value is an integer, then:
      A:Value yields Rep(TS):
      { A, Value }

Map assoc and exact fields

When W is an assoc field or exact field (in the body of a map), then:

  • If W is an assoc field, where K and V are both expressions, then:
      K => V yields Rep(W):
      { map_field_assoc, LINE, Rep(K), Rep(V) }

  • If W is an exact field, where K and V are both expressions, then:
      K := V yields Rep(W):
      { map_field_exact, LINE, Rep(K), Rep(V) }

6.5 Clauses

There are function clauses, if clauses, case clauses, and catch clauses.

A clause C is one of the following alternatives:

  • If C is a function clause where Ps is a pattern sequence and B is a body, then:
      ( Ps ) -> B yields Rep(C):
      { clause, LINE, Rep(Ps), [], Rep(B) }

  • If C is a function clause where Ps is a pattern sequence,
    Gs is a guard sequence, and B is a body, then:
      ( Ps ) when Gs -> B yields Rep(C):
      { clause, LINE, Rep(Ps), Rep(Gs), Rep(B) }

  • If C is an if clause where Gs is a guard sequence and B is a body, then:
      Gs -> B yields Rep(C):
      { clause, LINE, [], Rep(Gs), Rep(B) }

  • If C is a case clause where P is a pattern and B is a body, then:
      P -> B yields Rep(C):
      { clause, LINE,
        [ Rep(P) ],
        [], Rep(B) }

  • If C is a case clause where P is a pattern, Gs is a guard sequence, and B is a body, then:
      P when Gs -> B yields Rep(C):
      { clause, LINE,
        [ Rep(P) ],
        Rep(Gs), Rep(B) }

  • If C is a catch clause where P is a pattern and B is a body, then:
      P -> B yields Rep(C):
      { clause, LINE,
        [ Rep( { throw, P, _ } ) ],
        [], Rep(B) }

  • If C is a catch clause where X is an atomic literal or a variable pattern,
    P is a pattern, and B is a body, then:
      X : P -> B yields Rep(C):
      { clause, LINE,
        [ Rep( { X, P, _ } ) ],
        [], Rep(B) }

  • If C is a catch clause where P is a pattern, Gs is a guard sequence, and B is a body, then:
      P when Gs -> B yields Rep(C):
      { clause, LINE,
        [ Rep( { throw, P, _ } ) ],
        Rep(Gs), Rep(B) }

  • If C is a catch clause where X is an atomic literal or a variable pattern,
    P is a pattern, Gs is a guard sequence, and B is a body, then:
      X : P when Gs -> B yields Rep(C):
      { clause, LINE,
        [ Rep( { X, P, _ } ) ],
        Rep(Gs), Rep(B) }

6.6 Guards

  • If a guard sequence Gs is a sequence of guards G_i, then:
      G_1; ...; G_k yields Rep(Gs):
      [ Rep(G_1), ..., Rep(G_k) ]

  • If the guard sequence is empty, Rep(Gs) = [].

  • If a guard G is a nonempty sequence of guard tests, then:
      Gt_1, ..., Gt_k yields Rep(G):
      [ Rep(Gt_1), ..., Rep(Gt_k) ]

A guard test Gt is one of the following alternatives:

  • If Gt is an atomic literal L, then Rep(Gt) = Rep(L).

  • If Gt is a variable pattern V, where A is an atom with a printname
    consisting of the same characters as V, then:
      V yields Rep(Gt):
      { var, LINE, A }

  • If Gt is a tuple skeleton, then:
      { Gt_1, ..., Gt_k } yields Rep(Gt):
      { tuple, LINE,
        [ Rep(Gt_1), ..., Rep(Gt_k) ] }

  • If Gt is [], then:
      Rep(Gt) = { nil, LINE }

  • If Gt is a cons skeleton, then:
      [ Gt_h | Gt_t ] yields Rep(Gt):
      { cons, LINE, Rep(Gt_h), Rep(Gt_t) }

  • If Gt is a binary constructor, then:
      <<Gt_1:Size_1/TSL_1, ..., Gt_k:Size_k/TSL_k>> yields Rep(Gt):
      { bin, LINE,
         [ { bin_element, LINE,
             Rep(Gt_1), Rep(Size_1), Rep(TSL_1) }, ...,
           { bin_element, LINE,
             Rep(Gt_k), Rep(Size_k), Rep(TSL_k) } ] }

    For Rep(TSL), see above. An omitted Size is represented by default.
    An omitted TSL (type specifier list) is represented by default.

  • If Op is a binary operator, then:
      Gt_1 Op Gt_2 yields Rep(Gt):
      { op, LINE, Op, Rep(Gt_1), Rep(Gt_2) }

  • If Op is a unary operator, then:
      Op Gt_0 yields Rep(Gt):
      { op, LINE, Op, Rep(Gt_0) }

  • In Gt, if Field_i are ??? and Gt_i are ???, then:
      #Name{ Field_1=Gt_1, ..., Field_k=Gt_k } yields Rep(Gt):
      { record, LINE, Name,
        [ { record_field, LINE, Rep(Field_1), Rep(Gt_1) }, ...,
          { record_field, LINE, Rep(Field_k), Rep(Gt_k) } ] }

  • In Gt, if Name is ??? and Field is ???, then:
      #Name.Field yields Rep(Gt):
      { record_index, LINE, Name, Rep(Field) }

  • In Gt, if Gt_0 is ???, Name is ???, and Field is ???, then:
      Gt_0#Name.Field yields Rep(Gt):
      { record_field, LINE, Rep(Gt_0), Name, Rep(Field) }

  • In Gt, if A is an atom and Gt_i are ???, then:
      A(Gt_1, ..., Gt_k) yields Rep(Gt):
      { call, LINE, Rep(A),
        [ Rep(Gt_1), ..., Rep(Gt_k) ] }

  • In Gt, if A_m is the atom erlang and A is an atom or an operator, then:
      A_m:A(Gt_1, ..., Gt_k) yields Rep(Gt):
      { call, LINE,
         { remote, LINE, Rep(A_m), Rep(A) },
         [ Rep(Gt_1), ..., Rep(Gt_k) ] }

  • In Gt, if A_m is the atom erlang and A is an atom or an operator, then:
      {A_m, A}(Gt_1, ..., Gt_k) yields Rep(Gt):
      { call, LINE, Rep({A_m,A}),
         [ Rep(Gt_1), ..., Rep(Gt_k) ] }

  • If Gt is ( Gt_0 ), then Rep(Gt) = Rep(Gt_0)

    That is, parenthesized guard tests cannot be distinguished from their bodies. Note that every guard test has the same source form as some expression, and is represented the same way as the corresponding expression.

6.7 The abstract format after preprocessing

After preprocessing has completed, the abstract format may be handled in assorted ways.

  • If the compilation option debug_info is given to the compiler, the abstract code
    will be stored in the abstract_code chunk in the BEAM file (for debugging purposes).

  • In OTP R9C and later, the abstract_code chunk will contain the tuple
    { raw_abstract_v1, AbstractCode }, where AbstractCode is the abstract code
    as described in this document.

  • In releases of OTP prior to R9C, the abstract code (after more processing) was stored
    in the BEAM file. The first element of the tuple would be either abstract_v1 (R7B)
    or abstract_v2 (R8B).

Resources


This wiki page is maintained by Rich Morin, an independent consultant specializing in software design, development, and documentation. Please feel free to email comments, inquiries, suggestions, etc!

Topic revision: r22 - 04 Apr 2016, RichMorin
This site is powered by Foswiki Copyright © by the contributing authors. All material on this wiki is the property of the contributing authors.
Foswiki version v2.1.6, Release Foswiki-2.1.6, Plugin API version 2.4
Ideas, requests, problems regarding CFCL Wiki? Send us email