L10) Attribute Grammars

When Trees Carry Properties

A syntax tree is not just a structure. It carries meaning: values ascend from the leaves, constraints descend from the root. Attribute grammars formalize these two flows of information — and reveal why the flags of I2 (Bol Processor 3, algorithmic composition software based on formal grammars) are more powerful than they seem.

Where does this article fit in?

In syntax trees (L4), we discovered the AST: a tree-like representation of source code that preserves logical structure by eliminating syntactic noise. But a bare AST is like a skeleton without muscles — it shows the form, not the function.

This article extends L4 by showing how to decorate a syntax tree with computed properties. The idea, due to Donald Knuth (1968), is both simple and profound: each node in the tree carries attributes whose values are determined by semantic rules attached to the grammar’s productions.

Links in the series:

L1 — Chomsky Hierarchy — the classificatory framework for grammars
L4 — Syntax Trees — the structure that attributes enrich
L8 — Axiomatic Semantics — another way to add meaning beyond syntax

Why is this important?

Every compiler you use relies, in one way or another, on attribute grammars. When the C compiler checks that you’re not adding an integer and a pointer, it propagates types bottom-up in the tree. When it allocates registers, it transmits constraints top-down. These two flows — ascending and descending — are exactly what attribute grammars formalize.

But there’s a second, less expected reason. In BP3, flags (those conditional counters that control the application of grammar rules) strangely resemble attributes. Understanding attribute grammars will give us the vocabulary to place flags within a spectrum of expressiveness, and understand why their global scope has profound theoretical consequences.

The idea in one sentence

An attribute grammar is a context-free grammar enriched with computable properties attached to the nodes of the syntax tree, some ascending from leaves to root (synthesized), others descending from root to leaves (inherited).

Let’s explain step by step

The tree that computes

Let’s take a simple arithmetic expression: 3 + 4 * 2. Its AST looks like this:

So far, it’s a silent tree. It shows the structure, not the result. But if we add a val attribute to each node, the tree starts to compute:

        (+)  val = 11
       /   \
val=3 3    (*)  val = 8
          /   \
   val=4 4     2  val=2

What happened? The leaves know their own value (it’s immediate: a literal is worth what it’s worth). Then, the val attribute ascends: the (*) node calculates 4 * 2 = 8 from its children, then the (+) node calculates 3 + 8 = 11. Information flows bottom-up, from leaves to root.

This upward flow has a name: it’s a synthesized attribute. The parent node synthesizes its value from those of its children, like a manager summarizing reports from their team.

Deciphering: The val attribute is synthesized because each node calculates it from its children, never from its parent. The rule for the production E -> E1 + E2 is: E.val = E1.val + E2.val. This is a local equation: only immediate neighbors are needed.

Synthesized attributes: bottom-up

Let’s formalize. A synthesized attribute is an attribute whose value, for a given node, is calculated exclusively from the attributes of its children in the tree.

Let’s take the example of type checking. Here is a simplified grammar with semantic rules:

Production               Semantic Rule
─────────────────────────────────────────────
E  ->  E1 + E2          E.type = if E1.type == E2.type
                                  then E1.type
                                  else error

E  ->  E1 * E2          E.type = if E1.type == E2.type
                                  then E1.type
                                  else error

E  ->  num               E.type = int
E  ->  real              E.type = float

Deciphering: Production: E -> E1 + E2. Semantic rule: E.type = .... The type of the parent node E is synthesized from the types of its children E1 and E2. If both children have the same type, the parent inherits that type. Otherwise, it’s an error. No need to look higher in the tree.

This is how type checking works in almost all compilers: types ascend from the leaves (literals, variables whose type is known) to the root (the complete expression), combining at each node according to the language’s compatibility rules.

Inherited attributes: top-down

But not everything ascends. Some information descends. Let’s take a classic example: variable declaration.

int x, y, z;

The type int is declared only once, at the beginning. It must then be transmitted to each variable in the list. The tree looks like this:

       (Decl)
       /    \
    int     (List)
           /   |   \
          x    y    z

The type int is not calculated by the leaves x, y, z — it is transmitted to them from above. This is an inherited attribute: the parent (or an older sibling) communicates information to its descendants.

Production                    Semantic Rule
──────────────────────────────────────────────────
Decl -> Type List             List.type_decl = Type.type
List -> List1 , id            List1.type_decl = List.type_decl
                              id.type_decl = List.type_decl
List -> id                    id.type_decl = List.type_decl

Deciphering: The type_decl attribute (the declared type) is inherited: it descends from Decl to List, then from List to each id. The leaves do not calculate it — they receive it. This is the opposite of a synthesized attribute.

The analogy is that of a family tree where family customs are passed from parent to child. The grandparent decides “we are integers”, and each descendant inherits this convention.

Important Convention: The start symbol (the root of the tree) has no inherited attributes. This is logical: it has no parent to transmit anything to it. Information entering the tree must start either from the leaves (synthesized) or from the root (but inherited attributes of the root, if any, would have to come from nowhere).

The dependency graph

In a decorated tree, some attributes depend on others. The dependency graph makes these relationships explicit. Each attribute is a node in the graph, and an edge goes from A to B if the value of B depends on that of A.

Let’s revisit our expression 3 + 4 * 2 with a synthesized attribute val and add an inherited attribute base (imagine a configurable numeral system):

            (+)
           /   \
          3    (*)
              /   \
             4     2

Dependencies (arrows = "provides to"):

  3.val ──────────> (+).val
  (*).val ────────> (+).val
  4.val ──────────> (*).val
  2.val ──────────> (*).val

  (+).base ───────> 3.base      (inherited, descends)
  (+).base ───────> (*).base    (inherited, descends)
  (*).base ───────> 4.base      (inherited, descends)
  (*).base ───────> 2.base      (inherited, descends)

The upward arrows are synthesized attributes (values ascend), the downward arrows are inherited attributes (the base descends). The whole forms a directed acyclic graph — at least, if all goes well.

Deciphering — Some graph theory concepts: This diagram is a directed graph — a set of vertices (here, the attributes) connected by arrows (the dependencies). When no path returns to its starting point, it’s called a DAG (Directed Acyclic Graph). A DAG has a valuable property: its vertices can always be ordered such that each vertex appears after all those it depends on. This ordering is called a topological sort (Kahn, 1962). This is exactly what the compiler does to evaluate attributes: a depth-first search (DFS) of the tree is enough to determine the calculation order. DAGs and topological sorts are fundamental tools in computer science — they are found in build systems (Make, Bazel), spreadsheets (cell recalculation order), and package managers (dependency installation order).

The problem of cycles. If a synthesized attribute of A depends on an inherited attribute of A which itself depends on a synthesized attribute of A, a cycle is formed: a circular dependency with no solution. The graph is no longer acyclic, and evaluation is impossible.

Detecting whether an attribute grammar can produce cycles (for some derivation trees, not all) is a very difficult problem. Jazayeri, Ogden, and Rounds (1975) showed that this problem is DEXPTIME-complete (DEXPTIME = Deterministic Exponential Time, the class of problems solvable in exponential time with respect to the input size — concretely, computation time can double with each symbol added to the grammar, making verification impractical for large grammars). In practice, we therefore restrict ourselves to classes of attribute grammars where cycles are impossible by construction.

Practical classes: S-attributed and L-attributed

To avoid cycle problems and enable efficient evaluation, compiler theory has identified two well-behaved classes.

S-attributed Grammars

A grammar is S-attributed (S for synthesized) if all its attributes are synthesized. No inherited attributes, ever. Information flows only bottom-up.

This is the simplest class. A single bottom-up traversal of the tree is sufficient to evaluate everything. This class is perfectly compatible with LR parsing (Left-to-right, Rightmost derivation — the parser reduces bottom-up, exactly the right time to calculate synthesized attributes).

L-attributed Grammars

A grammar is L-attributed (L for left-to-right) if each inherited attribute of a symbol in the right-hand side of a production depends only on:

the inherited attributes of the left-hand side (the parent), and
the attributes (synthesized or inherited) of symbols located to its left in the production.

In other words: a symbol can look at what comes before it (its older siblings), but never what comes after. Evaluation is done in a single left-to-right pass, compatible with LL parsing (Left-to-right, Leftmost derivation — predictive top-down parsing).

Criterion	S-attributed	L-attributed
Allowed Attributes	Synthesized only	Synthesized + inherited (under constraint)
Flow Direction	Bottom-up only	Bottom-up + left-to-right
Required Traversal	One bottom-up pass	One left-to-right, depth-first pass
Compatible with	LR parsing	LL parsing
Expressiveness	Limited	Richer
Possible Cycles	Never	Never
Typical Examples	Calculators, expression evaluation	Type checking, variable scope

The inclusion relationship is:

S-attributed  ⊂  L-attributed  ⊂  Well-defined attribute grammars (acyclic)

Any S-attributed grammar is also L-attributed (since there are no inherited attributes, the L constraint is trivially satisfied). But the converse is false: an L-attributed grammar with inherited attributes is not S-attributed.

The link with BP3: flags as “global attributes”

We saw in B4 that BP3 flags are named counters that control the application of grammar rules. A flag like /Ideas=20/ initializes, tests (/Ideas/), decrements (/Ideas-1/), and conditions derivation. This strongly resembles an inherited attribute that descends the tree to control choices. But there is a crucial difference.

Classic AGs vs. BP3 flags

Aspect	Attributes (Classic AGs)	Flags (BP3)
Scope	Local to the node and its neighbors	Global: visible to all rules
Flow	Structural (parent-child, sibling-sibling)	Non-structural: any rule can read/write
Evaluation	Determined by tree topology	Depends on derivation order (and randomness)
Determinism	Deterministic (same inputs = same result)	Non-deterministic (stochastic choices)
Decidability	Evaluation decidable (for S/L-attributed)	Depends on the number of flags (see below)

The taxonomy of expressiveness

Attribute grammars and BP3 flags fall within a spectrum of increasing expressiveness:

Synthesized attributes (S-attributed)
  ⊂ Local inherited attributes (L-attributed)
    ⊂ General inherited attributes (well-defined AGs)
      ⊂ GLOBAL inherited attributes with bounded number (bounded BP3 flags)
        ⊂ Unbounded global attributes (Turing-complete)

Why is this progression important? Because it corresponds to different levels of decidability:

S/L-attributed: single-pass evaluation, always terminating, fully decidable.
Well-defined AGs: multi-pass evaluation, terminating if acyclic, decidable.
Bounded BP3 flags: flags are integers with finite values. The state space is finite, so termination is decidable in principle (though potentially costly).
Unbounded global attributes: if arbitrarily large flag values were allowed (unbounded), one could simulate the counters of a Minsky machine (two counters are sufficient for Turing-completeness). Termination would become undecidable.

Concrete example in BP3

Consider this grammar where a flag controls a musical choice:

gram#1[1] /Style=1/  |S| --> |Intro| |Dev| |Coda|
gram#1[2]            |Dev| --> /Style/ |A| |Dev| /Style-1/
gram#1[3]            |Dev| --> |B|
gram#1[4]            |A| --> do4 ré4 mi4
gram#1[5]            |B| --> sol4 la4 si4

The Style flag is initialized to 1 by rule 1. Rule 2 tests /Style/ (is it non-zero?) and decrements it. This flag functions like a global inherited attribute: it is defined “at the top” (at initialization) and consulted “further down” (in the derivation rules of Dev). But unlike a classic inherited attribute, it does not flow along the edges of the tree — it is accessible everywhere, like a global variable.

It is this global scope that makes flags both powerful (they can coordinate distant parts of the grammar) and theoretically delicate (their interaction with stochastic choices makes behavior more difficult to analyze than that of a classic attribute grammar).

Applications: what compilers do

Attribute grammars are not a forgotten theoretical object. They are at the heart of modern compilers, even if the vocabulary has sometimes changed.

Type checking. This is the canonical application. Types are synthesized attributes that ascend from leaves (literals, variables) to the root (complete expression). At each operator node, the compiler checks the compatibility of its children’s types and synthesizes the resulting type. If int + float is allowed, the resulting type is float (coercion); if int + string is not, it’s a type error.

Code generation. The compiler transmits descending information (inherited attributes): in which register to store the result? What is the base address of the stack? These constraints descend from the root to the leaves, guiding the translation of each node into machine instructions.

Modern tools. The tradition of attribute grammars has given rise to specialized tools:

JastAdd (Lund, Sweden): a system of Reference Attribute Grammars (RAG) where attributes can reference other nodes in the tree. Used for the ExtendJ (Java) compiler.
Silver (Minnesota, USA): a compiler specification language based on attribute grammars, with support for composable language extensions.
ANTLR (ANother Tool for Language Recognition): although parser-oriented, ANTLR supports actions and attributes attached to grammar rules, in the spirit of AGs.

These tools show that Knuth’s formalism is far from obsolete: it has evolved, been enriched, but the fundamental principle — decorating trees with computed properties — remains the same.

Key takeaways

Attribute grammar = context-free grammar + computed properties on tree nodes. Each production is accompanied by semantic rules that define how attributes are evaluated.
Synthesized attribute: calculated bottom-up, from children. Example: the value of an expression, the type of an operand.
Inherited attribute: transmitted top-down (or left-to-right), from the parent or older siblings. Example: the declared type in a variable list.
The dependency graph links attributes together. If it is acyclic, evaluation is possible; if it contains a cycle, it is impossible.
S-attributed (all synthesized) and L-attributed (inherited limited to left siblings) are the two practical classes that guarantee the absence of cycles and allow single-pass evaluation.
BP3 flags are global inherited attributes: they do not flow along the tree but are accessible by any rule. Their bounded number guarantees a finite state space.
The spectrum of expressiveness ranges from S-attributed (simple, decidable) to unbounded global attributes (Turing-complete, undecidable), with BP3 flags in a controlled intermediate zone.

To go further

Knuth, D.E. (1968). “Semantics of Context-Free Languages”, Mathematical Systems Theory 2(2), pp. 127-145. — The foundational paper. Knuth introduces synthesized and inherited attributes.
Aho, A.V., Sethi, R. & Ullman, J.D. (1986). Compilers: Principles, Techniques, and Tools (the “Dragon Book”), chapters 5-6. — The reference treatment of attribute grammars in the context of compilation.
Paakki, J. (1995). “Attribute Grammar Paradigms — A High-Level Methodology in Language Implementation”, ACM Computing Surveys 27(2). — A comprehensive synthesis of AG paradigms and applications.
Jazayeri, M., Ogden, W.F. & Rounds, W.C. (1975). “The Intrinsically Exponential Complexity of the Circularity Problem for Attribute Grammars”, Communications of the ACM 18(12). — The proof of DEXPTIME-completeness for circularity detection.
Hedin, G. (2000). “Reference Attributed Grammars”, Informatica 24(3). — The modern extension of AGs used in JastAdd.
In this series: L4 for the basic structure, B4 for BP3 flags.

Glossary

Synthesized attribute: an attribute whose value is calculated from the attributes of child nodes. Information flows bottom-up in the tree. Example: the val value of an arithmetic expression node.
Inherited attribute: an attribute whose value is transmitted by the parent node or siblings located to the left. Information flows top-down or left-to-right. Example: the declared type type_decl in a list of variables.
Dependency graph: a directed graph whose nodes are attribute instances and whose edges represent dependencies between them. If it is acyclic, an evaluation order exists.
S-attributed: a class of attribute grammars containing only synthesized attributes. Bottom-up single-pass evaluation, compatible with LR parsing.
L-attributed: a class of attribute grammars where inherited attributes depend only on the parent and left siblings. Left-to-right single-pass evaluation, compatible with LL parsing.
OAG (Ordered Attribute Grammar): an intermediate class between L-attributed and general AG, where a fixed evaluation order can be determined statically for each production. Introduced by Kastens (1980).
Flag (BP3): a named counter with global scope that conditions and modifies the application of grammar rules. Combines the roles of guard (test) and effect (modification).
Multi-pass evaluation: an evaluation strategy where the tree is traversed multiple times, each pass calculating a subset of attributes. Necessary when dependencies do not allow a single pass.
Circularity: a situation where the dependency graph contains a cycle, making evaluation impossible. Detecting whether an attribute grammar can produce circular dependencies is DEXPTIME-complete.
Semantic rule: an equation or function that defines how to calculate the value of an attribute based on other attributes. Each grammar production is accompanied by its semantic rules.

Prerequisites: L4
Reading time: 15 min
Tags: #attribute-grammars #knuth #compilation #synthesized-attributes #inherited-attributes #bp3-flags

Previous article: L9
Next article: L11) Other Semantics (coming soon)
For BP3 flags in practice, see: B4