B3) Derivation Rules

How a Grammar Generates

A grammar is a set of rules. But how are these rules applied to produce concrete results?

Where This Article Fits In

After defining the B1 (probabilities) and the B2 (vocabulary) of BP3, this article details how the rules are applied. The five derivation modes (ORD, RND, LIN, SUB, SUB1) are formalized in SOS (Structural Operational Semantics) in article L6.


Why It’s Important

Having a grammar is not enough. You also need to know how to use it. A grammar defines the rules of the game, but not the strategy for playing.

Let’s take a culinary analogy. A recipe tells you that you can add salt “to taste” and that you can cook the dish “in the oven or in a pan.” But in what order do you do things? Do you salt before or after cooking? Do you choose the cooking method first?

Similarly, a grammar often offers several rules that can be applied simultaneously. The derivation strategy determines which one to apply first, and this strategy can radically change the system’s behavior. BP3 goes further: it offers five different derivation modes, each adapted to a generation style.

The Idea in One Sentence

A derivation is a sequence of production rule applications that transforms the starting symbol into a string of terminals; the strategy (order, choice) determines the result.


Let’s Explain Step by Step

Anatomy of a Production Rule

A production rule (or rewrite rule) has the form:

 

A → α

 

It reads: “The symbol A can be replaced by the sequence α (alpha — any string of symbols).”

Sidebar: The Production Rule as an Instruction

Think of a production rule as a substitution instruction in a word processor:

  • “Replace A with xyz”

Every time you see A in your working text, you CAN replace it with xyz. You don’t have to (there may be other rules for A), but it’s a valid option.

Components:

  • A: the left side, called LHS (Left-Hand Side). This is what is replaced. In a context-free grammar (cf. L2), it is always a SINGLE non-terminal.
  • α: the right side, called RHS (Right-Hand Side). This is what it is replaced by. It can be a sequence of terminals and/or non-terminals, or even the empty string (ε).
  • : the production arrow, which indicates the direction of replacement (from left to right)

Examples:

S → A B           (S becomes the sequence A then B)
A → a A           (A becomes 'a' followed by A - recursion)
A → a             (A simply becomes 'a')
B → b             (B becomes 'b')

 

Derivation: From S to w

A derivation is a sequence of intermediate forms, each obtained by applying a rule to the previous one. It is the “path” that leads from the starting symbol S to a final string w.

Notation:

Sidebar: Derivation Arrows

Symbol Name Meaning Example
=> simple derivation exactly ONE step S => A B
=>* star derivation ZERO or more steps S =>* abc (S can be abc, or reach it in several steps, or S = abc directly)
=>+ plus derivation ONE or more steps S =>+ abc (at least one rule has been applied)

>
The star (*) and plus (+) are conventions borrowed from regular expressions:

  • Star = “zero or more”
  • Plus = “one or more”

Complete Example:

Grammar:

S → A B
A → a A | a
B → b B | b

 

Derivation of aab:

S
=> A B        (S → A B)
=> a A B      (A → a A)
=> a a B      (A → a)
=> a a b      (B → b)

 

We write: S =>* aab

Example 1: Leftmost vs. Rightmost Derivation

When multiple non-terminals are present in the working string, which one should be replaced first? This is a question of derivation strategy.

Sidebar: Why Strategy Matters

For an unambiguous grammar (where each string admits only one derivation tree), the final RESULT is the same regardless of the strategy. But the PATH (the order of steps) differs. This matters for:

  • Parsers (syntax analyzers): an LL parser (Left-to-right, Leftmost derivation) reads from left to right and uses leftmost derivation; an LR parser (Left-to-right, Rightmost derivation) uses rightmost
  • Debugging: following a derivation step by step is easier with a consistent strategy
  • Ambiguous grammars: different strategies can lead to different interpretations

Leftmost (or leftmost derivation):
We ALWAYS replace the leftmost non-terminal first. It’s like reading a book: you process what comes first.

Rightmost (or rightmost derivation):
We ALWAYS replace the rightmost non-terminal first. It’s like undoing a stack: you process what arrived last.

Example Grammar:

S → A B
A → a
B → b

 

Leftmost derivation of ab:

S
=> A B        (S → A B, we have A and B, A is leftmost)
=> a B        (A → a, we replace A)
=> a b        (B → b, we replace B)

 

Rightmost derivation of ab:

S
=> A B        (S → A B, we have A and B, B is rightmost)
=> A b        (B → b, we replace B first)
=> a b        (A → a, we replace A)

 

Identical result, different path.

For unambiguous grammars, leftmost and rightmost derivations produce the same result. The difference becomes important for parsers and ambiguous grammars.

Example 2: When Order Changes the Result (Non-Deterministic Grammars)

With alternative rules, the choice of rule (not just the symbol) affects the result.

Grammar:

S → A B
A → a | aa
B → b | bb

 

Possible derivations of S:

S => A B => a B => a b        → "ab"
S => A B => a B => a bb       → "abb"
S => A B => aa B => aa b      → "aab"
S => A B => aa B => aa bb     → "aabb"

 

Four different results depending on the choices. This is the non-determinism of grammars.


The Five BP3 Modes: Derivation Strategies

BP3 does not settle for pure non-determinism. It offers five modes that define how to choose among alternative rules. These modes transform the grammar into different types of generators.

ORD Mode: Strict Sequential

Principle: Rules are applied in the order they are written in the grammar file. For each non-terminal, its first available rule is used, then the next one is used when the first has been exhausted.

BP3 Example — a tintāl theka:

The theka is the basic pattern of a tāla (Indian rhythmic cycle), played as an ostinato by the tabla. It is the perfect example of ORD mode: the theka is always the same, deterministic, played identically in each cycle [BelKippen1992a].

 

-mo.ORD
gram#1[1] |S| → |Vibhag1| |Vibhag2| |Vibhag3| |Vibhag4|
gram#1[2] |Vibhag1| → dha dhin dhin dha
gram#1[3] |Vibhag2| → dha dhin dhin dha
gram#1[4] |Vibhag3| → dha tin tin ta
gram#1[5] |Vibhag4| → ta dhin dhin dha

 

Step-by-step Derivation:

Step 1: |S|
         → |Vibhag1| |Vibhag2| |Vibhag3| |Vibhag4|      (rule 1)

Step 2: |Vibhag1| → dha dhin dhin dha                  (rule 2)
Step 3: |Vibhag2| → dha dhin dhin dha                  (rule 3)
Step 4: |Vibhag3| → dha tin tin ta                     (rule 4)
Step 5: |Vibhag4| → ta dhin dhin dha                   (rule 5)

 

Result: dha dhin dhin dha | dha dhin dhin dha | dha tin tin ta | ta dhin dhin dha

This is the theka of tintāl — the most common 16-beat cycle in Hindustani music. Always the same, 100% deterministic. Note the contrast between the thali sections (vibhāg 1-2, resonant bols dha/dhin) and the khali section (vibhāg 3, dry bols tin/ta).

Usage: Generating fixed sequences: thekas, scales, technical exercises, any pattern where repeatability is essential.

RND Mode: Weighted Random

Principle: Among all applicable rules for a non-terminal, one is chosen randomly according to the specified weights. The weight <N> indicates the relative importance of the rule.

BP3 Example — a tabla improvisation:

In performance, the tabla player improvises around the theka by choosing from a repertoire of motifs (bols). Resonant motifs (thali section) are more frequent than dry motifs (khali section) — exactly what weights allow to model [BelKippen1992a].

 

-mo.RND
gram#1[1] |S| → <3> |Motif| |S|
gram#1[2] |S| → <1> |Sam|
gram#1[3] |Motif| → <3> dha tirakita dhin
gram#1[4] |Motif| → <2> dha dhin dhin dha
gram#1[5] |Motif| → <1> tin - ta

 

Probability Calculation:

For |S|: total weight = 3 + 1 = 4
  - P(rule 1) = 3/4 = 75% (continue improvisation)
  - P(rule 2) = 1/4 = 25% (land on sam, the first beat)

For |Motif|: total weight = 3 + 2 + 1 = 6
  - P(rule 3) = 3/6 = 50% (ornate motif, most frequent)
  - P(rule 4) = 2/6 = 33% (classic theka motif)
  - P(rule 5) = 1/6 = 17% (dry motif, rare — khali section)

 

Three possible executions:

Execution 1: |S| → |Sam|                                    → "sam"
Execution 2: |S| → dha tirakita dhin |S| → dha tirakita dhin |Sam|
Execution 3: |S| → tin - ta |S| → tin - ta dha dhin dhin dha |S| → ...

 

Result: Variable with each execution. Resonant motifs (dha) dominate, dry motifs (tin) are rare — faithful to actual tabla practice.

Usage: Improvisation, variation, creative exploration. Ideal for modeling the controlled freedom of Indian improvisation.

LIN Mode: Cyclic Linear

Principle: Like ORD, rules are applied in order. But when the last rule is reached, we return to the first (cyclic behavior, called wrap-around).

BP3 Example — a cyclic tāla ostinato:

Indian music is fundamentally cyclic: the tāla (rhythmic cycle) loops, returning tirelessly to the sam (first beat). LIN mode is the natural mechanism to encode this circularity.

 

-mo.LIN
gram#1[1] |S| → |Bol| |Bol| |Bol| |Bol| |Bol| |Bol| |Bol| sam
gram#1[2] |Bol| → dha
gram#1[3] |Bol| → dhin
gram#1[4] |Bol| → ta

 

Step-by-step Derivation:

|S| → |Bol| |Bol| |Bol| |Bol| |Bol| |Bol| |Bol| sam   (rule 1)

Cyclic development of each |Bol|:
|Bol| #1 → dha   (rule 2)
|Bol| #2 → dhin  (rule 3)
|Bol| #3 → ta    (rule 4)
|Bol| #4 → dha   (return to rule 2 -- wrap-around!)
|Bol| #5 → dhin  (rule 3)
|Bol| #6 → ta    (rule 4)
|Bol| #7 → dha   (return to rule 2)

 

Result: dha dhin ta dha dhin ta dha sam

The cycle of three bols repeats indefinitely: dha, dhin, ta, dha, dhin, ta… like a tāla that turns until it returns to sam.

Usage: Perfect for ostinatos (motifs repeated in a loop), tāla cycles, circular rhythmic patterns. In Indian music, all tabla accompaniment relies on this cyclic principle [Bel2001].

SUB Mode: Global Substitution

Principle: When a non-terminal is developed, ALL its occurrences in the string are simultaneously replaced by the same development. This is a “global” substitution, like a “Replace All” in a text editor.

BP3 Example — a tihāī (Indian cadence):

The tihāī is a fundamental cadence in Indian music: a motif is repeated exactly three times to land on sam (the first beat of the cycle). The three repetitions must be identical — this is exactly what SUB mode guarantees [BelKippen1992a].

 

-mo.SUB
gram#1[1] |S| → |A| - |A| - |A|
gram#1[2] |A| → dha tirakita dhin dhin
gram#1[3] |A| → tin ta dha ghe

 

Step-by-step Derivation:

Step 1: |S|
         → |A| - |A| - |A|      (rule 1)

Step 2: We choose how to develop |A|
         Suppose we choose rule 2 (dha tirakita dhin dhin)
         ALL |A|s become dha tirakita dhin dhin SIMULTANEOUSLY:

         |A| - |A| - |A|
         → dha tirakita dhin dhin - dha tirakita dhin dhin - dha tirakita dhin dhin

 

Result: dha tirakita dhin dhin - dha tirakita dhin dhin - dha tirakita dhin dhin (the three |A|s are identical)

This is THE structure of the tihāī: the same motif, three times, separated by silences calibrated so that the third repetition falls exactly on sam. In RND mode, each |A| would be developed independently, which would destroy the structure of the tihāī — the three motifs could differ.

Musical Usage:

  • Tihāī: the quintessential Indian cadence — 3× the same motif to resolve on sam
  • Thematic coherence: if |A| represents a theme, all its appearances are identical
  • Strict canons: voices play exactly the same motif

SUB1 Mode: First Occurrence Substitution

Principle: At each step, only the first occurrence (the leftmost) of each non-terminal is replaced. Other occurrences of the same symbol await their turn.

BP3 Example — a kayda with progressive variations:

The kayda is a tabla composition based on a theme (mukhra) that returns several times with progressive variations. Unlike the tihāī (SUB mode, where repetitions are identical), the kayda develops each appearance of the theme independently — this is exactly what SUB1 does [BelKippen1992a].

 

-mo.SUB1
gram#1[1] |S| → |A| - |A| - |A|
gram#1[2] |A| → dha dhin dhin dha
gram#1[3] |A| → dha tirakita dhin dhin dha
gram#1[4] |A| → dha dhin tirakita dha tirakita dhin dha

 

Step-by-step Derivation:

Step 1: |S|
         → |A| - |A| - |A|          (rule 1)

Step 2: |A| - |A| - |A|
         We replace only the FIRST |A| (the leftmost)
         → dha dhin dhin dha - |A| - |A|     (first |A| → rule 2, simple theme)

Step 3: dha dhin dhin dha - |A| - |A|
         → dha dhin dhin dha - dha tirakita dhin dhin dha - |A|    (second |A| → rule 3, ornate)

Step 4: → dha dhin dhin dha - dha tirakita dhin dhin dha - dha dhin tirakita dha tirakita dhin dha
         (third |A| → rule 4, very ornate)

 

Result: Three variations of the theme, each more developed than the previous one.

Crucial Difference with SUB:
In SUB mode (tihāī), the three |A|s would have been identical. In SUB1 mode (kayda), each |A| is developed independently, allowing for progressive development — from simple to complex, as in actual kayda practice.

Musical Usage:

  • Kayda: theme and progressive variations of the tabla — the core of the Lucknow repertoire
  • Progressive development: each repetition of a theme can evolve
  • Asymmetrical structures: different antecedent and consequent

Sidebar: The Five Modes and Tabla Practice

Each BP3 derivation mode naturally corresponds to an aspect of tabla performance, which is no coincidence: BP was designed to model precisely these practices [BelKippen1992a]:

Mode Tabla Practice Description
ORD Theka (ostinato) The basic pattern of the tāla, always identical, played mechanically
RND Improvisation The tabla player chooses from a repertoire of motifs, some more frequent than others
LIN Tāla Cycle The rhythmic cycle turns indefinitely, returning to the starting point
SUB Tihāī (cadence) A motif repeated 3 times identically to resolve on sam
SUB1 Kayda (theme & variations) A theme that returns with progressive developments

>
This direct correspondence between computational formalism and Indian musical practice shows that BP3 modes are not arbitrary abstractions: they emerge from the observation of real music [Bel1998].

Summary Table of Modes

Mode Meaning Rule Selection Multiple Occurrences Determinism
ORD Ordered Sequential, in file order Independent Yes, 100%
RND Random Random according to weights Independent No
LIN Linear Cyclic (return to beginning) Independent Yes, cyclic
SUB Substitution Random or sequential All identical Variable
SUB1 Substitution 1 Random or sequential One at a time Variable

Sidebar: Choosing the Right Mode

  • Do you want a fixed and reproducible sequence? → ORD
  • Do you want controlled variety? → RND with adjusted weights
  • Do you want a looping pattern? → LIN
  • Do you want repetitions to be identical? → SUB
  • Do you want progressive development where each repetition can evolve? → SUB1

Visualizing a Derivation: The Syntax Tree

A derivation can be represented as a syntax tree (see also L4):

  • The root is the starting symbol
  • Each non-terminal node has the symbols of its development as children
  • The leaves are the terminals

Example:

Grammar:

S → A B
A → a a
B → b

 

Tree for aab:

       S
      / \
     A   B
    / \   |
   a   a  b

 

Reading the leaves from left to right: a a b

Parsers build these trees to analyze code or text.


Key Takeaways

  • A production rule A → α indicates that A can be replaced by α.
  • A derivation is a sequence of replacements from the starting symbol to the terminals.
  • Leftmost and rightmost are two strategies for choosing which non-terminal to replace first.
  • BP3 offers five modes that go far beyond:

ORD: fixed order of rules
RND: weighted random choice
LIN: cycle through rules
SUB: global substitution (coherence)
SUB1: one occurrence at a time substitution

  • The chosen mode radically transforms the generative behavior of the same grammar.
  • Direction: the five modes above are production (generation) modes. BP3 also has two other directions: ANAL mode (recognition — the grammar checks if a sequence belongs to the language) and TEMP mode (template — generation under constraints). See B8.

To Go Further

  • Reference Book: Aho, Lam, Sethi & Ullman, Compilers: Principles, Techniques, and Tools (the “Dragon Book”), chapter 4
  • BP3 Documentation: Bol Processor – Produce All Items
  • BP3 Modes: Bol Processor – Grammar Control
  • Visualization: Syntax Tree Generator
  • The Modes in their Original Context: Bel, B. & Kippen, J. (1992), “Modelling Music with Grammars: Formal Language Representation in the Bol Processor” — description of ORD, RND, SUB modes applied to tabla
  • Overview: Bel, B. (1998), “Migrating Musical Concepts — An Overview of the Bol Processor”, Computer Music Journal 22(2)
  • Beyond Production: B8 — BP3’s ANAL (recognition) and TEMP (template) modes, and QAVAID for grammar inference

Glossary

  • Syntax Tree: A tree-like representation of a derivation, with the starting symbol at the root and terminals at the leaves.
  • Kayda: A tabla composition with a fixed theme (mukhra) and systematic variations. Corresponds to BP3’s SUB1 mode.
  • Khali: The “non-resonant” section of a tāla cycle, where dry bols (tin, ta) dominate.
  • Derivation: A sequence of rule applications transforming the starting symbol into a string of terminals.
  • Leftmost (leftmost derivation): A strategy that always replaces the leftmost non-terminal.
  • LHS (Left-Hand Side): The left part of a production rule, i.e., the symbol being replaced.
  • LIN (BP3 mode): Linear mode, where rules are applied cyclically: after the last one, it returns to the first.
  • Derivation Mode: In BP3, a global strategy for choosing and applying rules (ORD, RND, LIN, SUB, SUB1).
  • ORD (BP3 mode): Ordered mode, where rules are applied in the sequential order of the file.
  • Ostinato: A short musical motif stubbornly repeated throughout a passage or piece.
  • Sam: The first beat of the tāla cycle — the point of resolution towards which tihāī and cadences converge.
  • Theka: The basic pattern of a tāla, played by the tabla as an ostinato. Corresponds to BP3’s ORD mode.
  • Tihāī: An Indian cadence where a motif is repeated exactly three times to land on sam. Corresponds to BP3’s SUB mode.
  • Tintāl: The most common tāla in Hindustani music: 16 beats in 4 vibhāg of 4 beats.
  • Vibhāg: A section of a tāla (e.g., the tintāl has 4 vibhāg).
  • RHS (Right-Hand Side): The right part of a production rule, i.e., what it is replaced by.
  • Rightmost (rightmost derivation): A strategy that always replaces the rightmost non-terminal.
  • RND (BP3 mode): Random mode, where rules are chosen randomly according to their weights.
  • Production Rule: An instruction of the form A → α indicating that A can be rewritten as α.
  • SUB (BP3 mode): Substitution mode, where all occurrences of the same non-terminal are replaced identically.
  • SUB1 (BP3 mode): Substitution 1 mode, where only the first occurrence of a non-terminal is replaced at each step.
  • Wrap-around: Cyclic behavior where, after reaching the end of a sequence, one returns to the beginning.

Prerequisites: B1 (PCFG), B2 (Alphabets), L6 (SOS)
Reading Time: 8 min
Tags: #derivation #grammars #BP3 #modes #generation #parsing