B3) Derivation Rules

How a grammar generates

A grammar is a set of rules. But how are these rules applied to produce concrete results?

Where does this article fit in?

After defining B1 (probabilities) and B2 (vocabulary) of BP3, this article details how rules are applied. The seven derivation modes (ORD, RND, LIN, SUB, SUB1, TEM, POSLONG) are formalized in SOS (Structural Operational Semantics) in article L6.

Why is it important?

Having a grammar is not enough. You still need to know how to use it. A grammar defines the rules of the game, but not the strategy for playing.

Take a culinary analogy. A recipe tells you that you can add salt “to taste” and that you can cook the dish “in the oven or in a pan.” But in what order do you do things? Do you salt before or after cooking? Do you choose the cooking method first?

Similarly, a grammar often offers several rules applicable simultaneously. The derivation strategy determines which one to apply first, and this strategy can radically change the system’s behavior. BP3 goes further: it offers seven different derivation modes, each suited to a generation style.

The idea in one sentence

A derivation is a sequence of production rule applications that transforms the start symbol into a string of terminals; the strategy (order, choice) determines the result.

Explaining step by step

Anatomy of a production rule

A production rule (or rewriting rule) has the form:

A → α

It reads: “The symbol A can be replaced by the sequence α (alpha — any string of symbols).”

Sidebar: The production rule as an instruction

Think of a production rule as a substitution instruction in a word processor:

“Replace A with xyz”

Every time you see A in your working text, you CAN replace it with xyz. You don’t have to (there may be other rules for A), but it is a valid option.

Components:

A: the left side, called LHS (Left-Hand Side). This is what is being replaced. In a context-free grammar (cf. L2), it is always A SINGLE non-terminal.
α: the right side, called RHS (Right-Hand Side). This is what it is replaced with. It can be a sequence of terminals and/or non-terminals, or even the empty string (ε).
→: the production arrow, which indicates the direction of replacement (from left to right)

Examples:

S → A B           (S becomes the sequence A then B)
A → a A           (A becomes 'a' followed by A - recursion)
A → a             (A becomes simply 'a')
B → b             (B becomes 'b')

Derivation: from S to w

A derivation is a sequence of intermediate forms, each obtained by applying a rule to the previous one. It is the “path” leading from the start symbol S to a final string w.

Notation:

Sidebar: Derivation arrows

Symbol Name Meaning Example

=> simple derivation exactly ONE step S => A B

=>* star derivation ZERO or more steps S =>* abc (S can be abc, or get there in several steps, or S = abc directly)

=>+ plus derivation ONE or more steps S =>+ abc (at least one rule has been applied)

>
The star (*) and plus (+) are conventions borrowed from regular expressions:

Star = “zero or more”

Plus = “one or more”

Symbol	Name	Meaning	Example
`=>`	simple derivation	exactly ONE step	S => A B
`=>*`	star derivation	ZERO or more steps	S =>* abc (S can be abc, or get there in several steps, or S = abc directly)
`=>+`	plus derivation	ONE or more steps	S =>+ abc (at least one rule has been applied)

Full example:

Grammar:

S → A B
A → a A | a
B → b B | b

Derivation of aab:

S
=> A B        (S → A B)
=> a A B      (A → a A)
=> a a B      (A → a)
=> a a b      (B → b)

We write: S =>* aab

Example 1: leftmost vs rightmost derivation

When several non-terminals are present in the work string, which one should be replaced first? This is a question of derivation strategy.

Sidebar: Why does strategy matter?

For an unambiguous grammar (where each string has only one derivation tree), the final RESULT is the same regardless of the strategy. But the PATH (the order of steps) differs. This matters for:

Parsers (syntactic analyzers): an LL parser (Left-to-right, Leftmost derivation) reads from left to right and uses leftmost derivation; an LR parser (Left-to-right, Rightmost derivation) uses rightmost

Debugging: following a derivation step by step is easier with a consistent strategy

Ambiguous grammars: different strategies can lead to different interpretations

Leftmost (leftmost derivation):
We ALWAYS replace the leftmost non-terminal first. It’s like reading a book: we process what comes first.

Rightmost (rightmost derivation):
We ALWAYS replace the rightmost non-terminal first. It’s like undoing a stack: we process what arrived last.

Example grammar:

S → A B
A → a
B → b

Leftmost derivation of ab:

S
=> A B        (S → A B, we have A and B, A is further left)
=> a B        (A → a, we replace A)
=> a b        (B → b, we replace B)

Rightmost derivation of ab:

S
=> A B        (S → A B, we have A and B, B is further right)
=> A b        (B → b, we replace B first)
=> a b        (A → a, we replace A)

Identical result, different path.

For unambiguous grammars, leftmost and rightmost produce the same result. The difference becomes important for parsers and ambiguous grammars.

Example 2: when order changes the result (non-deterministic grammars)

With alternative rules, the choice of the rule (not just the symbol) affects the result.

Grammar:

S → A B
A → a | aa
B → b | bb

Possible derivations of S:

S => A B => a B => a b        → "ab"
S => A B => a B => a bb       → "abb"
S => A B => aa B => aa b      → "aab"
S => A B => aa B => aa bb     → "aabb"

Four different results depending on choices. This is the non-determinism of grammars.

Sub-grammars and exhaustion

Before talking about modes, a fundamental principle: a BP3 grammar is not a monolithic block — it is divided into sub-grammars separated by lines of dashes (-----).

gram#1[1] S --> A B C
gram#1[2] A --> dha dhin

-----

gram#2[1] B --> tirakita
gram#2[2] C --> dha ge

The BP3 engine derives sub-grammars in order, with a principle of exhaustion: each gram#N block is derived until no more non-terminals from this block are rewritable in the work string (the string currently being derived). Only then does the engine move to the next block.

Concretely:

gram#1 applies first — it rewrites S into A B C, then A into dha dhin. As long as there is a non-terminal that gram#1 knows how to rewrite, we stay in gram#1.
When gram#1 is exhausted (nothing left to rewrite), we move to gram#2 which rewrites B and C.

This division is powerful: each sub-grammar has its own mode of derivation. You can have a gram#1 in ORD mode (deterministic) followed by a gram#2 in RND mode (random) — the first phase sets the structure, the second fills it with variations.

ORD                          // gram#1: deterministic structure
gram#1[1] S --> Phrase Phrase

-----

RND                          // gram#2: random content
gram#2[1] Phrase --> dha dhin tirakita
gram#2[2] Phrase --> ta tin tirakita
gram#2[3] Phrase --> dha ge tirakita

In BPscript, the separator is ----- and the mode is declared by @mode:X (S5):

@mode:ord
S -> Phrase Phrase

-----

@mode:random
Phrase -> dha dhin tirakita
Phrase -> ta tin tirakita
Phrase -> dha ge tirakita

The seven modes of BP3: derivation strategies

BP3 does not settle for pure non-determinism. It offers seven modes that define how to choose among alternative rules within a sub-grammar. These modes transform the grammar into different types of generators.

ORD Mode: strict sequential

Principle: Rules are applied in the order they are written in the grammar file. For each non-terminal, its first available rule is used, then we move to the next one when the first has been exhausted.

BP3 Example — a tintāl theka:

The theka is the basic pattern of a tāla (Indian rhythmic cycle), played as an ostinato by the tabla. It is the perfect example of ORD mode: the theka is always the same, deterministic, played identically in each cycle [BelKippen1992a].

ORD
gram#1[1] S -->Vibhag1 Vibhag2 Vibhag3 Vibhag4
gram#1[2] Vibhag1 -->dha dhin dhin dha
gram#1[3] Vibhag2 -->dha dhin dhin dha
gram#1[4] Vibhag3 -->dha tin tin ta
gram#1[5] Vibhag4 -->ta dhin dhin dha

Step-by-step derivation:

Step 1: S
         → Vibhag1 Vibhag2 Vibhag3 Vibhag4      (rule 1)

Step 2: Vibhag1 -->dha dhin dhin dha                  (rule 2)
Step 3: Vibhag2 -->dha dhin dhin dha                  (rule 3)
Step 4: Vibhag3 -->dha tin tin ta                     (rule 4)
Step 5: Vibhag4 -->ta dhin dhin dha                   (rule 5)

Result: dha dhin dhin dha | dha dhin dhin dha | dha tin tin ta | ta dhin dhin dha

This is the theka of tintāl — the most common 16-beat cycle in Hindustani music. Always the same, 100% deterministic. Note the contrast between the thali sections (vibhāg 1-2, resonant bols dha/dhin) and the khali section (vibhāg 3, dry bols tin/ta).

Usage: Generating fixed sequences: thekas, scales, technical exercises, any pattern where repeatability is essential.

RND Mode: weighted random

Principle: Among all applicable rules for a non-terminal, one is chosen randomly according to specified weights. The weight <N> indicates the relative importance of the rule.

BP3 Example:

RND
gram#1[1] S --> <3> A S
gram#1[2] S --> <1> x
gram#1[3] A --> <3> a b c
gram#1[4] A --> <2> a a b
gram#1[5] A --> <1> c - a

Probability calculation:

For S: total weight = 3 + 1 = 4
  - P(rule 1) = 3/4 = 75% (continue)
  - P(rule 2) = 1/4 = 25% (finish with x)

For A: total weight = 3 + 2 + 1 = 6
  - P(rule 3) = 3/6 = 50% (motif a b c — most frequent)
  - P(rule 4) = 2/6 = 33% (motif a a b)
  - P(rule 5) = 1/6 = 17% (motif c - a — rare)

Three possible executions:

Execution 1: S → x                              → "x"
Execution 2: S → a b c S → a b c x              → "a b c x"
Execution 3: S → c - a S → c - a a a b S → ...  → "c - a a a b ..."

Result: Variable at each execution. Heavier weighted patterns appear more often.

Usage: Improvisation, variation, creative exploration. Weights model the relative frequency of each alternative.

LIN Mode: random leftmost derivation

Correction (March 2026)

This article initially contained an erroneous description of LIN mode as “cyclic (wrap-around)”. This is false. The correction below is based on official BP3 documentation [Bel, BP2 Grammars] and confirmed by Bernard Bel.

Principle: Among candidate rules, one is chosen randomly that produces a leftmost derivation (leftmost derivation — we rewrite the leftmost non-terminal in the work string). In analysis mode, rightmost derivation is used (rightmost derivation, context-sensitive).

It is the random counterpart of ORD: where ORD takes the first candidate rule, LIN chooses one at random among those that apply to the leftmost position.

BP3 Example:

LIN
gram#1[1] S --> A B
gram#1[2] A --> a
gram#1[3] A --> c
gram#1[4] B --> b
gram#1[5] B --> d

Step-by-step derivation:

S → A B                  (rule 1 — only candidate for S)

A is the leftmost non-terminal.
Candidate rules for A: rule 2 (A → a) and rule 3 (A → c)
Random choice → let's say rule 3:

A B → c B                (rule 3 — chosen randomly)

B is now the leftmost non-terminal.
Candidate rules for B: rule 4 (B → b) and rule 5 (B → d)
Random choice → let's say rule 4:

c B → c b                (rule 4 — chosen randomly)

Possible result: c b (among 4 possible results: a b, a d, c b, c d)

Difference with RND: In RND mode, rules are chosen randomly without position constraints. In LIN mode, the random choice is constrained: only rules applicable to the leftmost position are considered. It is a more disciplined non-determinism — the derivation always progresses from left to right, but with variation at each position.

Usage: When controlled non-determinism is desired: the structure progresses from left to right (like reading), but the content varies with each execution.

SUB Mode: parallel derivation

Correction (March 2026)

Previous descriptions of SUB were incorrect. SUB is NOT “choose a rule and replace all occurrences.” It is simultaneous rewriting — all candidate rules apply at the same time across the entire work string. Definition and example from Bernard Bel, formal BP2 reference section 4: “Substitutions”.

Principle: SUB grammars perform simultaneous rewriting of symbols in the work string. All candidate rules apply at the same time — it is a parallel derivation.

BP3 Example (Bernard Bel):

SUB
gram#1[1] S --> A B B A B B A
gram#1[2] A B --> a B
gram#1[3] B A --> B b
gram#1[4] B B A --> B e A
gram#1[5] B B --> f B

Derivation trace (from BP3):

[Step #1] S --> A B B A B B A

All candidate rules are matched simultaneously:

[Step #2] A B --> a B           (match at position 1-2)
[Step #3] B B --> f B           (match at position 2-3)
[Step #4] B B A --> B e A       (match at position 5-6-7)

The same rules are re-applied to the remaining matches:

[Step #5] A B --> a B           (re-applied)
[Step #6] B B --> f B           (re-applied)
[Step #7] B B A --> B e A       (re-applied)
[Step #8] B A --> B b           (new match)

Result: a f e a f e b

The key insight: everything happens as if all candidate rules applied at the same time across the entire work string. Rules with multi-symbol LHS (A B, B B A) match at specific positions, and all matches are resolved simultaneously. This is fundamentally different from RND (one random rule) or ORD (rules in order).

Usage: Parallel derivation — p-substitutions, simultaneous rewriting of the L-systems type, generation of patterns where the entire string evolves in a single step.

SUB1 Mode: single pass substitution

Correction (March 2026)

The previous version described SUB1 as non-deterministic. This is false. SUB1 is deterministic: rules are tested in the order of the file, applied at the LEFT position (leftmost). Definition and example from Bernard Bel.

Principle: Like SUB, but performs only one pass of rewriting. Rules are tested in order, and the first one whose LHS matches at the leftmost position is applied. Only one replacement per step. Deterministic.

Warning: with a simple LHS (a single non-terminal like A), the first rule for A always wins. For example, this grammar:

SUB1
gram#1[1] S --> A - A - A
gram#1[2] A --> a b
gram#1[3] A --> c d e
gram#1[4] A --> f g

will always produce a b - a b - a b — because rule 2 (A --> a b) is the first and always matches.

SUB1 becomes interesting with multi-symbol LHS (context-sensitive rules):

BP3 Example (Bernard Bel):

SUB1
gram#1[1] S --> A A A A B B B B
gram#1[2] A B --> c d e
gram#1[3] A A --> a b
gram#1[4] B B --> f g

Derivation trace (from BP3):

[Step #1] <127:127> LEFT S --> A A A A B B B B
[Step #2] <127:127> LEFT A B --> c d e
[Step #3] <127:127> LEFT A A --> a b
[Step #4] <127:127> LEFT B B --> f g

Result: a b A c d e f g B

The remaining A and B no longer have an applicable rule — the derivation stops with residual non-terminals. The result is 100% deterministic — changing the rule order changes the result.

Difference with SUB: In SUB mode, all candidate rules apply simultaneously. In SUB1, only one replacement per step, leftmost, deterministic.

TEM Mode: structural templates

Principle: TEM (Templates) mode generates an exhaustive catalog of structural skeletons — all the forms the grammar can produce, without concrete terminals. BP3 exhaustively derives sub-grammars marked TEM and stores the results as reusable templates.

The result is stored in a TEMPLATES: section of the grammar file:

TEMPLATES:
[1] *1/1 __*1/2 _
[5] *1/1 (@0 _)(@1 )

Each template is a skeleton: positions (_, __) indicate where motifs are placed, groups (@N ...) reference sub-grammars whose result is inserted. Ratios (*1/1, *1/2) indicate speed ratios.

Usage: In ANAL mode (recognition, cf. B8), pre-calculated templates allow for quickly testing if a sequence belongs to the language — comparing against skeletons rather than re-deriving. If no template matches, the sequence is rejected without backtracking.

POSLONG Mode: position + maximum length

Principle: POSLONG (Position + Longest) is a variant of SUB1 with an additional disambiguation criterion. Like SUB1, it replaces only one occurrence at a time (the leftmost). But when several rules are candidates at the same position, POSLONG selects the one that produces the longest derivation.

The algorithm, visible in Compute.c:1375-1399, proceeds in three steps:

Find the minimum position (leftmost): among all candidate rules, those that match furthest to the left are retained.
Select the maximum length (longest): among candidates at the same position, the one that produces the longest derivation is retained.
Apply: the selected rule replaces the first occurrence.

It is the leftmost-longest match strategy — the same heuristic that POSIX regular expressions use for pattern matching.

Difference with SUB1:

Criterion	SUB1	POSLONG
Occurrences replaced	One (the leftmost)	One (the leftmost)
Choice among candidate rules	According to weights (RND) or order (ORD)	The longest derivation
Determinism	Variable (depending on choice mode)	Yes (the length criterion decides)

Restrictions: Same as SUB and SUB1 — no flags (/flags/), no _goto, no “Produce all items” (CompileGrammar.c:613-695, ProduceItems.c:757).

Usage: Useful when the grammar contains rules of different lengths and you systematically want the most developed motif — for example, for transliteration where the longest match is the most specific.

Sidebar: The seven modes and tabla practice

Each BP3 derivation mode naturally corresponds to an aspect of tabla performance, which is no coincidence: BP was designed to model exactly these practices [BelKippen1992a]:

Mode Tabla practice Description

ORD Theka (ostinato) The basic pattern of the tāla, always identical, played mechanically

RND Improvisation The tabla player chooses from a repertoire of motifs, some more frequent than others

LIN Positional variation Each position receives random content, but progression remains left→right

SUB Parallel derivation All candidate rules apply simultaneously to the work string

SUB1 Progressive development Each occurrence of a motif is developed independently

TEM Structural skeletons Exhaustive catalog of possible forms

POSLONG Transliteration Longest match at the leftmost position

>
The first five modes emerge directly from the observation of actual music [Bel1998]. TEM and POSLONG are extensions for specialized uses — structural exploration and deterministic pattern matching.

Mode	Tabla practice	Description
ORD	Theka (ostinato)	The basic pattern of the tāla, always identical, played mechanically
RND	Improvisation	The tabla player chooses from a repertoire of motifs, some more frequent than others
LIN	Positional variation	Each position receives random content, but progression remains left→right
SUB	Parallel derivation	All candidate rules apply simultaneously to the work string
SUB1	Progressive development	Each occurrence of a motif is developed independently
TEM	Structural skeletons	Exhaustive catalog of possible forms
POSLONG	Transliteration	Longest match at the leftmost position

Summary Table of Modes

Mode	Meaning	Rule Choice	Multiple Occurrences	Determinism
ORD	Ordered	Sequential, in file order	Independent	Yes, 100%
RND	Random	Random according to weights	Independent	No
LIN	Linear (leftmost)	Random among leftmost	Independent	Yes, leftmost
SUB	Substitution (parallel)	All rules simultaneously	Parallel (L-system)	No
SUB1	Substitution 1 (single-pass)	File order, leftmost	One at a time	Yes (deterministic)
TEM	Templates	Exhaustive (all derivations)	—	Yes (exhaustive)
POSLONG	Position + Longest	Leftmost + longest match	One at a time	Yes

Sidebar: Choosing the right mode

Do you want a fixed and reproducible sequence? → ORD

Do you want controlled variety? → RND with adjusted weights

Do you want non-determinism that progresses from left to right? → LIN

Do you want repetitions to be identical? → SUB

Do you want a progressive development where each repetition can evolve? → SUB1

Do you want to catalog all possible structural forms? → TEM

Do you want the longest match at each position? → POSLONG

Visualizing a derivation: the syntax tree

A derivation can be represented as a syntax tree (see also L4):

The root is the start symbol
Each non-terminal node has as children the symbols of its development
The leaves are the terminals

Example:

Grammar:

S → A B
A → a a
B → b

Tree for aab:

TikZ diagram

Figure 1 — Syntax tree for the derivation of aab (S \(\to\) A B, A \(\to\) a a, B \(\to\) b). Non-terminals are in blue, terminals in orange.

Reading the leaves from left to right: a a b

Parsers build these trees to analyze code or text.

Key takeaways

A production rule A → α indicates that A can be replaced by α.
A derivation is a sequence of replacements from the start symbol to the terminals.
Leftmost and rightmost are two strategies for choosing which non-terminal to replace first.
BP3 offers seven modes that go much further:

– ORD: fixed rule order
– RND: weighted random choice
– LIN: leftmost, random choice
– SUB: parallel derivation (all rules simultaneously)
– SUB1: substitution one occurrence at a time
– TEM: exhaustive derivation to generate a catalog of structural skeletons (TEMPLATES:)
– POSLONG: like SUB1, but selects the longest derivation (leftmost-longest match)

The chosen mode radically transforms the generative behavior of the same grammar.
Modes vs directions: the seven modes above define the derivation strategy of an individual sub-grammar. BP3 also has three directions that determine the overall use of the grammar: PROD (production), ANAL (recognition), and TEMP (templates). See B8.

To go further

Reference book: Aho, Lam, Sethi & Ullman, Compilers: Principles, Techniques, and Tools (the “Dragon Book”), chapter 4
BP3 Documentation: Bol Processor – Produce All Items
BP3 Modes: Bol Processor – Grammar Control
Visualization: Syntax Tree Generator
The modes in their original context: Bel, B. & Kippen, J. (1992), “Modelling Music with Grammars: Formal Language Representation in the Bol Processor” — description of ORD, RND, SUB modes applied to tabla
Overview: Bel, B. (1998), “Migrating Musical Concepts — An Overview of the Bol Processor”, Computer Music Journal 22(2)
Beyond production: B8 — BP3’s ANAL (recognition) and TEMP (exhaustive structural exploration) modes. QAVAID (grammar inference) was a separate Prolog II program, not integrated into BP3

Glossary

Syntax tree: Tree representation of a derivation, with the start symbol at the root and terminals at the leaves.
Kayda: Tabla composition with a fixed theme (mukhra) and systematic variations. Uses complex grammatical structures (sub-grammars, templates), not just SUB1 mode.
Khali: “Non-resonant” section of a tāla cycle, where dry bols (tin, ta) dominate.
Derivation: Sequence of rule applications transforming the start symbol into a string of terminals.
Leftmost (leftmost derivation): Strategy that always replaces the leftmost non-terminal.
LHS (Left-Hand Side): The left part of a production rule, i.e., the symbol being replaced.
LIN (BP3 mode): Linear mode — the candidate rule is chosen randomly among those that produce a leftmost derivation (the leftmost non-terminal is rewritten first).
Derivation mode: In BP3, sub-grammar strategy for choosing and applying rules (ORD, RND, LIN, SUB, SUB1, TEM, POSLONG).
ORD (BP3 mode): Ordered mode, where rules are applied in the sequential order of the file.
POSLONG (BP3 mode): Position + Longest mode, a variant of SUB1 that selects the longest derivation among candidate rules at the leftmost position. Leftmost-longest match strategy.
Ostinato: Short musical motif repeated persistently throughout a passage or piece.
Sam: First beat of the tāla cycle — the resolution point for Indian cadences.
Theka: Basic pattern of a tāla, played by the tabla as an ostinato. Corresponds to BP3’s ORD mode.
Tihāī: Indian cadence where a motif is repeated exactly three times to land on sam. Corresponds to BP3’s SUB mode.
Tintāl: The most common tāla in Hindustani music: 16 beats in 4 vibhāg of 4 beats.
Vibhāg: Section of a tāla (e.g., tintāl has 4 vibhāg).
RHS (Right-Hand Side): The right part of a production rule, i.e., what it is replaced with.
Rightmost (rightmost derivation): Strategy that always replaces the rightmost non-terminal.
RND (BP3 mode): Random mode, where rules are chosen randomly according to their weights.
Production rule: Instruction of the form A → α indicating that A can be rewritten as α.
SUB (BP3 mode): Substitution mode — parallel derivation where all candidate rules apply simultaneously to the work string, as in L-systems.
SUB1 (BP3 mode): Substitution 1 mode — a single pass of rewriting, leftmost, deterministic. Rules are tested in the order of the file.
TEM (BP3 mode): Templates mode, which exhaustively derives a sub-grammar to generate a catalog of structural skeletons (TEMPLATES: section).
Leftmost derivation: Derivation strategy that always rewrites the leftmost non-terminal in the work string. In BP3’s LIN mode, the choice among applicable rules at this position is random.

Prerequisites: B1 (PCFG), B2 (Alphabets), L6 (SOS)
Reading time: 8 min
Tags: #derivation #grammars #BP3 #modes #generation #parsing

← Back to index