B6) Homomorphisms, Variables, and Context

When BP3 exceeds context-free grammars

Certain constructions in BP3 cannot be expressed in a classic context-free grammar. They introduce memory, context, and flow control.

Where does this article fit?

After covering probabilities (B1), vocabulary (B2), derivation rules (B3), and dynamic control (B4), this article explores the advanced constructions of BP3 (Bol Processor 3, cf. I2): variables, homomorphisms, wildcards, and context markers. These mechanisms make BP3 a much more powerful system than a simple context-free grammar—they bring it closer to context-sensitive grammars (L1), or even Turing-complete languages. The details of the translation into SuperCollider code are covered in B7.


Why is it important?

A context-free grammar (CFG) is an elegant but limited tool. It cannot express certain constraints that are nonetheless natural in music:

  • Repeating exactly the same motif: if a theme appears twice in a piece, a CFG cannot guarantee that both occurrences are identical. Each derivation is independent.
  • Transforming a motif according to an external rule: transposing a melody into minor, applying an inversion—these operations require a memory of what was played.
  • Reacting to context: playing a different motif depending on what precedes or follows it.

Let’s take an analogy with kathak, the classical dance of Northern India, which is intimately linked to the tabla and BP3. A CFG is like a kathak choreographer giving general instructions: “do a tatkar (rhythmic footwork), then a chakkar (pirouette).” But he cannot say “do exactly the same tatkar as before” or “if the tabla player played a dha (resonant bol), respond with a tin (dry bol).” For that, you need memory and context-sensitivity—exactly what the constructions we are about to explore provide. This is no coincidence: Bernard Bel and James Kippen developed BP3 precisely to model these interactions between tabla and dance in the Lucknow tradition [BelKippen1992a].

The idea in one sentence

BP3 extends CFGs with variables (exact repetition), homomorphisms (motif transformation), wildcards (pattern matching), and context markers—constructions that make BP3 a context-sensitive language.


Step-by-step explanation

1. Variables: the tihāī and exact repetition

The problem

In Indian music, the tihāī (a cadence where a motif is repeated exactly three times to land on sam, the first beat of the tāla cycle—the rhythmic resolution point toward which all improvisation converges) requires that the three repetitions be identical. If I write:

 

gram#1[1] S --> A - A - A

 

In RND mode (Random—random derivation, cf. B3), each A is derived independently. The first A could result in dha tirakita dhin and the second in tin ta ka. This is not a tihāī—it’s three different motifs, and the cadence effect is lost.

The BP3 solution

BP3 introduces the variable with the notation |x| (identifier between pipes):

 

gram#1[1] S --> |x| - |x| - |x|

 

This rule reads: “derive something, call it x, then play x two more times—exactly the same thing, separated by silences.” The variable captures the result of the first derivation and replicates it everywhere it appears. This is exactly the structure of a tihāī [BelKippen1992a].

Sidebar: Variable vs. non-terminal

Do not confuse |x| (variable) with a non-terminal. A non-terminal is a category that will be derived independently at each occurrence. A variable is a binding: the first occurrence sets the value, and subsequent ones reuse it.

Construction Behavior
Non-terminal A Each occurrence derived independently
Variable \|x\| First occurrence sets the value, subsequent ones replicate it

Complete example

 

gram#1[1] S --> |x| - |x| - |x|
gram#1[2] |x| --> dha tirakita dhin dhin
gram#1[3] |x| --> tin ta dha ge na

 

Possible result: dha tirakita dhin dhin - dha tirakita dhin dhin - dha tirakita dhin dhin—a perfect tihāī, where the same pattern of bols is played three times to land on sam (the first beat of the cycle).

Why it is context-sensitive

The link to the Chomsky hierarchy (L1) is direct. The language $\{ ww \mid w \in \Sigma^* \}$—the set of all strings formed by a word $w$ followed by its exact copy, where $\Sigma^*$ (sigma star) denotes the set of all possible words over the alphabet $\Sigma$ (sigma)—is not context-free. This is a classic result: the pumping lemma (a proof tool showing that a language cannot be generated by a CFG) forbids it. BP3 variables allow for the expression of exactly this type of constraint, placing BP3 at the context-sensitive level (Type 1) of the hierarchy.


2. Homomorphisms: patterns and mapping tables

Correction (March 2026)

The previous version of this section described homomorphisms as “note-to-note transformations (transposition, inversion, retrograde)” with the invented example (: mineur) (: rétrograde). This is fundamentally incorrect. Retrograde is not a homomorphism (it reverses the order, violating $h(\alpha\beta) = h(\alpha)h(\beta)$), and the syntax (: rétrograde) does not exist in BP3. The correction below is based on the official documentation at bolprocessor.org and analysis of the BP3 C source code.

The idea

A homomorphism in mathematics is a function $h$ that preserves structure: $h(\alpha\beta) = h(\alpha)h(\beta)$. Each symbol is transformed independently, and the order is preserved. This is crucial: retrograde (reversing the order) is NOT a homomorphism.

In BP3, homomorphisms are 1:1 mapping tables— an input symbol yields an output symbol. They are defined in external files (-ho.*) and used in master/slave patterns.

The master/slave mechanism

BP3 uses two markers for patterns:

Marker Syntax Role
Master (= ...) Defines the reference motif
Slave (: ...) Must reproduce the same structure as the master

Example (bolprocessor.org):

 

gram#1[1] S --> (= X) X (: X)

 

This rule means: the third element (slave (:X)) must correspond to the first (master (=X)). If the master is derived into a b, the slave must also be a b. This is a constraint of structural identity.

Homomorphisms in the alphabet file (-al.)

Homomorphisms are defined within the alphabet file (-al.), not in a separate file. Historically, Bernard Bel used the prefix -ho. for a dedicated file, but today the -al. file contains both: the list of terminals AND the homomorphism mappings.

 

// File -al.abc: alphabet + homomorphism "*"
*
a --> a' --> a''
b --> b'
c

 

The homomorphism is named *. The strings a --> a' --> a'' are cyclic: applying * once → a becomes a', a' becomes a''. If the cycle is closed (a --> a' --> a'' --> a), applying it 3 times = returning to the start.

A terminal absent from the table remains unchanged: c has no mapping, so *(c) = c.

How it works

The homomorphism * appears in the rules before the parenthesis it transforms:

 

S --> a * (= a') * * (= a) * (= a') * b (= a c)

 

Here:

  • a → direct terminal
  • * (= a') → applies * to the content of the master → a' becomes a''
  • * * (= a) → applies * twice → aa'a''
  • * (= a')a'a''
  • b → direct terminal
  • (= a c) → no homomorphism → a c remains a c… but the slave points to the masters, so a is resolved via pointers → a' and c remains c

Result: a a' a'' a'' b a' c

Homomorphisms stack: * * = apply twice. * * * = three times.

When are they applied?

A crucial point confirmed by Bernard Bel (March 2026): terminals are replaced by their images at the end of production, for display—by the SearchOrigin() function in DisplayArg.c. During derivation, slave parentheses contain pointers to the master, not transformed copies.

Homomorphisms do not influence the choice of rules during derivation. They do not modify the temporal structure. Two identical grammars—one with a homomorphism, the other without—produce the same temporal sequence. Only the names of the terminals change upon display.

Formally, a BP3 homomorphism is a morphism of free monoids: $h : \Sigma^* \to \Sigma^*$ such that $h(\alpha\beta) = h(\alpha)h(\beta)$. Each symbol is transformed independently via the p_Image[h][j] table, and the order is always preserved.

Sidebar: Graha bheda—raga transposition as a homomorphism

In Indian music, graha bheda (graha = starting note, bheda = change) is a technique where correspondences between degrees are redefined. This is exactly what a BP3 -ho. file does: a table that associates each note of the alphabet with another note. The melodic structure is preserved, only the pitches change—the very definition of a morphism [Bel1998].

Note: retrograde (playing the melody backward) is not a homomorphism—it reverses the order, which violates $h(\alpha\beta) = h(\alpha)h(\beta)$. BP3 handles it through other mechanisms (the _retro directive).


3. Wildcards: capturing and transforming without defining

Wildcards—notated as ? (anonymous), ?1, ?2 (named)—introduce pattern matching into grammar rules. They are analogous to capture groups in regular expressions (regex): on the left side of a rule, ?1 captures an arbitrary portion of the string; on the right side, ?1 reinjects it.

Wildcards are historically associated with ANAL mode (recognition). In PROD mode, they function via layered transformational grammars: a first sub-grammar generates a sequence with markers, and a second uses wildcards to transform that sequence. The wildcard operates on the output of a previous sub-grammar—it is an inter-layer rewriting, not direct generation.

In PROD mode: context-sensitive rewriting

Imagine a two-stage grammar. The first grammar (GRAM#1) generates a sequence with markers:

 

gram#1[1] S --> R1 dha tirakita dhin dhin R2

 

The second grammar (GRAM#2) uses wildcards to transform this sequence:

 

gram#2[1] R1 ?1 R2 --> ?1 ?1

 

This rule reads: “find the pattern R1R2, capture everything between the two markers (here dha tirakita dhin dhin), and replace the whole thing with two copies of that captured content.” Result:

 

dha tirakita dhin dhin dha tirakita dhin dhin

 

The wildcard ?1 captured the content between the markers (left side) and then reinjected it twice (right side)—without ever knowing what that content was. This is context-sensitive rewriting: the rule depends on the context (the markers R1 and R2) and operates on content it does not define.

Sidebar: Manipulating objects without defining them

This is the fundamental insight of wildcards in PROD mode: they allow for the manipulation of structures without naming them. The rule R1 ?1 R2 --> ?1 ?1 does not know if ?1 contains tabla bols, sitar notes, or robot commands—it only knows that there is “something” between two boundaries, and it duplicates it.

We are touching here upon a primitive form of the notion of an object in programming: an entity manipulable through its interfaces (the markers) rather than its internal structure. This is also what makes wildcards so powerful—and so different from variables.

Wildcards vs. variables: two copy mechanisms

Both capture and replicate, but at different levels:

Variable \|x\| Wildcard ?1
Scope Within a single rule
Captures what The result of a derivation
When During derivation
Musical analogy “Play this theme, then play it again identically” (tihāī)

Variables create repetition (the tihāī). Wildcards create transformation (extracting a motif and recontextualizing it).

In ANAL mode: the membership test

In analytical mode, wildcards return to their classic role of pattern matching. The rule:

 

gram#1[1] ?1 - ?1 - ?1 --> #

 

means: “verify that the sequence to be analyzed is a tihāī—three repetitions of the same motif separated by silences.” If the sequence dha tirakita dhin - dha tirakita dhin - dha tirakita dhin is submitted as input, the wildcard ?1 captures dha tirakita dhin and verifies that the three occurrences are identical.

This is the dual of PROD mode: in production, the wildcard transforms; in analysis, it recognizes. Two sides of the same coin—a subject developed in B8.


4. Context markers: the grammar looks around

The problem with CFGs

In a CFG, when you replace $A$ with $x\,y\,z$, the decision never depends on what surrounds $A$. This is the very definition of “context-free.” But in music, context is fundamental: you don’t play the same thing after a crescendo as you do after a silence.

BP3 markers

BP3 introduces context markers that make rules sensitive to their environment. The most important is the distant marker:

 

gram#1[1] |Resolution| --> (|sam|) dha tirakita dhin dhin

 

This rule plays the motif dha tirakita dhin dhin only if the marker sam is present somewhere in the current string—exactly as a tabla player “aims” for sam to conclude an improvisation [BelKippen1992a].

A rule that says “replace $A$ with $xyz$ only if $B$ is present somewhere” cannot be expressed in a CFG. It is a context-sensitive rule (Type 1):

 

In CFG (Type 2):  A --> xyz         (always applicable)
In CSG (Type 1):  B A --> B xyz     (only if B precedes A)

 

BP3 generalizes this principle with its markers, allowing for conditions at a distance, not just on immediate neighbors.

Sidebar: The ālāp—the textbook case for context markers

The ālāp is the opening of a raga where notes are introduced one by one: first Sa alone, then Sa-Re, then Sa-Re-Ga, and so on. This structure of progressive accumulation is impossible to express properly with the 5 derivation modes (B3) or even flags (B4), which would require one rule per step (see the ālāp sidebar in B4).

Context markers offer an elegant solution:

gram#1[1]                    |Phrase| --> Sa |Ornement_Sa|
gram#1[2] (|Re_introduit|)  |Phrase| --> Re |Ornement_Re| |Phrase|
gram#1[3] (|Ga_introduit|)  |Phrase| --> Ga |Ornement_Ga| |Phrase|

Rule 2 is only applicable if the marker Re_introduit is present in the context—meaning if Re has already been “unlocked” in a previous phrase. The grammar progressively accumulates its vocabulary of notes, exactly like a musician exploring a raga.

This is a strong argument for why Bel added context-sensitive extensions to BP3: the combinatorial structures of tabla (kayda, tihāī) are well-expressed with modes and flags, but the accumulative structures of raga (ālāp, barhat) require context-sensitivity. The two traditions—tabla and raga—demand different extensions.


5. Flow control and empty productions

Two additional constructions complete the arsenal:

_goto(gram, rule)—the arbitrary jump. This instruction jumps to a specific rule in a specific grammar, like a goto in programming. It is a flow control mechanism that goes far beyond CFGs—it allows for arbitrary loops and branching.

_lambda()—the empty production. The non-terminal simply disappears without a trace. In formal theory, this is written as $A \to \varepsilon$ (epsilon, the empty string). This is useful for optional structures: “there might be an ornament here, or nothing at all.”

Sidebar: The price of power

The more expressive a language is, the harder it is to analyze automatically. CFGs (Type 2) can be parsed in polynomial time. CSGs (Type 1) are decidable but potentially exponential. BP3’s context-sensitive constructions provide remarkable compositional power, but at the cost of complexity: some (wildcards, context markers, goto) cannot be resolved by a static transpiler—they require a full derivation engine (see B7).


Position in the Chomsky hierarchy

Let’s place these constructions within the hierarchy (L1):

Level Type BP3 Constructions
Type 3 Regular Simple sequences of notes
Type 2 Context-free Non-terminals, derivation rules, modes (B1-B3)
Type 2+ Mildly context-sensitive Bounded flags (B4)—same zone as TAG, CCG
Type 1 Context-sensitive Variables, context markers, wildcards
Type 0 Recursively enumerable _goto (arbitrary jump), unbounded flags

Articles B1-B3 cover the context-free core. B4 (flags) pushes toward mildly context-sensitive. This article shows how advanced constructions propel BP3 to the context-sensitive level, and even beyond. It is this adjustable gradation—from Type 2 to Type 0 depending on the activated mechanisms—that constitutes BP3’s formal originality.


Key takeaways

  • Variables (|x|) allow for the exact repetition of the same derived motif—the tihāī is the perfect application. This capability is impossible in pure CFGs.
  • Homomorphisms ((= ...), (: ...)) transform motifs according to note-to-note mappings—graha bheda (raga transposition) is the natural musical example.
  • Wildcards (?1, ?2) capture and reinject string fragments—in PROD mode to transform (context-sensitive rewriting), in ANAL mode to recognize (membership test). They allow for the manipulation of structures without defining them.
  • Context markers make rules sensitive to their environment—the ālāp (progressive introduction of notes) is the textbook case.
  • Flow control (_goto) and empty productions (_lambda) complete the arsenal beyond classic grammars.
  • BP3 is a parameterizable formalism: depending on the constructions used, it sits between Type 2 (CFG) and Type 0 (Turing-complete).

Further reading

  • Context-sensitive grammars: Hopcroft, Motwani & Ullman, Introduction to Automata Theory, Chapter 9
  • Formal homomorphisms: Rozenberg & Salomaa, Handbook of Formal Languages, Volume 1, chapter on morphisms
  • Bel, B. & Kippen, J. (1992). “Modelling Music with Grammars: Formal Language Representation in the Bol Processor”—the founding article, including variables and homomorphisms applied to tabla.
  • Bel, B. (1998). “Migrating Musical Concepts — An Overview of the Bol Processor”, Computer Music Journal 22(3)—how musical concepts “migrate” between traditions via homomorphisms.
  • BP3 Documentation: Bol Processor – Pattern Grammars
  • BP3 Variables and Context: Bol Processor – Context-sensitive grammars
  • Translation into SuperCollider: B7—how the transpiler handles (or does not handle) these advanced constructions.
  • The Three Directions of BP3: B8—wildcards in generative and analytical contexts, PROD/ANAL/TEMP modes, QAVAID.

Glossary

  • Ālāp: Opening of a raga where notes are introduced progressively. Its accumulative structure requires context markers—flags alone (B4) are not sufficient.
  • Bol: Mnemonic syllable of the tabla (e.g., dha, dhin, tin, ta). The “Bol” in “Bol Processor.”
  • Context-sensitive: A property of a grammar where the replacement of a symbol depends on its environment. Corresponds to Type 1 of the Chomsky hierarchy.
  • Graha bheda: Technique of raga transposition by shifting the fundamental note (Sa), causing a new raga to emerge. Formalizable as a BP3 homomorphism.
  • Homomorphism: A function that transforms each symbol of a string according to a fixed mapping, preserving structure (morphism of free monoids).
  • Kathak: Classical dance of Northern India, linked to the tabla. Tabla player-dancer interactions are a founding case study for BP3.
  • Lambda (empty production): A rule that rewrites a non-terminal as the empty string ($\varepsilon$). The symbol disappears without a trace.
  • Context marker: A BP3 construction that conditions the application of a rule on the presence of a symbol in the context.
  • Pattern matching: A mechanism for matching motifs. BP3 wildcards capture portions of derived strings for reuse.
  • Pumping lemma: A proof tool in language theory used to show that a language is not context-free.
  • Sam: The first beat of the tāla cycle—the resolution point toward which tihāīs converge.
  • Tihāī: An Indian cadence where a motif is repeated three times to land on sam. Formalizable in BP3 via variables: |x| - |x| - |x|.
  • Variable: The |x| construction where the first occurrence sets a value and subsequent ones replicate it. Allows for the expression of exact repetition constraints impossible in CFGs.
  • Wildcard: The ?N construction that captures an arbitrary portion of the string. In PROD mode: capture on the left side, reinjection on the right side (context-sensitive rewriting between grammars). In ANAL mode: pattern matching for membership testing. Works in both directions—see B8.

Prerequisites: L1, B2, B3
Reading time: 14 min
Tags: #homomorphisms #variables #context #BP3 #context-sensitive


Back to index