M4) Generative Grammars and Music
From Chomsky to GTTM
What if music, like language, obeyed hidden rules that our brain unconsciously applies? This idea revolutionized linguistics in 1957, then musicology in 1983.
Where Does This Article Fit?
This is where the two main series meet: the theory of formal languages (L1, L series) and musical representation (M3, M series). This article shows how Chomsky’s grammars were adapted to music, paving the way for the I2 (Bol Processor) and GTTM (A Generative Theory of Tonal Music, Lerdahl and Jackendoff’s generative theory of tonal music, 1983).
Why Is This Important?
When you listen to a melody, your brain does remarkable work: it groups notes into phrases, identifies tensions and resolutions, and anticipates what comes next. You do all of this without thinking, even without musical training. How is this possible?
The generative grammars hypothesis offers a fascinating answer: our brain would possess innate “rules” for structuring sounds, whether speech or music. This idea, born in linguistics with Noam Chomsky (American linguist, born in 1928), has profoundly influenced how we understand, analyze, and even generate music.
For anyone interested in formal musical notation (like BP3, the Bol Processor — see I2) or automatic analysis, understanding this intellectual lineage is essential. It explains why some formalisms work and others fail.
Quick Timeline
- 1957: Chomsky publishes Syntactic Structures, birth of generative grammars
- 1983: Lerdahl and Jackendoff publish GTTM (A Generative Theory of Tonal Music)
- 1990s: Bernard Bel develops BP3, applying these ideas to musical generation
The Idea in One Sentence
A generative grammar is a finite set of rules capable of producing an infinite set of valid structures, whether sentences or musical sequences.
Let’s Explain Step by Step
Example 1: French Sentences
Let’s start with language. In French, you can create an infinite number of correct sentences you’ve never heard before:
“Le chat violet de ma voisine mange des spaghettis sur le toit.”
This sentence is probably new to you, but you immediately know it’s grammatically correct. How? Because your brain applies implicit rules:
Phrase → Sujet + Verbe + Complément
Sujet → Article + Nom + (Complément du nom)
Complément du nom → "de" + Sujet
With these few rules, an infinite number of sentences can be generated. This is Chomsky’s fundamental intuition: a finite number of rules can generate infinite creativity.
Example 2: A Melody That “Works”
Now, mentally listen to these two sequences:
Sequence A: C – D – E – F – G – C (ascending, resolved)
Sequence B: C – F# – Bb – E – Db – A (random intervals)
Even without musical training, sequence A seems “logical” to you, while B seems chaotic. Why? Because A respects implicit rules of Western tonality:
- Movement by conjunct degrees (neighboring notes)
- Beginning and ending on the tonic (C)
- Coherent direction (ascent then resolution)
These rules are exactly what a musical grammar attempts to formalize.
The Chomsky Revolution (1957)
The Context
Before Chomsky, linguistics was dominated by behaviorism: it was thought that language was learned solely through imitation and reinforcement. A child hears sentences, repeats them, and their errors are corrected.
The Poverty of the Stimulus Problem
Chomsky demonstrated that this explanation was insufficient. Consider:
- Children produce sentences they have never heard
- They make systematic errors (“je sontais” instead of “j’étais” – “I was-ed” instead of “I was”) that reveal the application of rules
- The amount of language heard is insufficient to explain the acquired mastery
His conclusion: the human brain possesses an innate “universal grammar”, a set of principles that constrain possible languages.
Rewrite Rules
Chomsky formalized grammars with production rules:
S → NP VP (a sentence = noun phrase + verb phrase)
NP → Det N (noun phrase = determiner + noun)
VP → V NP (verb phrase = verb + noun phrase)
Det → "the" | "a"
N → "cat" | "dog"
V → "eats" | "sees"
These rules generate sentences like “the cat eats a dog” or “a dog sees the cat”. Simple, yet powerful.
Transfer to Music: What Works
Hierarchical Structure
The major discovery that transfers from linguistics to music is hierarchy. Just as a sentence breaks down into clauses, then phrases, then words, a musical piece breaks down into:
- Movements
- Sections
- Phrases
- Motifs
- Notes
This tree-like structure allows us to understand how music is perceived at multiple scales simultaneously.
Why a tree and not a simple list?
Imagine “Frère Jacques” as a flat list: C, D, E, C, C, D, E, C…
We lose the crucial information that “C D E C” forms a group (the “Frère Jacques” motif).With a tree:
Song / \ Phrase 1 Phrase 2 / \ / \ Motif Motif Motif Motif | | | | C D C D E F E F E C E C G G
The tree captures inclusion relations: a motif belongs to a phrase, which belongs to the song. This is the structure our brain unconsciously builds.
Recursion
Recursion — the ability of a rule to apply to its own result — also exists in music:
In linguistics:
“The cat [that the dog [that Pierre saw] bit] is black.”
In music:
- A theme contains a motif
- A variation develops the theme (which contains the motif)
- A sonata develops the variation (which develops the theme, which contains the motif)
Production Rules
BP3 (Bol Processor 3), developed by Bernard Bel in the 1990s, explicitly uses production rules to generate music:
S → _tempo(80) A B A
A → C D C
C → C D E | E D C
D → F G A G F
B → D C D
How to read this grammar?
S → _tempo(80) A B A: the starting symbol S rewrites with a tempo of 80 BPM (beats per minute), followed by three sections A, B, AC → C D E | E D C: the|symbol indicates a choice (random or weighted) between two alternatives- Uppercase letters (S, A, B, C, D) are non-terminals (intermediate symbols), lowercase letters (C, D, E…) are terminals (the notes played)
This grammar can generate several different melodies depending on the choices made at each | (or). This is exactly Chomsky’s principle applied to music.
Transfer to Music: What Needs Adaptation
The differences between language and music are often presented as incompatibilities that would invalidate the grammatical approach in music. In reality, they are differences of degree — and recognizing them only helps to better calibrate the necessary formal tools.
Compositionality: Propositional vs. Perceptual
In linguistics, meaning composes: “big black cat” = meaning(big) + meaning(cat) + meaning(black). This is the principle of semantic compositionality (attributed to Frege, German logician, 1848-1925).
In music, composition also exists, but it operates on a different level:
- A cadence V → I (dominant → tonic) composes tension + resolution = rest
- A crescendo on a dissonance composes intensity + instability = anticipation
- A transposed motif composes the original motif + a shift = development
The difference: linguistic compositionality is propositional (it constructs statements about the world, which can be true or false). Musical compositionality is perceptual and affective (it constructs experiences of tension, movement, resolution, surprise). It’s a shift in register, not an absence.
Functional Categories: Fixed vs. Contextual
In linguistics, words are classified into functional categories at the same level of abstraction: noun, verb, adjective, adverb, preposition… In music, comparable functional categories exist at the same level (that of the note):
| Linguistic Categories (word level) | Musical Categories (note level) |
|---|---|
| Noun → can be subject | Tonic note → point of rest |
| Verb → carries the action | Passing note → movement between two poles |
| Adjective → qualifies | Appoggiatura (ornamental note that creates tension before resolving) → expressive ornament |
| Adverb → modifies | Neighbor tone (decorative neighboring note) → neighboring note |
The main difference: in linguistics, “cat” is a noun in (almost) all contexts. In music, a D is “tonic” in D major but a “passing note” in C major. Musical categories are more contextual.
But even in linguistics, context plays a role: “marche” (walk) is a noun (“la marche” – the walk) or a verb (“il marche” – he walks) depending on the sentence. “Run” in English is a noun or a verb. So categories are not absolutely fixed either — the difference is, again, one of degree.
Beware of the trap: functional categories (all at the same level) should not be confused with hierarchical levels (note → motif → phrase → section). This confusion would be equivalent, in linguistics, to comparing “noun, verb, adjective” with “word, phrase, clause, sentence” — these are two distinct axes of analysis.
And the Question of Musical Meaning?
The signifier-signified relationship in music — the connection between the musical sign and what it “designates” — is a vast subject that goes beyond the scope of this article. Let’s simply note that the musical sign IS arbitrary in the sense of Saussure (Swiss linguist, 1857-1913): the same pitch is called “do” in solfège, “C” in Anglo-Saxon notation, and “sa” in Indian music — just as “arbre”, “tree”, and “Baum” designate the same concept. The parallel with linguistics is stronger here than often believed, and an in-depth study of the informational content of musical structures would probably reveal more similarities than differences with language.
What These Differences Imply for Musical Grammars
These two shifts — perceptual compositionality and contextual categories — do not invalidate the grammatical approach. They explain why musical grammars need adaptations compared to linguistic grammars:
| Shift | Necessary Adaptation | Examples |
|---|---|---|
| Perceptual compositionality | Semantics of tension/resolution rather than truth | GTTM: prolongational reduction; BP3: weighted PCFGs (B1) |
| Contextual categories | Preference rules rather than rigid categories | GTTM: GPR, MPR; BP3: contextual flags (B4) |
This is exactly what GTTM (1983) and BP3 did — and that’s why they work.
The Pioneers: From 1957 to 1983
Early Attempts (1960s-70s)
Before GTTM, several researchers attempted to apply Chomsky’s ideas to music. These attempts met with mixed success but paved the way.
Leonard Meyer (American musicologist, 1918-2007), in Emotion and Meaning in Music (1956, even before Chomsky!), had already proposed that our musical perception relies on expectations and their resolution. When a melody creates tension on the dominant (the 5th degree of the scale, for example G in C major), we “expect” a resolution on the tonic (the 1st degree, C). This intuition would be formalized later.
Sundberg and Lindblom (1976) created one of the first formal musical grammars to generate Swedish children’s songs. Their grammar looked like this:
Song → Verse Verse
Verse → Line Line
Line → Motif Motif | Motif Variant
Motif → Note Note Note Note
The result was functional but rigid: the generated melodies were grammatically correct but lacked “life”.
Heinrich Schenker, long before the computer era (1920s-30s), had developed a theory of musical reduction. For Schenker, any tonal piece could be reduced to a fundamental structure (Ursatz, literally “original structure” in German — the underlying harmonic and melodic skeleton). This idea of hierarchical level analysis would directly influence GTTM.
Why These Early Efforts Failed
Purely Chomskyan grammars applied to music had a problem: they did not capture gradation. In linguistics, a sentence is either grammatical or not — “The cat sleeps” is correct, “Cat the sleeps” is not.
In music, things are more nuanced. A melodic sequence can be:
- Perfectly idiomatic (very “musical”)
- Acceptable but unusual
- Technically possible but strange
- Impossible to play
This gradation required a new type of rules: preference rules, a major innovation of GTTM.
GTTM: The 1983 Synthesis
Lerdahl and Jackendoff
In 1983, Fred Lerdahl (American composer and theorist, born in 1943) and Ray Jackendoff (American linguist, born in 1945, former student of Chomsky) published A Generative Theory of Tonal Music, abbreviated GTTM. Their goal: to formalize the musical intuition of a competent listener of Western tonal music.
GTTM in one sentence
GTTM is a theory that describes how our brain automatically structures tonal music into groups, metrical levels, and tension/resolution relationships.
The Four Components
GTTM proposes that our musical perception constructs four parallel structures:
- Grouping structure: How notes group into motifs, phrases, sections
- Metrical structure: The alternation of strong and weak beats
- Time-span reduction: Which notes are more important than others
- Prolongational reduction: The relationships of tension and relaxation
Preference Rules
GTTM’s major innovation is the concept of preference rules. Unlike Chomsky’s strict rules (a sentence is grammatical or not), GTTM’s rules indicate tendencies:
- GPR 2a: A group tends to end when there is a silence
- GPR 3a: A group tends to end when there is a large interval
What do these acronyms mean?
- GPR = Grouping Preference Rule
- MPR = Metrical Preference Rule
- TSRPR = Time-Span Reduction Preference Rule
- PRPR = Prolongational Reduction Preference Rule
Each rule is numbered: GPR 2a is sub-rule “a” of the 2nd grouping rule.
These rules can conflict. The final interpretation results from their weighting, which explains why different listeners can perceive the same piece differently.
A Concrete Example: “Frère Jacques”
Let’s apply GTTM to a melody everyone knows. Sing it mentally:
Frè - re Jac - ques, Frè - re Jac - ques,
C D E C C D E C
Dor - mez vous? Dor - mez vous?
E F G E F G
Grouping structure: Our perception naturally segments into 4 groups of 2 measures. Why?
- GPR 2a: implicit silences between phrases (we take a breath after “Jacques”)
- GPR 3 (similarity): the repetition of the “Frère Jacques” motif creates a group
Metrical structure: Strong beats on “Frè”, “Jac”, “Dor”, “vous”. The melody coincides with the strong beats (MPR 5: important notes fall on strong beats).
Time-span reduction: The notes C-E-G form the harmonic framework (the C major chord). The other notes (D, F) are ornamental “passages” — if only C-E-G were kept, the melody would remain recognizable.
This formal analysis corresponds to our listener’s intuition — that is exactly GTTM’s objective.
Another example: “Au clair de la lune”
Au clair de la lu - ne, mon a - mi Pier - rot C C C D E D C E D D C
Grouping: two groups (pause after “lune” = GPR 2a)
Reduction: C-E-C = framework (broken C major chord)
BP3: Heir to This Tradition
Bernard Bel developed BP3 (Bol Processor 3) to represent Indian musical structures, particularly tabla compositions. His contribution was to create a formalism that:
- Uses generative grammars to describe rhythmic patterns
- Manages polymetry (superposition of multiple simultaneous meters, see M5 (upcoming))
- Allows contextual rules (similar to Chomsky’s context-sensitive grammars — see L1)
Example of a BP3 grammar for a tabla pattern:
S → Theka Theka Tihai
Theka → dha dhin dhin dha | dha dhin dhin dha dha tin tin ta
Tihai → X X X
X → dha ti dha ge na dha ti dha ge na dha
This grammar captures the recursive structure of the repertoire: a tihai (final cadence) repeats a motif three times, which itself can contain sub-patterns.
Why BP3 Goes Further Than GTTM
| Aspect | GTTM | BP3 |
|---|---|---|
| Direction | Analysis (understanding a piece) | Generation (creating a piece) |
| Tradition | Western tonal music | Any music (Indian, African…) |
| Rules | Preference (tendencies) | Production + weighting |
| Tempo | Single tempo | Native polymetry (see M5 (coming soon)) |
| Output | Structural trees | Playable music (MIDI — Musical Instrument Digital Interface, see M1 — sound) |
BP3 does not seek to model human perception (like GTTM), but to provide a flexible composition tool. This difference in objective explains different design choices.
Summary Timeline
To place the ideas in time:
| Year | Event | Contribution |
|---|---|---|
| 1920s-30s | Schenker, Free Composition (published posthumously) | Analysis by hierarchical reduction |
| 1956 | Meyer, Emotion and Meaning in Music | Expectation and resolution |
| 1957 | Chomsky, Syntactic Structures | Generative grammars |
| 1976 | Sundberg & Lindblom | First formal musical grammar |
| 1983 | Lerdahl & Jackendoff, GTTM | Preference rules |
| 1990s | Bel, Bol Processor 3 | Grammars for generation |
| 2000s | Computational implementations | Automated GTTM |
This intellectual lineage shows how an idea born in linguistics was progressively adapted, criticized, and enriched to apply to music.
Key Takeaways
- Chomsky (1957) showed that language obeys finite generative rules producing infinite creativity.
- What transfers to music: hierarchy (notes → motifs → phrases → sections), recursion, production rules.
- What needs adaptation: compositionality is perceptual (tension/resolution) rather than propositional (true/false), functional categories are contextual rather than fixed. These differences are a matter of degree, not nature — and the signifier-signified parallel between language and music is stronger than often believed.
- GTTM (1983) adapts the generative approach to music with “preference rules” that model our listener intuitions.
- BP3 applies these principles to concrete musical generation, particularly for Indian music.
Further Reading
- Chomsky, N. (1957). Syntactic Structures. Mouton. — The foundational work.
- Lerdahl, F. & Jackendoff, R. (1983). A Generative Theory of Tonal Music. MIT Press. — The reference in musical grammar.
- Bel, B. (1998). “Migrating Musical Concepts: An Overview of the Bol Processor”. Computer Music Journal, 22(2). — On BP3.
- Patel, A. D. (2008). Music, Language, and the Brain. Oxford University Press. — The neuroscience of the music/language parallel.
Glossary
- Behaviorism: A school of psychology (early 20th century) that explains learning solely through stimulus-response, without reference to innate mental structures.
- BP3 (Bol Processor 3): Musical grammar software developed by Bernard Bel (1990s), designed for Indian and African music.
- Chomsky, Noam: American linguist (born in 1928), founder of the theory of generative grammars.
- Neighbor tone: An ornamental note adjacent to a main note, which moves away by one step and then returns (e.g., C-D-C).
- Cadence: A sequence of chords that punctuates the end of a musical phrase (e.g., V → I = dominant → tonic).
- Semantic compositionality: The principle (attributed to Frege) that the meaning of an expression is constructed from the meaning of its parts (“big cat” = meaning(big) + meaning(cat)).
- Dominant: The 5th degree of the scale (e.g., G in C major), which creates tension calling for resolution to the tonic.
- Frege, Gottlob: German logician (1848-1925), founder of modern logic, to whom the principle of compositionality is attributed.
- GPR (Grouping Preference Rule): In GTTM, a preference rule for grouping notes into phrases.
- Generative grammar: A set of formal rules capable of producing all valid structures of a language.
- Universal grammar: Chomsky’s hypothesis that all humans are born with innate linguistic principles.
- GTTM (A Generative Theory of Tonal Music): Lerdahl and Jackendoff’s theory (1983) formalizing the perception of tonal music.
- Hierarchy: Organization in nested levels (tree), where each element belongs to a higher-level element.
- MPR (Metrical Preference Rule): In GTTM, a preference rule for metrical structure (strong/weak beats).
- Passing note: A note that connects two structural notes by conjunct motion (e.g., the D between C and E).
- Recursion: The property of a rule that can apply to its own result, allowing for nested structures.
- Production rule: A rule of the form $A \to B$ indicating how to rewrite a symbol.
- Preference rule: In GTTM, a rule that indicates a tendency rather than a strict obligation.
- Saussure, Ferdinand de: Swiss linguist (1857-1913), founder of structural linguistics, known for the concept of the arbitrariness of the sign.
- Referential semantics: The ability of language to refer to things in the world (the word “cat” refers to an animal).
- Grouping structure: Hierarchical organization of musical events into units (motifs, phrases, sections).
- Tihai: In Indian music, a concluding cadence where a motif is repeated three times, leading to the sam (the first beat of the rhythmic cycle, a point of convergence).
- Tonic: The resting note or chord of a tonality (e.g., C in C major), the point of resolution for tensions.
- Ursatz: Schenker’s term (German, “original structure”) referring to the fundamental harmonic and melodic skeleton of a tonal piece.
Links
- L1 — The Chomsky Hierarchy — the formal framework of grammars
- M3 — The three paradigms of musical representation
- M5 — Polymetry — superposition of meters
- M6 — Hierarchical structure in music: GTTM in depth
- I2 — Bol Processor — presentation of BP3
- M1 — MIDI under the formal microscope
- B1 — BP3 Probabilistic Grammars — first entry in the B series
Prerequisites: L1 — The Chomsky Hierarchy, M3 — The Three Paradigms
Reading time: 12 min
Tags: #chomsky #gttm #grammars #linguistics #musicology #bp3
Next article: M5 — Polymetry