L19) The P-chain

“Generation-Recognition Asymmetry” Series

This article accompanies the research paper The Generation-Recognition Asymmetry: Six Dimensions of a Fundamental Divide in Formal Language Theory (📄 arXiv).
Reading path: L13 · L14 · L15 · L16 · L17 · L18 · L19 · L20 — or the complete index.

The Brain That Predicts by Producing

Understanding, Producing, Learning: A Single Loop?

When you listen to a sentence, your brain does not just “analyze.” According to an increasingly supported hypothesis, it silently produces the likely continuation — and it is the gap between its prediction and what it actually hears that drives learning. Three activities once thought to be separate may actually form just one.

Where Does This Article Fit?

The previous articles dissected the generation-recognition asymmetry as a formal fact — a property of grammars, measurable in complexity classes (L17) and in bits of surprisal (L15). This article shifts levels: it looks at the psycholinguistic antecedent of this asymmetry. Indeed, the triad we are studying — producing, recognizing, inferring — has already been conceptualized as a unified whole by cognitive science, under the name of the P-chain. It is the link between formal language theory and the brain.

Why It Matters

The research paper that this series popularizes claims a certain originality: having brought together into a single dimensional framework asymmetries that were previously scattered. But it would be dishonest to claim that no one had linked production and comprehension. Psycholinguistics did so — at a different level of analysis. The P-chain is exactly the antecedent that must be cited, understood, and from which we must distinguish ourselves.

Understanding the P-chain also helps us see why two dimensions of the asymmetry — information (D4) and temporality (D6) — which seem independent on formal paper, might be one and the same thing in the brain.

The Idea in One Sentence

The P-chain proposes that language comprehension relies on an implicit prediction made by the production system, and that the error between this prediction and the actual input is the driver of learning — thus linking production, comprehension, and acquisition into a single causal chain.

Let’s Explain Step by Step

1. Three Activities, Long Studied Separately

Historically, psycholinguistics has treated the following as distinct fields:

production (how we transform an intention into speech),
comprehension (how we retrieve meaning from heard speech),
acquisition (how a child learns their language).

Three literatures, three sets of models, few bridges. This is the exact cognitive reflection of the formal asymmetry in L13: generating, recognizing, inferring, each in its own corner.

2. The P-chain: A Chain, Not Three Boxes

Dell & Chang (2014) propose to flip this perspective. Their framework, the P-chain (where P stands for prediction, production, processing), asserts that these three activities are links in a single chain:

TikZ diagram

Figure 1 — The P-chain loop. Comprehension calls upon the production system to predict what comes next; the gap with the actual input constitutes a prediction error; this error adjusts the model, which is learning; a better model predicts better. Production is not at the end of the chain — it is at the heart of comprehension.

3. “Prediction is production”

The most surprising link is the first one: to comprehend, the brain supposedly uses its production apparatus. Martin, Branzi & Bar (2018) formulate this right in their title — “Prediction is Production” — and support it experimentally: when the production system is occupied by a secondary verbal task, prediction capacity during comprehension drops. The system that speaks is the same one that anticipates what we are about to hear.

Gambi & Pickering (2017) turn this into a modeling principle: to comprehend is to simulate the other’s production. The listener does not receive passively; they actively re-generate, in advance, what the speaker is saying.

Breakdown. The idea is not that you internally pronounce every word. Rather, the mechanisms of language planning — those that, in production, choose the next word — are recruited, in comprehension, to guess the next word. Production runs “idle,” in prediction mode.

4. Surprisal: The Measurable Trace of Prediction

How do we measure this silent prediction? By its failure. When the heard word is expected, processing is smooth; when it surprises, it incurs a cost. This is exactly the surprisal introduced in L15:

$S(w_i) = -\log_2 P(w_i \mid w_1, \dots, w_{i-1})$

The surprisal of word $$w_i$$ measures the improbability of the word given the preceding context. Hale (2001) uses this as a model of processing difficulty: the more surprising a word is, the longer it takes to integrate. Levy (2008) refines this into expectation-based comprehension: difficulty is the cost of reallocating probability mass among competing hypotheses when the word arrives. Stolcke (1995) had provided the machinery: a probabilistic Earley parser that computes, at each position, the prefix probability.

Surprisal is therefore the observable signature of the P-chain: if the brain predicts, then the violation of its prediction must have a cost — and we measure it (reading times, N400 brain waves). This is the bridge between the cognitive hypothesis and the temporal dimension (D6) of our asymmetry: the generator never surprises itself ( $$S = 0$$ ), while the receiver experiences the surprisal of the input ( $$S > 0$$ ).

5. Prediction Error Drives Learning

The final link connects everything to inference (D5, the subject of L13 and soon a dedicated article). In the P-chain, “prediction error drives learning”: every gap between prediction and reality is an error signal that adjusts internal parameters. This is the same principle as learning by minimizing surprise, and it aligns with the idea that comprehending is compressing — finding the model that makes the data as unsurprising as possible.

In other words: a child acquiring their language is not performing an operation foreign to comprehension. They are doing comprehension whose errors are large enough to reconfigure the grammar. Inference is comprehension pushed to its limit — when the grammar itself is still unknown.

6. What the P-chain Says (and Doesn’t Say) About Our Framework

Here, for the sake of rigor, we must distinguish between levels of analysis.

Our framework is formal: it analyzes generation, recognition, and inference as distinct computational objects, with distinct complexity classes. At this level, information asymmetry (D4: what each agent knows in total) and temporal asymmetry (D6: how uncertainty evolves token by token) are independent: a “batch” mode parser (which receives the entire string at once) suffers from D4 but not from D6.

The P-chain is cognitive: it describes brain mechanisms. And at this level, it challenges this independence. If comprehending is predicting by producing, then the static information gap (D4) might be nothing more than the aggregate of small incremental surprises (D6) accumulated over time. A single machinery, observed at two scales.

The two readings do not contradict each other: they operate at different levels. Our contribution is not to discover the production-comprehension link — the P-chain did that — but to situate it within the formal framework of languages, where it had not yet been articulated. The cognitive question of whether D4 and D6 are one and the same mechanism remains open.

7. In Music: The Listener Who Anticipates

Music offers the purest testing ground for the P-chain. Listening to a tonal melody means constantly anticipating the next note — and feeling a precise tension when it deviates. Models of melodic expectation (such as IDyOM, which calculates note-by-note surprisal from a statistical model of style) are literally musical P-chains: they predict by mentally “producing” the continuation, and measure the surprise.

The improvising musician experiences the other end of the chain: they hear while playing. Their production anticipates their own listening. And the apprentice, in turn, adjusts their model of the style with each poorly anticipated phrase — inference in action. The same loop, from composer to listener to student.

This is also why a system like I2, which clearly separates production (PROD mode) and analysis (ANAL mode), captures the formal shape of the asymmetry but not its cognitive loop: it does not predict by producing. Grammar reversibility (L16) is a necessary, but not sufficient, condition to close the loop.

Key Takeaways

The P-chain (Dell & Chang 2014) unifies production, comprehension, and acquisition into a single causal chain.
Central hypothesis, prediction-by-production: to comprehend, the brain predicts what comes next by mobilizing its production system (Martin et al. 2018).
Surprisal (Hale 2001, Levy 2008) is the measurable trace of this prediction: its cost when expectation is violated.
“Prediction error drives learning”: the gap between prediction and reality drives learning — inference is comprehension pushed to its limit.
Levels of analysis: on a formal level, D4 (information) and D6 (temporality) are independent; on a cognitive level, the P-chain suggests they might be one and the same. Our contribution is to situate this link within the formal framework, not to discover it.
In music, the listener who anticipates and the musician who “hears while playing” are living P-chains.

To Go Further

The P-chain Framework and Prediction

Dell, G.S. & Chang, F. (2014). “The P-chain: relating sentence production and its disorders to comprehension and acquisition.” Phil. Trans. R. Soc. B 369(1634), 20120394. DOI:10.1098/rstb.2012.0394
Martin, C.D., Branzi, F.M. & Bar, M. (2018). “Prediction is Production: The missing link between language production and comprehension.” Scientific Reports 8, 1079. DOI:10.1038/s41598-018-19499-4
Gambi, C. & Pickering, M.J. (2017). “Models Linking Production and Comprehension.” The Handbook of Psycholinguistics, 157-181. DOI:10.1002/9781118829516.ch7
Gastaldon, S. et al. (2024). “Predictive language processing: integrating comprehension and production.” Frontiers in Psychology 15, 1369177. DOI:10.3389/fpsyg.2024.1369177
Chater, N. & Manning, C.D. (2006). “Probabilistic models of language processing and acquisition.” Trends in Cognitive Sciences 10(7), 335-344. DOI:10.1016/j.tics.2006.05.006

Surprisal

Hale, J. (2001). “A Probabilistic Earley Parser as a Psycholinguistic Model.” NAACL 2001 — difficulty proportional to surprisal.
Levy, R. (2008). “Expectation-Based Syntactic Comprehension.” Cognition 106(3), 1126-1177 — probability reallocation.
Stolcke, A. (1995). “An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities.” Computational Linguistics 21(2) — the prefix probability machinery.

The Popularized Research Paper

Peyrichou, R. (2026). The Generation-Recognition Asymmetry… §1.2 and §4.6 situate the P-chain in relation to the formal framework. Preprint arXiv:2603.10139 — https://arxiv.org/abs/2603.10139

Within the Corpus

L13 — The asymmetry in 6 dimensions
L15 — Surprisal and other formulas (D6)
L16 — Reversibility: necessary but not sufficient for the loop

Glossary

P-chain: framework by Dell & Chang (2014) linking production, comprehension, and acquisition in a causal chain via prediction.
Prediction-by-production: hypothesis according to which comprehension predicts what comes next by mobilizing the production system.
Surprisal: $-\log_2$ of the probability of a word given its context; measures improbability, and thus processing difficulty.
Prediction error: gap between what the system predicted and the actual input; the signal that drives learning.
N400: brain wave (event-related potential) whose amplitude increases with semantic unexpectedness — neural correlate of surprisal.
Level of analysis: the plane on which a phenomenon is described (formal/computational vs. cognitive/mechanistic); two levels can diverge without contradicting each other.
Melodic expectation: the listener’s anticipation of the next note; modeled by musical surprisal (e.g., IDyOM).

Series Links

L13 — Generating or recognizing — the asymmetry for which the P-chain is the cognitive antecedent
L15 — The formulas of asymmetry — where surprisal comes from (D6)
L18 — The sign reversal — the other major contribution of the paper
M6 — Hierarchical structure in music — structural expectations

Prerequisites: L13, L15
Reading time: 10 min
Tags: #P-chain #psycholinguistics #surprisal #prediction #production #musical-cognition

← Back to index