L7) Denotational Semantics

When a Program Becomes a Function

What if a program was nothing more than a mathematical function in disguise? This bold idea is at the heart of denotational semantics.

Where does this article fit in?

In L5, we saw three approaches to giving meaning to a program. L6 (Structural Operational Semantics) describes how it executes. Denotational semantics describes what — what mathematical function the program computes. This is the subject of this article.

Why is it important?

Operational semantics (SOS) describes how a program executes. It’s precise, useful for implementing an interpreter, but sometimes too detailed. We want to know what the program does, not how it does it.

Denotational semantics approaches the problem from another angle: it associates each program with a mathematical function. Two programs are equivalent if and only if they denote the same function. This is a more abstract, more mathematical vision, which allows reasoning about equivalence and optimization.

For BP3 (Bol Processor 3, a musical grammar language, see I2), this approach is interesting but limited: stochasticity (behavior governed by probabilities) and side effects (state modifications, inputs/outputs, or anything beyond the simple calculation of a value) complicate the definition of functions. Understanding why helps us better grasp the strengths and weaknesses of each semantic approach.

The idea in one sentence

Denotational semantics interprets each language construct as a mathematical function, transforming the question “what does this program do?” into “what function is it?”

Let’s explain step by step

Example 1: Arithmetic Expressions

Let’s start with a simple example: arithmetic expressions.

Syntax:

e ::= n | x | e + e | e * e

Semantic Domain:

Values = Integers
Environment = Variables → Integers

The Denotation Function ·:

We define e as a function that takes an environment and returns an integer.

n : Env → Int
n(ρ) = n

x : Env → Int
x(ρ) = ρ(x)

e₁ + e₂ : Env → Int
e₁ + e₂(ρ) = e₁(ρ) + e₂(ρ)

e₁ * e₂ : Env → Int
e₁ * e₂(ρ) = e₁(ρ) × e₂(ρ)

Complete Decryption of these Notations:

The · notation (double brackets):

Reads as “the denotation of” or “the mathematical meaning of”

It is a function that transforms syntax (the program text) into semantics (its mathematical meaning)

e = “what expression e truly means”

The Env → Int notation:

Reads as “function from Env to Int”

Env (Environment) = the set of variable-value associations

Int = the set of integers

Env → Int = “a function that takes an environment and returns an integer”

The letter ρ (rho):

Represents a concrete environment (a variable → value dictionary)

ρ(x) = “the value of x in environment ρ”

Example: if ρ = {x ↦ 5, y ↦ 10}, then ρ(x) = 5

Line-by-line reading:

n(ρ) = n: “the denotation of a number n, in any environment, is that number itself”

x(ρ) = ρ(x): “the denotation of a variable x is its value in the environment”

e₁ + e₂(ρ) = e₁(ρ) + e₂(ρ): “the denotation of a sum is the sum of the denotations”

Concrete example:

x + 2 * 3({x ↦ 5})
= x({x ↦ 5}) + 2 * 3({x ↦ 5})
= 5 + (2({x ↦ 5}) × 3({x ↦ 5}))
= 5 + (2 × 3)
= 11

Step-by-step reading of this calculation:

We want the denotation of x + 2 * 3 in the environment where x is 5

By the addition rule, it is the sum of the denotations of x and 2 * 3

The denotation of x in {x ↦ 5} is 5

The denotation of 2 * 3 is 2 times 3 = 6

So the final result is 5 + 6 = 11

Denotation is compositional: the meaning of a compound expression depends only on the meaning of its sub-expressions.

Why is compositionality important?

Imagine a cooking recipe. If you know that “making a shortcrust pastry” yields a certain dough, and “making an apple filling” yields a certain filling, then you know what “apple pie = shortcrust pastry + apple filling” yields without needing to redo the entire recipe. Compositionality allows reasoning about the parts to understand the whole.

Example 2: Commands with State

Let’s move on to commands that modify the state.

Syntax:

cmd ::= skip | x := e | cmd ; cmd | if b then cmd else cmd

Semantic Domain:

A command transforms one state into another state:

cmd : State → State

What is a semantic domain?

It is the mathematical space in which the meanings of programs “live”. Here, commands are interpreted as state transformers: functions that take a state (the memory before) and return a new state (the memory after).

The Denotation Functions:

skip : State → State
skip(σ) = σ

x := e : State → State
x := e(σ) = σ[x ↦ e(σ)]

c₁ ; c₂ : State → State
c₁ ; c₂(σ) = c₂(c₁(σ))

if b then c₁ else c₂ : State → State
if b then c₁ else c₂(σ) =
    if b(σ) = true  then c₁(σ)
    else c₂(σ)

Decryption of each rule:

skip(σ) = σ: the “do nothing” command returns the state unchanged

x := e(σ) = σ[x ↦ e(σ)]:

The assignment x := e first evaluates e (yielding e(σ))

Then updates the state: σ[x ↦ v] = “σ where x now equals v”

Example: x := 5({x ↦ 0, y ↦ 10}) = {x ↦ 5, y ↦ 10}

c₁ ; c₂(σ) = c₂(c₁(σ)):

The sequence “c1 then c2” first executes c1 (obtaining a new state)

Then executes c2 in this new state

This is a composition of functions: (c₂ ∘ c₁)(σ)

if b then c₁ else c₂: depending on the value of b, executes c1 or c2

We see that each syntactic construct has a direct functional interpretation.

The Challenge of Loops: Fixed Points

Loops pose a subtle problem. Consider:

while b do c

Intuitively, this is equivalent to:

if b then (c ; while b do c) else skip

But this definition is circular: while is defined in terms of itself.

Solution: The Fixed-Point Equation

We are looking for a function F such that:

while b do c = F

where F satisfies:

F(σ) = if b(σ) then F(c(σ)) else σ

This is a fixed-point equation: we are looking for F such that F = G(F) for a certain operator G.

What is a fixed point? Concrete examples

A fixed point of a function f is a value x such that f(x) = x (the function “does not move” this value).

Simple examples:

For f(x) = x², the fixed points are 0 and 1 (because 0² = 0 and 1² = 1)

For f(x) = x + 1, there is no fixed point (no number equals its successor)

For a thermostat set to 20°C, the temperature 20°C is a “fixed point”: the system no longer needs to change

For loops:
We are looking for a function F (the behavior of the loop) such that “executing the loop one more time changes nothing”. This is exactly the definition of a fixed point applied to functions.

Kleene’s theorem guarantees that this equation has a solution (the least fixed point) under certain conditions. This is the mathematical basis for the semantics of loops.

Why “the least” fixed point?

There can be multiple fixed points. We choose the least because it corresponds to the “minimal” behavior: the loop does exactly what it is supposed to do, without inventing additional behavior. For an infinite loop, the least fixed point is the “undefined everywhere” function (the loop never terminates).

Intuition: The fixed point represents “the limit” of:

0 iterations: skip
1 iteration: if b then c else skip
2 iterations: if b then (c; if b then c else skip) else skip
…

Domains: The Mathematical Foundation

For fixed points to exist, we need an appropriate mathematical structure. This is the role of domain theory.

Key Ideas:

Partial order: values are ordered by “information”. ⊥ (bottom) represents “no information” or “non-termination”.
Increasing chains: if x₀ ⊑ x₁ ⊑ x₂ �� ..., the sequence converges to a limit.
Continuous functions: functions that preserve chain limits.
Fixed-point theorem: every continuous function on a domain has a least fixed point.

Decryption of symbols and concepts:

⊥ (bottom): symbol representing the absence of information or a non-terminating computation. It is “less” than any other value.

⊑ (partial order): reads as “is less defined than” or “has less information than”. x ⊑ y means “y has at least as much information as x”.

Example: ⊥ ⊑ 5 (not terminating has less information than returning 5)

Increasing chain: a sequence where each element has “more information” than the previous one

Example: ⊥ ⊑ "program that terminates in 1 iteration" ⊑ "program that terminates in 2 iterations" ⊑ ...

Continuous function: a function that “preserves approximations”. If we approximate the input better and better, the output also approximates better and better.

In practice: These details are important for theoreticians. For practitioners, the essential thing is that loops and recursion do have a well-defined mathematical interpretation.

What about BP3 in all this?

BP3 is an interesting case for denotational semantics. Let’s try to define · for a BP3 grammar.

First attempt: deterministic grammars

For ORD (ordered) mode without stochasticity:

S : () → String
S() = Tihai Tihai Tihai()
Tihai() = "dha dhin dhin"

This works: each non-terminal (symbol that can be rewritten according to grammar rules, see L1) denotes a string of symbols.

The problem: stochasticity

In RND (weighted random) mode with weights (numerical values determining the probability of rule selection):

<3> Tihai → dha dhin dhin
<2> Tihai → fa sol la

What is the denotation of Tihai?

Tihai = ???

It’s not a single string but a set of possible strings with probabilities. The denotation becomes:

Tihai : () → Distribution(String)
Tihai() = { "dha dhin dhin" ↦ 0.6, "fa sol la" ↦ 0.4 }

This is possible but more complex. We enter the realm of probabilistic monads.

What is a probability distribution?

It is a function that associates each possible outcome with its probability of occurrence. The sum of all probabilities is 1 (100%).

Here, { "dha dhin dhin" 0.6, "fa sol la" 0.4 } means:

60% chance of getting “dha dhin dhin”

40% chance of getting “fa sol la”

(0.6 + 0.4 = 1, the sum is indeed 100%)

A probabilistic monad is a mathematical structure that allows composing such distributions. If A has a certain distribution and B also, how to calculate the distribution of “A then B”? The monad answers this question.

The problem: mutable state

BP3 flags (conditional variables like /Ideas=20/, see B4) add state:

rule with /Ideas-1/ : State → Distribution(String × State)

Now the denotation returns both a string AND a new state. We need state monads combined with distributions.

The problem: decremental weights

With <50-12> (initial weight 50 decremented by 12 with each use, see B4), probabilities change with each use:

Tihai after 3 calls ≠ Tihai after 5 calls

The denotation depends on the complete history of past derivations.

Conclusion for BP3:

A denotational semantics for BP3 is possible but requires:

Probabilistic monads
State monads
Weight history

It’s feasible (researchers have done it for similar languages) but complex. Operational semantics (SOS) is more direct and practical for BP3.

When Denotational Semantics Shines

Despite its limitations for BP3, denotational semantics excels in certain contexts:

Context	Advantage
Pure functional languages	No state, no effects: direct denotation
Equivalence proofs	`P₁ ≡ P₂` iff `P₁ = P₂`
Optimization	If `P' = P`, P’ can be substituted for P
Type definition	Types are semantic domains
Theoretical foundations	Links with logic, topology, category theory

Example of equivalence:

x + 0 = x
x * 1 = x
if true then e₁ else e₂ = e₁

These equations are laws that any interpreter must respect.

Key Takeaways

e = the function denoted by e: standard notation for “the semantics of”.
Compositionality: the meaning of a whole depends on the meaning of its parts.
Fixed points: mathematical solution for defining loops and recursion.
Domains: mathematical structures that guarantee the existence of fixed points.
Limitations for BP3: stochasticity, mutable state, and dynamic weights make denotational semantics complex. SOS is more suitable.
Strengths: equivalence proofs, optimization, solid theoretical foundations.

To go further

Foundational book: Stoy, Denotational Semantics: The Scott-Strachey Approach to Programming Language Theory — the historical classic
Modern introduction: Winskel, The Formal Semantics of Programming Languages — chapters 5-8
Monads and effects: Moggi, E. (1991). “Notions of Computation and Monads” — how to handle effects in denotational semantics
For the curious: Tennent, Semantics of Programming Languages — pedagogical approach

Glossary

Bottom (⊥): minimal element of a domain, representing absence of information or non-termination. Pronounced “bottom”. Example: an infinite loop has ⊥ as its denotation.
Increasing chain: a sequence of elements x₀ ⊑ x₁ ⊑ x₂ ⊑ ... where each element has “more information” than the previous one. Used to construct fixed points.
Compositionality: property according to which the meaning of a compound expression depends only on the meaning of its components. Allows reasoning about the parts to understand the whole.
Denotation: the mathematical value (function, set, etc.) associated with an expression. Noted e for expression e.
Probability distribution: a function associating each possible outcome with its probability. The sum of probabilities is 1.
Domain: a mathematical structure with a partial order (⊑) and completeness properties; serves as a “value space” for programs.
Semantic domain: the mathematical space in which programs are interpreted (integers, functions, states…).
Environment: a dictionary associating each variable with its value. Noted ρ (rho). Example: {x ↦ 5, y ↦ 10}.
Continuous function: a function that preserves the limits of increasing chains. Necessary for the existence of fixed points.
Monad: a mathematical structure that encapsulates effects (state, non-determinism, exceptions, probabilities) within a functional framework.
Probabilistic monad: a monad allowing the composition of probability distributions.
Partial order (⊑): “is less defined than” or “has less information than” relation. ⊥ ⊑ x for all x.
Fixed point: a value x such that f(x) = x. Examples: 0 and 1 are fixed points of f(x) = x². Used to define loops and recursion.
Kleene’s theorem: guarantees that every continuous function on a domain has a least fixed point, obtained as the limit of successive approximations.
State transformer: a function of type State → State. An imperative command is viewed as a state transformer.
· (double brackets): standard notation for “the denotation function”. e reads as “the denotation of e” or “what e means mathematically”.
ρ (rho): Greek letter conventionally used to represent an environment (variable → value dictionary).
σ (sigma): Greek letter conventionally used to represent a state (program memory).

Prerequisite: L5
Reading time: 8 min
Tags: #denotational-semantics #functions #fixed-point #domain-theory

Previous article: L6
For practice, see: R1