Substitution in first-order languages

1. Introduction

This web page discusses syntactical aspects of substitution in first-order logic. In particular, the tricky definition of substitution itself is given and we will see when that substitution is valid. All these details are technical. They behave exactly as one would naïvely hope (though the details are more awkward to write down than one might expect), and in particular all these notions are computable, i.e. computable on a computer, or representable in a formal system like a Post system. Readers who can take such details on trust need not read further. Only those who want to see the gory details need read on.

2. Substitution

We want to define and investigate the effect of substituting terms $t_{1}, \dots, t_{k}$ for variables $v_{1}, \dots, v_{k}$ in a term or formula. The work here will be entirely syntactical (i.e. manipulations of strings of symbols without meaning attached) but looking ahead to further work we will take care to ensure that our substitutions are potentially meaningful (i.e. can at a later stage be given meaning).

We fix a first order language $L$ with variables from ${VAR}_{L} = \{u\}, v, w, \dots$ and terms ${TERM}_{L}$ .

In its simplest form a substitution is a function $s : V \to {TERM}_{L}$ where $V = dom (s)$ is a finite set of variables. The idea is that we should substitute each $v \in V$ with the term $s (v)$ . This is turn gives more complex substitition functions $s (t)$ and $s (ϕ)$ where $t$ is a term and $ϕ$ is a formula of $L$ . Most of the inductive definition of this extension to terms and formulas is given recursively by the following.

$s (v) = v$ if $v \notin dom (s)$
$s (c) = c$ if $c$ is a constant symbol.
$s (F (t_{1}, \dots, t_{k})) = F (s (t_{1}), \dots, s (t_{k}))$ if $F$ is a $k$ -ary function symbol and $t_{1}, \dots, t_{k}$ are terms.
$s (⊤) = ⊤$ and $s (⊥) = ⊥$
$s ((t_{1} = t_{2})) = (s (t_{1}) = s (t_{2}))$ if $t_{1}, t_{2}$ are terms.
$s (R (t_{1}, \dots, t_{k})) = R (s (t_{1}), \dots, s (t_{k}))$ if $R$ is a $k$ -ary relation symbol and $t_{1}, \dots, t_{k}$ are terms.
$s (\neg ϕ) = \neg s (ϕ)$ for any formula $ϕ$
$s ((ϕ_{1} \land ϕ_{2})) = (s (ϕ_{1}) \land s (ϕ_{2}))$ if $ϕ_{1}, ϕ_{2}$ are formulas.
$s ((ϕ_{1} \lor ϕ_{2})) = (s (ϕ_{1}) \lor s (ϕ_{2}))$ if $ϕ_{1}, ϕ_{2}$ are formulas.
$s ((ϕ_{1} \to ϕ_{2})) = (s (ϕ_{1}) \to s (ϕ_{2}))$ if $ϕ_{1}, ϕ_{2}$ are formulas.

This is an inductive definition on terms and formulas, and as such relies on the Unique Readability Theorem for terms and formulas.

The only clauses missing from the above are those for $s (v) = v$ if $v \in dom (s)$ and for the quantifiers $\forall$ and . But these cause difficulties: some substitutions are invalid i.e. they result in unacceptable introduction of new variables in the scope of an existing quantifier. To resolve this problem we must specify when substitutions are valid, and this involves a slightly more sophisticated idea of substitution.

From now on, we say that a substitution $s$ is a function $s : V \to {TERM}_{L}$ with domain $V = dom (s)$ and a finite set of forbidden variables, $(s)$ . Variables in $(s)$ are not allowed in the substituted output, though they can be present in unsubstituted output.

The remaining clauses of the definition of substitution will now be given, starting with the substitution of a variable for a term.

$s (v) = v$ if $v \notin dom (s)$
$s (v)$ is the term $s (v)$ itself if $v \in dom (s)$ and no variable $w \in (s)$ occurs in $s (v)$
$s (v)$ is invalid in the remaining case, i.e. if $v \in dom (s)$ and some variable $w \in (s)$ occurs in $s (v)$

Finally, we can give the rules for substitution of formulas involving quantifiers. There are two cases. In the following, $Q$ is either $\forall$ or .

if $v \in dom (s)$ then $s (Q v ϕ)$ is the formula $Q v t (ϕ)$ where the substitution $t$ is defined by: $dom (t) = dom (s) ∖ \{v\}$ ; $(t) = (s) \cup \{v\}$ ; and $t (w) = s (w)$ for all $w \in dom (t)$
if $v \notin dom (s)$ then $s (Q v ϕ)$ is the formula $Q v r (ϕ)$ where the substitution $r$ is defined by: $dom (r) = dom (s)$ ; $(r) = (s) \cup \{v\}$ ; and $t (w) = s (w)$ for all $w \in dom (t)$

The point is here that a quantified variable must never be substituted, since it already has an intended meaning (ranging over all or some elements of a $L$ -structure). That is why in the first case the domain of the substitution is altered to remove the variable $v$ . Furthermore, in the scope of the quantifier terms should not be introduced by substitution if they involve the variable being quantified over. That is why the forbidden variable set is made bigger.

In the sequel, when defining a substitution $s$ we shall normally just give the domain of $s$ and the value $s (v)$ for each variable $v \in dom (s)$ . If $(s)$ is not specified it will be assumed to be empty.