3.1 The Substitution Principle

The random variable \(Y\) above is quite arbitrary: in particular, it may be of the form \(Y = \varphi(X,Z),\) where \(Z\) is a random vector and where \(\varphi\) is a given measurable function. If this is the case, one feels tempted to ask: “conditional on \(X=x,\) does \(X\) behave as a constant?” The answer is affirmative:

Theorem 3.2 Let \(X\) and \(Z\) be random vectors of dimensions \({\mathrm{D}_X}\) and \({\mathrm{D}_Z}\) respectively. If \(\varphi\colon\mathbb{R}^{{\mathrm{D}_X}}\times\mathbb{R}^{{\mathrm{D}_Z}}\to\mathbb{R}\) is a measurable function, then there exists a Borel set \(A^*\subseteq\mathbb{R}^{\mathrm{D}_X}\) such that \[\begin{equation} \mathbf{P}[\varphi(X,Z)\in B\,|\, X=x] = \mathbf{P}[\varphi(x,Z)\in B\,|\, X=x] \end{equation}\] for all Borel sets \(B\subseteq \mathbb{R}\) and all \(x\in A^*.\)

Remark. The equality stated in the above theorem actually means that, for each Borel set \(B\subseteq \mathbb{R},\) one has \[ \int \mathbf{P}[\varphi(X,Z)\in B\,|\,X=x]\,F_X(\mathrm{d}x) = \int \mathbf{P}[Z\in B_x\,|\,X=x]\,F_X(\mathrm{d}x), \] where, for each \(x\in\mathbb{R}^{\mathrm{D}_X},\) \(B_x := \{z\in \mathbb{R}^{{\mathrm{D}_Z}}\colon\,\varphi(x,z)\in B\}.\)

Example 3.3 With the same notation as in the above theorem, assume the function \(\varphi\) does not depend on its second argument, that is, for some \(\psi\colon\mathbb{R}^{\mathrm{D}_X}\to\mathbb{R}\) it holds that \(\varphi(x,z) = \psi(x)\) for all \(x\in\mathbb{R}^{\mathrm{D}_X}\) and \(z\in\mathbb{R}^{{\mathrm{D}_Z}}.\) Then an easy check tells us that \[\begin{equation} \mathbf{P}[\psi(X)\in B\,|\,X=x] = \begin{cases} 0,& \textsf{if } \psi(x)\notin B\\ 1,& \textsf{if } \psi(x)\in B \end{cases} \end{equation}\] for any Borel set \(B\subseteq\mathbb{R}.\) That is, \(\mathbf{P}[\psi(X)\in B\,|\,X=x] = \mathbb{I}[\psi(x)\in B].\) In particular, \(\mathbf{P}[\psi(X) = \psi(x)\,|\,X=x] = 1\) (not surprisingly). The above also is true with \(\psi = \hphantom{\!}\) “the identity function” (valid when \({\mathrm{D}_X}=1\) or by generalizing Theorem 3.1 to vector-valued \(Y.\))

Exercise 3.1 Show that, if \(Y\) is independent from \(X,\) then \(\mathbf{P}[Y\in B\,|\,X=x] = \mathbf{P}[Y\in B]\) for all admissible \(B\subseteq\mathbb{R}\) and all \(x\in\mathbb{R}^{\mathrm{D}_X},\) that is, one can take \(\pi(B,x) = \mathbf{P}[Y\in B]\) in Theorem 3.1. \(\blacksquare\)