3.2 Conditional CDFs and Quantile functions
In section 2 we have put forth the case that univariate probability distributions are essentially the same thing as their corresponding cumulative distribution functions, which in turn are essentially the same thing as their quantile functions. Well, whenever \(Y\) is a scalar random variable (and \(X\) is a random vector), for a fixed \(x\in\mathbb{R}^{\mathrm{D}_X}\) the real valued function \[\begin{equation} B\mapsto \mathbf{P}[Y\in B\,|\, X=x] \tag{3.3} \end{equation}\] is in fact a probability distribution on the Borel subsets of \(\mathbb{R}.\) Thus, it not only makes sense to define the corresponding cumulative distribution function, but it is also the case that the latter function fully characterizes the mapping in equation (3.3). The conditional cumulative distribution of \(Y\) given \(X\) is the function \((y,x)\mapsto F_{Y|X}(y|x)\) defined on \(\mathbb{R}\times\mathbb{R}^{\mathrm{D}_X}\) by \[\begin{equation} F_{Y|X}(y|x) := \mathbf{P}[Y\le y\,|\,X=x]. \end{equation}\] Importantly, the machinery introduced so far allows us to write the joint cumulative distribution function of \((Y,X)\) as8 \[\begin{equation} F_{Y,X}(y,x) = \int_{-\infty}^x F_{Y|X}(y|u)\,F_X(\mathrm{d}u),\quad y\in\mathbb{R}, x\in\mathbb{R}^{\mathrm{D}_X}. \tag{3.4} \end{equation}\]
Exercise 3.2 Find an expression for \(F_{Y|X}(y|x)\):
- in terms of the joint density function \(f_{Y,X},\) assuming \((Y,X)\) is absolutely continuous.
- in terms of the joint mass function \(p_{Y,X},\) assuming \((Y,X)\) is discrete. \(\blacksquare\)
But this is a text about quantile regression, and now comes a fundamental definition in this direction: in the same setting as above, we let the conditional quantile function of \(Y\) given \(X\) be the function \((\tau,x)\mapsto Q_{Y|X}(\tau|x)\) defined on \((0,1)\times\mathbb{R}^{\mathrm{D}_X}\) by \[\begin{equation} Q_{Y|X}(\tau|x) = \inf\{y\in\mathbb{R}\colon\, F_{Y|X}(y|x)\ge\tau\}. \end{equation}\] Wrapping up what we have seen so far, the catch is that a probability distribution on \(\mathbb{R}\times\mathbb{R}^{\mathrm{D}_X}\) is entirely determined by the marginal distribution of \(X\) together with the conditional quantile function \(Q_{Y|X},\) since we can recover the conditional CDF \(F_{Y|X}\) from \(Q_{Y|X}\) and reconstruct the joint CDF \(F_{Y,X}\) via (3.4). From \(F_{Y,X}\) we can then compute the probability of any event of the form \([Y\in B, X\in A ]\) via the formula \[ \mathbf{P}[Y\in B, X\in A] = \int_{B\times A}\,F_{X,Y}(\mathrm{d}y, \mathrm{d}x). \]
The idea is not a mere shenanigan: for instance, if \({\mathrm{D}_X}=1\) it provides us a recipe to simulate a pair \((Y,X)\) from the distribution \(\mathbf{P}_{Y,X}\) as follows:
- generate two independent pseudo-random numbers \(u_1\) and \(u_2\) from the Uniform\([0,1]\) distribution.
- set \(x = Q_X(u_1)\) and \(y = Q_{Y|X}(u_2|x).\)
It follows that the pair \((y,x)\) is a pseudo-random draw from \(\mathbf{P}_{Y,X}.\)
Theorem 3.3 Assume \(Y\) is a scalar random variable, and \(X\) is a \({\mathrm{D}_X}\)-dimensional random vector. Denote by \(F_X\) the cumulative distribution function of \(X,\) and by \(Q_{Y|X}\) the conditional quantile function of \(Y\) given \(X.\) Now let \(U\) be a Uniform\([0,1]\) random variable independent from \(X,\) and define \(\tilde{Y} = Q_{Y|X}(U\,|\,X).\) Then \(\tilde{Y} \overset{\textsf{dist}}= Y.\)
Exercise 3.3 Prove Theorem 3.3. Hint: the bulk of the proof lies in computing \(\mathbf{P}[Q_{Y|X}(U|x)\le y\,|\,X=x]\) and then integrating wrt \(F_X(\mathrm{d}x).\) For those with a Measure Theoretic eye, however, the question of whether the function \(\tilde{Y}\colon\Omega\to\mathbb{R}\) is measurable may be a deterrent; fortunately, the answer to this question is affirmative — see Theorem 3 in Gowrisankaran (1972).
Exercise 3.4 Let \(Z := g(X) + h(X)Y\) where \(g\) and \(h\) are real valued measurable functions on \(\mathbb{R}^{\mathrm{D}_X},\) \(h\) being non-negative. Show that \(Q_{Z|X}(\tau|x) = g(x) + h(x)Q_{Y|X}(\tau|x),\) for \(\tau\in(0,1).\)
Exercise 3.5 For \(\tau\in(0,1),\) let \(g_\tau\colon\mathbb{R}^{\mathrm{D}_X}\to\mathbb{R}\) be a measurable function.
- Show that \[\begin{equation} \mathbf{E}\rho_\tau\big(Y - Q_{Y|X}(\tau|x)\big) \le \mathbf{E}\rho_\tau\big(Y - g_\tau(X)\big). \end{equation}\]
- Show that, if the mapping \((\tau,x)\mapsto g_\tau(x)\) is measurable, then \[\begin{equation} \int_0^1\mathbf{E}\rho_\tau\big(Y - Q_{Y|X}(\tau|x)\big)\,\mathrm{d}\tau \le \int_0^1\mathbf{E}\rho_\tau\big(Y - g_\tau(X)\big)\,\mathrm{d}\tau. \end{equation}\]
Of course, the integral can be simplified in the cases where \(X\) is discrete or continuous, as we discussed earlier.↩︎