5 Quantile regression in time series

In the time series literature, the consecrated pedagogy recommends that dynamic models are introduced in terms of stochastic difference equations of the form \[\begin{equation} Y_t = h(\textsf{past})+\textsf{innovation}_t,\qquad\textsf{for all admissible}\, t \tag{5.1} \end{equation}\] for a suitable function \(h,\) usually accompanied by the assumption that innovations do not depend “too much” on the past, that they are in some sense “well behaved,” etc. This approach has the merit of conveying an intuitive, mechanistic content, as it (sort of) tells us how the random variable \(Y_t\) comes into being, given what has happened in the past.16 Nevertheless, a stochastic difference equation — just as ordinary differential equations, for that matter — may or may not admit a solution, which in turn may or may not be stationary, ergodic, etc. Existence of a solution, here, means one can find a probability law of a stochastic process that satisfies the required stochastic difference equations.

Exercise 5.1 (AR(1) model) Let \((U_t\colon t\in\mathbb{Z})\) be a doubly infinite sequence of iid Gaussian random variables, and let \(\alpha\) be a real number with \(|\alpha|<1.\) Define \[ Y_t := \lim_{p\to \infty}\sum_{\ell=0}^{p} \alpha^\ell U_{t-\ell},\qquad t\in\mathbb{Z}. \]

  1. Show that the above limit exists almost surely (use the Borel-Cantelli lemma) so the \(Y_t\)’s are well defined.
  2. Show that the sequence \((Y_t\colon\,t\in\mathbb{Z})\) satisfies the stochastic difference equations \[\begin{equation} Y_t = \alpha Y_{t-1} + U_t,\qquad t\in\mathbb{Z} \end{equation}\]
  3. Find the conditional quantile function of \(Y_t\) given \(Y_{t-1}.\) Is it the same as the conditional quantile function of \(Y_t\) given \((Y_{t-1},\dots,Y_{t-p})\) for arbitrary \(p\ge1?\) \(\blacksquare\)

The idea of presenting time series models in terms of stochastic difference equations with “innovation terms” is reminiscent of the ubiquity of “error terms” in regression models. Well, while “observable” random variables work as a formal model for observable, quantitative phenomena, one may argue that error terms are a fiction (as they are commonly preceded by the adjective “unobservable”), and writing them down in an equation certainly does not concede them any material substance.17 An alternative route is to present regression models in terms of the conditional distribution of the response given the covariates; as we have seen in the preceding sections, this is precisely how quantile regression models proceed, and fortunately this can be brought to a time series framework.

Before taking the first step in this direction, however, we need to come up with a precise definition of the idea of “conditioning on the entire past.” For that end, suppose that \(\big((Y_t,Z_t)\colon\,t\in\mathbb{Z}\big)\) is a stochastic process18 where, for all \(t,\) \(Y_t\) is scalar valued and \(Z_t\) is \(\mathbb{R}^{\mathrm{D}_Z}\) valued. For each \(t\in \mathbb{Z},\) let \(\mathfrak{F}_t\) denote the \(\sigma\)-field generated by the sequence of random vectors \(\big((Y_s,Z_s)\colon s\le t\big).\) For short, we shall write \[ \mathbf{P}[E\,|\,\mathfrak{F}_{t-1}] := \mathbf{P}[E\,|\,(Y_{s},Z_{s}),\,s < t],\quad t\in \mathbb{Z} \] for each event \(E\) determined by the \(Y\)’s and \(Z\)’s. Of special interest here are events \(E\) of the form \(E = [Y_t\le y, Z_t\le z]\) with \(y\in\mathbb{R}\) and \(z\in \mathbb{R}^{\mathrm{D}_Z}\) (vector inequalities are interpreted component-wise), as the Kolmogorov Extension Theorem tells us that the probability law of the process \(\big((Y_t,Z_t)\colon\,t\in\mathbb{Z}\big)\) can be reconstructed from the sequence of conditional distributions \[ \mathbf{P}[Y_t\le\cdot, Z_t\le \cdot\,|\,\mathfrak{F}_{t-1}],\quad t\in\mathbb{Z}. \]

The notation introduced above calls for a bit of caution, as the conditional probabilities \(\mathbf{P}[E\,|\,\mathfrak{F}_t]\) are random variables: they are in analogy with our previous \(\mathbf{P}[E\,|\,X],\) and not with the more down-to-earth \(\mathbf{P}[E\,|\, X=x].\) This has the drawback of requiring us to introduce additional notation for the conditional quantile functions. We shall write, for \(\tau\in(0,1)\) \[\begin{equation} Q_{Y_t}(\tau\,|\,\mathfrak{F}_{s}) := \inf\{y\in\mathbb{R}\colon\, \mathbf{P}[Y_t\le y\,|\,\mathfrak{F}_{s}] \ge \tau\},\quad t,s\in \mathbb{Z}. \end{equation}\] The careful reader will notice the random set in the right-hand side above, which may raise concerns about measurability. The thing is, notational extravagances aside, what really is going on here is that, at the outset, we are working with a sequence of regular conditional distributions of the form \[ \pi_t(E,v) = \mathbf{P}[E\,|\,V_t = v]\quad t\in\mathbb{Z}, \] where \(V_t\) is the stochastic process \(\big((Y_s,Z_s)\colon s\le t\big)\) and where \(v\) runs through admissible values of \(V_t\) — that is, \(v\in\operatorname{support}(V_t).\) In this case we have \(\mathbf{P}[E\,|\,\mathfrak{F}_{t}] = \pi_t(E, V_t)\) and then \(Q_{Y_t}(\tau\,|\,\mathfrak{F}_{t-1})\) is simply the composition of the function \(v\mapsto Q_{Y_t|V_{t-1}}(\tau\,|\,v)\) with the “infinite dimensional random vector” \(V_{t-1}.\) That is, we have \[ Q_{Y_t}(\tau\,|\,\mathfrak{F}_{t-1})(\omega) = Q_{Y_t|V_{t-1}}(\tau\,|\,V_{t-1}(\omega)) \] and so on. I hope this illustrates how clumsy notation can get, and here is one of those places I mentioned in the foreword, where ambiguity gets in the way.

Exercise 5.2 (Vector autoregression) Let \((W_t\colon\,t\in\mathbb{Z})\) be a stochastic process with state space \(\mathbb{R}^{{\mathrm{D}_Z}+1}.\) For each \(t,\) write \(W_t = (Y_t\quad Z_t^\prime)^\prime.\) Assume that, for each \(t,\) conditional on \(\mathfrak{F}_{t-1},\) the random vector \(W_t\) has a multivariate Gaussian distribution with mean \(\mathbf{A}W_{t-1},\) for some (non-random) \(({\mathrm{D}_Z}+1)\times({\mathrm{D}_Z}+1)\) matrix \(\mathbf{A},\) and covariance matrix \(\boldsymbol{\Sigma}_1.\) The process \((W_t\colon\,t\in\mathbb{Z})\) is called a vector autoregressive process of order 1.

For each \(t,\) find the conditional quantile function \(\tau \mapsto Q_{Y_t}(\tau\,|\,\mathfrak{F}_{t-1}).\)

  1. Most of us, I believe, think about probability models in terms of an iterative mechanism, akin to simulation, and this is what an equation like (5.1) encodes.↩︎

  2. Actually, it is just a matter of writing e.g. \(Y = X^\prime\beta + (Y - X^\prime\beta)\) so it really all boils down to distributional assumptions about the random variable \(Y - X^\prime\beta.\)↩︎

  3. Considering \(\mathbb{Z}\) as the index set is a matter of convenience; the times could run through non-negative integers as well, requiring some minor adjustments in notation.↩︎