Summary: Recent research has emphasized that permanent changes in the innovation variance (caused by structural shifts or an integrated volatility process) lead to size distortions in conventional unit root tests. It has been shown how these size distortions can be resolved using the wild bootstrap. In this paper, we first derive the asymptotic power envelope for the unit root testing problem when the non‐stationary volatility process is known. Next, we show that under suitable conditions, adaptation with respect to the volatility process is possible, in the sense that non‐parametric estimation of the volatility process leads to the same asymptotic power envelope. Implementation of the resulting test involves cross‐validation and the wild bootstrap. A Monte Carlo experiment shows that the asymptotic results are reflected in finite sample properties, and an empirical analysis of real exchange rates illustrates the applicability of the proposed procedures.
Adaptive testing; Non‐parametric estimation; Power envelope; Unit root; Wild bootstrap
Over the past decade, a large amount of research has been devoted to the effect of heteroscedasticity on unit root tests. When the heteroscedasticity follows a stationary GARCH‐type specification, such that the unconditional variance is well‐defined and constant, then the invariance principle guarantees that the usual Dickey–Fuller (DF) tests remain valid asymptotically. This was illustrated using Monte Carlo simulations by Kim and Schmidt ([
In empirical applications, the assumption that the variation in volatility effectively averages out over the relevant sample is often questionable. On the one hand, in applications involving daily financial prices (interest rates, exchange rates), the degree of mean reversion in the volatility is usually so weak that the volatility process shows persistent deviations from its mean over the relevant time‐span (often ten years or less). On the other hand, in applications involving macro‐economic time series observed at a lower frequency but over a longer time‐span, one often finds level shifts in the volatility, instead of volatility clustering. Intermediate cases (slowly mean‐reverting volatility with changing means) may also occur.
In the presence of such persistent variation in volatility, the invariance principle cannot be expected to apply, such that the null distribution of unit root tests will be affected. The resulting size distortions have been investigated by Boswijk ([
There is no guarantee that these approaches to deliver tests with correct asymptotic size will also yield tests with the highest possible power. In particular, in the presence of heteroscedasticity we can expect higher power from a method that gives the highest weight to observations with the lowest volatility, and this is not the case for the tests discussed above.[
In this paper, we address this issue by deriving the asymptotic power envelope; that is, the maximum possible power against a sequence of local alternatives to the unit root, for a given and known realization of the volatility process. This allows us to evaluate the power loss of various tests, and to construct a class of admissible tests, that have a point of tangency with the envelope. For the empirically more relevant case where the volatility function is not observed, we show that under suitable conditions, adaptation with respect to the volatility process is possible, in the sense that non‐parametric estimation of the volatility process leads to the same asymptotic power envelope. Similar adaptivity results were obtained for stable (auto‐)regressions by Hansen ([
The plan of the paper is as follows. In Section , we present the model, and we obtain some preliminary asymptotic results. In Section , we characterize the power envelope (conditional on the volatility process) and we illustrate the power gain possibilities in four examples. In Section , we discuss non‐parametric estimation of the volatility process, and its use in the construction of a class of adaptive tests; we also discuss various bootstrap implementations of the tests. In Section , we extend the test to allow for deterministic components and short‐run dynamics. The finite‐sample behaviour of these tests is investigated in a Monte Carlo experiment in Section ; simulation results are reported in the online Appendix. In Section , we discuss an empirical application, and in Section we provide some concluding remarks. Proofs are given in Appendix .
Throughout the paper, we use the notation and to denote convergence in probability and convergence in distribution, respectively, for sequences of random variables or vectors. We let denote weak convergence in D[0, 1], the space of right‐continuous functions with finite left limits, under the Skorohod metric, and denotes weak convergence in probability; see Giné and Zinn ([
Consider the heteroscedastic first‐order autoregressive modelwhere , and where is the filtration generated by . Extensions to models with deterministic components and higher‐order autoregressions are considered in Section . The null hypothesis of interest is the unit root hypothesis .
We assume that is a deterministic sequence, such that is a martingale difference sequence, with conditional (and unconditional) variance and hence volatility . The theory developed here can be extended to allow for an exogenous stochastic volatility process, in which case the results would hold conditionally on this process. Furthermore, the analysis could be extended to the case where is a stationary GARCH‐type process, but this will not be considered explicitly.
If the variation in averages out over subsamples (i.e., if as , for all ), then under additional technical conditions, satisfies an invariance principle. This implies that conventional DF tests for a unit root will be asymptotically valid, even though more powerful tests can be obtained by explicitly modelling the volatility process; see, e.g., Seo ([
In contrast, in this paper we are concerned with cases where the volatility displays permanent shifts or trends. We do not assume a particular parametric specification, but instead require the following.
In the model –: (a) defining for and , as , in D[0, 1] where is strictly positive; (b) the sequence satisfies an invariance principle, i.e., as ,where is a standard Brownian motion.
Assumption (a) preserves persistent changes in the volatility as . It is closely related to the assumption , considered, inter alia, by Cavaliere ([
The invariance principle for would follow if the martingale difference assumption is strengthened to an independent and identically distributed (i.i.d.) assumption, or augmented with a (conditional) Lindeberg condition.
The following lemma characterizes the limiting behaviour of the process under a near‐integrated parameter sequence , with a fixed constant.
Consider the model – under Assumption . Under , and as ,jointly with , where satisfies
All proofs are given in the Appendix. The lemma has direct consequences for the asymptotic properties of the conventional DF tests. In particular, let DF
The distribution of the expression on the right‐hand side of does not coincide with the usual DF null distribution, unless (constant), such that . Thus, the DF tests are not robust to persistent variation in , leading to a non‐constant . As shown by Cavaliere and Taylor ([
The focus of this paper is not on solving the size distortions caused by non‐stationary volatility, but on developing tests with higher power. In the next section, we derive the maximum possible asymptotic power of any test of the unit root null against local alternatives, for the (infeasible) case where is observed, and is an i.i.d. N(0, 1) sequence. Next, we show that the asymptotic volatility function is consistently estimable, and this can be used to construct a family of point optimal tests that reach the Gaussian asymptotic power envelope. The resulting tests are adaptive, in the sense that there is no loss of asymptotic efficiency or power caused by estimating .
In this section, we derive the Gaussian asymptotic power envelope for the unit root hypothesis in the model –, with known. This power envelope will then be compared to the asymptotic power of the DF test, and of the likelihood ratio (LR) test based on known (which, in practice, when is not observed, will be infeasible).
Under Gaussianity, the log‐likelihood is given byDefine the log‐likelihood ratio of relative to :wherewith .
The envelope is based on the power of the Neyman–Pearson test in a limit experiment that provides an asymptotic approximation of the model in a neighbourhood of the null hypothesis. This limit experiment is locally asymptotically quadratic (LAQ); see, e.g. Jeganathan ([
Consider the model –, under Assumption . LetUnder , we have as ,and hence, for fixed ,
As usual in a likelihood analysis, the asymptotic distributions and hence power functions derived below will continue to hold when the Gaussianity assumption is violated, as long as satisfies an invariance principle. However, the optimality claims in the results to follow critically depend on its validity: if the actual density differs from the Gaussian density, then more powerful tests can be constructed from a likelihood function derived from the actual density. In an earlier working paper version of this paper, we considered the power envelope for an arbitrary but known density ; see Boswijk ([
A similar remark applies to the possible presence of conditional heteroscedasticity in , i.e., when , where follows a stationary GARCH specification with . Under suitable additional conditions, the asymptotic properties derived below will continue to hold, but more powerful testing procedures can be obtained from a likelihood analysis of the model under a parametric specification for , analogous to Ling and Li ([
Note that in , c refers to the true data‐generating process (the probability measure with ), whereas characterizes a chosen local alternative. Therefore, setting gives the asymptotic null distribution of the Neyman–Pearson test statistic for against , whereas setting gives the asymptotic distribution under local alternatives, and hence can be used to evaluate local power.
An interpretation of Theorem is that the model is locally approximated, for , by the limit experiment , where and are the relevant Borel σ‐fields, and where is the distribution of , with log‐likelihood ratio . An interpretation of this limit experiment is that we observe , generated by , to make inference on c. The limit experiment is a curved exponential model with one parameter c and two sufficient statistics . Note that the information is not ancillary, since its distribution under depends on c. This implies that the log‐likelihood ratio is not locally asymptotically mixed normal (LAMN), but locally asymptotically Brownian functional (LABF); see Jeganathan ([
The power of the Neyman–Pearson test for against , which rejects for large values of , defines the asymptotic power envelope (conditional on ) for testing against . We evaluate this power envelope by Monte Carlo simulation, for , and for four different volatility functions, inspired by the simulations in Cavaliere ([
1.σ1(u)=1[0,0.9)(u)+5·1[0.9,1](u); this represents a level shift in the volatility from 1 to 5 at time t=(9/10)n (i.e., late in the sample).
- 2.σ2(u)=1[0,0.1)(u)+5·1[0.1,1](u); an early level shift from 1 to 5.
- 3.σ3(u)=exp((1/2)H(u)), where dH(u)=−10H(u)du+10dB(u), with B(·) being a standard Brownian motion, independent of W(·); this represents a realization of a stochastic volatility process, with a low degree of mean‐reversion and a fairly high volatility‐of‐volatility.
- 4.σ4(u)=exp((1/2)H(u)), where H(u)=5B(u); a realization of a stochastic volatility process with no mean‐reversion and a lower volatility‐of‐volatility.
Figure depicts the volatility paths – that we use in our simulations. The two stochastic volatilities σ
The power envelopes are based on Monte Carlo simulations of under , with . The simulations of under Q
From these figures, we observe that the power of the LR test is close to the envelope, but not equal to it, especially in case of stochastic volatility. Furthermore, in most cases (with the exception of σ
In the previous section, we have studied the power of procedures that assume that is known and observed. In practice, this is not the case, and will have to be estimated. One option is to specify a parametric model for , such as a GARCH model, and then to consider maximum likelihood estimation of that model. However, it is desirable to have a testing procedure that is not too sensitive to deviations from such an assumption, and that will also work well, for example, in the case of (gradual) changes in the level of the volatility.
Therefore, inspired by Hansen ([
To prove uniform consistency of , we need the following assumptions.
is continuous on [0, 1] (i.e., ).
For some , .
Consider the model –, under Assumptions , and . If for some a and b satisfying and , then both under and under , as ,
Uniform consistency of the kernel estimator requires continuity of (Assumption ). Hence we exclude level shifts in , as considered in some of the examples in the previous section. Such level shifts can be approximated arbitrarily well by a smooth transition function, such as the logistic function; but it is expected that the non‐parametric estimator will perform relatively badly around the change point. As noted later in Remark , it is possible to develop the main result of this section allowing for a finite number of discontinuities in , bypassing Lemma . This is not considered explicitly, for simplicity.
The lemma involves a trade‐off between the existence of moments and the window width; for distributions with relatively fat tails, such that extreme observations occur with some frequency, more smoothing is needed to obtain consistency.
A simple example of an implementation of the kernel estimator is that of an exponentially weighted (double‐sided) moving average. Take , where the coefficient 5 is chosen such that . Then, letting , we have , and , such that . For , this corresponds to a smoothing parameter of . As the sample size increases, would have to converge to 1 to guarantee consistency, at the rate determined by Lemma .
In practice, the window width can be chosen by a leave‐one‐out cross‐validation procedure, which involves minimizingover N; see Wasserman ([
The consistency of the kernel estimator can be used for constructing tests for a unit root as follows. First, we can estimate the asymptotic score and information byThese equations can be used to construct approximate point‐optimal test statistics , or a one‐sided LR statistic . Consistency of is considered in the next theorem.
Consider the model –, under Assumptions , and . Under , we have as ,
This theorem implies that we may asymptotically recover (in a weak convergence sense) the likelihood ratio by non‐parametric estimation of the infinite‐dimensional nuisance parameter , meaning that adaptive estimation and testing is possible. Note that the result applies both when (i.e., under the null hypothesis) and when . A formal analysis of adaptivity involves finding a so‐called least‐favourable parametric submodel , where is a parameter vector characterizing ; see Chapter 25 of Van der Vaart ([
Xu and Phillips ([
Theorem implies that the limiting null distribution of adaptive tests will be affected by nuisance parameters, just like the DF test – see . This means that critical values for such tests cannot be tabulated, and should be generated on a case‐by‐case basis. One possibility is to simulate the asymptotic null distribution of , with replaced by . If the simulated continuous‐time processes involved in the limiting distributions are discretized on a grid , thenwhere , with an i.i.d. N(0, 1) sequence. From this it can be shown that simulating the asymptotic null distribution based on can be interpreted as a ‘volatility bootstrap’ procedure, where the bootstrap is based on the model under the null and uses bootstrap errors .
An alternative is given by the wild bootstrap; see Liu ([
Let , the adaptive one‐sided LR statistic, and let , its bootstrap version (either the volatility bootstrap or the wild bootstrap).
Consider the model –, under Assumptions , and . Under both and , we have as ,so that
The theorem implies that the (wild or volatility) bootstrap is asymptotically valid, in the sense that the bootstrap p‐value, defined as , is asymptotically uniformly distributed on the unit interval under the null hypothesis. Because has the same limiting null distribution under with as under , it follows that the bootstrap test has the same asymptotic power function as the test based on the true but unknown critical values.
The first‐order autoregression with a known zero mean and a zero starting value is too restrictive in many empirical applications. Therefore, in this section we discuss how the adaptive test derived in the previous section can be extended in these directions.
Suppose, first, that the observed data arewhere satisfies the same assumptions as in the previous sections, and is a vector of deterministic functions of t, with μ being a conformable parameter vector. As usual in the unit root literature, we focus on the cases (constant mean μ) and (linear trend ). Maintaining the assumption that , this implies that the point‐optimal invariant test for against , with observed , follows as a straightforward extension of the analysis of Elliott et al. ([
The asymptotic distribution of is given next.
Consider the model defined by – and , under Assumptions , and . (a) If (constant mean), then under , as ,where are as in Theorem ; (b) if (linear trend), then under , as ,where , and
Analogously to Elliott et al. ([
In general, the first‐order autoregressive model for might be misspecified. Therefore, the testing procedures can be extended to higher‐order dynamics as follows. Suppose that we maintain for the observed time series , but now is replaced bywith , where L is the lag operator and has all roots outside the unit circle. This corresponds to the AR(p) modelwhere the errors are still assumed to satisfy – and Assumption .
Generalizing the approach of Elliott et al. ([
Adaptive wild bootstrap unit root lr test in arp model
Step 1.Estimate σt based on OLS residuals ε̂t in an AR(p−1) for ΔYt (i.e. an AR(p) for Yt with a unit root imposed), including a constant if dt=(
Step 2.Construct Xtd(c¯)=Yt−μ̂(c¯)′dt, with μ̂(c¯) as in , with σt replaced by σ̂t.
Step 3.Calculate the t‐statistic LR ̂n for δ=0 in with σt replaced by σ̂t.
Step 4.Construct bootstrap errors εt∗=ε̂tzt∗, and generate bootstrap observations Yt∗ from the same estimated AR(p) model under the unit root restriction as in Step 1 (using starting values (Y1∗,…,Yp∗)=(Y1,…,Yp)); construct bootstrap statistics LR ̂n∗ by applying Steps 2 and 3 to Yt∗, and use these to calculate the bootstrap p‐value.
In practice, the first step will have to be preceded by a lag order selection procedure, based on information criteria, residual autocorrelation tests, or a combination of both. In the next theorem, we assume that this has led to a selected autoregressive order p that is (larger than or) equal to the true order.
Consider the model defined by , , and , under Assumptions , and . Then, under , as ,where are as in Theorem , and are as in Theorem . Under both , and , we have as ,so that bootstrap p‐values are asymptotically uniformly distributed on [0, 1] under .
In this section, we compare the finite‐sample behaviour of the adaptive one‐sided LR test for a unit root with that of the DF‐type t‐test in a Monte Carlo experiment. We consider four data‐generating processes, corresponding to the volatility functions – considered in Section ; i.e., for any sample size n, we set for . The innovations are generated as i.i.d. N(0, 1). The lag length is fixed at 1, both in the data‐generating process and in the test regressions. Both tests allow for an unknown mean, removed by GLS demeaning with – so that DF is in fact the DF–GLS
The volatility smoother uses restricted residuals in all the scenarios. For both test statistics, we consider two approaches to obtain their critical values. For the adaptive LR statistic, we compare the wild bootstrap implementation to a test based on simulated asymptotic critical values, replacing the unknown with the estimate (i.e., the volatility bootstrap). For the DF statistic, we compare the wild bootstrap implementation with a version using the standard asymptotic critical values (which are valid only in case of unconditional homoscedasticity). All results are based on 10,000 Monte Carlo replications and 999 bootstrap replications (which is also the number of replications used for simulating the asymptotic p‐value of the LR test). Both the wild bootstrap DF test and the wild bootstrap LR test use the restricted OLS residuals to generate the bootstrap samples.
The simulation results are provided in the online Appendix. They show that the wild bootstrap is an effective way of correcting size distortions. As the sample size grows, the adaptive LR test realizes an increasing part of the power gain potential over the DF test, as predicted by the power envelope. Although our assumptions do not allow for it, the adaptive LR test has good power in case of a discontinuous change in the volatility process. This supports our claim that such abrupt changes in volatility are not a problem for the test in practice.
In this section, we apply the adaptive LR test developed in this paper to study the validity of the purchasing power parity (PPP) hypothesis in 16 EU countries. The PPP hypothesis states that in a well‐functioning world market, foreign currencies should have the same purchasing power in the long run, which implies that the real exchange rate should exhibit stationary, mean‐reverting properties. Macro‐economists often use classical unit root tests with real exchange rate data to test the PPP hypothesis, where a rejection of the unit root hypothesis is used as evidence to support the PPP hypothesis; see, e.g., Froot and Rogoff ([
For 16 EU member countries, we analyse their real effective exchange rate (REER), i.e., the average of the bilateral real exchange rates of their trading partners, weighted by the respective trade shares of each partner. The use of REERs provides a test of the multi‐country version of PPP; rejection of the unit root hypothesis based on REERs can be viewed as stronger evidence for PPP to hold than tests using bilateral rates; see Bahmani‐Oskooee et al. ([
The data are depicted in Figure . For almost all countries, we do not observe a clear pattern of strong mean reversion: the REERs can persistently deviate from their mean for a large number of years. This illustrates the common empirical difficulty to find strong evidence supporting PPP.
The results are based on AR(p) models with an unknown mean for each of the 16 time series. The autoregressive orders p have been chosen to obtain residuals with no significant autocorrelation.
Figure displays the non‐parametric kernel estimate of the volatility (monthly percentage standard deviation) of the 16 real exchange rate series, where we have used the exponential kernel , with the window width N selected by the leave‐one‐out cross‐validation method. The volatility estimator is based on the OLS residuals from the selected AR(p) model under the unit root restriction, in agreement with Algorithm .
It is observed that the volatility of most series decreases gradually, although with different patterns, in the sample period considered; exceptions are Norway, Switzerland and the UK, which display an increase volatility around the financial crisis. The volatility paths suggest that the constant volatility assumption in classical unit root tests might be violated, and it seems reasonable to entertain the possibility of non‐stationary volatility.
Table reports wild bootstrap p‐values of the DF and adaptive LR tests. For comparison, the asymptotic p‐values of the DF test (valid only in case of unconditional homoscedasticity) are also provided; the difference with the bootstrap p‐values is small in most cases. We observe that the p‐values of the adaptive LR test may be both lower and higher than those of the DF test, and are often in the same order of magnitude. Most remarkable is the result for Italy: using a 5% significance level, the unit root hypothesis is not rejected based on the DF test, but application of the adaptive LR test leads to a clear rejection, with a p‐value of around 4%. To a lesser extent, similar conclusions apply to Belgium and the UK. In summary, the example illustrates that the use of the more powerful adaptive LR test can indeed provide stronger evidence for the PPP hypothesis than using conventional tests, which confirms its useful role in the empirical analysis of macro‐economic data.
Asymptotic and wild bootstrap p‐values of DF and adaptive LR test
DF asy p‐value DF WB p‐value LR WB p‐value p Austria 0.697 0.718 0.799 12 Belgium 0.086 0.101 0.060 1 Denmark 0.050 0.062 0.059 12 Finland 0.046 0.034 0.299 11 France 0.140 0.141 0.252 10 Germany 0.060 0.056 0.069 10 Greece 0.160 0.166 0.219 12 Ireland 0.197 0.208 0.338 13 Italy 0.215 0.229 0.038 1 Netherlands 0.009 0.005 0.006 12 Norway 0.020 0.019 0.022 1 Portugal 0.097 0.123 0.161 12 Spain 0.519 0.492 0.799 1 Sweden 0.757 0.785 0.752 1 Switzerland 0.604 0.628 0.440 1 United Kingdom 0.131 0.130 0.062 12
1 Note: The table reports asymptotic and wild bootstrap p‐values for the DF–GLSμ, and wild bootstrap p‐values for the adaptive LR tests, for the real effective exchange rate of 16 EU countries; p refers to the autoregressive order used in the test regressions.
In this paper, we have demonstrated that substantial power differences of unit root tests can arise in models with non‐stationary volatility. We have shown that it is possible to construct a class of tests that have asymptotic power close to the envelope. The tests are based on non‐parametric volatility estimation, and therefore do not require very specific assumptions on the parametric form of the volatility process. This approach can be extended in various directions.
First, for uniform consistency of the non‐parametric volatility estimator, the volatility process needs to have continuous sample paths. This means that sudden level shifts are excluded. In practice, one might argue that these can be approximated arbitrarily well by smooth transition functions; furthermore, as shown by Xu and Phillips ([
Secondly, the analysis is based on a deterministic volatility sequence. The asymptotic theory and the bootstrap method could be extended to allow for an exogenous volatility process, as long as it is independent of the Brownian motion defined from the standardized innovations. Hence, this excludes non‐stationary volatility processes with statistical leverage effects, which are relevant in applications to equity prices. Note that our approach does not allow for stationary (GARCH‐type) conditional heteroscedasticity, with or without leverage effects. It would be of interest to extend the analysis in this direction, leading to further possibilities for higher power.
The analysis in this paper can be extended to the multivariate case. The non‐parametric volatility estimator has a very obvious extension to an estimator of a time‐varying variance matrix; as long as the same kernel and window width is used for all variances and covariances, the resulting estimator will be positive semi‐definite by construction. This can be used to construct more efficient cointegration tests or adaptive estimators of cointegrating vectors in the presence of non‐stationary volatility. We are currently exploring this possibility; see Boswijk and Zu ([
We would like to thank the Co‐Editor, Michael Jansson, and an anonymous referee for helpful comments and suggestions. Comments on earlier versions from Rob Taylor, Oliver Linton, Peter Phillips, Anders Rahbek, Ulrich Müller and Barbara Rossi are also gratefully acknowledged.
Online Appendix
Replication files
PHOTO (COLOR): Realization of volatility processes σ1–σ4. [Color figure can be viewed at wileyonlinelibrary.com]
PHOTO (COLOR): Asymptotic power envelope and power curves for σ1–σ4. [Color figure can be viewed at wileyonlinelibrary.com]
PHOTO (COLOR): Log‐real effective echange rates for 16 EU countries, 1973:1–2015:12. [Color figure can be viewed at wileyonlinelibrary.com]
PHOTO (COLOR): Non‐parametric volatility estimate (percentage) for 16 EU countries, 1973:1–2015:12. [Color figure can be viewed at wileyonlinelibrary.com]
By H. Peter Boswijk and Yang Zu