- The Economist

Sunday, 14 May 2017

When conducting regression analysis, the relationship between two variables is not always clear-cut. For instance, covariates (or control variables) can often affect the relationship between a dependent and independent variable. In this context, ANCOVA can prove (or indeed disprove, as is the case here) the importance of the control variable in explaining such interactions.
Let us take the example of ice-cream consumption (Y) and income (X). At its most basic, an increase in income will lead to an increase in ice-cream consumption – all else being equal. However, let us throw a covariate variable into the mix – temperatures. Should temperatures rise, then it stands to reason that more income will be spent on ice-cream consumption. In this regard, we expect that our control variable has a significant impact on the relationship between income and ice-cream consumption – and including this variable would improve the accuracy of our model.
In this particular example, we use ANCOVA to examine the relationship between earnings and stock returns with volatility as our covariate (variables in percentage terms). The earnings of a company are likely to have a significant impact on stock returns – however the volatility covariate is hypothesised to also have an impact on stock returns.
Firstly, we see if our data follows a normal distribution by applying the Shapiro test:
.com/img/proxy/
Given that most of our Shapiro statistics are significant at the 5% level, this suggests our data is not normally distributed. In this regard, to ensure that the covariate and the independent variable of earnings are independent across our three main groups of stocks (small-cap, medium-cap, and large-cap), we run Levene’s Test for Homogeneity of Variance across the three groups:
.com/img/proxy/
Next, we run a more formal ANOVA test across our three groups. The order in which we lay out our independent variables matters in this context – we can see that from using the summary.aov function in R below, the volatility covariate is statistically significant when listed first – and earnings are also statistically significant in the model. However, when earnings is listed first, volatility ceases to be significant:
.com/img/proxy/
To mitigate the conflicting results, we need to run a Type II or Type III sum of squares test. Both tests adjust for evaluating the effects of our predictors without taking order into account. However, the main difference between the two is that a Type II sum of squares will analyze the main effects in the model without owing variance from the main effect of the predictor variable. This type of test is best when we do not expect an interaction between main effects. However, in the case that main effects associated with an interaction are significant, then a Type III test will deliver more meaningful results.
.com/img/proxy/
_________________________________________________________________________
.com/img/proxy/
________________________________________________________________
We now see that when we run a Type II and Type III ANOVA, our covariate volatility is insignificant in both cases. This implies that the covariate may not be the best variable for describing an interaction between earnings and stock returns. To test this, two OLS models – one containing earnings only (Model 1) and the other containing volatility and earnings (Model 2) – were used to predict the actual value. It was found that Model 1 was the better predictor 48% of the time, while Model 2 was the better predictor the remaining 51% of the time.
Given that our p-value for volatility was shown by the ANOVA Type II and III tests to be insignificant – this has demonstrated that the covariate has not done a good job of examining the interaction between earnings and stock returns. Had the covariate been significant, then it is likely that Model 2 would have been a significantly better predictor of stock returns given a suitable covariate. In this regard, an ANCOVA analysis allows us to determine the suitability of a “control” variable in enhancing a model describing an existing relationship. As we can see in this case, an ANCOVA analysis disproved volatility to be an important control variable in analyzing the relationship between earnings and stock returns.
_________________________________________________________________________________





No comments:

Post a Comment