QUESTION 1 Consider the regression model that a resear

Homework answers / question archive / QUESTION 1 Consider the regression model that a researcher will estimate with T=26 annual observations: ???? = ??1 + ??2??2?? + ??3??3?? + ??4??4?? + ???? (1) List the ordinary least squares (OLS) regression estimator assumptions and, one by one, discuss: (i) the implications of violating each of them, (ii) how to detect a violation of each of them, (iii) how to remedy each of them

QUESTION 1 Consider the regression model that a researcher will estimate with T=26 annual observations: ???? = ??1 + ??2??2?? + ??3??3?? + ??4??4?? + ???? (1) List the ordinary least squares (OLS) regression estimator assumptions and, one by one, discuss: (i) the implications of violating each of them, (ii) how to detect a violation of each of them, (iii) how to remedy each of them

Statistics

Share With

QUESTION 1

Consider the regression model that a researcher will estimate with T=26 annual observations:

??_??= ??₁+ ??₂??_2??+ ??₃??_3??+ ??₄??_4??+ ??_?? (1)

List the ordinary least squares (OLS) regression estimator assumptions and, one by one, discuss: (i) the implications of violating each of them, (ii) how to detect a violation of each of them, (iii) how to remedy each of them.

(8 marks)

There is one salient event in the sample period and the researcher wants to investigate whether two specific parameters, ??₃ and ??₄, in model (1) above are stable throughout the sample period. For this purpose, he conducts a joint hypotheses test for parameter stability (known as structural break test or Chow breakpoint test) by considering the dummy variable ??_?? that takes the value 0 in all the years before the event and 1 in the years after the event. Write down: (i) the re-formulated model that allows conducting such joint-hypotheses test, (ii) the marginal effect of ??_3?? and ??_4?? on ??_?? before and after the event according to the reformulated model, (iii) the null hypothesis and the alternative hypothesis for the test; (iv) the name and expression of the test statistic, and the probability distribution that it follows.
1. marks)

Inspired by finance theory, a researcher conjectures that the marginal effect of ??_2?? on ??_?? is not constant but instead depends on the level of ??_4??. How shall she reformulate the model equation in order to be able to conduct a test for this effect? Write down: (i) the re-formulated regression model; (ii) the expression of the marginal effect of ??_2?? on ??_?? according to the reformulated model (iii) the null hypothesis and the alternative hypothesis for this test so that a test rejection implies that the researcher was right in his conjecture; (iv) the name and expression of the test statistic, and its probability distribution.
1. marks)

Suppose that a researcher estimates the regression model (1) by OLS and obtains the following results: (i) the ??² of the model is 78%, (ii) the p-value of the F-statistic for the overall significance of the model is 0.03%; (iii) the p-value of the individual significance t-tests for each of the slopes is 0.23%, 0.64% and 0.12%, respectively. Is there any contradiction in these results? If yes, what is this contradiction likely to be stemming from?

(10 marks)

cont

Suppose that in model (1) above the dependent variable ??_?? is a binary variable that takes the value 1 with probability ??_?? and 0 with probability 1 −??_??. What are the potential problems of estimating the model by OLS? How can the model be reformulated in order to circumvent those problems? Write down the expression of the reformulated model and explain how it circumvents those problems. What is the name of the reformulated model? Can the OLS estimation method be used to obtain the parameter values of your reformulated model? Give an example of such regression model (clearly indicate what the variables ??_??, ??_2??, ??_3?? and ??_4?? are in your example).

Suppose that the model parameter estimates are ??₁= 0, ??₂= 0.5, ??₃= 3, ??₄= 0. Calculate according to the reformulated model what is the expected value of ??_?? conditional on these values for the independent variables ??_2??= 1, ??_3??= 1 and ??_4??= 1; that is, calculate the fitted value ??_??= ??(??_??|??_2??, ??_3??, ??_4??). What is the interpretation of the resulting fitted value ??_???

(10 marks)

QUESTION 2 [25 marks]

A researcher is interested in assessing the risk exposure of a particular managed portfolio of UK stocks. She has collected a sample of monthly excess returns on the portfolio (PORTFOLIO) as well as monthly returns on four UK risk factors covering the same time period: the market factor (MARKET) is the excess return on the FTSE 100 index, the size factor (SIZE) is the return on a portfolio of small UK stocks less the return on a portfolio of large UK stocks (where size is measured using market capitalization), the value factor (VALUE) is the return on a portfolio of high value UK stocks less the return on a portfolio of low value UK stocks (where value is measured using the book-to-market ratio), and the momentum (MOMENTUM) factor is the return on UK stocks that have performed strongly over the last year less the return on a portfolio of UK stocks that have performed poorly. All the returns are expressed in decimals. The researcher estimates the multiple linear regression model

??_??= ?? + ??₁??_1??+ ??₂??_2??+ ??₃??_3??+ ??₄??_4??+ ??_?? (2)

where ??_??= ??????????????????; ??_1??= ????????????; ??_2??= ????????; ??_3??= ????????????????; ??_4??= ??????????;

Using the OLS estimation method she obtains the estimation output shown in EXHIBIT 1.

EXHIBIT 1

Dependent Variable: PORTFOLIO

Method: Least Squares; Sample: 1986M10 2016M12;

363 observations

Variable	Coefficient	Std. Error	t-Statistic	Prob.
C	0.005379	0.001898	2.834781	0.0048
MARKET	0.669732	0.039124	17.11797	0.0000
SIZE	0.833417	0.053539	15.56653	0.0000
MOMENTUM	0.095220	0.050715	1.877544	0.0613
VALUE	0.104161	0.055652	1.871643	0.0621

What is the residual sum of squares of the model? What does it represent?
1. marks)

With the estimation results provided in the above exhibit, can we say anything about whether there is correlation in the residuals? Which specific estimation result gives us information about residual autocorrelation? Can we say anything about the order of the autocorrelation? Can we tell if the correlation is positive or negative?
1. marks)


R-squared	0.602309 Mean dependent var	0.010881
Adjusted R-squared	0.597866 S.D. dependent var	0.054137
S.E. of regression	0.034331 Akaike info criterion	-3.891877
Sum squared resid	Schwarz criterion	-3.838235
Log likelihood	711.3757 Hannan-Quinn criter.	-3.870555
F-statistic	135.5493 Durbin-Watson stat	0.784676
Prob(F-statistic)	0.000000

In order to find if the model above improves upon the simple CAPM model, the researcher specifies a regression excluding the SIZE, MOMENTUM and VALUE factors. The OLS estimation results are shown in EXHIBIT 2 below. She also estimates the same model augmented with the MOMENTUM factor as shown in EXHIBIT 3 below. Construct a table that provides 3 measures or criteria to compare the models in EXHIBITS 1, 2 and 3. Which model shall the researcher select on the basis of this comparison exercise?
1. marks)

EXHIBIT 2

Dependent Variable: PORTFOLIO

Method: Least Squares;

Sample: 1986M10 2016M12; Included observations: 363

Variable	Coefficient	Std. Error t-Statistic	Prob.
C	0.007468	0.002347 3.181679	0.0016
MARKET	0.663586	0.050026 13.26481	0.0000
R-squared	0.327691	Mean dependent var	0.010881
Adjusted R-squared	0.325828	S.D. dependent var	0.054137
S.E. of regression	0.044451	Akaike info criterion	-3.383362
Sum squared resid	0.713298	Schwarz criterion	-3.361905
Log likelihood	616.0802	Hannan-Quinn criter.	-3.374833
F-statistic	175.9553	Durbin-Watson stat	1.398702
Prob(F-statistic)	0.000000

EXHIBIT 3

Dependent Variable: PORTFOLIO

Method: Least Squares

Sample: 1980M10 2010M12; Included observations: 363

Variable	Coefficient	Std. Error t-Statistic	Prob.
C	0.007722	0.002397 3.222133	0.0014
MARKET	0.659565	0.050631 13.02694	0.0000
MOMENTUM	-0.030513	0.056758 -0.537591	0.5912
R-squared	0.328230	Mean dependent var	0.010881
Adjusted R-squared	0.324498	S.D. dependent var	0.054137
S.E. of regression	0.044495	Akaike info criterion	-3.378655
Sum squared resid	0.712726	Schwarz criterion	-3.346470
Log likelihood	616.2259	Hannan-Quinn criter.	-3.365862
F-statistic	87.94887	Durbin-Watson stat	1.395260
Prob(F-statistic)	0.000000

An analyst conjectures that the portfolio has a risk exposure to SIZE that is less than 1. Test this conjecture by carrying out a statistical test in the context of the model reported in EXHIBIT 1. Write down the null and alternative hypotheses of the test so that a rejection represents evidence supportive of the analyst’s conjecture. What is the name of the test statistic and the name of the probability distribution that it follows? Draw a graph showing the sample value of the test statistic, the 1% critical value of the test, and its p-value. What shall the analyst conclude?
1. marks)

In the context of the model in EXHIBIT 3, which probability distribution does the test statistic for the overall significance of the model follow? What is the name of the test statistic? Write down the null and alternative hypotheses of the test in two different ways (in terms of the model parameters; in terms of the coefficient of determination). What is the 1% critical value?

1. marks)

QUESTION 3 [25 marks]

The market performance of environmentally certified and green commercial buildings and the rent premium they command is a topical research area in finance. In an empirical study investigating the relationship between office rent and the green characteristics of the building (in central London) where the office is located, the following OLS estimation results are obtained for a cross-sectional regression model:

ln(RENT) = 3.981 + 0.073RATING – 0.038VACANCY - 0.013SIZE (3) (0.08) (0.02) (0.01) (0.005)

with the numbers in parentheses representing OLS standard errors, the sum of squared residuals (RSS) is 4.980, and the number of central London buildings sampled for the study is N=220.

RENT is the achieved rent in £s per square foot reported in CoStar (a leading commercial real estate information company);

RATING is the rating Costar assigns to buildings for their sustainability and green characteristics; the scale of the rating scheme is between 1 and 5, with 1 representing poor environmental specification and 5 excellent green features.

VACANCY is the vacancy in the building in percentage

SIZE is the net leasable area in thousands of square feet.

(a) Comment on the impact of the explanatory variables on rents in terms of signs and statistical significance. Do the signs make sense intuitively? What is the estimate for the rent premium of green buildings according to this study?

(6 marks)

(b) What is heteroskedasticity? What would be the intuition as to the source of heteroskedasticity in the above model? Would a p-value of 0.11 (11%) obtained for the White test suggest that the model suffers from heteroskedasticity? Write down the null and alternative hypothesis of this White test in the context of the above model. Which probability distribution does the White test statistic follow in the present context? Draw the probability distribution graph and indicate in the graph where the White test statistic obtained from the sample would be positioned, the 5% critical value (obtained from the Cambridge statistical tables), the test pvalue, the rejection region and the non-rejection region.

(7 marks)

cont

(c) Two further diagnostics are calculated for the above model (i) the Jarque-Bera test statistic value is 6.14, and (ii) the RESET test statistic value is 3.09. State clearly the null and alternative hypotheses for each of the above two tests – if either of these tests is based on an auxiliary regression, write down this regression. What is the critical value of the Jarque-Bera test and the RESET test that you are using to answer this question? What probability distributions have you obtained the critical values from? Do you detect any misspecification?

If yes, discuss the type of misspecification. (Note: adopt a significance level of 5%).

(6 marks)

(d) The existing literature on the subject of green buildings and rent premia has suggested further control variables in the models. Two such variables are the LEASE_TERM (number of years) and RENT-FREE PERIOD (in months) which refers to the suspension of rent during the initial period until the business is up and running. We re-estimate model (3) adding these two variables. The sum of the squared residuals of this augmented equation (with 5 regressors in total) is 4.650. Use an F-statistic to address the question of whether the model should include both of these variables (as a group) using the 10% significance level. What is the appropriate critical value for this test? Which probability distribution have you obtained it from?

(6 marks)

pur-new-sol