The labor supply of married women has been a subject of a great deal of economic research
Economics
Share With
The labor supply of married women has been a subject of a great deal of economic research. Given the data on women who worked in the previous year and those who have not, several models have been estimated. The variable indicating whether a woman worked is LFP=Labor Force Participation, which takes the value 1 if a woman worked and 0 if she did not.
(a) Consider the following supply equation specification:
What sign do you expect each of the coefficients to have, and why? (b) The variable nwifeinc=non-wife income is defined as
nwifeinc = faminc − wage
hours
The summary statistics for the variables: wife’s age, number of children less than
6 years old in the household, and the family income for the women who worked (LFP=1) and those who did not work (LFP=0) are provided in the table. Comment on any differences you observe.
MEAN Standard Deviation
Variable
LFP=1 LFP=0
LFP=1 LFP=0 age
41.97
43.28 7.72
8.47 kidsl6 0.14
0.37
0.39 0.6 4
faminc 24130
21698
11671 12728
NWIFEINC and ln(WAGE) variables in part (a) are endogenous. Can the HOURS equation be estimated satisfactory using the
Table 1 provides the results of the least squares regression on only the women who worked (LFP=1). Did things come out as expected? If not, why?
TABLE 1
Dependent variable: HOURS
Included observations: 428
Coef.
Std.Err.
z P > |z|
Intercept
2114.69
340.13
6.22
0.000
lnwage
-17.41
54.22
-0.32
0.75
educ
-14.44
17.96
-0.80
0.42
age
-7.73
5.53
-1.40
0.16
kidsl6
-342.51
100.01
-3.43
0.000
kids618
-115.02
30.83
-3.73
0.000
nwifeinc
-0.01
0.01
-1.16
0.25
The results of the following wage equation for the women who worked are provided in Table 2. What is the effect on wage of an additional year of education?
Dependent variable: LNWAGE Included observations: 428
Coef.
Std.Err.
z P > |z|
Intercept
-0.357
0.31
-1.13
0.26
educ
0.099
0.015
6.615
0.00
age
-0.001
0.005
-0.650
0.52
kidsl6
-0.056
0.088
-0.631
0.53
kids618
-0.0176
0.027
-0.633
0.53
nwifeinc
-5.69E-06
3.32E-06
1.715
0.08
exper
0.0407
0.013
3.044
0.00
exper2
0.001
0.011
-1.860
0.06
Is the supply equation in part (a) identified? Name the instrumental variable if any.
Two-stage least squares estimation of the supply equation is provided in Table
3. Discuss the sign and significance of the estimated parameters.
TABLE 3
Dependent variable: HOURS
Included observations: 428
Coef.
Std.Err.
z P > |z|
Intercept
2432.198
594.17
4.093
0.000
lnwage
1544.818
480.7387
3.213
0.004
educ
-177.449
58.142
-3.0519
0.002
age
-10.784
9.577
-1.126
0.260
kidsl6
-210.834
176.934
-1.192
0.234
kids618
-47.557
53.917
-0.835
0.404
nwifeinc
-0.001
0.006
-1.427
0.154
It is common for companies to spend money and time on employee training pro- grams to improve productivity in the workplace. A company is interested in an- alyzing the effect of a training program on employees’ productivity. Data from a random sample of 10,000 workers ages 18 and older were collected and a regression
P RODUCT IV IT Y = Xβ + αT RAINING + µ
was estimated, where X is a matrix of socio-demographic characteristics, and TRAINING is a dummy variable that is equal to 1 if the person participated in the training program and is equal to zero otherwise. TRAINING can depend on many factors and one of them could be whether a company received a job training grant to help subsidize the cost of the training, therefore
T RANING = Zγ + θGRANT + .
Here Z is a matrix of firm related factors and GRANT=1 if company received a training grant and =0 otherwise.
In estimating the PRODUCTIVITY equation the variable TRAINING can potentially be endogenous. Why using OLS can fail in estimating this regres- sion model? Explain.
Propose an estimation method that will help resolve this endogeneity problem. Describe the estimation method in detail.
Is the PRODUCTIVITY equation identified? If so why? Is it exact, under or over -identified?
Now suppose that you only observe the positive level of productivity for those employees who did the training and zero for those who did not do the training. Discuss and name the model you will use in estimating PRODUCTIVITY equation. What will happen if you use OLS to estimate this model? Is endogeneity of TRANING is still a problem? If so, how can it be fixed?
What will happen if you only select positive levels of productivity and esti- mate the model? Explain.
In general, the two-stage least squares estimation (2SLS) procedure can be used to estimate the parameters of any identified equation within a simultaneous equation system. In a system of M simultaneous equations let y1, y2, y3, ..., yn be the en- dogenous variables. Let k be the number of exogenous variables: x1, x2, x3, ..., xk. Suppose the first equation is:
y1 = α2y2 + α3y3 + β1x1 + β2x2 + µ1
Describe the steps of 2SLS estimation method in detail, so a researcher who is reading your answer can easily follow the steps you described and estimate the model.
What is an Instrumental Variable(IV)? Be specific.
Write the following models in terms of a latent variable, comment on how to interpret the results of each model and give examples.
Probit
Multinomial Logit
Conditional Logit
Ordered Probit
Tobit
The first-order autoregressive model is given by
yt = α0 + α1yt−1 + t
lets assume that the value of y0 is known. Then we can write y1 as
y1 = α0 + α1y0 + 1.
Using the above information write y2 as a function of y1 and then as a function of y0 by replacing y1 with above given expression.
Continue the process and write y3 as a function of y0. Simplify your answer by combining like terms.
Since yt = α0 + α1yt−1 + t, write the expression for yt using y5 knowing that the value of yt−1 is obtained from yt−2 and the value of yt−2 is obtained from yt−3 and so on. Identify the expressions for intercept, slope, and error terms.
Suppose that your company was hired by the Mayor of one of the America’s large cities to help address traffic congestion problems during rush hour. You were given a data on the mode of transportation chosen by commuters, distance trav- eled from home to work/subway, etc., and all the individual and household level characteristics that are needed for the analysis.
After careful analysis of the data, you found that you can group the modes of transportation into four general categories ’subway’, ’bus’, ’car’, and ’human- powered’ (walk, bicycle). Specify an econometric model that fits the data best, i.e. write down the model you will be estimating and label your variables carefully. Is your model linear or non-linear? How are you going to interpret your results?
List all the factors (independent variables) that you as a research think might affect the choice of transportation.
The RESET test (Regression Specification Error Test) is designed to detect omit- ted variables and incorrect functional form. It proceeds as follows. Suppose that we have specified and estimated the regression model
yi = β1 + β2xi2 + β3xi3 + µi.
Let (βˆ1, βˆ2, βˆ3) be the least squares estimates and let
In (1) a test for misspecification is a test of H0 : γ1 = 0 against the alternative
H1 : γ1 = 0. In (2), testing H0 : γ1 = γ2 = 0 against the alternative H1 : γ1 = 0 and/or γ2 = 0 is a test for misspecification. In the first case a t- or an F-test can be used. An F-test is required for the second equation. Rejection of H0 implies the original model is inadequate and can be improved. A failure to reject H0 says
the test has not been able to detect any misspecification. The following table contains output for the two models
yi = β1 + β2xi + β3wi + µi (3) yi = β1 + β2xi + i (4)
obtained using n=35 observations. RESET test applied to the second model yields
Variable
Coef. Std.Error t-Statistic
Coef. Std.Error t-Statistic
C
3.6356 2.763 1.316
-5.8382 2.000 -2.919
X
-0.99845 1.235 -0.8085
4.1072 0.3383 12.14
W
0.49785 0.1174 4.240
F-values of 17.98 (for yˆ2i ) and 8.72 (for yˆ2i and yˆi3 ). Answer the following questions using the results provided in the table:
Should wi be included in the model? What can you say about omitted- variable bias?
You have discovered that the correlation between x and w is rxw=0.975. What can you say about the existence of collinearity and its possible effects?
(a) Why is perfect multicollinearity a problem?
Give an example of two (or more) perfectly collinear variables?
Suppose that X1 and X2 are highly collinear and we estimate the following model Yi = β1 + β21X1i + β3X2i + ei. Can we use OLS to find unbiased estimates of the coefficients. Why are these coefficients difficult to estimate precisely (give intuition).
(a) Suppose you are interested in the effect of high school GPA (gpai) on ad- mission to Texas University (admiti), where admiti is a binary variable with admiti = 1 indicating admission and admiti = 0 indicating rejection. You estimate the following regression using OLS, admiti = β1 + β2gpai + ei and find addmiti = 1.2 for some units. Why is this a questionable predicted value? How could you model this to avoid this problem?
(a) Suppose you are interested in the effect of years of education on income. Why would a regression of income on education give a biased estimate of this effect? Could this biased estimate be used for forecasting?
Suppose you are interested in modeling U.S. GDP data using an autoregression model. Describe two approaches to choosing the lag-length. What are the advan- tages and disadvantages of each?
High humidity levels can cause drivers to be drowsy and increase traffic accidents. We would like to estimate the effect of humidity on traffic accidents. We have a cross sectional data set of per-capita traffic accident (accidenti) and average humidity (humidityi) data for all states i in 2005. We also have a panel data set of percapita traffic accident (accidentit) and average humidity (humidityit) data for all states i from 2005-2010.
Suppose I can only use one data set, which would be the best data set for finding an unbiased estimate of the impact of humidity on traffic accidents? Why?
What would be the best model to estimate this effect. Carefully explain why.
(a) Specify the Pooled Model and state when it can be used?
What are the consequences of estimating Pooled Model using panel data?
Specify Fixed and Random Effects models. Write down the regression models and state the differences between the two.
In developing an econometric model, you might exclude one or more relevant re- gressors or include one or more irrelevant regressors.
What are the consequences of including 1 too many regressors? (b) What are the consequences of excluding 1 too few regressors?
When using non-experimental data in estimating the regression model it is common to have high correlation among some of the regressors which can lead to Exact and Near Multicollinearity.
What is Exact Multicollinearity? How to detect Exact Multicollinearity and what are the remedies?
What is Near Multicollinearity? How to detect Near Multicollinearity and what are the remedies?
Suppose you are interested in estimating the elasticity of demand for wheat using:
ln(Qi wheat) = β1 + β2ln(Pi wheat) + ei
Is the independent variable (ln(P wheati )) positively or negatively correlated with the error term?
If we estimated this equation with OLS, would you expect βˆ2 to be larger or smaller than β2?
In estimating the regression model y = β1 +β2x a researcher had options of scaling either both or one of the variables. Without scaling any of the variables he/she obtained
yˆ = 83.42 + 0.1021x (se) (43.41)
(0.0209)
where
y=weekly food expenditure by household, in dollars
x=weekly household income * indicates significance at 10% *** indicates significance at 1%.
Interpret the above estimated regression model.
Re-write the estimated regression model if x was divided by 100, i.e. x 100. Interpret the result.
Re-write the estimated regression model if y was divided by 100, i.e. y 100. Interpret the result.
Re-write the estimated regression model if both x and y were scaled, i.e. x 100 and 100y . Interpret the result.
Purchase A New Answer
Custom new solution created by our subject matter experts