- Alastair Hall ECON60622: Further Econometrics Spring 2009 Problem Set for Tutorial 1 You will work through these problems in Tutorial 1. So please work through the problems before hand and bring your answers to the tutorial. The answers do not need to be handed in. This tutorial involves four questions relating to our discussion of categorical variables in Lecture 2. Questions 1, 3 and 4 use the data set CPS85 that can be downloaded from the class web page. 1. Using the variable names in the CPS85 data set, suppose we are interested in the following wage equation: LN W AGE = β0 + β1 ED + β2 EX + β3EXSQ + u where LN W AGE denotes the log of average hourly earnings, ED denotes the number of years of education, EX denotes experience and EXSQ is the square of experience. (a) Why do you think that both EX and EXSQ have been included? (b) Estimate the model using the data in CPS85. Do the OLS estimates of β1 , β2 and β3 have the signs you would expect? (c) Calculate the semi-elasticity of the wage with respect to education. (d) Calculate the semi-elasticity of the wage with respect to experience. 2. Suppose we have a cross-sectional sample of observations on the log(wage) of n individuals. Further suppose that the ﬁrst n1 in the sample are men; that is, letting yi denote the log(wage) of the ith person (observation) in the sample, we have that {y1 , y2, . . . yn1 } are the observations for the men and {yn1 +1 , yn1 +2 , . . . yn } are the observations on the women. The number of men in the sample is, therefore, n1 and the number of women is n − n1 = n2 , say. As in class, we introduce a dummy variable to indicate gender which is deﬁned as follows: x = 0, if individual is male = 1, if individual is female Now consider the regression model: y = β 0 + β1 x + u (1) 1
- (a) Show that the normal equations associated with OLS estimation of (1) can be written as: ˆ ˆ n¯ − nβ0 − n2 β1 = 0 y (2) ˆ ˆ n2 yf − n 2 β0 − n 2 β1 = 0 ¯ (3) n where the sample mean of y is given by y = n−1 ¯ i=1 yi and the sample mean of y for −1 n women is yf = n2 ¯ i=n1 +1 yi . ˆ ¯ ˆ (b) Using (2)-(3), show that β0 = ym and β1 = yf − ym . ¯ ¯ 3. Suppose that we wish to study the relationship between log(wage) and the number of years of education, and suspect that the relationship depends on the gender and marital status (i.e. married or non-married) of the individual. (a) Propose a linear regression model for this relationship in which both the intercept and slope depend on gender and marital status. (b) Write down the conditional expectation of log(wage) for each of the four types (in terms of gender and marital status) of individual in the sample, and use this to deduce the interpretation in the coeﬃcients in the model. (c) Estimate this model by OLS using the data in CPS85 and report your results. (d) Test whether the relationship is aﬀected by marital status. Be sure to state the null and alternative hypothesis and the decision rule of the test. 4. In the Lecture, we considered a wage equation of the form LN W AGE = β0 + β1 ED + β2 F E + u (4) and discussed how this formulation allows the intercept to be diﬀerent for men and women. Another way to parameterize such a relationship is to introduce a dummy variable for males, M A = 1 − F E, and estimate the model LN W AGE = γ1ED + γ2F E + γ3 M A + u (5) (a) What is the interpretation of the coeﬃcients γ1, γ2 and γ3? (b) Estimate the models in (4) and (5) using the CPS85 data. Given your answer to part (a) and the discussion in the lecture, are the OLS estimates from the two models logically consistent? Explain. (c) Are the reported R2 ’s in the Stata analysis the same? If not then which should we believe? Explain. 2