At this final stage of this study, you are required to make a summary report with
recommendations for the EqRA Board to inform their annual report to their members.
The Board is very keen to understand the dynamics in the labour market and the sources of wage discrimination. Their chief economists warn that economists have long documented an existing marriage premium that married men on average earn more than their single coworkers.
Dr Ng is very pleased with the analysis you have provided and your answers to her initial questions. She did a lot of readings in this literature and she came back asking you to use the data to clarify a few outstanding claims and questions.
The chief economist dismissed the findings in the literature and claims that the ‘marriage
premium’ story is really just a ‘productivity story.’ He claims that:
“Wage differentials are a result of productivity differentials: that is, marriage per se makes workers more productive. There exists no basis to this observed discrimination once productivity is taken into account.”
Dr. Ng is puzzled by some findings in the literature. She read that one of the proposed
explanations in the standard cross-section regressions for why married men earn more
than single men is that:
“Marriage has a positive effect on workers’ productivity. The question remains
whether married men are more productive because they are married, or whether men
who do better on the ‘Marriage market’ tend to also do better on ‘Job market.’’
Dr. Ng informs you that the board is also concerned with how ‘race’ impacts the
distribution of wage. She wants to report to the board on the following claims:
“Married black players earn higher wages than married non-black players.”
(D) Recall Task 1 in Milestone 2:
Marriage & Age. The marriage premium has the greatest impact on younger players and high performers.
In addition to Dr. Ng concerns in (C), estimate a model that can also address the claims in (D). Describe what variables/controls you have to add to the model to estimate and draw inference about the (ceteris paribus) effect of marriage across age groups and performance levels.
Dr. Ng wants you to address points (A)-(D) to the best of your ability given the dataset you
have and given your statistical knowledge from ECON1203.
Hints: some ideas you may pursue in your investigation
You can address all of the points in this milestone in the context of multiple linear regression. Adding controls and variables to the regression will allow you to separate groups along their characteristics.
As part of your reporting requirements please state your assumptions. Remember to
show all your working.
Keep in mind that this is a kind of a research exercise where you are dealing with real dataset. There is no correct or wrong answer. BUT there is incorrect use of statistical techniques and assumptions. Or worse, incorrect interpretation of the results and misleading the Board! The latter is what we want to learn to avoid.
For the linear regression, please report the following:
• Interpret the coefficients.
• Define and comment whether each of the coefficients are statistically/economically significant. Remember to state your assumptions.
• Limitations and issues with your model. The main concerns here are multicollinearity, homoskedasticity, and omitted variable bias. You need to place these concepts in the context of your problem and how they may affect your statistical inference.
No need to discuss these for each regression you perform. You can discuss these in the final model in Part (D).
Representation of the OLS results
You should use the following style to report estimation results, standard errors of the coefficient estimates in (), e.g.,
𝑴𝑶𝑫𝑬𝑳 𝟏: 𝒍𝒘𝒂𝒈𝒆 = 𝟔. 𝟖𝟎𝟓 + 𝟎. 𝟑𝟏𝟔𝟓 𝑴𝒂𝒓𝒓𝒊𝒆𝒅;
𝑹𝟐 = 𝟎. 𝟎𝟑𝟏,
𝑭𝒔𝒕𝒂𝒕 = 𝟔. 𝟒𝟐, 𝒑𝒗𝒂𝒍 = 𝟎. 𝟎𝟏𝟐; 𝑺𝑬𝑬𝟐 = 𝟎. 𝟕𝟕
Excel table can be deferred to the appendix.
– If you use log of wages as dependent variable, pay attention at the interpretation of the coefficients.
– Dummy variable trap: always have a base group out of the regression model. Model coefficients are interpreted as the difference in (log) wages between the two groups
– Omitted variable problem: always control for as many factors as your data allow. In this case, you should think of the role of experience, age and other players’ information
– Nonlinearities between (log) wages and some characteristics like the players’ age.
Final executive report
The final report unifies all three millstones/sections which you have completed which showcases all the statistical techniques you have learnt to date. This report is a final take on this problem and should include the key results that you wish the Board to learn about this puzzling phenomenon in the labour market. Your report will consist of five core sections:
• Introduction and executive summary (1/2 page max for executive summary).
• Section 1 (using Milestone 1) which comprises a summary of key features of data based on your first milestone and any feedback received.
• Section 2 (using Milestone 2) comprises a summary of statistical tests conducted based on your second milestone and any feedback received.
• Section 3 (using Milestone 3) will investigate the productivity story in the claims above.
• Concluding remarks.
Please note that Section 1 and Section 2 is not a copy/paste of all your reports for the first two milestones. These sections should report the take home messages from your detailed analysis that you have completed. No need to re-report the tables and graphs from those milestones. You should summarize the key points and state briefly the statistical evidence/argument that support those key points.
Your report should tell a story that uses all the key conclusions you have come to from the first
two milestones, and which will probably also inform your analysis for the last milestone.