Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Historical Data Coverage #17

Open
AnastassiaFedyk opened this issue May 9, 2019 · 2 comments
Open

Historical Data Coverage #17

AnastassiaFedyk opened this issue May 9, 2019 · 2 comments
Assignees

Comments

@AnastassiaFedyk
Copy link
Collaborator

Hi Honghao,

We would like to investigate how good the data coverage actually is (what percentage of individuals employed at a certain firm we capture), and how that has changed over time. To this end, could you please do the following?

  1. Out of all the firms that we have looked into so far, take those that are publicly traded.
  2. Download the historical numbers of employees (let's say going back to 1990) from Compustat, which you can access through WRDS. By the way, do you have access to WRDS through Northwestern?
  3. Compare the number of employees that we see in our data each year against the numbers Compustat. What fraction of employees do we capture for each firm? How does that fraction change over time?

Please let me know if you have any questions.

Thank you,
Anastassia

@AnastassiaFedyk AnastassiaFedyk changed the title Data Coverage over Time Historical Data Coverage May 9, 2019
@c-forrest c-forrest self-assigned this May 14, 2019
@c-forrest
Copy link
Collaborator

Hi Professor Fedyk,

Please find the comparison here. It seems that our resume data only cover a half or less of the total employees in the 15 public companies. The coverage usually grows slowly over the years. One significant difference between the two sources is that Compustat data suffer a lot from the M&As and often changes suddenly, while the resume data always grows smoothly.

Please let me know if you have any question on this.

Best,
Honghao

@hodsonjames
Copy link
Owner

Thanks Honghao!

I think coverage is sufficient, though obviously not complete. For larger companies, I would generally be happy with anything above a 10% sample of the employees if it is well distributed across positions/hierarchy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants