-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve the dataset #2
Comments
With some effort a new database could be constructed. This dataset does the USA: https://catalog.data.gov/dataset/baby-names-from-social-security-card-applications-national-level-data This dataset does Australia England and Wales Northern Ireland I wouldn't be surprised if name usage was Zipfian. |
Julio Raffo, 2016. "Worldwide Gender-Name Dictionary," WIPO Economics & Statistics Related Resources 10, World Intellectual Property Organization - Economics and Statistics Division. created a dataset from several sources included various government statistics, facebook and wikipedia. 6.2 million names for 182 different countries Making that work would mean added DataDeps.jl as dependency because it is nontrivial in size, |
This is more of feedback than an issue, not sure its actionable.
I tried this on a real world list of almost 18K names, and got a hit rate of around 34%.
The text was updated successfully, but these errors were encountered: