Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clean_emdat_df function from impact_data module produces KeyError: 'Entry Criteria' #690

Closed
simonameiler opened this issue Apr 4, 2023 · 4 comments
Labels
bug dependencies help wanted task something that needs to be done

Comments

@simonameiler
Copy link
Collaborator

The clean_emdat_df function from impact_data module produces a KeyError: 'Entry Criteria'. This Error has escaped the tests on Jenkins because the test data has a different structure than recently downloaded files from the EMDAT database. The header of the newly downloaded files differs from the one we use for testing.

Here's the header for the test file:
Dis No,Year,Seq,Disaster Group,Disaster Subgroup,Disaster Type,Disaster Subtype,Disaster Subsubtype,Event Name,Entry Criteria,Country,ISO,Region,Continent,Location,Origin,Associated Dis,Associated Dis2,OFDA Response,Appeal,Declaration,Aid Contribution,Dis Mag Value,Dis Mag Scale,Latitude,Longitude,Local Time,River Basin,Start Year,Start Month,Start Day,End Year,End Month,End Day,Total Deaths,No Injured,No Affected,No Homeless,Total Affected,Reconstruction Costs ('000 US$),Insured Damages ('000 US$),Total Damages ('000 US$),CPI

Here the one for the new dataset:
Dis No,Year,Seq,Glide,Disaster Group,Disaster Subgroup,Disaster Type,Disaster Subtype,Disaster Subsubtype,Event Name,Country,ISO,Region,Continent,Location,Origin,Associated Dis,Associated Dis2,OFDA Response,Appeal,Declaration,AID Contribution ('000 US$),Dis Mag Value,Dis Mag Scale,Latitude,Longitude,Local Time,River Basin,Start Year,Start Month,Start Day,End Year,End Month,End Day,Total Deaths,No Injured,No Affected,No Homeless,Total Affected,Reconstruction Costs ('000 US$),"Reconstruction Costs, Adjusted ('000 US$)",Insured Damages ('000 US$),"Insured Damages, Adjusted ('000 US$)",Total Damages ('000 US$),"Total Damages, Adjusted ('000 US$)",CPI,Adm Level,Admin1 Code,Admin2 Code,Geo Locations

Why did EMDAT change and does this matter for CLIMADA?

@simonameiler simonameiler added bug help wanted dependencies task something that needs to be done labels Apr 4, 2023
@simonameiler
Copy link
Collaborator Author

Here's a start to fixing this issue:

I added the newest set of variables downloaded from EMDAT to the VARNAMES dictionary in the impact_data Module.

What need's to be done is a correction of the clean_emdat_df function to include this updated list of variables.

https://github.com/CLIMADA-project/climada_python/blob/feature/emdat_datastructure_update/climada/engine/impact_data.py

@peanutfun
Copy link
Member

Looks fine! Seems like you already did the most important thing which is setting the 2023 case in VARNAMES_EMDAT. I am still a bit unclear about the use of this variable. For 2020 and 2023, all key-value pairs are the same, or am I missing something?

Can you raise a PR from that branch?

@simonameiler
Copy link
Collaborator Author

Yes, for 2020 and 2023 all key-value pairs are the same. But 2020 and 2023 contain different key-value pairs.

@emanuel-schmid
Copy link
Collaborator

See #701, #722

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug dependencies help wanted task something that needs to be done
Projects
None yet
Development

No branches or pull requests

3 participants