Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add run scenarios #543

Merged
merged 13 commits into from
Jan 26, 2023
Merged

Conversation

davide-f
Copy link
Member

@davide-f davide-f commented Dec 25, 2022

Closes #503

Changes proposed in this Pull Request

This branch aims at creating rules and scripts to iteratively test the workflow on a list of countries.
For each country, the workflow is executed and general statistics are collected.
Intermediate results are also stored.
This can be useful to track the overall status of the workflow for the globe, also collecting statistics that can be useful for validation.

Suggestions are welcome, especially on the statistics to include.

Checklist

  • I tested my contribution locally and it seems to work fine.
  • Code and workflow changes are sufficiently documented.
  • Newly introduced dependencies are added to envs/environment.yaml and envs/environment.docs.yaml.
  • Changes in configuration options are added in all of config.default.yaml and config.tutorial.yaml.
  • Add a test config or line additions to test/ (note tests are changing the config.tutorial.yaml)
  • Changes in configuration options are also documented in doc/configtables/*.csv and line references are adjusted in doc/configuration.rst and doc/tutorial.rst.
  • A note for the release notes doc/release_notes.rst is amended in the format of previous release notes, including reference to the requested PR.

@ekatef
Copy link
Member

ekatef commented Dec 25, 2022

A technical suggestion: would it be possible to keep all the country names for which GADM_ID codes contain non-standard values, that is letters of GADM_ID do not correspond to the country code?

It would be very helpful to understand how often fine tuning may be needed when working with administrative data globally. And how much hard-coded could be a solution for non-standard cases :)

@davide-f
Copy link
Member Author

A technical suggestion: would it be possible to keep all the country names for which GADM_ID codes contain non-standard values, that is letters of GADM_ID do not correspond to the country code?

It would be very helpful to understand how often fine tuning may be needed when working with administrative data globally. And how much hard-coded could be a solution for non-standard cases :)

Yeah :) that can be done absolutely, do you have some lines already developed to include that?
I'd just need a line that performs the check including fail-safe check if the datatype does not match.
I was wondering if you may like a test on the time series of renewables. Do you think it is needed?

@ekatef
Copy link
Member

ekatef commented Dec 26, 2022

Yeah :) that can be done absolutely, do you have some lines already developed to include that?
I'd just need a line that performs the check including fail-safe check if the datatype does not match.

Super :)
I'd suggest something like https://github.com/ekatef/pypsa-earth/blob/019a49bc95087d3f47fefd6fd8dc155f637cdd91/scripts/build_shapes.py#L145-L175

I have tried here to get use of pandas DataFrame to_csv() method with two approaches:

  • one utilising a pre-defined dataframe structure (and thus hopefully safer!),
  • another one writing a filtered-out chunk of the original dataframe (which would be more useful for analysis).

Agree that we may need an additional safety check to ensure that the workflow will not be stopped due to some I/O errors. But don't have any particular ideas on this (except catching a non-specific exception...) and would be very interested in yours :)

@ekatef
Copy link
Member

ekatef commented Dec 26, 2022

I was wondering if you may like a test on the time series of renewables. Do you think it is needed?

Absolutely! I'd be more than interested in any kind of validation procedures for renewables. I see here two options:

  1. we may try to validate the wind/solar potential calculated for (known) locations of the wind/PV generation against the known generation time series in these places;

  2. we can extract climate variables for the locations of the meteorological stations and compare them against the observations and introduce the bias corrections.

Regarding the first point I'm afraid we are somewhat constrained with RES data availability for 2013.

As for the second approach, in my opinion it's a more "correct" way, and there data availability looks much better here. There are great global-scale meteorological datasets like GHCN or ISD. I think it would be very useful to have an opportunity integrate them in our workflow but it may be not trivial at all :)

Probably, as an initial test we could take a couple of dozens locations of meteorological stations (e.g. ones being representative for more or less global picture or just with the highest data quality) and extract wind/solar potential for the buses where these stations belong to? The resulted array could be supplemented with meteorological observations data and used for numerical exercises to obtain a big picture at least in an initial version.

What do you think?

@davide-f
Copy link
Member Author

So, I kind of implemented everything on the wish list, at least on a draft way:

  • build_shapes: please, check the function collect_shape_stats for the statistics that are currently implemented
  • build_renewable_profiles: please, check the function collect_renewable_stats

They are meant to be simple statistics to check the functionality of the workflow, yet they can be used for validation purposes and easy to expand to include additional features

I think this PR is ready for preliminary review and comparison.
Morever, we should discuss whether to include this to the codebase or not.

@pz-max and @ekatef , will you be available tomorrow for the workflow meeting?

@ekatef
Copy link
Member

ekatef commented Dec 29, 2022

Fantastic! Thank you so much :)
I think both collect_shape_stats and collect_renewable_stats look great and we'll be able to use their results very well.

Will be happy to discuss details during the today's workflow meeting

@pz-max
Copy link
Member

pz-max commented Dec 29, 2022

Yes, let's talk today

@davide-f
Copy link
Member Author

davide-f commented Dec 30, 2022

A run :)
Some work is still needed.
Legend:

  • green: network solved
  • dark green: at least add_electricity worked, but network was not solved
  • yellow: base_network was successful but simplify_network didn't
  • orange: build_shapes worked but base_network not
  • red: buils_shapes couldn't succeed

img

[Update 3-1-2022]

@pz-max
Copy link
Member

pz-max commented Dec 30, 2022

Nice map @davide-f . Is this figure created by the workflow? (We should regularly check the situation e.g. on weekly basis in near feature). It really shows that pypsa-earth is not yet stable to run everywhere.

@davide-f
Copy link
Member Author

davide-f commented Dec 30, 2022

Nice map @davide-f . Is this figure created by the workflow? (We should regularly check the situation e.g. on weekly basis in near feature). It really shows that pypsa-earth is not yet stable to run everywhere.

I'm working on it and we can improve it; it was a nice mapping and testing. I think most of the yellow can be easily fixed.
A lot will be improved when PyPSA/powerplantmatching#96 will be merged and included into conda.
Currently that is running again.

I believe most of the yellow is because of that.
Moreover, most likely some countries will be failing, especially the small ones because of missing data or not enough buses.

Once we have more green for africa, I'll run other countries as well

@ekatef
Copy link
Member

ekatef commented Dec 30, 2022

A nice and very insightful plot!

Would it probably make sense to switch green and dark green to make the color coding more intuitive? (or probably replace gark green with something like gold or yellowgreen?..)

Results are a bit surprising: my a priori feeling was that the model works perfectly in 80% cases and tends to make troubles in really complicated cases only :) Is the yellow probably an effect of #531? And does red mean some regional borders uncertainties?

Agree that the visualisation works perfectly to highlight the modeling status and it's worth to include it into the regular workflow. Apart of this practical meaning, it's just beautiful :)

@davide-f
Copy link
Member Author

A nice and very insightful plot!

Would it probably make sense to switch green and dark green to make the color coding more intuitive? (or probably replace gark green with something like gold or yellowgreen?..)

Results are a bit surprising: my a priori feeling was that the model works perfectly in 80% cases and tends to make troubles in really complicated cases only :) Is the yellow probably an effect of #531? And does red mean some regional borders uncertainties?

Agree that the visualisation works perfectly to highlight the modeling status and it's worth to include it into the regular workflow. Apart of this practical meaning, it's just beautiful :)

Green/dark green currently is not a problem, though I agree.
Before I was using blue.
The major issue now is that we need some little fixing and this mapping is actually something we really needed.
There was a problem with powerplantmatching that I'm investigating and proposed a fix.
The new map should be more realistic and close to the africa paper as well

@davide-f
Copy link
Member Author

davide-f commented Dec 30, 2022

Nice map @davide-f . Is this figure created by the workflow? (We should regularly check the situation e.g. on weekly basis in near feature). It really shows that pypsa-earth is not yet stable to run everywhere.

The picture has been created by a notebook.
Maybe it is more suitable for the documentation package.

@davide-f
Copy link
Member Author

davide-f commented Jan 2, 2023

Image above updated

@ekatef
Copy link
Member

ekatef commented Jan 2, 2023

Image above updated

Wow 🤩 An impressive progress!

@davide-f
Copy link
Member Author

davide-f commented Jan 3, 2023

Latest changes are updated above.
The image now looks pretty good. Some bug fixing has been needed in powerplantmatching/atlite and pypsa-earth.
That is the result.

For the purpose of this PR, no more bugfixing is expected, but I'll try run the same code on regions beyond africa.

Major needs to improve the current picture are:

  • adjust the clustering when DC buses are included. Probably, the best way would be to do the groupby by ["countries", "carrier"]; that would keep consistency in the carriers by bus without making explode the number of clusters; the effects of sub_networks shall be tested anyway.
  • (second priority) Handling empty dataframes where necessary. The workflow failed in some countries because the clustering process occurring on buses without lines seemed to lead errors inside pypsa that may be addressed. An issue as been added in pypsa

To achieve this image, the number of clusters used for the clustering are adjusted by country.
The default used was 5 clusters but that can be relatively high especially for small countries.
A dictionary has been added to track the number of clusters by country.

@pz-max
Copy link
Member

pz-max commented Jan 5, 2023

Update. The picture is becoming more green 🥇 .. White areas still needs to be tested Screenshot from 2023-01-05 16-17-40

Legend:
green: network solved
dark green: at least add_electricity worked, but network was not solved
yellow: base_network was successful but simplify_network didn't
orange: build_shapes worked but base_network not
red: buils_shapes couldn't succeed

@davide-f
Copy link
Member Author

davide-f commented Jan 8, 2023

Latest updates.

I think this image cannot be released as of now: too much non-green in Asia.
Maybe the Africa+South America may be released (upper image)

Priorities to be fixed:

  1. geometry shape whose draft fix is at Fix geometry issues #532 . With the revised version and the option drop/set_to_country PK, IN and CH at least should become at least dark green
  2. fix cluster network problem at Verify the effect of using/disregarding clustering by subnetworks in cluster_network #531 . With this most dark green areas should become clear.

After these fixes, I think the +Asia shape should be ok to be published.
In that case, we can also run Europe and North America

img

@pz-max pz-max mentioned this pull request Jan 14, 2023
9 tasks
@davide-f
Copy link
Member Author

This PR will be heavily revised in agreement to the new developments on main.
A force push will update the code and a copy will be stored in my fork

@davide-f davide-f reopened this Jan 25, 2023
@davide-f davide-f mentioned this pull request Jan 25, 2023
7 tasks
@davide-f davide-f marked this pull request as ready for review January 25, 2023 15:16
scripts/build_test_configs.py Outdated Show resolved Hide resolved
Snakefile Outdated Show resolved Hide resolved
Snakefile Show resolved Hide resolved
Snakefile Outdated Show resolved Hide resolved
Snakefile Outdated Show resolved Hide resolved
Snakefile Outdated Show resolved Hide resolved
@davide-f davide-f merged commit fd02021 into pypsa-meets-earth:main Jan 26, 2023
@davide-f davide-f deleted the add_run_scenarios branch March 16, 2023 12:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create a list in the documentation of countries with model status (runs, validated, year of validation)
3 participants