Add run scenarios #543

davide-f · 2022-12-25T16:57:44Z

Closes #503

Changes proposed in this Pull Request

This branch aims at creating rules and scripts to iteratively test the workflow on a list of countries.
For each country, the workflow is executed and general statistics are collected.
Intermediate results are also stored.
This can be useful to track the overall status of the workflow for the globe, also collecting statistics that can be useful for validation.

Suggestions are welcome, especially on the statistics to include.

Checklist

I tested my contribution locally and it seems to work fine.
Code and workflow changes are sufficiently documented.
Newly introduced dependencies are added to envs/environment.yaml and envs/environment.docs.yaml.
Changes in configuration options are added in all of config.default.yaml and config.tutorial.yaml.
Add a test config or line additions to test/ (note tests are changing the config.tutorial.yaml)
Changes in configuration options are also documented in doc/configtables/*.csv and line references are adjusted in doc/configuration.rst and doc/tutorial.rst.
A note for the release notes doc/release_notes.rst is amended in the format of previous release notes, including reference to the requested PR.

ekatef · 2022-12-25T17:35:04Z

A technical suggestion: would it be possible to keep all the country names for which GADM_ID codes contain non-standard values, that is letters of GADM_ID do not correspond to the country code?

It would be very helpful to understand how often fine tuning may be needed when working with administrative data globally. And how much hard-coded could be a solution for non-standard cases :)

davide-f · 2022-12-25T18:44:43Z

A technical suggestion: would it be possible to keep all the country names for which GADM_ID codes contain non-standard values, that is letters of GADM_ID do not correspond to the country code?

It would be very helpful to understand how often fine tuning may be needed when working with administrative data globally. And how much hard-coded could be a solution for non-standard cases :)

Yeah :) that can be done absolutely, do you have some lines already developed to include that?
I'd just need a line that performs the check including fail-safe check if the datatype does not match.
I was wondering if you may like a test on the time series of renewables. Do you think it is needed?

ekatef · 2022-12-26T18:42:04Z

Yeah :) that can be done absolutely, do you have some lines already developed to include that?
I'd just need a line that performs the check including fail-safe check if the datatype does not match.

Super :)
I'd suggest something like https://github.com/ekatef/pypsa-earth/blob/019a49bc95087d3f47fefd6fd8dc155f637cdd91/scripts/build_shapes.py#L145-L175

I have tried here to get use of pandas DataFrame to_csv() method with two approaches:

one utilising a pre-defined dataframe structure (and thus hopefully safer!),
another one writing a filtered-out chunk of the original dataframe (which would be more useful for analysis).

Agree that we may need an additional safety check to ensure that the workflow will not be stopped due to some I/O errors. But don't have any particular ideas on this (except catching a non-specific exception...) and would be very interested in yours :)

ekatef · 2022-12-26T19:08:33Z

I was wondering if you may like a test on the time series of renewables. Do you think it is needed?

Absolutely! I'd be more than interested in any kind of validation procedures for renewables. I see here two options:

we may try to validate the wind/solar potential calculated for (known) locations of the wind/PV generation against the known generation time series in these places;
we can extract climate variables for the locations of the meteorological stations and compare them against the observations and introduce the bias corrections.

Regarding the first point I'm afraid we are somewhat constrained with RES data availability for 2013.

As for the second approach, in my opinion it's a more "correct" way, and there data availability looks much better here. There are great global-scale meteorological datasets like GHCN or ISD. I think it would be very useful to have an opportunity integrate them in our workflow but it may be not trivial at all :)

Probably, as an initial test we could take a couple of dozens locations of meteorological stations (e.g. ones being representative for more or less global picture or just with the highest data quality) and extract wind/solar potential for the buses where these stations belong to? The resulted array could be supplemented with meteorological observations data and used for numerical exercises to obtain a big picture at least in an initial version.

What do you think?

davide-f · 2022-12-28T23:38:09Z

So, I kind of implemented everything on the wish list, at least on a draft way:

build_shapes: please, check the function collect_shape_stats for the statistics that are currently implemented
build_renewable_profiles: please, check the function collect_renewable_stats

They are meant to be simple statistics to check the functionality of the workflow, yet they can be used for validation purposes and easy to expand to include additional features

I think this PR is ready for preliminary review and comparison.
Morever, we should discuss whether to include this to the codebase or not.

@pz-max and @ekatef , will you be available tomorrow for the workflow meeting?

ekatef · 2022-12-29T12:48:58Z

Fantastic! Thank you so much :)
I think both collect_shape_stats and collect_renewable_stats look great and we'll be able to use their results very well.

Will be happy to discuss details during the today's workflow meeting

pz-max · 2022-12-29T13:10:08Z

Yes, let's talk today

davide-f · 2022-12-30T22:13:09Z

A run :)
Some work is still needed.
Legend:

green: network solved
dark green: at least add_electricity worked, but network was not solved
yellow: base_network was successful but simplify_network didn't
orange: build_shapes worked but base_network not
red: buils_shapes couldn't succeed

[Update 3-1-2022]

pz-max · 2022-12-30T22:41:51Z

Nice map @davide-f . Is this figure created by the workflow? (We should regularly check the situation e.g. on weekly basis in near feature). It really shows that pypsa-earth is not yet stable to run everywhere.

davide-f · 2022-12-30T23:45:32Z

Nice map @davide-f . Is this figure created by the workflow? (We should regularly check the situation e.g. on weekly basis in near feature). It really shows that pypsa-earth is not yet stable to run everywhere.

I'm working on it and we can improve it; it was a nice mapping and testing. I think most of the yellow can be easily fixed.
A lot will be improved when PyPSA/powerplantmatching#96 will be merged and included into conda.
Currently that is running again.

I believe most of the yellow is because of that.
Moreover, most likely some countries will be failing, especially the small ones because of missing data or not enough buses.

Once we have more green for africa, I'll run other countries as well

ekatef · 2022-12-30T23:45:57Z

A nice and very insightful plot!

Would it probably make sense to switch green and dark green to make the color coding more intuitive? (or probably replace gark green with something like gold or yellowgreen?..)

Results are a bit surprising: my a priori feeling was that the model works perfectly in 80% cases and tends to make troubles in really complicated cases only :) Is the yellow probably an effect of #531? And does red mean some regional borders uncertainties?

Agree that the visualisation works perfectly to highlight the modeling status and it's worth to include it into the regular workflow. Apart of this practical meaning, it's just beautiful :)

davide-f · 2022-12-30T23:49:13Z

A nice and very insightful plot!

Would it probably make sense to switch green and dark green to make the color coding more intuitive? (or probably replace gark green with something like gold or yellowgreen?..)

Results are a bit surprising: my a priori feeling was that the model works perfectly in 80% cases and tends to make troubles in really complicated cases only :) Is the yellow probably an effect of #531? And does red mean some regional borders uncertainties?

Agree that the visualisation works perfectly to highlight the modeling status and it's worth to include it into the regular workflow. Apart of this practical meaning, it's just beautiful :)

Green/dark green currently is not a problem, though I agree.
Before I was using blue.
The major issue now is that we need some little fixing and this mapping is actually something we really needed.
There was a problem with powerplantmatching that I'm investigating and proposed a fix.
The new map should be more realistic and close to the africa paper as well

davide-f · 2022-12-30T23:52:40Z

Nice map @davide-f . Is this figure created by the workflow? (We should regularly check the situation e.g. on weekly basis in near feature). It really shows that pypsa-earth is not yet stable to run everywhere.

The picture has been created by a notebook.
Maybe it is more suitable for the documentation package.

davide-f · 2023-01-02T17:20:46Z

Image above updated

ekatef · 2023-01-02T17:48:42Z

Image above updated

Wow 🤩 An impressive progress!

davide-f · 2023-01-03T20:40:00Z

Latest changes are updated above.
The image now looks pretty good. Some bug fixing has been needed in powerplantmatching/atlite and pypsa-earth.
That is the result.

For the purpose of this PR, no more bugfixing is expected, but I'll try run the same code on regions beyond africa.

Major needs to improve the current picture are:

adjust the clustering when DC buses are included. Probably, the best way would be to do the groupby by ["countries", "carrier"]; that would keep consistency in the carriers by bus without making explode the number of clusters; the effects of sub_networks shall be tested anyway.
(second priority) Handling empty dataframes where necessary. The workflow failed in some countries because the clustering process occurring on buses without lines seemed to lead errors inside pypsa that may be addressed. An issue as been added in pypsa

To achieve this image, the number of clusters used for the clustering are adjusted by country.
The default used was 5 clusters but that can be relatively high especially for small countries.
A dictionary has been added to track the number of clusters by country.

pz-max · 2023-01-05T16:19:05Z

Update. The picture is becoming more green 🥇 .. White areas still needs to be tested

Legend:
green: network solved
dark green: at least add_electricity worked, but network was not solved
yellow: base_network was successful but simplify_network didn't
orange: build_shapes worked but base_network not
red: buils_shapes couldn't succeed

davide-f · 2023-01-08T15:54:29Z

Latest updates.

I think this image cannot be released as of now: too much non-green in Asia.
Maybe the Africa+South America may be released (upper image)

Priorities to be fixed:

geometry shape whose draft fix is at Fix geometry issues #532 . With the revised version and the option drop/set_to_country PK, IN and CH at least should become at least dark green
fix cluster network problem at Verify the effect of using/disregarding clustering by subnetworks in cluster_network #531 . With this most dark green areas should become clear.

After these fixes, I think the +Asia shape should be ok to be published.
In that case, we can also run Europe and North America

davide-f · 2023-01-24T09:21:33Z

This PR will be heavily revised in agreement to the new developments on main.
A force push will update the code and a copy will be stored in my fork

scripts/build_test_configs.py

Snakefile

davide-f force-pushed the add_run_scenarios branch from ef41896 to f15504c Compare January 5, 2023 19:44

pz-max mentioned this pull request Jan 14, 2023

add earth-osm package, remove esy-osm #547

Merged

9 tasks

davide-f added 2 commits January 25, 2023 10:53

Revise test config style

fba027a

Revise test_config to enhance script capabilities

35122a3

davide-f closed this Jan 25, 2023

davide-f force-pushed the add_run_scenarios branch from 6e3fd1d to a3dac09 Compare January 25, 2023 10:07

davide-f added 2 commits January 25, 2023 11:14

Add multi-scenario management

de6efc1

Finalize rule

9f28eeb

davide-f reopened this Jan 25, 2023

davide-f added 2 commits January 25, 2023 11:38

Test custom workflow with multi scenario management

4a7f81b

Minor revision and bug fixing

09e078b

davide-f mentioned this pull request Jan 25, 2023

Create statistics #579

Merged

7 tasks

davide-f marked this pull request as ready for review January 25, 2023 15:16

Finalize revision

92737a1

pz-max requested changes Jan 26, 2023

View reviewed changes

scripts/build_test_configs.py Outdated Show resolved Hide resolved

Snakefile Outdated Show resolved Hide resolved

Snakefile Show resolved Hide resolved

Snakefile Outdated Show resolved Hide resolved

Snakefile Outdated Show resolved Hide resolved

Snakefile Outdated Show resolved Hide resolved

davide-f added 3 commits January 26, 2023 14:07

Refuse

ba48924

Get stem from path in run_all_scenarios

2fa12b8

Change config file for custom scenario analysis

399820d

pz-max approved these changes Jan 26, 2023

View reviewed changes

davide-f added 3 commits January 26, 2023 14:41

Revise default test config depending on tutorial status

7da5708

Fix typo

82a44ee

Fix backup of config file in scenario managemetn

38eb81b

davide-f merged commit fd02021 into pypsa-meets-earth:main Jan 26, 2023

davide-f deleted the add_run_scenarios branch March 16, 2023 12:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add run scenarios #543

Add run scenarios #543

davide-f commented Dec 25, 2022 •

edited

Loading

ekatef commented Dec 25, 2022

davide-f commented Dec 25, 2022

ekatef commented Dec 26, 2022

ekatef commented Dec 26, 2022 •

edited

Loading

davide-f commented Dec 28, 2022

ekatef commented Dec 29, 2022

pz-max commented Dec 29, 2022

davide-f commented Dec 30, 2022 •

edited

Loading

pz-max commented Dec 30, 2022

davide-f commented Dec 30, 2022 •

edited

Loading

ekatef commented Dec 30, 2022

davide-f commented Dec 30, 2022

davide-f commented Dec 30, 2022 •

edited

Loading

davide-f commented Jan 2, 2023

ekatef commented Jan 2, 2023

davide-f commented Jan 3, 2023 •

edited

Loading

pz-max commented Jan 5, 2023 •

edited

Loading

davide-f commented Jan 8, 2023 •

edited

Loading

davide-f commented Jan 24, 2023

Add run scenarios #543

Add run scenarios #543

Conversation

davide-f commented Dec 25, 2022 • edited Loading

Closes #503

Changes proposed in this Pull Request

Checklist

ekatef commented Dec 25, 2022

davide-f commented Dec 25, 2022

ekatef commented Dec 26, 2022

ekatef commented Dec 26, 2022 • edited Loading

davide-f commented Dec 28, 2022

ekatef commented Dec 29, 2022

pz-max commented Dec 29, 2022

davide-f commented Dec 30, 2022 • edited Loading

pz-max commented Dec 30, 2022

davide-f commented Dec 30, 2022 • edited Loading

ekatef commented Dec 30, 2022

davide-f commented Dec 30, 2022

davide-f commented Dec 30, 2022 • edited Loading

davide-f commented Jan 2, 2023

ekatef commented Jan 2, 2023

davide-f commented Jan 3, 2023 • edited Loading

pz-max commented Jan 5, 2023 • edited Loading

davide-f commented Jan 8, 2023 • edited Loading

davide-f commented Jan 24, 2023

davide-f commented Dec 25, 2022 •

edited

Loading

ekatef commented Dec 26, 2022 •

edited

Loading

davide-f commented Dec 30, 2022 •

edited

Loading

davide-f commented Dec 30, 2022 •

edited

Loading

davide-f commented Dec 30, 2022 •

edited

Loading

davide-f commented Jan 3, 2023 •

edited

Loading

pz-max commented Jan 5, 2023 •

edited

Loading

davide-f commented Jan 8, 2023 •

edited

Loading