Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lat/lon output error when processing CMCC-CESM2 #105

Closed
jdldeauna opened this issue Mar 10, 2021 · 17 comments
Closed

Lat/lon output error when processing CMCC-CESM2 #105

jdldeauna opened this issue Mar 10, 2021 · 17 comments

Comments

@jdldeauna
Copy link

Hello! Thank you so much for this package, it helps a lot with processing CMIP6 datasets.

I'm encountering an issue when using the package with the model CMCC-CESM2. The package proceeds with no errors, but the output lat/lon is very different from the input. I'm attaching a notebook with test code.

Thanks!

@jdldeauna
Copy link
Author

Hello! I was able to use the example from #93 for CMCC-ESM2, sorry for the oversight.

Following #104 , I'm trying to install an updated version of cmip6_preprocessing. However, using conda install -c conda-forge cmip6_preprocessing results in this error when I try to use it: UserWarning: No input dictionary entry for source_id: CMCC-ESM2.

Then trying to install from master pip install git+https://github.com/jbusecke/cmip6_preprocessing.git results in this error: RuntimeError: Failed to apply pre-processing function: combined_preprocessing. I'm not sure, but for this one it might be connected to broadcast_lonlat?

@jbusecke jbusecke reopened this Apr 7, 2021
@jbusecke
Copy link
Owner

jbusecke commented Apr 7, 2021

Thank you very much for raising an issue, @jdldeauna.
Were you able to figure out a solution? Otherwise I think we should leave this issue open.

I think there are several things going on there. Ill try to unpack this a little:

  • The versioning is a bit of a mess (sorry for that). I really need to release a version so that a conda install gets you the newest code.

  • Could you provide the full code part you are executing, aswell as the full error message? This will help to narrow down what is going wrong and also give more useful advice how to fix/work around the issues.

Following #104 , I'm trying to install an updated version of cmip6_preprocessing. However, using conda install -c conda-forge cmip6_preprocessing results in this error when I try to use it: UserWarning: No input dictionary entry for source_id: CMCC-ESM2.

I suspect you want to create an xgcm grid here? This does not work for this specific model, because there is no information on the grid staggering available in this file. A bit of background: I decided to store this information in a file, because determining the staggering (the shift between e.g. tracer and velocity fields) is not very fast. My hacky approach so far, was to run a maintenance notebook to update that file. This is not a very elegant or robust way to do that, and down the line, we could think about how to improve it. As a quick fix, you could provide your own dictionary to the functions you are using. If you provide the code, Ill work out a quick example how to do that.

Regardless of this I think the Warning is not very helpful. We should try to incorporate some of this explanation into the warning, so any future users can try to provide their own input.

Then trying to install from master pip install git+https://github.com/jbusecke/cmip6_preprocessing.git results in this error: RuntimeError: Failed to apply pre-processing function: combined_preprocessing. I'm not sure, but for this one it might be connected to broadcast_lonlat?

This sounds like a different (and unexpected problem). Once I see the full code I will be able to get deeper into what is going on here.

@jdldeauna
Copy link
Author

jdldeauna commented Apr 8, 2021

Hi @jbusecke , thanks for the reply!

My notebook is such a mess! I'll do better next time. Basically, I was just trying to install the new version of cmip6_preprocessing that could do combined_preprocessing with CMCC-ESM2, and still play nicely with my other packages. I've discovered that installing packages in this exact order in a new virtual environment is what worked best:

  1. intake (including intake-xarray and intake-esm)
  2. gcsfs
  3. gdal
  4. fiona
  5. pip install git+https://github.com/jbusecke/cmip6_preprocessing.git
  6. xesmf
  7. xgcm

Somehow if I installed in a different order, all these different types of errors would come out. I think its because which version of which dependency that got installed first would then affect the subsequent packages installed. But yes, they are working well now (at least until I see a shiny new package then maybe I would have to reconfigure haha).

@jbusecke
Copy link
Owner

jbusecke commented Apr 8, 2021

I am about to merge #104. Once that is done, the version should behave better with CMCC-ESM2 (but the lon values wont be sorted anymore). Ill close this for now, but please feel free to reopen the issue, if you continue to have problems.

@jbusecke jbusecke closed this as completed Apr 8, 2021
@jdldeauna
Copy link
Author

jdldeauna commented Apr 9, 2021

Hello, I'd like to reopen this issue. I was able to use cmip6_preprocessing on CMCC-ESM2, but then had issues with using xesmf and xgcm for regridding. I think it's because of the reasons you mentioned above. Is there a way I can help out with updating this file? I've tried looking up the staggered grid information for CMCC, and have reached out to them for more info. Thanks!

@jdldeauna
Copy link
Author

@jbusecke Thanks for your guidance on this! I just wanted to clarify, so even if a model is not included in the yaml file (for example CMCC-ESM2), combined_preprocessing would still work?

@jbusecke
Copy link
Owner

jbusecke commented May 20, 2021

I just wanted to clarify, so even if a model is not included in the yaml file (for example CMCC-ESM2), combined_preprocessing would still work?

Yes, that is correct. The yaml file is only used by the logic in cmip6_preprocessing.grids in particular here is the only time where it is read.

The logic in cmip6_preprocessing.preprocessing should be totally separate, and if that does not work I presume it would be a different problem.

I think that it is indeed crucial to improve the way the grid information is parsed. I have opened another issue (#122) to keep track of that problem in a more isolated way, but if I remember correctly, we were able to narrow down the issues in the notebook to other problems (related to xgcm/xgcm#328 and xgcm/xgcm#208)?

@jbusecke
Copy link
Owner

Also, please always feel free to just reopen an issue here whenever you have further questions. I have reopened it now, just in case something is still unsolved.

@jbusecke jbusecke reopened this May 20, 2021
@jdldeauna
Copy link
Author

Unfortunately, I can't reopen this issue.

Yes, we have sorted out the previous issues in working with xgcm. I see that including more grid-specific info has become a separate goal (#122). I'd still like to keep this issue open for now. Thanks!

@jbusecke
Copy link
Owner

Oh, wow. I did not know that! That should definitely be changed, but until then its noted.

Leaving this open. Have a great weekend.

@jdldeauna
Copy link
Author

Hi Julius, I'm trying to use CMCC-ESM2 with xgcm and it would be easier if it had an entry on this yaml file. I'm exploring the grid shifts to try and update the yaml file, but I'm a little confused because it seems the velocity grids are not wholly shifted with respect to temperature. For example, in northerly latitudes, the uo grid has similar values to the thetao grid, but in the mid-latitudes they are less. Would that still be considered as shifted to the left along the x-axis? Also if you think this should be a new issue let me know, since this is related to this old issue I thought I would just continue this discussion.

@jbusecke
Copy link
Owner

jbusecke commented Mar 7, 2022

Hey Dianne,

can you post some output for this model in question? Maybe some smaller examples for the regions you mentioned?

@jdldeauna
Copy link
Author

Hi! I made a mistake, uo and theta latitudes are similar along the tropics to subtropics. When I run the following:

cat_url = "https://storage.googleapis.com/cmip6/pangeo-cmip6-noQC.json"
col = intake.open_esm_datastore(cat_url)
cat = col.search(table_id='Omon',experiment_id='historical', variable_id=['thetao','uo'], 
                 grid_label='gn', member_id='r1i1p1f1',source_id='CMCC-ESM2')
ds = cat.to_dataset_dict(
                    zarr_kwargs={'consolidated':True, 'decode_times': True, 'use_cftime': True},
                    preprocess=combined_preprocessing,
                    aggregate=False)

thetao = ds['CMIP.CMCC.CMCC-ESM2.historical.r1i1p1f1.Omon.thetao.gn.gs://cmip6/CMIP6/CMIP/CMCC/CMCC-ESM2/historical/r1i1p1f1/Omon/thetao/gn/v20210114/.nan.20210114.good.none.none']
uo = ds['CMIP.CMCC.CMCC-ESM2.historical.r1i1p1f1.Omon.uo.gn.gs://cmip6/CMIP6/CMIP/CMCC/CMCC-ESM2/historical/r1i1p1f1/Omon/uo/gn/v20210114/.nan.20210114.good.none.none']
plt.figure(figsize=(8,4))
uo.lat.plot()
plt.show

CMCC-ESM2 lat

plt.figure(figsize=(8,4))
diff_lat = thetao.lat-uo.lat
diff_lat.plot(vmin=-0.005,vmax=0.005)
plt.show

CMCC-ESM2 grid

@jbusecke
Copy link
Owner

My conclusion from this would be that the u velocity is on the same y position as the tracer, which is expected from a C grid.
Could you plot the same thing for lon instead of lat?

Ultimately this 'playing grid detective' is not a good solution. I think that ultimately this really motivates me to push on the efforts to bring the full grid information of CMIP6 models to the cloud. That will give us proper knowledge of the indivdual grids.

@jdldeauna
Copy link
Author

Longitude is on the same y position as the tracer:

cat_url = "https://storage.googleapis.com/cmip6/pangeo-cmip6-noQC.json"
col = intake.open_esm_datastore(cat_url)
cat = col.search(table_id='Omon',experiment_id='historical', variable_id=['thetao','uo'], 
                 grid_label='gn', member_id='r1i1p1f1',source_id='CMCC-ESM2')
ds = cat.to_dataset_dict(
                    zarr_kwargs={'consolidated':True, 'decode_times': True, 'use_cftime': True},
                    preprocess=combined_preprocessing,
                    aggregate=False)

thetao = ds['CMIP.CMCC.CMCC-ESM2.historical.r1i1p1f1.Omon.thetao.gn.gs://cmip6/CMIP6/CMIP/CMCC/CMCC-ESM2/historical/r1i1p1f1/Omon/thetao/gn/v20210114/.nan.20210114.good.none.none']
uo = ds['CMIP.CMCC.CMCC-ESM2.historical.r1i1p1f1.Omon.uo.gn.gs://cmip6/CMIP6/CMIP/CMCC/CMCC-ESM2/historical/r1i1p1f1/Omon/uo/gn/v20210114/.nan.20210114.good.none.none']
plt.figure(figsize=(8,4))
uo.lon.plot()
plt.show

lon_plot

plt.figure(figsize=(8,4))
diff_lon = thetao.lon-uo.lon
diff_lon.plot(vmin=-0.005,vmax=0.005)
plt.show

diff_plot

Ultimately this 'playing grid detective' is not a good solution. I think that ultimately this really motivates me to push on the efforts to bring the full grid information of CMIP6 models to the cloud. That will give us proper knowledge of the indivdual grids.

I think this is a great effort, and I hope all the grid info can be compiled :) I do wonder in the meantime, would it be useful to update this yaml file to include CMCC-ESM2? And if so, would this be an accurate entry (since uo lon is not shifted with respect to tracer)?

CMCC-ESM2
    gn:
       axis_shift:
            Y: left

@jbusecke
Copy link
Owner

if uo lon is not shifted it would have to be X:center if vo lat is not shifted its Y:center. You can always manually overwrite these when you create a grid object, so I think that would be your best choice for now.

@jdldeauna
Copy link
Author

Alright, thanks Julius!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants