Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError when cmorising EC-Earth3-Veg-LR AMIP run #733

Open
uwefladrich opened this issue Mar 14, 2022 · 14 comments
Open

KeyError when cmorising EC-Earth3-Veg-LR AMIP run #733

uwefladrich opened this issue Mar 14, 2022 · 14 comments
Assignees

Comments

@uwefladrich
Copy link

Hi,

I am getting this error when trying to cmorise an EC-Earth3-Veg-LR AMIP run:

Traceback (most recent call last):
    File "[...]/.conda/envs/ece2cmor/bin/ece2cmor", line 11, in <module>
        load_entry_point('ece2cmor3==1.8.1', 'console_scripts', 'ece2cmor')()
    File "[...]/.conda/envs/ece2cmor/lib/python2.7/site-packages/ece2cmor3-1.8.1-py2.7.egg/ece2cmor3/
        cdothreads=args.ncdo)
    File "[...]/.conda/envs/ece2cmor/lib/python2.7/site-packages/ece2cmor3-1.8.1-py2.7.egg/ece2cmor3/
        ifs2cmor.execute(ifs_tasks, nthreads=taskthreads)
    File "[...]/.conda/envs/ece2cmor/lib/python2.7/site-packages/ece2cmor3-1.8.1-py2.7.egg/ece2cmor3/
        pool.map(cmor_worker, proctasks)
    File "[...]/.conda/envs/ece2cmor/lib/python2.7/multiprocessing/pool.py", line 253, in map
        return self.map_async(func, iterable, chunksize).get()
    File "[...]/.conda/envs/ece2cmor/lib/python2.7/multiprocessing/pool.py", line 572, in get
        raise self._value
    KeyError: 'lat_bnds'

Immediately followed by this error in the ece2cmor log:

ERROR:ece2cmor3.ifs2cmor: CMOR failed to load table Amon, the following variable will be skipped: ts. Reason: Problem with 'cmor.load_table'. Please check the logfile (if defined).

When trying to cmorise another leg of the same experiment, I've seen the same error, but with another variable of the same table. Thus, I do not think it is related specifically to ts.

Any hints what this could be or what I can test?

@goord
Copy link
Collaborator

goord commented Mar 14, 2022

hmm seems to be an issue in the CMOR library, perhaps a mismatch with the table versions...

@goord goord self-assigned this Mar 16, 2022
@uwefladrich
Copy link
Author

Hi @goord, is there anything I can try or test? Is it helpful to try and minimise the example to make it easier to reproduce? Or can I check table versions somehow?

@goord
Copy link
Collaborator

goord commented Mar 21, 2022

Hi @uwefladrich yes sorry, there is something you can do: (i) post the version of ece2cmor3 and make sure the cmor-tables are up-to-date and (ii) run sequentially only the Amon-variables and post the full log output here. I will try to reproduce it this evening.

@uwefladrich
Copy link
Author

(i) ece2cmor v1.8.1 I updated the git submodules recursively, but the tables stayed the same, so I assume they are up-to-date.

(ii) l610-ifs-005-20220321151854.log (the *.cmor.log file is empty)

@goord
Copy link
Collaborator

goord commented Mar 21, 2022

Strange. In your log you posted, it is the table day that is gives an error when being loaded, so there is some randomness in the loading failures. Also it is remarkable that the log file is empty, while the message from cmor clearly says 'check the log file'.

@goord
Copy link
Collaborator

goord commented Mar 21, 2022

@uwefladrich could you change line 69 in ece2cmorlib.py from

cmor.setup(table_dir, cmor_mode, logfile=logname, create_subdirectories=(1 if create_subdirs else 0))

to

cmor.setup(table_dir, cmor_mode, logfile=None, create_subdirectories=(1 if create_subdirs else 0))

and then run the cmorization without specifying a log file, maybe more information will be sent to stderr?

@uwefladrich
Copy link
Author

Strange. In your log you posted, it is the table day that is gives an error when being loaded, [...]

I realise that I have created a Amon-only varlist file, but I haven't used it in the test run. So I will have to repeat it, but haven't had the time today... I will also use your other suggestion.

@uwefladrich
Copy link
Author

I made a few more tests. First of all, I tried the log file changes, but it only got me the same messages on stderr instead of the log file.

The I run a couple of tests trying to isolate the table that would cause the issue, tracking things down to fx. So if I remove ifs/fx from the varlist, everything works fine.

Note that the error reported in the logs (about the Amon table) seems to be misleading. Not only is there no problem with Amon if I remove fx, but also if I have only fx in the varlist, the run crashes with the KeyError without an error in the log file.

So with fx being a likely candidate for problems, this leads me to think that it could be something that has to do with resolution? This is a cmorisation of EC-Earth3-Veg-LR, has the LR variant had some issues with the fx cmorisation?

@treerink
Copy link
Collaborator

For instance ece2cmor3/resources/b2share-data/fx-sftlf-EC-Earth3-T159.nc is used.

Do you have the error for all of the fx? If not, do you know which one of the fx causes the problem?

@uwefladrich
Copy link
Author

It is fx/sftlf. The other two fx variables (areacella and orog) do not trigger the error.

@goord
Copy link
Collaborator

goord commented Mar 24, 2022

Thanks for tracking this down @uwefladrich, sftlf is a special variable that requires downloading a file from b2share (there is a download_sftlf function in ifs2cmor.py). Maybe the function hangs on the downloading, which somehow causes the cmor library to report a failure to load a table (speculating here). Could you try to debug on your system by inserting some print messages in download_sftlf to see whether the download needed, whether it is successful etc?

The actual download is done on line 1012, cmor_utils.get_from_b2share(fname, fullpath).

@treerink
Copy link
Collaborator

Usually on an HPC platform I would recommend (at installation) to run from your ece2cmor3 root directory:
./download-b2share-dataset.sh ${HOME}/cmorize/ece2cmor3/ece2cmor3/resources/b2share-data
which makes all b2share files are downloaded. If the download is the problem, this might solve it.

@uwefladrich
Copy link
Author

I re-initiated the download manually, but it didn't get more/new files, so the problem remains. In particular, fx-sftlf-EC-Earth3-T159.nc is not changed.

@goord
Copy link
Collaborator

goord commented Mar 24, 2022

@uwefladrich can you put a month or year of data on an FTP server together with the varlist and metadata json files? If it is not a networking problem, it should be reproducible on our hpc (or knmi's)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants