Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grib support #394

Open
aasdelat opened this issue May 17, 2024 · 11 comments
Open

Grib support #394

aasdelat opened this issue May 17, 2024 · 11 comments
Labels

Comments

@aasdelat
Copy link
Contributor

aasdelat commented May 17, 2024

Does YAXArrays support grib files?. If so, how can I load one? If not, could this feature be added?
Thanks a lot.

@aasdelat aasdelat changed the title Grig support Grib support May 17, 2024
@Balinus
Copy link
Contributor

Balinus commented May 17, 2024

Not a solution, but I very much dislike grib files, so I usually convert them to netcdf or zarr files. Even xarray has in my experience limited support for grib files (it doesn't work with 50% of the files I'd like to read).

https://www.ncl.ucar.edu/Applications/griball.shtml
https://confluence.ecmwf.int/display/LDAS/Converting+from+grib+to+netCDF+with+cdo

@felixcremer
Copy link
Member

YAXArrays.jl itself does not support grib files but Rasters does. And from a Raster object you can then convert that back into a YAXArray object if you want to use the YAXArrays machinery.

using Rasters, GRIBDatasets
using YAXArrays

r = Raster("path/to/gribfile", lazy=true)
yax = YAXArray(dims(r), parent(r))

This is a bit of a stop gap solution because you would loose the metadata of your file on that but it might help you in starting your analysis.

@aasdelat
Copy link
Contributor Author

aasdelat commented May 20, 2024

Thanks, both of you. I have tested @felixcremer s idea, but got a weird structure for the data. So I've also tested @Balinus s idea, but It seems that YAXArrays does not support the structure of my data.
Here is the structure of my data after converted into NetCDF, obtained from the head of ncdump:

dimensions:
        values = 338880 ;
        valid_time = 744 ;
variables:
        int64 valid_time(valid_time) ;
                valid_time:units = "seconds since 1970-01-01T00:00:00" ;
                valid_time:calendar = "proleptic_gregorian" ;
                valid_time:long_name = "time" ;
                valid_time:standard_name = "time" ;
        double longitude(values) ;
                longitude:units = "degrees_east" ;
                longitude:long_name = "longitude" ;
                longitude:standard_name = "longitude" ;
        double latitude(values) ;
                latitude:units = "degrees_north" ;
                latitude:long_name = "latitude" ;
                latitude:standard_name = "latitude" ;
        double msl(valid_time, values) ;
                msl:name = "Mean sea level pressure" ;
                msl:units = "Pa" ;

This is what I get when I try to load the file with YAXArrays:

julia> ds = Cube("nc_file.nc")
ERROR: ArgumentError: invalid index: nothing of type Nothing

Any ideas?

@felixcremer
Copy link
Member

felixcremer commented May 20, 2024

Could you please show the full stacktrace of your error?

@felixcremer
Copy link
Member

Could you please show the full stacktrace of you error?

You can also try the open_dataset function to open the file as a dataset and select the relevant data from there.

@aasdelat
Copy link
Contributor Author

I have tried open_dataset and it works!
I do not know the difference between Cube() and open_dataset().
But I need to open a lot of files into a singe dataset, How can I do this?
Anyway, I copy and paste the output error of Cube():

ds = Cube("nc_file.nc")
ERROR: ArgumentError: invalid index: nothing of type Nothing
Stacktrace:
  [1] to_index(i::Nothing)
    @ Base ./indices.jl:300
  [2] to_index(A::Vector{Any}, i::Nothing)
    @ Base ./indices.jl:277
  [3] _to_indices1(A::Vector{Any}, inds::Tuple{Base.OneTo{Int64}}, I1::Nothing)
    @ Base ./indices.jl:359
  [4] to_indices
    @ ./indices.jl:354 [inlined]
  [5] to_indices
    @ ./indices.jl:345 [inlined]
  [6] getindex
    @ ./abstractarray.jl:1291 [inlined]
  [7] concatenatecubes(cl::Vector{Any}, cataxis::Dim{:Variable, Vector{String}})
    @ YAXArrays.Cubes ~/.julia/packages/YAXArrays/jdA1f/src/Cubes/TransformedCubes.jl:39
  [8] Cube(ds::Dataset; joinname::String, target_type::Nothing)
    @ YAXArrays.Datasets ~/.julia/packages/YAXArrays/jdA1f/src/DatasetAPI/Datasets.jl:382
  [9] Cube
    @ ~/.julia/packages/YAXArrays/jdA1f/src/DatasetAPI/Datasets.jl:341 [inlined]
 [10] Cube(s::String)
    @ YAXArrays.Datasets ~/.julia/packages/YAXArrays/jdA1f/src/DatasetAPI/Datasets.jl:815
 [11] top-level scope
    @ REPL[4]:1

@danlooo
Copy link
Member

danlooo commented May 21, 2024

A Dataset is just a dictionary of YAXArrays. These arrays may have shared axes, e.g. longitude and latitude for arrays temperature and precipitation. These arrays may also be completely unrelated. The type of the cell elements may also be different.
A Cube is just one YAXArray in which all variables from the dataset are squeezed together. If temperature and precipitation do not share all axes e.g. they are in a different grid, one can not make a Cube out of it.
Maybe you're files have some different axes so you just use a Dataset instead of a Cube.

@aasdelat
Copy link
Contributor Author

Wow!, I understand now.
I have two shared axes: "values" and "valid_time" and three YAXArrays: "longitude", "latitude" and "msl".
So it is a data set, but not a Cube.
On the other hand, I have tried to read it as a Cube by adding the following attribute to the "msl" array:
msl:coordinates = "longitude latitude" ;
but with no success.
So, I think I have to read it as a dataset.
But, Can I read a set of files as a single dataset (as can be made by Python's xarray.open_mfdataset)?

@lazarusA
Copy link
Collaborator

That's what 'open_dataset' will do also here. Similar to https://juliadatacubes.github.io/YAXArrays.jl/dev/UserGuide/openZarr.

@felixcremer
Copy link
Member

Could you share two of your files, so that I can debug this. We also have an open_mfdataset function but it is a bit buggy.

@aasdelat
Copy link
Contributor Author

aasdelat commented May 21, 2024

Here are two of them. The first one has the
msl:coordinates = "longitude latitude" ;
attribute on the msl array and the second one, not.
They are in a zip file 580 Mb, so I send them by wetransfer:
https://we.tl/t-v5IqMyWDG8

@lazarusA lazarusA added the todo label Sep 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants