Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexing does not work for Disk-based stores #416

Open
ConnectedSystems opened this issue Jul 24, 2024 · 3 comments
Open

Indexing does not work for Disk-based stores #416

ConnectedSystems opened this issue Jul 24, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@ConnectedSystems
Copy link
Contributor

ConnectedSystems commented Jul 24, 2024

I'm running some code on two separate computers and have experienced an issue where after an update, most approaches to indexing stopped working on one computer, but not the other.

I've narrowed it down to YAXArray datasets that are disk-based.
Just to reiterate, it is only not working on one computer.

Thing is, if I run status YAXArrays on both machines, the version numbers are the same: v0.5.8

# ]add NetCDF YAXArrays

using NetCDF
using YAXArrays


axlist = (
    Dim{:v1}(range(1, 3, length=3)),
    Dim{:v2}(["x1", "x2", "x3"])
)
test_arr = YAXArray(axlist, rand(3,3))

# All of these work
test_arr[v1=BitVector([true, false, true])]
test_arr[v1=[1, 3]]

test_arr[v2=BitVector([true, false, true])]
test_arr[v2=[1, 3]]

test_arr[v1=At([1, 2])]
test_arr[v2=At(["x1", "x2"])]

test_arr[v1=1:2]
test_arr[v2=2:3]

savecube(test_arr, "test_cube.nc", driver=:netcdf)

# Errors out as `test_arr` is not a dataset, but maybe it should save it as
# a dataset with a single entry?
# savedataset(test, path="test_dataset.nc", driver=:netcdf)

# Open file as disk-based store
ds = open_dataset("test_cube.nc")
disk_arr = ds.layer

# None of these work
disk_arr[v1=BitVector([true, false, true])]
disk_arr[v1=BitVector([true, false, true])]
disk_arr[v1=[1, 3]]
disk_arr[v1=At([1, 2])]
disk_arr[v2=At(["x1", "x2"])]

# But ranges do work for some reason?
disk_arr[v1=1:2]
disk_arr[v2=2:3]

The error is:

ERROR: ArgumentError: Unable to determine chunksize of non-range views.
Stacktrace:
  [1] eachchunk_view(::DiskArrays.Chunked{…}, vv::SubArray{…})
    @ DiskArrays C:\Users\tiwanaga\.julia\packages\DiskArrays\MpOpv\src\subarray.jl:29
  [2] eachchunk
    @ C:\Users\tiwanaga\.julia\packages\DiskArrays\MpOpv\src\subarray.jl:25 [inlined]
  [3] YAXArray
    @ C:\Users\tiwanaga\.julia\packages\YAXArrays\zyFvF\src\Cubes\Cubes.jl:136 [inlined]
  [4] rebuild(A::YAXArray{…}, data::DiskArrays.SubDiskArray{…}, dims::Tuple{…}, refdims::Tuple{}, name::DimensionalData.NoName, metadata::Dict{…})
    @ YAXArrays.Cubes C:\Users\tiwanaga\.julia\packages\YAXArrays\zyFvF\src\Cubes\Cubes.jl:200
  [5] rebuild
    @ C:\Users\tiwanaga\.julia\packages\DimensionalData\BZbYQ\src\array\array.jl:85 [inlined]
  [6] rebuildsliced
    @ C:\Users\tiwanaga\.julia\packages\DimensionalData\BZbYQ\src\array\array.jl:100 [inlined]
  [7] rebuildsliced
    @ C:\Users\tiwanaga\.julia\packages\DimensionalData\BZbYQ\src\array\array.jl:99 [inlined]
  [8] view
    @ C:\Users\tiwanaga\.julia\packages\DimensionalData\BZbYQ\src\array\indexing.jl:125 [inlined]
  [9] _dim_view
    @ C:\Users\tiwanaga\.julia\packages\DimensionalData\BZbYQ\src\array\indexing.jl:110 [inlined]
 [10] #view#110
    @ C:\Users\tiwanaga\.julia\packages\DimensionalData\BZbYQ\src\array\indexing.jl:81 [inlined]
 [11] getindex(::YAXArray{Float64, 2, YAXArrayBase.NetCDFVariable{…}, Tuple{…}, Dict{…}}; kwargs::@Kwargs{v1::BitVector})
    @ YAXArrays.Cubes C:\Users\tiwanaga\.julia\packages\YAXArrays\zyFvF\src\Cubes\Cubes.jl:487
 [12] top-level scope
    @ c:\Users\tiwanaga\projects\ADRIA.jl\sandbox\yaxarray_issue\main.jl:30
Some type information was truncated. Use `show(err)` to see complete types.
@ConnectedSystems
Copy link
Contributor Author

ConnectedSystems commented Jul 24, 2024

I suspect it is the DiskArrays.jl dependency.

On both machines YAXArrays.jl is at v0.5.8 but:

  • For code that is working, the DiskArrays.jl dependency is an older v0.3.23, with upgrades being blocked by something
  • For the non-working machine, DiskArrays.jl is v0.4.3

@ConnectedSystems
Copy link
Contributor Author

Confirming that the example above works if I revert back to DiskArrays.jl v0.3.23

@ConnectedSystems
Copy link
Contributor Author

ConnectedSystems commented Jul 26, 2024

@meggart @felixcremer just submitted a potential fix to DiskArrays.jl.

I'm happy to submit a PR adding the above code as a test case to YAXArrays.jl

@lazarusA lazarusA added the bug Something isn't working label Sep 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants