Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot write CSV from diskbased DimArray #788

Open
felixcremer opened this issue Aug 27, 2024 · 1 comment
Open

Cannot write CSV from diskbased DimArray #788

felixcremer opened this issue Aug 27, 2024 · 1 comment

Comments

@felixcremer
Copy link
Contributor

I can't write a diskarray backed DimArray to CSV but if I load it to disk before hand it works.
I am not sure, whether this should rather be a DiskArray or YAXArray issue, I can try to reduce the example further down to an MWE at the end of the week.
jena is a selection of a diskbased YAXArray for one single pixel.
I can write the data if I use readcubedata from YAXArrays beforehand to load the cube to disk.
If I use DimTable explicitely, the writing seems to work, but is very slow when the data is still a DiskArray.

julia> CSV.write("examples/data/jena.csv", jena)
ERROR: BoundsError: attempt to access Tuple{Colon} at index [2]
Stacktrace:
  [1] getindex
    @ ./tuple.jl:31 [inlined]
  [2] #79
    @ ./range.jl:429 [inlined]
  [3] ntuple
    @ ./ntuple.jl:19 [inlined]
  [4] getindex
    @ ./range.jl:429 [inlined]
  [5] view(a::DiskArrayTools.DiskArrayStack{Union{…}, 2, DiskArrays.SubDiskArray{…}, 1}, i::Function)
    @ DiskArrayTools ~/.julia/packages/DiskArrayTools/141OI/src/DiskArrayTools.jl:67
  [6] vec(a::DiskArrayTools.DiskArrayStack{Union{…}, 2, DiskArrays.SubDiskArray{…}, 1})
    @ DiskArrays ~/.julia/packages/DiskArrays/6JA8Z/src/subarray.jl:52
  [7] vec(A::DimMatrix{Union{…}, Tuple{…}, Tuple{}, DiskArrayTools.DiskArrayStack{…}, Symbol, Dict{…}})
    @ DimensionalData ~/.julia/packages/DimensionalData/RxCda/src/array/array.jl:110
  [8] map
    @ ./tuple.jl:291 [inlined]
  [9] map(::Function, ::@NamedTuple{value::DimMatrix{…}})
    @ Base ./namedtuple.jl:266
 [10] DimTable(s::DimStack{…}; mergedims::Nothing)
    @ DimensionalData ~/.julia/packages/DimensionalData/RxCda/src/tables.jl:106
 [11] DimTable(x::YAXArray{…}; layersfrom::Nothing, mergedims::Nothing)
    @ DimensionalData ~/.julia/packages/DimensionalData/RxCda/src/tables.jl:144
 [12] DimTable(x::YAXArray{Union{…}, 2, DiskArrayTools.DiskArrayStack{…}, Tuple{…}, Dict{…}})
    @ DimensionalData ~/.julia/packages/DimensionalData/RxCda/src/tables.jl:131
 [13] columns(x::YAXArray{Union{…}, 2, DiskArrayTools.DiskArrayStack{…}, Tuple{…}, Dict{…}})
    @ DimensionalData ~/.julia/packages/DimensionalData/RxCda/src/tables.jl:14
 [14] _rows(x::YAXArray{Union{…}, 2, DiskArrayTools.DiskArrayStack{…}, Tuple{…}, Dict{…}})
    @ Tables ~/.julia/packages/Tables/8p03y/src/fallbacks.jl:93
 [15] rows(m::YAXArray{Union{…}, 2, DiskArrayTools.DiskArrayStack{…}, Tuple{…}, Dict{…}})
    @ Tables ~/.julia/packages/Tables/8p03y/src/matrix.jl:5
 [16] write(file::String, itr::YAXArray{…}; append::Bool, compress::Bool, writeheader::Nothing, partition::Bool, kw::@Kwargs{})
    @ CSV ~/.julia/packages/CSV/cwX2w/src/write.jl:197
 [17] write(file::String, itr::YAXArray{Union{…}, 2, DiskArrayTools.DiskArrayStack{…}, Tuple{…}, Dict{…}})
    @ CSV ~/.julia/packages/CSV/cwX2w/src/write.jl:162
 [18] top-level scope
    @ REPL[361]:1
Some type information was truncated. Use `show(err)` to see complete types.
@rafaqz
Copy link
Owner

rafaqz commented Oct 1, 2024

Probably need to do modify(cache, dimarray) first to use chunk caching in DiskArrays. It should help a bit. But otherwise I'm not sure how we can get Tables.jl sources to read in chunk order... like a csv has to be written sequentially. You could also use the RechunkedDiskArray to force rows to be contiguous. We should put this in the DimensionalDataDiskArraysExt whenever we add it.

Tables.jl doesn't use iteration it uses indexing so the iterate optimisations don't help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants