Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read/write compressed csv file #475

Closed
norci opened this issue Jul 25, 2019 · 8 comments
Closed

read/write compressed csv file #475

norci opened this issue Jul 25, 2019 · 8 comments

Comments

@norci
Copy link

norci commented Jul 25, 2019

Shall we have a feature, that read/write compressed file transparently?
see:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html

@quinnj
Copy link
Member

quinnj commented Jul 25, 2019

You can already do:

CSV.read(GzipDecompressorStream(open(file)))

CSV.write(GzipCompressorStream(open(file)))

@xiaodaigh
Copy link
Contributor

How about picking a particular file from within a zip file? Can that be done?

@quinnj
Copy link
Member

quinnj commented Aug 2, 2019

There's nothing in CSV.jl to support processing zip archives. It looks like https://github.com/fhs/ZipFile.jl is fairly well maintained. I've avoided taking direct dependencies in CSV.jl for this kind of stuff because it composes really easily where CSV.jl can accept any IO or byte stream/array, and these other packages can provide specific compression/archive support.

@quinnj quinnj closed this as completed Aug 2, 2019
@xiaodaigh
Copy link
Contributor

Ok then. Perhaps an example of how to do this with zipfile.jl would be cool

@quinnj
Copy link
Member

quinnj commented Aug 6, 2019

A PR to the docs would be very welcome! We have a whole "examples" section, so it'd be a great spot to contribute.

@xiaodaigh
Copy link
Contributor

Once I figure out how to use zipfile.jl, I will. Currently, it is mystifying

@lungben
Copy link
Contributor

lungben commented Jul 23, 2020

Just for reference, this is how reading / writing of a single CSV file inside a zip file worked for me using ZipFile.jl.

Reading:

zf = ZipFile.Reader("filename.zip")
cal_df = CSV.File(read(zf.files[1])) |> DataFrame!

Writing:

filename = "myfile"
dir = ZipFile.Writer("$filename.zip")
f = ZipFile.addfile(dir, "$filename.csv", method=ZipFile.Deflate)
df |> CSV.write(f)
close(dir)

For reading, there is an example in the documentation of CSV.jl, for writing not yet.
I could make a PR if this would help.

@xiaodaigh
Copy link
Contributor

I could make a PR if this would help.

From a user's perspective. It always helps

lungben added a commit to lungben/CSV.jl that referenced this issue Jul 23, 2020
Added an example how to write a CSV file directly into a zip archive using ZipFiles.jl

See
JuliaData#475
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants