Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

start outputting zip collections instead of .sig files? #1440

Closed
ctb opened this issue Apr 3, 2021 · 4 comments
Closed

start outputting zip collections instead of .sig files? #1440

ctb opened this issue Apr 3, 2021 · 4 comments

Comments

@ctb
Copy link
Contributor

ctb commented Apr 3, 2021

from @bluegenes in #1349 (comment) -

  • Future thoughts: Are you set on zipfile generation being external? If not --
  • Zipped output from search or gather(when outputting matches/unmatched) could be useful?
  • If sketching a bunch of sigs, could we write directly to a zip file? Not sure I would actually do this for non --singleton cases as it would eliminate parallelization, but just throwing it out there.

I think we could adjust or extend save_signatures(...) to create a .zip file when .zip is given as the output filename. Could do something similar with .sbt.zip, .lca.json, and other, perhaps?

see also #1349 (comment)

@ctb
Copy link
Contributor Author

ctb commented Apr 26, 2021

trying this out in #1370.

thoughts -

  • could use a helper class to allow/enable progressive output of signatures where possible
  • output to .gz by convention? or by default (e.g. in directories)?
  • can we somehow make valid JSON with progressive output to a single file?
  • support .zip output? can we make that progressive, too?

@ctb
Copy link
Contributor Author

ctb commented Apr 27, 2021

see b1d54df for creation of helper class, sourmash_args.SaveMatchingSignatures.

@ctb
Copy link
Contributor Author

ctb commented Apr 27, 2021

reference to explore streaming JSON output - https://medium.com/galvanize/streaming-structured-json-18da4edd4f20

maybe we should adjust JSON signature format to support a collection of individual records, rather than a list?

@ctb
Copy link
Contributor Author

ctb commented May 8, 2021

closed by #1493.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant