Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Directly coerce to DataOutputStream without using output-stream #14

Merged
merged 2 commits into from
Jun 28, 2019

Conversation

axel-angel
Copy link
Contributor

This is problematic because io/output-stream in clojure adds yet an
other buffer in the middle which is uncessary since all instances in the
code are OutputStream already and DataOutputStream already accepts it,
secondly it creates corruption while using different streams like
GZIPOutputStream for unknown reasons the BufferedOutputStream introduced
by io/output-stream is preventing the last flush thus corrupting the
output.

To reproduce the bug try this:

> (with-open [fd (->> "myfile.cbor.gz" (clojure.java.io/output-stream) (java.util.zip.GZIPOutputStream.))] (clj-cbor.core/encode clj-cbor.core/default-codec fd (into [] (range 1e6))))    
4868653                                                                                                                      
> (with-open [fd (->> "myfile.cbor.gz" (clojure.java.io/input-stream) (java.util.zip.GZIPInputStream.))] (doall (clj-cbor.core/decode clj-cbor.core/default-codec fd)))                    
                                                                                                                             
Execution error (ExceptionInfo) at clj-cbor.error/codec-exception! (error.clj:8).                                            
Input data ended while parsing a CBOR value.

The code above works if we do:

(with-open [fd (->> "myfile.cbor.gz" (clojure.java.io/output-stream) (java.util.zip.GZIPOutputStream.) (java.io.DataOutputStream.))] (clj-cbor.core/encode clj-cbor.core/default-codec fd
(into [] (range 1e6))))

As you can see we directly provide ourself the DataOutputStream thus in the code above we don't call io/output-stream as explained.

This is problematic because io/output-stream in clojure adds yet an
other buffer in the middle which is uncessary since all instances in the
code are OutputStream already and DataOutputStream already accepts it,
secondly it creates corruption while using different streams like
GZIPOutputStream for unknown reasons the BufferedOutputStream introduced
by io/output-stream is preventing the last flush thus corrupting the
output.
@codecov-io
Copy link

codecov-io commented Jun 27, 2019

Codecov Report

Merging #14 into develop will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@          Coverage Diff           @@
##           develop    #14   +/-   ##
======================================
  Coverage      100%   100%           
======================================
  Files           13     13           
  Lines          805    804    -1     
======================================
- Hits           805    804    -1
Impacted Files Coverage Δ
src/clj_cbor/core.clj 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 67e09a5...3181151. Read the comment docs.

Copy link
Owner

@greglook greglook left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the investigation and fixes! In hindsight, this is an obvious symmetry of #13 which dealt with the same issue on the decoding side.

@@ -78,15 +78,6 @@

;; ## Encoding Functions

(defn- data-output-stream
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be better to keep this function and do something similar to 760124f - if the argument is already a DataOutputStream then return it, if it is an OutputStream already then wrap it directly, otherwise use io/output-stream to coerce it before wrapping.

(encode-seq encoder output values))
(.toByteArray buffer)))
([encoder ^OutputStream output values]
(let [data-output (data-output-stream output)]
(let [data-output (DataOutputStream. output)]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would double-wrap an existing DataOutputStream, which is probably fine in practice but not ideal.

@greglook greglook self-assigned this Jun 27, 2019
@greglook greglook added the bug label Jun 27, 2019
@axel-angel
Copy link
Contributor Author

@greglook Oh! that's very good point, I totally forgot the double wrap indeed, a bit stupid. So I did adapt the function and only call it where the stream comes externally. I hope I got it right this time. Thanks!

Copy link
Owner

@greglook greglook left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@greglook greglook merged commit 2b8d5cf into greglook:develop Jun 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants