Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add docs for blob.jl #22555

Merged
merged 7 commits into from
Jul 8, 2017
Merged

Add docs for blob.jl #22555

merged 7 commits into from
Jul 8, 2017

Conversation

kshyatt
Copy link
Contributor

@kshyatt kshyatt commented Jun 26, 2017

No description provided.

@kshyatt kshyatt added docs This change adds or pertains to documentation libgit2 The libgit2 library or the LibGit2 stdlib module labels Jun 26, 2017
@kshyatt kshyatt requested a review from ararslan June 26, 2017 18:00

Fetch the *raw* contents of the [`GitBlob`](@ref) `blob`. This is a read-only
`Array` containing the contents of the blob, which may be binary or may be ASCII
`String` data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused a bit by this wording. Is it always just raw bytes which may or may not be valid ASCII? If it's only sometimes ASCII, is that worth noting? Under what circumstances do each of these outcomes occur?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's basically content but it won't scream if you try to load raw binary data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it would be good to just say that? ¯\_(ツ)_/¯

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's a good call.

@kshyatt
Copy link
Contributor Author

kshyatt commented Jun 26, 2017

make check-whitespace passed locally

`String` data.
`String` data. `rawcontent` will allow the user to load the raw binary data into
the output `Array` and will not check to ensure it is a valid `String`, so errors
may occur if the result is passed to functions which expect valid `String` data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the result an array or a string? if this is type stable and always array, then you won't be passing it directly to most String functions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Counterpoint: just a few lines below, s = String(rawcontent(blob)) 😛

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the constructor yes, but not things that expect a String


See also [`content`](@ref).
See also [`content`](@ref), which *will* throw an error the content of the `blob`
is binary and not valid ASCII.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it error on unicode?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think it shouldn't, the libgit2 docs only refer to "printable characters"

@kshyatt
Copy link
Contributor Author

kshyatt commented Jun 27, 2017

OK to merge?

"""
rawcontent(blob::GitBlob) -> Array

Fetch the *raw* contents of the [`GitBlob`](@ref) `blob`. This is a read-only
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does read only mean here? you can always write to julia arrays

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did the revised wording make this clearer?

`Array` containing the contents of the blob, which may be binary or may be Unicode.
If you write to this `Array` the blob on disk will not be updated.
`rawcontent` will allow the user to load the raw binary data into
the output `Array` and will not check to ensure it is a valid Unicode, so errors
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "a valid Unicode" a typical phrase? Perhaps remove "a"?

the output `Array` and will not check to ensure it is valid Unicode, so errors
may occur if the result is passed to functions which expect valid Unicode data.

See also [`content`](@ref), which *will* throw an error the content of the `blob`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

throw an error if the content

@kshyatt
Copy link
Contributor Author

kshyatt commented Jun 29, 2017

Ready to roll?

@tkelman
Copy link
Contributor

tkelman commented Jun 29, 2017

does git always use utf8 on all platforms? that might be more specific to say, as unicode has multiple encodings

@kshyatt
Copy link
Contributor Author

kshyatt commented Jul 1, 2017

I don't think so, it appears from some reading that people have used UTF-16 and UTF-32 with git but it's a bit of a wild ride (esp on Windows).

@tkelman
Copy link
Contributor

tkelman commented Jul 1, 2017

anything we could test for that?

@kshyatt
Copy link
Contributor Author

kshyatt commented Jul 1, 2017

I'm not really sure how

@kshyatt
Copy link
Contributor Author

kshyatt commented Jul 6, 2017

Can we merge this?

@@ -4,15 +4,28 @@ function Base.length(blob::GitBlob)
return ccall((:git_blob_rawsize, :libgit2), Int64, (Ptr{Void},), blob.ptr)
end

"""
rawcontent(blob::GitBlob) -> Array
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe Vector{UInt8} (assuming that's true)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

rawcontent(blob::GitBlob) -> Array

Fetch the *raw* contents of the [`GitBlob`](@ref) `blob`. This is an
`Array` containing the contents of the blob, which may be binary or may be Unicode.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"may or may not be valid Unicode"? I find "may be binary or may be Unicode" a little confusing since it's always binary; it doesn't return Chars.

@tkelman
Copy link
Contributor

tkelman commented Jul 6, 2017

"valid unicode" is maybe a bit ambiguous still

@kshyatt
Copy link
Contributor Author

kshyatt commented Jul 7, 2017

Are we ok with the Unicode phrasing? Can we squash?

@kshyatt
Copy link
Contributor Author

kshyatt commented Jul 7, 2017

make docs passed locally btw

@kshyatt
Copy link
Contributor Author

kshyatt commented Jul 7, 2017

Anyone?

@ararslan
Copy link
Member

ararslan commented Jul 7, 2017

I still think "may be binary or may be Unicode" is weird but I don't care enough to make you change it. This LGTM.

@kshyatt kshyatt merged commit 37e911e into master Jul 8, 2017
@kshyatt kshyatt deleted the ksh/docblob branch July 8, 2017 13:16
@tkelman
Copy link
Contributor

tkelman commented Jul 8, 2017

valid utf8 string would make a lot more sense since that describes what the Julia functions will do here

@tkelman
Copy link
Contributor

tkelman commented Jul 10, 2017

also xref libgit2/libgit2#4300

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs This change adds or pertains to documentation libgit2 The libgit2 library or the LibGit2 stdlib module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants