Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document/export copy-free string allocation? #19945

Open
stevengj opened this issue Jan 9, 2017 · 8 comments
Open

Document/export copy-free string allocation? #19945

stevengj opened this issue Jan 9, 2017 · 8 comments
Labels
docs This change adds or pertains to documentation strings "Strings!"

Comments

@stevengj
Copy link
Member

stevengj commented Jan 9, 2017

There are now (as of #19449) undocumented functions Base._string_n(n) (to allocate an n-byte string) and Base.StringVector(n) (to allocate an n-byte array that can be converted to a string without copying). Should some version of these be documented and exported? They seem useful for e.g. string processing and calling C APIs expecting pre-allocated string buffers.

Note also that the String(v::Vector{UInt8}) documentation, which says that it takes "ownership" of the array, seems to be wrong now (unless v was allocated with StringVector).

cc @JeffBezanson, @nalimilan

@stevengj stevengj added docs This change adds or pertains to documentation strings "Strings!" labels Jan 9, 2017
@stevengj
Copy link
Member Author

stevengj commented Jan 9, 2017

At some point, @JeffBezanson said in #19945 that something like StringVector was the default for all byte arrays, but this seems to be no longer the case?

@nalimilan
Copy link
Member

BTW, am I right that String(take!(b)) now makes a copy? This is annoying since we just deprecated takebuf_string...

@stevengj
Copy link
Member Author

@nalimilan, no, that does not typically make a copy, because IOBuffer by default uses StringVector to allocate its buffer.

@JeffBezanson
Copy link
Sponsor Member

StringVector is used by IOBuffer, but is otherwise not the default for all byte arrays.

@stevengj
Copy link
Member Author

Would it be bad to make this the default for all byte arrays?

@StefanKarpinski
Copy link
Sponsor Member

To the extent possible, if we're going to export some of these interfaces, I'd like to put some thought into making sure that they're future proof in the sense that the same API will continue to be usable even if we change the underlying string implementation. The codeunit change is a good example of this. If we're not sure about that, I'd rather not export, so that even if people do use these interfaces in performance-critical string code, they're fairly easy to find since they have to reach into Base to get at the interfaces in question.

@simonbyrne
Copy link
Contributor

Any more thoughts on this? It is a very convenient API, and is currently used by HDF5.jl

@stevengj
Copy link
Member Author

It's used in CSTParser.jl, CSV.jl, DelimitedFiles.jl, ExactConvolution.jl, Format.jl, GraphQLParser.jl, HDF5.jl, HTTP.jl, JSON2.jl, JSON3.jl, JuliaSyntax.jl, LazyJSON.jl, MySQL.jl, Parsers.jl, Pidfile.jl, ShortStrings.jl, SourceWalk.jl, StrBase.jl, StringViews.jl, and ZMQ.jl …

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs This change adds or pertains to documentation strings "Strings!"
Projects
None yet
Development

No branches or pull requests

5 participants