Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: change lowering of ccall cconvert arguments and add Ref{T} type #9986

Merged
merged 6 commits into from
Mar 7, 2015

Conversation

vtjnash
Copy link
Sponsor Member

@vtjnash vtjnash commented Jan 31, 2015

continuation of #9913, but targeted for merging into master

(note: this is part of the implementation of Julep https://gist.github.com/vtjnash/7296634)

@vtjnash vtjnash changed the title WIP: change lowering of ccall cconvert arguments and add Ref{T} type RFC: change lowering of ccall cconvert arguments and add Ref{T} type Jan 31, 2015
@JeffBezanson
Copy link
Sponsor Member

Overall this is really good. I think we can rename cconvert_gcroot to cconvert; the name is ugly and there aren't really any meaningful definitions of cconvert as it is, so we might as well merge them. There is also too much complexity in the several subtypes of Ref. They are somewhat redundant with SubArray, and I think it is easy enough to simply avoid anything like pointer(allocate_array(), n). I liked the PR better when it was a net deletion of code:)

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Feb 1, 2015

yeah, it may make more sense to just use SubArray instead. part of the intent of this PR is that you can replace most usages of pointer with Ref. essentially, Ref is intended to act as a gc-safe version of Ptr for almost anywhere

there is one meaningful cconvert call, which is cconvert(Ptr{T}, Ref{T}). there may be lots of valid cconvert_gcroot functions

@JeffBezanson
Copy link
Sponsor Member

Yes, the fact that cconvert_gcroot is the more useful function to define is part of the problem: the good function has the ugly name.

I'm not sure it's worth trying to replace Ptr with Ref. A Ref is not a pointer, and is not as efficient as a pointer, so nobody will be fooled.

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Feb 1, 2015

I'm not sure it's worth trying to replace Ptr with Ref. A Ref is not a pointer, and is not as efficient as a pointer, so nobody will be fooled

but a Ref is much more versatile and is often easy to allocate on the stack. i'm hoping to fool people someday

@@ -6,6 +6,17 @@ typealias AbstractVecOrMat{T} Union(AbstractVector{T}, AbstractMatrix{T})

## Basic functions ##

# convert Arrays to pointer arrays for ccall
cconvert_gcroot{P<:Ptr}(::Type{Ptr{P}}, a::Array{P}) = a
function cconvert_gcroot{P<:Ptr}(::Type{Ptr{P}}, a::Array)
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should go in array.jl as it is Array-specific.

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Feb 1, 2015

actually, there might be a few more useful calls for the method currently named cconvert, namely those with a first argument of Ptr{T} for some T. in general, it's not ideal to allow arrays and strings and such to be implicitly converted to Ptr on assignment. those operations should probably only be allowed in the context of ccall. however, it is fine to allow them to implicitly convert to a Ref object.

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Feb 2, 2015

They are somewhat redundant with SubArray, and I think it is easy enough to simply avoid anything like pointer(allocate_array(), n)

i'm more concerned about whether it remains valid under all code transforms. for example, in the following somewhat-contrived example, the string has become unrooted by the time it reached function i, after the inlining pass has taken care of optimizing away the function call to g(x) and corresponding gc-root for x:

f() = g("hello world")
g(x) = i(h(x))
@noinline h(x) = pointer(x)
@noinline i(x) = ccall(:write, Cint, (Ptr{Uint8},), x)

@JeffBezanson
Copy link
Sponsor Member

But if you can still call pointer, adding an extra layer of Ref doesn't
fix this. Granted, if nobody ever called pointer we'd be fine, but that's
true with or without Ref.

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Feb 3, 2015

true, but i would suggest most usages of pointer that exist now could be replaced by a Ref object construction instead. then the conversion to a Ptr would only implicitly happen during the ccall construction, when codegen is able to ensure, under all possible code transformations, that there has been a gc-root constructed (even though that gc-root may be actually on the stack and thus transiently destroyed very shortly thereafter via #8134)

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Feb 3, 2015

do you want me to merge this separately, or is it acceptable if I continue this under #7906 and interleave these two changes when it gets merge

@@ -100,6 +101,7 @@ end
function study(regex::Ptr{Void}, options::Integer)
# NOTE: options should always be zero in current PCRE
errstr = Array(Ptr{UInt8},1)
errstr[1] = C_NULL
extra = ccall((:pcre_study, :libpcre), Ptr{Void},
(Ptr{Void}, Int32, Ptr{Ptr{UInt8}}),
regex, options, errstr)
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like bug fixes in this file? Could you apply these to master?

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not exactly a bug. Pcre should zero this field, but I had errors in the way I lowered ccall and this helped me get further

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Feb 16, 2015

i think this is ready to merge once CI checks pass

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Feb 17, 2015

hm, i'm not really sure why travis linux is the only place this PR seems to fail

@JeffBezanson
Copy link
Sponsor Member

Looks like one of those mysterious hangs. Heck, if it even works on windows, it must be right, right? :)

@JeffBezanson
Copy link
Sponsor Member

  • Yay for net deletion.
  • I'm worried that removing so many conversions to Ptr is a pretty big breaking change.
  • Needs documentation / comments

return Any
end
t = rt.parameters[1]
if isa(t,DataType) && is((t::DataType).name,Ref.name)
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it mean if a ccall returns a Ref?

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See gist or doc updates on the jn/ccall3 PR. Some of these updates are primarily only meaningful in that context

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Feb 21, 2015

Looks like one of those mysterious hangs. Heck, if it even works on windows, it must be right, right? :)

that sounds great, but it seemed to happen on every commit build, so i don't know what to make of it or how to repeat it

edit: restarted again and it still failed. note that the missing tests appear to be parallel and random (consistently)

@vtjnash vtjnash force-pushed the jn/ccall_cconvert2 branch 3 times, most recently from b667509 to 4cd0665 Compare February 22, 2015 07:40
@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Feb 22, 2015

now that I found the bug (affecting travis, but not source builds) and fixed it in fdad916, i should be able to merge this now, pending a green-light from travis

I'm worried that removing so many conversions to Ptr is a pretty big breaking change.

perhaps, but that was sort of the point of deleting them

Needs documentation / comments

it's written, and will merged as part of #7906

bikeshedding @JeffBezanson

do you want me to change the names of any of these methods (Base.cconvert, Base.cconvert_gcroot)?

@@ -3244,21 +3244,26 @@ void jl_init_types(void)
jl_any_type, jl_null, 32);

tv = jl_tuple1(tvar("T"));
jl_ref_type =
Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JeffBezanson should we explicitly disable gc in this jl_init_types function? it would be bad if it somehow ran, even though it shouldn't have managed to allocate that much memory by here

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GC is already disabled at this point in the init process. I guess we could assert at the top of the function that GC is disabled.

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nah, just wanted to make sure i wasn't missing something

@JeffBezanson
Copy link
Sponsor Member

I don't think we can just sneak in removing all these conversions to Ptr. cc @StefanKarpinski @ViralBShah

Also, convert and cconvert is a much better set of names than convert, cconvert, and cconvert_gcroot. In fact that last name is so ugly that I veto it.

@JeffBezanson
Copy link
Sponsor Member

Part of the problem with the name is that in reality, cconvert_gcroot is a relatively high-level function that converts a julia structure to a different one more suitable for consumption by C code. This behavior is the main purpose, and it's not directly related to the GC. Sure its result will be rooted, but so is everything passed to a ccall.

@JeffBezanson
Copy link
Sponsor Member

But cconvert could make a Ref for anything that needed to be converted to Ptr, allowing conversions to Ptr to be defined only for Ref. This at least seems possible.

As it stands, this has ccall depend on four functions plus convert, and it creates garbage where we didn't previously. So I'm not sure this is enough of a net improvement.

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Mar 4, 2015

But you are proposing making more garbage Ref objects to avoid defining one functio no? (Also, I don't count the ptr_arg functions since this will also deprecate all that machinery)

@JeffBezanson
Copy link
Sponsor Member

Not necessarily. For objects like Vector{Int} that don't need much translation, cconvert can still just return a Ptr, and have the following convert be a no-op. If more translation is needed, cconvert can return a Ref or other object that gets rooted, followed by the actual conversion to Ptr by convert.

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Mar 4, 2015

But what would convert(::String) return? Plus that means I need to add another variable to hold your new proposed gc root

@JeffBezanson
Copy link
Sponsor Member

Converting a string could return a Ref to the Vector{UInt8} inside a ByteString.

Wouldn't this be the same variable currently used for cconvert_gcroot?

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Mar 6, 2015

that is true for ByteString, but it doesn't generalize particularly well. for example, what would you do for PyObject?

Base.getindex(b::RefArray) = b.x[b.i]

Base.setindex!(b::RefValue, x) = (b.x = x; b)
Base.setindex!(b::RefArray, x) = (b.x[b.i] = x; b)
Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should i undefine the setindex! operation for a RefArray with roots? thoughts?

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes definitely. Arguably getindex is not needed either.

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, there does need to be some way of turning a Ref back into a value

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was thinking more along the lines of adding:
Base.setindex!{T}(b::RefArray{T,Void}, x) = (b.x[b.i] = x; b)
Base.setindex!{T}(b::RefArray{T}, x) = error("this Ref has associated GC-roots and does not know how to do setindex! for them")

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1
I don't know why you'd need to setindex through a ref, and if you did that error message is pretty scary.

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to set an initial value, or populate an output of a function call, if used as an output argument

ccall(:a, B, (C,), d) is now lowered as:
  d#1 = cconvert_gcroot(C, d)
  Expr(:call, :ccall, :a, B, (C,), cconvert(C, d#1), d#1)

This allows implementation of the special-case ccall lowering logic in Julia
and still create a gcroot for it as needed

the Ref{T} type is the new supertype of Ptr{T}. the ccall emission
turns a Ref{T} into a Ptr{T}, as if the &addressOf operator had been
applied. the specific rules for this will be moved from the current
gist to the documentation at a later point.
@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Mar 7, 2015

this is mostly just waiting for CI to turn green to merge now

this also makes changes such that cconvert is mostly unchanged from its
functionality before this change. cconvert is expected to typically call
convert. in some cases (such as when intending to create a Ptr for
C-compatiblity), it should defer the actual unsafe step of making the
Ptr to unsafe_convert.

any code intending to allow conversion of an object to a Ptr should
define the operation as a method of unsafe_convert after this commit,
rather than the current behavior of making them a method of convert
vtjnash added a commit that referenced this pull request Mar 7, 2015
change lowering of ccall cconvert arguments to provide user flexibility in the final convert step in ccall, and add Ref{T} type for general-purpose output argument usage
@vtjnash vtjnash merged commit 3387e77 into master Mar 7, 2015
@vtjnash vtjnash deleted the jn/ccall_cconvert2 branch March 7, 2015 19:03
@JeffBezanson
Copy link
Sponsor Member

Maybe we should remove the pointer function as well.

@JeffBezanson
Copy link
Sponsor Member

This seems to make the system image significantly larger (more than 1MB, about 6%). Probably due to the lowering of ccall I'd guess.

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Mar 7, 2015

that's probably true. i removed the pointertoref intrinsic, but i'm thinking it might be necessary to reverse that and remove jl_value_ptr, to give a more concise AST after lowering

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Mar 7, 2015

Maybe we should remove the pointer function as well.

i'm on board with that, although it would need to be deprecated for a time first

@JeffBezanson
Copy link
Sponsor Member

I wouldn't have thought the increase was due to jl_value_ptr, just the extra unsafe_convert call. pointertoref wasn't used that much.

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Mar 7, 2015

perhaps not then. the lowering of Ref gets a bit verbose right now, for something that eventually turns into a single llvm bitcast. but it's true that base doesn't use it much right now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants