Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PyFuncWrap #483

Closed
wants to merge 2 commits into from
Closed

Add PyFuncWrap #483

wants to merge 2 commits into from

Conversation

JobJob
Copy link
Contributor

@JobJob JobJob commented Apr 4, 2018

Some performance issues were reported here: JuliaML/OpenAIGym.jl#9

This adds a wrapper for a python function (similar to the FuncWrapper in callback.jl for julia functions) so that when calling it in tight loops you reduce the number of PyObjects created to store the arguments and the return value. The included benchmarks show significant increase in calling speed.

Mean times
----------
nprand_pyo             : TrialEstimate(16.380 μs)  (original 2-arg function)
nprand2d_wrap          : TrialEstimate(2.071 μs)   (wrapped - args set at call time)
nprand2d_wrap_noargs   : TrialEstimate(1.239 μs)   (wrapped - args previously set)

nprand_pyo2            : TrialEstimate(925.201 μs) (original 15-arg function)
nprand2d_wrap2         : TrialEstimate(2.348 μs)   (wrapped - args set at call time)
nprand2d_wrap_noargs2  : TrialEstimate(1.320 μs)   (wrapped - args previously set)

@stevengj
Copy link
Member

stevengj commented Apr 4, 2018

Couldn't we just cache these for all Python functions? i.e. have a

const pycall_args = Vector{PyObject}[] # argtuples[i] is a tuple of length i-1
const pycall_ret = PyNULL()

where pycall_args is extended as needed whenever pycall is invoked with length(args)+1 > length(pycall_args)?

@JobJob
Copy link
Contributor Author

JobJob commented Apr 4, 2018

Yeah in an earlier iteration I had a (equiv of)pycall!(ret=PyNull(),... and a preallocated pycall_args, but I thought this was cleaner, since all pycalls sharing the same args wouldn't be threadsafe.

@JobJob
Copy link
Contributor Author

JobJob commented Apr 4, 2018

Not threadsafe (in my possibly flawed mental model) because e.g. say thread 1 is setting some args, then say thread 2 gets scheduled and sets some args, then thread 1 is scheduled again - thread 1 could make the call with thread 2's args.

@JobJob
Copy link
Contributor Author

JobJob commented Apr 4, 2018

This method was also faster, I think, can't remember by how much. Let me know if you want me to check.

@stevengj
Copy link
Member

stevengj commented Apr 4, 2018

I don't think the CPython API is thread-safe anyway, but if we were worried about that we could always use a per-thread array similar to JuliaLang/julia#26562. But the lack of thread-safety of the CPython API makes me disinclined to bother.

@stevengj
Copy link
Member

stevengj commented Apr 4, 2018

In particular, if we want to make pycall thread-safe (separate PR) we will need to acquire the global interpreter lock (GIL) anyway, and once we have a lock we can manipulate our own global state safely too.

@JobJob
Copy link
Contributor Author

JobJob commented Apr 4, 2018

Yeah, all good points. I will benchmark the difference between this and a pycall! with global set of args.
I'm not sure if every call could share the const pycall_ret = PyNULL(),I think I'd just get pycall(...) to call pycall!(PyNULL(),...) - and if users want they can call pycall! directly with their own ret::PyObject. I have an earlier commit with this, so I'll check it out.

@JobJob
Copy link
Contributor Author

JobJob commented Apr 11, 2018

You can check out a version of pycall! at https://github.com/JobJob/PyCall.jl/blob/pycall-bang/src/PyCall.jl#L674-L716

The benchmark results (from that branch - shown below) suggest that the wrap in this PR is faster. Having said that, I'm not really sure why the pycall! version is benchmarking slower - it's pretty similar. Presumably the extra typing is helping somehow?

A cleaned up version of pycall! could be done in a separate PR, but I don't think I'll have time for that right now, so anyone interested should feel free to adapt it to something merge worthy.

Mean times
----------
nprand_pyo ()                    : TrialEstimate(943.182 ns) (pycall version)
nprand_pyo! ()                   : TrialEstimate(589.341 ns) (pycall! version)
nprand_wrap ()                   : TrialEstimate(426.096 ns) (wrapped version)
nprand_wrap_noargs ()            : TrialEstimate(427.233 ns) (wrapped version - args already set)

nprand_pyo (1, 1, 1)             : TrialEstimate(18.899 μs)
nprand_pyo! (1, 1, 1)            : TrialEstimate(6.176 μs)
nprand_wrap (1, 1, 1)            : TrialEstimate(2.215 μs)
nprand_wrap_noargs (1, 1, 1)     : TrialEstimate(1.133 μs)

nprand_pyo 7*(1,1,...)           : TrialEstimate(34.965 μs)
nprand_pyo! 7*(1,1,...)          : TrialEstimate(7.714 μs)
nprand_wrap 7*(1,1,...)          : TrialEstimate(2.786 μs)
nprand_wrap_noargs 7*(1,1,...)   : TrialEstimate(1.338 μs)

nprand_pyo 12*(1,1,...)          : TrialEstimate(50.625 μs)
nprand_pyo! 12*(1,1,...)         : TrialEstimate(9.137 μs)
nprand_wrap 12*(1,1,...)         : TrialEstimate(4.344 μs)
nprand_wrap_noargs 12*(1,1,...)  : TrialEstimate(1.517 μs)

nprand_pyo 17*(1,1,...)          : TrialEstimate(68.642 μs)
nprand_pyo! 17*(1,1,...)         : TrialEstimate(11.808 μs)
nprand_wrap 17*(1,1,...)         : TrialEstimate(5.610 μs)
nprand_wrap_noargs 17*(1,1,...)  : TrialEstimate(1.726 μs)

@stevengj
Copy link
Member

I'd rather have the time spent on making pycall and pycall! faster than on maintaining a separate function-wrapper type.

@JobJob
Copy link
Contributor Author

JobJob commented Apr 13, 2018

So, you've managed to sucker me in to cleaning up my pycall! branch, well played old sport, should have something to push soonish.

@JobJob
Copy link
Contributor Author

JobJob commented Apr 13, 2018

Closing in favour of #492

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants