Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

take better advantage of static compilation #167

Closed
stevengj opened this issue Jul 20, 2015 · 12 comments
Closed

take better advantage of static compilation #167

stevengj opened this issue Jul 20, 2015 · 12 comments

Comments

@stevengj
Copy link
Member

From JuliaPy/PyPlot.jl#143, almost all the time to load PyPlot is eliminated if I put pyinitialize() at the end of PyCall.jl.

However, it goes back to 3.8 seconds (vs. 7.2 seconds without precompilation) if I put pyinitialize() in __init__ (which is where it belongs if I'm going to execute it automatically, since it needs to initialize lots of things at runtime.

Is there a way to re-arrange things so that we get more benefit from precompilation while still being safe?

@stevengj
Copy link
Member Author

stevengj commented Jul 21, 2015

I've been thinking about it, and maybe the best approach is to pick a specific libpython in Pkg.build and stick with it, rather than detecting it at runtime.

Advantages:

  • Should allow for full static compilation.
  • You only have to set PYTHON and PYTHONPATH once, and thereafter it will remember it.
  • Should eliminate the need for make 'pyinitialize' work under windows pyjulia#33 and similar hacks with trying to load the running libpython (as long as you use PyCall from the same python that it was built with).
  • Analogous to how other Julia packages are configured to use external libraries.

Disadvantages:

  • Changing your Python requires you to re-run Pkg.build("PyCall"). (e.g. from Python 2 to Python 3, or to a different Python distro; should not be required from e.g. 2.7.10 to 2.7.11, because that does not change the libpython path.)
  • The pyjulia module will only work with a single Python version on your system at a time. (Unless multiple versions of PyCall are built into different paths, which I suppose could be done by someone like Debian via fiddling with environment variables.)

cc: @malmaud, @jakebolewski

@malmaud
Copy link
Contributor

malmaud commented Jul 21, 2015

+1, as long as it's clearly documented. Maybe export an intuitively-named function to delete the cache and rebuild against the current runtime environment?

In the long term, I have been wondering if it might make sense to integrate an Anaconda python distribution with the Julia package manager, so the Python distribution is in a known location and configuration. Advanced users could override this to point to a different Python environment, similar to the situation with BLAS in Julia today. That might help with the constant IJulia problems people experience.

@stevengj
Copy link
Member Author

The rebuild has to be done when the PyCall module is not loaded yet, so I think we are stuck with Pkg.build("PyCall"). On the bright side, that's how you do it for every other Julia package. Certainly it will be documented prominently in the PyCall README.

@jakebolewski
Copy link
Collaborator

The user benefit here in terms of performance trumps the downside of statically picking one libpython to use. If you are doing serious work, switching python versions I feel is pretty rare.

@stevengj
Copy link
Member Author

Okay, I have a working branch that does this. With Base.compile(:PyCall), this gets using PyCall (including pyinitialize) down to 2 seconds (vs. 3.7 seconds without compiling, or 6.8 seconds for the old version ... just hardcoding libpython made a huge difference).

I'm having trouble figuring out how to get it down further. I tried adding lots of precompile calls, but it changed the time by less than 2%:

for T in (Int, Float64, Complex{Float64}, Bool, ASCIIString, UTF8String, Array{Float64}, Dict{ASCIIString, I\
nt}, Function, IO, @compat(Tuple{Int,Float64}), StepRange{Int,Int}, Dates.DateTime)
    precompile(PyObject, (T,))
    precompile(convert, (T, PyObject))
end
precompile(pycall, (PyObject, Type, Int))
precompile(pyimport, (ASCIIString,))
precompile(pywrap, (PyObject, Symbol))
precompile(__init__, ())
precompile(array2py, (Array{Int,3},))
precompile(py2array, (Type, PyObject))
precompile(writedims, (IO, PyBuffer, Int, Int))

@vtjnash, is there any way to profile the loading time to figure out where it is going?

@stevengj
Copy link
Member Author

Hmm, looks like about 1.5 seconds of the remaining 2 seconds is happening inside __init__ somewhere.

@stevengj
Copy link
Member Author

Okay, it looks like 0.8 seconds were being taken to compile 6 functions:

    global const jl_Function_call_ptr =
        cfunction(jl_Function_call, PyPtr, (PyPtr,PyPtr,PyPtr))
    global const pyio_repr_ptr = cfunction(pyio_repr, PyPtr, (PyPtr,))
    global const pyjlwrap_dealloc_ptr = cfunction(pyjlwrap_dealloc, Void, (PyPtr,))
    global const pyjlwrap_repr_ptr = cfunction(pyjlwrap_repr, PyPtr, (PyPtr,))
    global const pyjlwrap_hash_ptr = cfunction(pyjlwrap_hash, Uint, (PyPtr,))
    global const pyjlwrap_hash32_ptr = cfunction(pyjlwrap_hash32, Uint32, (PyPtr,))

The time for these lines is cut down to 0.27 seconds by precompiling them:

precompile(jl_Function_call, (PyPtr,PyPtr,PyPtr))
precompile(pyio_repr, (PyPtr,))
precompile(pyjlwrap_dealloc, (PyPtr,))
precompile(pyjlwrap_repr, (PyPtr,))
precompile(pyjlwrap_hash, (PyPtr,))
precompile(pyjlwrap_hash32, (PyPtr,))

However, it's a bit surprising that cfunction is so expensive even after I precompile. @vtjnash, can you comment?

@swt30
Copy link

swt30 commented Jul 25, 2015

I love the speed increase and I'm cool with re-building PyCall to use a different Python, but unfortunately that commit is breaking PyCall in my virtual environment even when I don't change environments between Julia sessions:

~❯ workon py3
(py3)~❯ julia

julia> Pkg.build("PyCall")
INFO: Building PyCall
INFO: PyCall is using python (Python 3.4.3) at ~/.virtualenvs/py3/bin/python, libpython = libpython3.4m

julia> using PyCall

julia> # works great

julia> ^D

(py3)~❯ julia

julia> using PyCall
WARNING: error initializing module PyCall:
ErrorException("cglobal: could not find symbol PyCObject_FromVoidPtr in library libpython3.4m")

That's even before using any Base.compile magic. Is some sort of PATH being lost between those Julia sessions? [Just saw the README updates so will have a look through those and see if the answer is there]

@stevengj
Copy link
Member Author

@swt30, good question. The only environment variable that is set in the build script is PYTHONHOME, but this is saved in deps.jl and restored when PyCall is loaded. So, I'm not sure what is going on here.

@stevengj
Copy link
Member Author

@swt30, what does ~/.julia/PyCall/deps/deps.jl contain?

@swt30
Copy link

swt30 commented Jul 25, 2015

Opened issue #173 with more details

@stevengj
Copy link
Member Author

Closing this issue. If there are other ways to improve static compilation, they can be separate issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants