Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

awkward rdataframe source tests #1446

Merged
merged 10 commits into from
May 2, 2022
Merged

awkward rdataframe source tests #1446

merged 10 commits into from
May 2, 2022

Conversation

ianna
Copy link
Collaborator

@ianna ianna commented Apr 27, 2022

  • Add more tests and automate checks
  • Add a lock when generating the c++ code
  • Hold a reference to the associated lookups so that the data cannot go out of scope as long as the datasource survives

@codecov
Copy link

codecov bot commented Apr 27, 2022

Codecov Report

Merging #1446 (19f92b5) into main (edfce38) will increase coverage by 0.85%.
The diff coverage is 55.84%.

❗ Current head 19f92b5 differs from pull request most recent head 765eb76. Consider uploading reports for the commit 765eb76 to get more accurate results

Impacted Files Coverage Δ
src/awkward/_v2/_connect/cuda/__init__.py 0.00% <0.00%> (ø)
src/awkward/_v2/_connect/jax/__init__.py 0.00% <0.00%> (ø)
...c/awkward/_v2/_connect/rdataframe/to_rdataframe.py 0.00% <0.00%> (ø)
src/awkward/_v2/contents/recordarray.py 82.26% <ø> (+0.76%) ⬆️
src/awkward/_v2/contents/unionarray.py 86.50% <ø> (ø)
src/awkward/_v2/operations/convert/ak_from_cupy.py 50.00% <0.00%> (+23.33%) ⬆️
src/awkward/_v2/operations/convert/ak_from_iter.py 93.75% <ø> (ø)
src/awkward/_v2/operations/convert/ak_from_jax.py 50.00% <0.00%> (-25.00%) ⬇️
src/awkward/_v2/operations/convert/ak_to_cupy.py 33.33% <0.00%> (+23.95%) ⬆️
src/awkward/_v2/operations/convert/ak_to_jax.py 33.33% <0.00%> (-41.67%) ⬇️
... and 100 more

@ianna ianna requested a review from jpivarski April 27, 2022 13:34
@ianna ianna changed the title rdataframe tests awkward rdataframe source tests Apr 27, 2022
Copy link
Member

@jpivarski jpivarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are all better tests than checking Display().Print() by eye.

Even though they aren't failing, some of them are letting memory go out of scope, such as test_empty_array and test_empty_list_array, because the internal Layout objects are not held. This test might not be able to do that correctly because we need to find out from Enrico and Enric how to attach Python objects to the RDataSource.

tests/v2/test_1374-to-rdataframe.py Outdated Show resolved Hide resolved
@ianna
Copy link
Collaborator Author

ianna commented Apr 28, 2022

@jpivarski - I've added a reference to the associated lookups to the C++ source class. This guarantees that the memory does not go out of scope. I did not remove the lookups attribute attached to the Python rdf to test a no-copy array creation after filtering. I hope, there is no harm in keeping it for a while.

@ianna ianna requested a review from jpivarski April 28, 2022 08:36
Copy link
Member

@jpivarski jpivarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, this is looking good! All of my comments are inline, and I'd like to get an answer from @eguiraud or @etejedor about GIL-safety before calling it done.

It's really nice that PyROOT just turns Python objects into PyObject* and back.

@@ -137,6 +144,8 @@ def data_frame(self, layouts):

if not hasattr(ROOT, array_data_source):
cpp_code = f"""
#include <Python.h>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Python headers would get re-included every time a new data type's ArrayView is compiled. Cling's compilation is in a single unit. Python.h probably has #ifndef/#define/#endif guards, but still, it's loading a lot of text into the C preprocessor, at least.

I just checked and it seems that Python.h is already included in the context where ROOT.gInterpreter.Declare runs.

>>> import ROOT
>>> ROOT.gInterpreter.Declare("#include <Python.h>\nvoid f(PyObject* stuff) { Py_INCREF(stuff); Py_DECREF(stuff); }")

But I don't know if that's guaranteed to be true. Maybe this would be a good way to check, to see whether the "#include <Python.h>" line needs to be prepended to the cpp_code?

>>> hasattr(ROOT, "PyObject")
True

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently, not all of the Python.h header is in the gInterpreter compilation unit at startup. The Py_REFCNT macro isn't (which we don't need, beyond debugging). This is another question we should ask @eguiraud or @etejedor: whether we can expect the bits we need to be there on all platforms: PyObject, Py_INCREF, Py_DECREF, and possibly the functions for acquiring and releasing the GIL.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved it up out of the function for now.

Comment on lines 255 to 259
rdf = getattr(ROOT.awkward, f"MakeAwkwardArrayDS_{array_data_source}")(
self.lookups,
self.length,
(self.data_ptrs_list),
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyROOT just took the self.lookups object from the Python call and turned it into a PyObject* in C++? Cool! That seems to have been easy. I wonder, though, if it permanently changes the reference count?

>>> import ROOT
>>> ROOT.gInterpreter.Declare("""
... #include <Python.h>
... 
... void f(PyObject* obj) {
...     std::cout << "HERE " << Py_REFCNT(obj) << std::endl;
... }
... """)
True
>>> import sys
>>> some_object = (1, 2.2, "three")
>>> sys.getrefcount(some_object)
2
>>> ROOT.f(some_object)
HERE 3
>>> sys.getrefcount(some_object)
2

No, it seems to be doing the right thing: the extra reference only lives as long as the function call.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does ROOT change the reference count when you return a PyObject? (And thus, we'd have to be careful not to overcount...)

>>> import ROOT
>>> ROOT.gInterpreter.Declare("""
... #include <Python.h>
... 
... PyObject* g(PyObject* obj) {
...     std::cout << "HERE " << Py_REFCNT(obj) << std::endl;
...     return obj;
... }
... """)
True
>>> import sys
>>> some_object = (1, 2.2, "three")
>>> sys.getrefcount(some_object)
2
>>> another_reference = ROOT.g(some_object)
HERE 3
>>> sys.getrefcount(some_object)
2
>>> del some_object
>>> anoth *** Break *** segmentation violation

Nope!

So the safe way to do this is to manually increment:

>>> import ROOT
>>> ROOT.gInterpreter.Declare("""
... #include <Python.h>
... 
... PyObject* g(PyObject* obj) {
...     std::cout << "HERE " << Py_REFCNT(obj) << std::endl;
...     Py_INCREF(obj);
...     return obj;
... }
... """)
True
>>> import sys
>>> some_object = (1, 2.2, "three")
>>> sys.getrefcount(some_object)
2
>>> another_reference = ROOT.g(some_object)
HERE 3
>>> sys.getrefcount(some_object)
3
>>> del some_object
>>> another_reference
(1, 2.2, 'three')

That means that we could implement an accessor on our RDataSource by writing a method like

PyObject* GetLookup() {
    Py_INCREF(fPyLookup);
    return fPyLookup;
}

possibly with GIL-safety, but probably not (because it would only be called from Python, and presumably PyROOT doesn't release the GIL for a direct call like this).

Then we would just need to get from an RDataFrame object to the RDataSource at the root of its DAG.

src/awkward/_v2/_connect/rdataframe/to_rdataframe.py Outdated Show resolved Hide resolved
Comment on lines +186 to +194
fPyLookup(lookup)
{{
Py_INCREF(fPyLookup);
{cpp_code_init_slots}
}}

~{array_data_source}() {{
Py_DECREF(fPyLookup);
}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's something we'll need @eguiraud or @etejedor to verify: whether the RDataSource is guaranteed to hold Python's GIL when it's being constructed and when it's being destructed. The Py_INCREF and Py_DECREF macros are only valid when the GIL is held, but like all segfault-producing errors, a successful test probably doesn't mean it's being used correctly.

That is, the question is whether we need to do this: https://docs.python.org/3/c-api/init.html#non-python-created-threads around the Py_INCREF in the constructor and/or around the Py_DECREF in the destructor. I suspect that we don't need to for the constructor, since the RDataFrame is being constructed by a call that initiated in Python—the GIL is probably held (Py_INCREF is safe) the whole time that is being called—but maybe not for the destructor, since that happens when some std::unique_ptr goes out of scope, which might not be related to a Python function call at all. (It might be triggered by a Python object going out of scope, but who knows? maybe another C++ object holds onto the RDataFrame and it actually gets released sometime after its Python proxy had been released.)

Copy link
Collaborator

@agoose77 agoose77 Apr 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe unrelated, but I know that cppyy looks for special attributes on Pythonisations that indicate whether e.g. the GIL can be released. If the GIL is released manually by a C++ method, then I doubt we'd have any introspective mechanism for identifying this Python-side, but in the Python->CPPYY case it should be possible to identify:
https://cppyy.readthedocs.io/en/latest/misc.html

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, any call to from Python to C++ via cppyy does not release the GIL by default, so the construction should be safe. That default behaviour can be modified on a per-function basis with the __release_gil__ attribute (as explained in the link that @agoose77 shared).

As @jpivarski pointed out, the destruction does not happen as a result of any explicit call from Python, but it should still hold the GIL since I can't see any explicit release. It happens here:

https://github.com/root-project/root/blob/master/bindings/pyroot/cppyy/CPyCppyy/src/CPPInstance.cxx#L210-L217

One way to double check would be to call PyGILState_Check() from the destructor code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @agoose77 and @etejedor! I'm getting these messages out of order. In #1448, @ianna pointed to the NumPy RDataSource, which has the same ownership-issues as ours. In RNumpyDS, the ownership of the original NumPy array (or a thin subclass of it) is directly taken by Py_INCREF and Py_DECREF (with no checks for PyGILState_Check):

https://github.com/root-project/root/blob/4f9759c604d05a8304bf0b55f5738a479441e151/bindings/pyroot/pythonizations/inc/RNumpyDS.hxx#L129-L150

Since the constructor and destructor aren't invoked in any __release_gil__ functions, they're safe and the same logic applies to our Lookup objects.

That answers what I was calling "question (1)" in the email: this is an appropriate way for our RDataSources to hold references to the Python data that own the arrays. You also had a question @etejedor, about whether we can avoid holding that reference. We can't avoid it for the same reasons as RNumpyDS—something has to keep the Python objects that own the memory in scope for as long as the data service lives. In a possible different design, we could have had all of our ArrayViews hold shared pointers to a single PyObject* and Py_DECREF it when the std::shared_ptr count goes to zero. That would still require ownership of the Python data on the C++ side, would be slower because all of the (many tiny) ArrayViews would have to maintain an atomic reference count in the std::shared_ptr, and then we'd have less control about when the Py_DECREF actually happens (though we could protect it with PyGILState_Check, as you point out).

What I called "question (2)", accessing the RDataSource from any of the RDataFrame nodes derived from it, is still something that we'd want to know for the sake of the RDataFrame → Awkward Array direction (PR #1448).

@jpivarski jpivarski mentioned this pull request Apr 29, 2022
Copy link
Member

@jpivarski jpivarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I consider all of the issues resolved and this PR is ready to be merged. In particular, we know that Py_INCREF/Py_DECREF of the Lookups in the constructor and destructor protect computations in the RDataFrame from undefined behavior/segfaults and there's no special handling to be done for GIL-checking.

Well, I made a suggested clean-up of the compiler/compile handling at the top of the file. Feel free to "auto-merge (squash)" when that and the corresponding fix on line 62 are done.

src/awkward/_v2/_connect/rdataframe/to_rdataframe.py Outdated Show resolved Hide resolved
@ianna
Copy link
Collaborator Author

ianna commented Apr 29, 2022

@jpivarski - the build fails with unable to access 'https://github.com/pybind/pybind11.git/'

@jpivarski
Copy link
Member

@jpivarski - the build fails with unable to access 'https://github.com/pybind/pybind11.git/'

That must be an intermittent GitHub outage. The URL does work—now, anyway.

I restarted the failed jobs. Also, I'm pleased that GitHub Actions now has that option—for a long time, it was a drop-down menu with only the option to restart all jobs. They must have been planning it for a while (hence the drop-down menu).

@ianna ianna merged commit edb2ff2 into main May 2, 2022
@ianna ianna deleted the ianna/awkward-rdataframe-tests branch May 2, 2022 02:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants