Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keeping Parent Structures Alive While Substructures Are #141

Open
nhusung opened this issue Oct 29, 2024 · 7 comments
Open

Keeping Parent Structures Alive While Substructures Are #141

nhusung opened this issue Oct 29, 2024 · 7 comments

Comments

@nhusung
Copy link

nhusung commented Oct 29, 2024

I’m not sure whether this is a CFFI or a CPython bug, but recently this issue was reported: OxiDD/oxidd#23. Today, I created a stripped-down version of the CFFI-based bindings at https://github.com/nhusung/cffi-cpython-bug. On the C side, we have some handle type handle_t, which is a pair of a pointer and an index. Conceptually, these handles refer to binary nodes, but in the example I just use arbitrary values. The C function handle_pair_t children(handle_t) should just return the handles to the two “child nodes” as a pair, but the bindings appear to modify the Python object corresponding to the handle passed as argument to children(). (Note that the C function takes the handle by value, so it cannot be the culprit.) Here is the Python test code:

from _foo import lib

def children(handle):
    print(f"before: ({handle._p}, {handle._i})")
    c = lib.children(handle).first
    print(f"after: ({handle._p}, {handle._i})")
    return c


handle = lib.new_handle(1)
c = children(handle)
children(c)  # modifies c although it shouldn't

Output (the line between “before” and “after” is from the C code):

before: (<cdata 'void *' NULL>, 1)
children({._p = (nil), ._i = 1}) = {.first = {._p = 0x7efec7aa0760, ._i = 21}, .second = {._p = 0x7efec7aa0760, ._i = 101}}
after: (<cdata 'void *' NULL>, 1)
before: (<cdata 'void *' 0x7efec7aa0760>, 21)
children({._p = 0x7efec7aa0760, ._i = 21}) = {.first = {._p = (nil), ._i = 41}, .second = {._p = (nil), ._i = 121}}
after: (<cdata 'void *' NULL>, 41)

I could reproduce this with CFFI 1.17.1 on both CPython 3.10.12 (Ubuntu 22.04) and CPython 3.12.7 (Fedora 40). PyPy does not appear to be affected (tested with PyPy 7.3.15 / Python 3.10.13).

With GCC’s address sanitizer, I get the following error message directly after before: (<cdata 'void *' 0x7efec7aa0760>, 21):

=================================================================
==50160==ERROR: AddressSanitizer: heap-use-after-free on address 0x5070000011d0 at pc 0x7f0761ef5676 bp 0x7ffc454112c0 sp 0x7ffc45410a80
READ of size 16 at 0x5070000011d0 thread T0
    #0 0x7f0761ef5675 in memcpy (/usr/lib64/libasan.so.8+0xf5675) (BuildId: a4ad7eb954b390cf00f07fa10952988a41d9fc7a)
    #1 0x7f0761502bdc in convert_from_object src/c/_cffi_backend.c:1792
    #2 0x7f0761deba73 in _cffi_f_children /tmp/tmpmx2jcyr6.build-temp/_foo.c:602
    #3 0x7f076197385d in _PyEval_EvalFrameDefault (/lib64/libpython3.12.so.1.0+0x17385d) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #4 0x7f07619fe3b3 in PyEval_EvalCode (/lib64/libpython3.12.so.1.0+0x1fe3b3) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #5 0x7f0761a1c9b6 in builtin_exec (/lib64/libpython3.12.so.1.0+0x21c9b6) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #6 0x7f076198a2bb in cfunction_vectorcall_FASTCALL_KEYWORDS (/lib64/libpython3.12.so.1.0+0x18a2bb) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #7 0x7f0761978c2e in _PyEval_EvalFrameDefault (/lib64/libpython3.12.so.1.0+0x178c2e) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #8 0x7f07619920ba in object_vacall (/lib64/libpython3.12.so.1.0+0x1920ba) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #9 0x7f07619b1919 in PyObject_CallMethodObjArgs (/lib64/libpython3.12.so.1.0+0x1b1919) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #10 0x7f07619b11e7 in PyImport_ImportModuleLevelObject (/lib64/libpython3.12.so.1.0+0x1b11e7) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #11 0x7f0761978dc5 in _PyEval_EvalFrameDefault (/lib64/libpython3.12.so.1.0+0x178dc5) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #12 0x7f07619fe3b3 in PyEval_EvalCode (/lib64/libpython3.12.so.1.0+0x1fe3b3) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #13 0x7f0761a23ad9 in run_eval_code_obj (/lib64/libpython3.12.so.1.0+0x223ad9) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #14 0x7f0761a1e10d in run_mod (/lib64/libpython3.12.so.1.0+0x21e10d) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #15 0x7f0761a10bf5 in PyRun_StringFlags (/lib64/libpython3.12.so.1.0+0x210bf5) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #16 0x7f0761a108e3 in PyRun_SimpleStringFlags (/lib64/libpython3.12.so.1.0+0x2108e3) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #17 0x7f0761a300c2 in Py_RunMain (/lib64/libpython3.12.so.1.0+0x2300c2) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #18 0x7f07619e7e6b in Py_BytesMain (/lib64/libpython3.12.so.1.0+0x1e7e6b) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #19 0x7f0761639087 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #20 0x7f076163914a in __libc_start_main_impl ../csu/libc-start.c:360
    #21 0x56292c0dd094 in _start (/usr/bin/python3.12+0x1094) (BuildId: d1c1e9a6a04c8224290e342a507a918dcff95537)

0x5070000011d0 is located 48 bytes inside of 80-byte region [0x5070000011a0,0x5070000011f0)
freed by thread T0 here:
    #0 0x7f0761ef6638 in free.part.0 (/usr/lib64/libasan.so.8+0xf6638) (BuildId: a4ad7eb954b390cf00f07fa10952988a41d9fc7a)
    #1 0x7f076197e59f in _PyEval_EvalFrameDefault (/lib64/libpython3.12.so.1.0+0x17e59f) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #2 0x7f07619fe3b3 in PyEval_EvalCode (/lib64/libpython3.12.so.1.0+0x1fe3b3) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #3 0x7f0761a1c9b6 in builtin_exec (/lib64/libpython3.12.so.1.0+0x21c9b6) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #4 0x7f076198a2bb in cfunction_vectorcall_FASTCALL_KEYWORDS (/lib64/libpython3.12.so.1.0+0x18a2bb) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #5 0x7f0761978c2e in _PyEval_EvalFrameDefault (/lib64/libpython3.12.so.1.0+0x178c2e) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #6 0x7f07619920ba in object_vacall (/lib64/libpython3.12.so.1.0+0x1920ba) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #7 0x7f07619b1919 in PyObject_CallMethodObjArgs (/lib64/libpython3.12.so.1.0+0x1b1919) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #8 0x7f07619b11e7 in PyImport_ImportModuleLevelObject (/lib64/libpython3.12.so.1.0+0x1b11e7) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #9 0x7f0761978dc5 in _PyEval_EvalFrameDefault (/lib64/libpython3.12.so.1.0+0x178dc5) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #10 0x7f07619fe3b3 in PyEval_EvalCode (/lib64/libpython3.12.so.1.0+0x1fe3b3) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #11 0x7f0761a23ad9 in run_eval_code_obj (/lib64/libpython3.12.so.1.0+0x223ad9) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #12 0x7f0761a1e10d in run_mod (/lib64/libpython3.12.so.1.0+0x21e10d) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #13 0x7f0761a10bf5 in PyRun_StringFlags (/lib64/libpython3.12.so.1.0+0x210bf5) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #14 0x7f0761a108e3 in PyRun_SimpleStringFlags (/lib64/libpython3.12.so.1.0+0x2108e3) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #15 0x7f0761a300c2 in Py_RunMain (/lib64/libpython3.12.so.1.0+0x2300c2) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #16 0x7f07619e7e6b in Py_BytesMain (/lib64/libpython3.12.so.1.0+0x1e7e6b) (BuildId: 180b9c2b9ddff4d7006aec0fb9c92af57c2025f4)
    #17 0x7f0761639087 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #18 0x7f076163914a in __libc_start_main_impl ../csu/libc-start.c:360
    #19 0x56292c0dd094 in _start (/usr/bin/python3.12+0x1094) (BuildId: d1c1e9a6a04c8224290e342a507a918dcff95537)

previously allocated by thread T0 here:
    #0 0x7f0761ef7997 in malloc (/usr/lib64/libasan.so.8+0xf7997) (BuildId: a4ad7eb954b390cf00f07fa10952988a41d9fc7a)
    #1 0x7f07614f8d4b in allocate_owning_object src/c/_cffi_backend.c:3760
    #2 0x7f07614f8d4b in convert_struct_to_owning_object src/c/_cffi_backend.c:3790

SUMMARY: AddressSanitizer: heap-use-after-free (/usr/lib64/libasan.so.8+0xf5675) (BuildId: a4ad7eb954b390cf00f07fa10952988a41d9fc7a) in memcpy
Shadow bytes around the buggy address:
  0x507000000f00: fd fd fd fd fd fd fd fd fd fd fa fa fa fa 00 00
  0x507000000f80: 00 00 00 00 00 00 06 fa fa fa fa fa 00 00 00 00
  0x507000001000: 00 00 00 00 06 fa fa fa fa fa 00 00 00 00 00 00
  0x507000001080: 00 00 00 fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x507000001100: 00 fa fa fa fa fa 00 00 00 00 00 00 00 00 00 03
=>0x507000001180: fa fa fa fa fd fd fd fd fd fd[fd]fd fd fd fa fa
  0x507000001200: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x507000001280: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x507000001300: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x507000001380: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x507000001400: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==50160==ABORTING
@mattip
Copy link
Contributor

mattip commented Oct 29, 2024

It can be tricky to manage object lifetimes when one mixes C and Python like CFFI allows you to do.

The C function handle_pair_t children(handle_t) should just return the handles to the two “child nodes” as a pair

You must explicitly keep the return value alive, otherwise the GC will destroy it. Making it explicit may be clearer, and suggests a path to solving the problem. I think this would solve the problem, does it make sense?

def children(handle):
    ret = ffi.new("handle_t[1]")
    pair = children(handle)
    # What keeps `pair` alive?
    # Copy the data, so when pair is collected the data will still be valid
    ret[0] = pair.first
    return ret

@nhusung
Copy link
Author

nhusung commented Oct 29, 2024

Thank you for your quick answer! I was assuming that pair.first would create a copy of the handle_t, but it is just an unowned reference to the handle_t, as you pointed out. Unfortunately, I cannot simply change the return type to handle_t[1] or handle_t *, because lib.children() expects a handle_t and (unless I missed something) I cannot pass a handle_t[1] or a handle_t * in place of a handle_t. Is there a simple way to copy the struct behind a handle_t & into a new handle_t? ffi.new() appears to support pointer and array types only. In principle, I could write a C function handle_t identity(handle_t h) { return h; }, but doing so for every type seems like a lot of boilerplate and (also does not sound terribly efficient). Alternatively, I could always copy the result of a function returning a struct into a new allocation (e.g., handle = ffi.new("struct handle_t *", lib.new_handle(1))), and then write lib.children(handle[0]). Allowing the handle parameter of the Python children function to be either handle_t * or handle_t (with a runtime check in the Python function) does not really appear to be an option. Which solution is “intended” by CFFI here?

Then, I have another question regarding the fix you proposed, just to be sure: Until when is pair guaranteed to live? Is it until the function returns (akin to C++ and Rust) or only until the last reference of pair? I.e., is there a possibility that pair is GC’d right after pair.first is evaluated but before the handle_t is copied into ret[0]?

@arigo
Copy link
Contributor

arigo commented Oct 30, 2024

@mattip I think you're going in the wrong direction. The C code doesn't call malloc() and the Python code doesn't call ffi.new() in this example, so there is no keepalive issue that should be visible to the user.

@arigo
Copy link
Contributor

arigo commented Oct 30, 2024

@mattip Sorry, now I understood why your proposed fix is correct. Another way to view it is by changing the prints:

def children(handle):
    print(f"The parameter I got: {handle}")
    c = lib.children(handle).first
    return c

You can see that the first time it is called, it is with a <cdata 'struct handle_t' owning 16 bytes>. The second time, it is with a <cdata 'struct handle_t &' 0x...>. The problem is, like I now realize you said, that the expression lib.children(handle).first is buggy because it gets a pointer to the first field inside that structure, which is a substructure, but then it frees the parent structure because the Python object returned by lib.children(handle) disappears.

@arigo
Copy link
Contributor

arigo commented Oct 30, 2024

@nhusung Python semantics should guarantee that pair won't be freed until the exit of the function, so the line ret[0] = pair.first should not have a free-before-read issue. At least, this should work in both CPython and PyPy.

But that's arguably an unclean workaround for a bug of cffi. Maybe cffi itself should keep the parent structure alive when we get a substructure. There isn't much of a point about getting a substructure, which is really a pointer somewhere inside the parent structure, if we don't keep the parent alive.

@mattip
Copy link
Contributor

mattip commented Oct 30, 2024

Maybe cffi itself should keep the parent structure alive when we get a substructure

There must be previous art for this kind of problem. CTypes or C/C++?

@nhusung nhusung changed the title Use-After-Free in convert_from_object() for Structs(?) Keeping Parent Structures Alive While Substructures Are Oct 30, 2024
@arigo
Copy link
Contributor

arigo commented Oct 30, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants