Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[subinterpreters] Global C variables are a problem #81057

Closed
ericsnowcurrently opened this issue May 10, 2019 · 23 comments
Closed

[subinterpreters] Global C variables are a problem #81057

ericsnowcurrently opened this issue May 10, 2019 · 23 comments
Assignees
Labels
3.8 only security fixes topic-subinterpreters type-bug An unexpected behavior, bug, or error

Comments

@ericsnowcurrently
Copy link
Member

ericsnowcurrently commented May 10, 2019

BPO 36876
Nosy @vsajip, @vstinner, @tiran, @phsilva, @ericsnowcurrently, @soltysh, @pablogsal, @sweeneyde
PRs
  • bpo-36876: Avoid static locals. #13372
  • bpo-36876: Use a consistent variable name for kwlist. #13531
  • bpo-36876: Make some static string literal arrays constant. #15760
  • bpo-36876: Add a tool that identifies unsupported global C variables. #15877
  • bpo-36876: Skip test_check_c_globals for now. #16017
  • bpo-36876: Fix the globals checker tool. #16058
  • bpo-36876: Moved Parser/listnode.c statics to interpreter state. #16328
  • bpo-36876: Re-organize the c-analyzer tool code. #16841
  • bpo-36876: Fix the C analyzer tool. #22841
  • bpo-36876: Small adjustments to the C-analyzer tool. #23045
  • bpo-36876: [c-analyzer tool] Tighten up the results and output. #23431
  • bpo-36876: [c-analyzer tool] Add a "capi" subcommand to the c-analyzer tool. #23918
  • bpo-36876: [c-analyzer tool] Additional CLI updates for "capi" command. #23929
  • bpo-36876: Update the c-analyzer whitelist. #31225
  • bpo-36876: Minor cleanup to c-analyzer "ignored" data.' #31239
  • bpo-36876: Make sure the c-analyzer is checking all the source files.' #31264
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/ericsnowcurrently'
    closed_at = None
    created_at = <Date 2019-05-10.19:01:42.087>
    labels = ['expert-subinterpreters', 'type-bug', '3.8']
    title = '[subinterpreters] Global C variables are a problem'
    updated_at = <Date 2022-02-10.23:14:23.025>
    user = 'https://github.com/ericsnowcurrently'

    bugs.python.org fields:

    activity = <Date 2022-02-10.23:14:23.025>
    actor = 'eric.snow'
    assignee = 'eric.snow'
    closed = False
    closed_date = None
    closer = None
    components = ['Subinterpreters']
    creation = <Date 2019-05-10.19:01:42.087>
    creator = 'eric.snow'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 36876
    keywords = ['patch']
    message_count = 22.0
    messages = ['342119', '342125', '352013', '352032', '352063', '352085', '352208', '354928', '356181', '357351', '357352', '357353', '368910', '379390', '379999', '381511', '383698', '383778', '393785', '412886', '412957', '413029']
    nosy_count = 8.0
    nosy_names = ['vinay.sajip', 'vstinner', 'christian.heimes', 'phsilva', 'eric.snow', 'maciej.szulik', 'pablogsal', 'Dennis Sweeney']
    pr_nums = ['13372', '13531', '15760', '15877', '16017', '16058', '16328', '16841', '22841', '23045', '23431', '23918', '23929', '31225', '31239', '31264']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue36876'
    versions = ['Python 3.8']

    @ericsnowcurrently
    Copy link
    Member Author

    We still have a bunch of "global" C variables (static globals, static locals, maybe thread-local storage) in our code-base that break the isolation between interpreters. I added Tools/c-globals/check-c-globals.py a while back to help identify such variables, however more have crept in. I also did not take static locals or thread-locals into account.

    To address the above, we should do the following:

    1. update check-c-globals.py to identify static locals (and thread-locals)
    2. deal with any identified globals
    3. add check-c-globals.py to "make check"
    4. (if "make check" isn't already there), ensure check-c-globals.py is run at some point in CI

    Separately, we should move fields out of _PyRuntimeState into PyInterpreterState wherever possible. That can also be done at step 2 if it's not too much work.

    @ericsnowcurrently ericsnowcurrently added interpreter-core (Objects, Python, Grammar, and Parser dirs) 3.8 only security fixes labels May 10, 2019
    @ericsnowcurrently ericsnowcurrently self-assigned this May 10, 2019
    @ericsnowcurrently ericsnowcurrently added the type-bug An unexpected behavior, bug, or error label May 10, 2019
    @ericsnowcurrently
    Copy link
    Member Author

    Also, Tools/c-globals/ignored-globals.txt is a bit out of date (some vars have been removed, renamed, or moved to another file). That should get cleaned up. It might also make sense to update check-c-globals.py to verify that all variables in ignored-globals.txt actually exist.

    @ericsnowcurrently
    Copy link
    Member Author

    New changeset ee536b2 by Eric Snow in branch 'master':
    bpo-36876: Add a tool that identifies unsupported global C variables. (bpo-15877)
    ee536b2

    @db3l
    Copy link
    Contributor

    db3l commented Sep 11, 2019

    The new test_check_c_globals.ActualChecks test is failing with an "unexpected success" on the bolen-ubuntu buildbot (under Ubuntu 18.04.3). I can reproduce the failure in a manually built tree.

    @ericsnowcurrently
    Copy link
    Member Author

    @db3l, I'll take a look right away.

    @ericsnowcurrently
    Copy link
    Member Author

    New changeset 64535fc by Eric Snow in branch 'master':
    bpo-36876: Skip test_check_c_globals for now. (gh-16017)
    64535fc

    @ericsnowcurrently
    Copy link
    Member Author

    New changeset 088b63e by Eric Snow in branch 'master':
    bpo-36876: Fix the globals checker tool. (gh-16058)
    088b63e

    @ericsnowcurrently
    Copy link
    Member Author

    New changeset e4c431e by Eric Snow in branch 'master':
    bpo-36876: Re-organize the c-analyzer tool code. (gh-16841)
    e4c431e

    @vsajip
    Copy link
    Member

    vsajip commented Nov 7, 2019

    New changeset 9def81a by Vinay Sajip in branch 'master':
    bpo-36876: Moved Parser/listnode.c statics to interpreter state. (GH-16328)
    9def81a

    @ericsnowcurrently
    Copy link
    Member Author

    Thanks, Vinay!

    @ericsnowcurrently
    Copy link
    Member Author

    FYI, others have been tackling this in separate issues (e.g. Victor, anyone relative to PEP-384).

    @ericsnowcurrently
    Copy link
    Member Author

    And I *am* still working on the c-analyzer + a test to that runs it, so we can keep from adding more globals. :)

    @vstinner vstinner added topic-subinterpreters and removed interpreter-core (Objects, Python, Grammar, and Parser dirs) labels May 15, 2020
    @vstinner vstinner changed the title Global C variables are a problem. [subinterpreters] Global C variables are a problem May 15, 2020
    @vstinner
    Copy link
    Member

    More and more C extensions are converted to multiphase initialization API (PEP-489) and their global variables are moved into a new module state.

    bpo-39465 will make _Py_IDENTIFIER compatible with subinterpreters.

    See also bpo-40521 for caches like free lists and Unicode interned strings.

    @ericsnowcurrently
    Copy link
    Member Author

    New changeset 345cd37 by Eric Snow in branch 'master':
    bpo-36876: Fix the C analyzer tool. (GH-22841)
    345cd37

    @ericsnowcurrently
    Copy link
    Member Author

    New changeset 4fe7209 by Eric Snow in branch 'master':
    bpo-36876: Small adjustments to the C-analyzer tool. (GH-23045)
    4fe7209

    @ericsnowcurrently
    Copy link
    Member Author

    New changeset 9f02b47 by Eric Snow in branch 'master':
    bpo-36876: [c-analyzer tool] Tighten up the results and output. (GH-23431)
    9f02b47

    @ericsnowcurrently
    Copy link
    Member Author

    New changeset 7ec59d8 by Eric Snow in branch 'master':
    bpo-36876: [c-analyzer tool] Add a "capi" subcommand to the c-analyzer tool. (gh-23918)
    7ec59d8

    @ericsnowcurrently
    Copy link
    Member Author

    New changeset 5ae9be6 by Eric Snow in branch 'master':
    bpo-36876: [c-analyzer tool] Additional CLI updates for "capi" command. (gh-23929)
    5ae9be6

    @sweeneyde
    Copy link
    Member

    I'm getting the following FutureWarning on a certain regular expression. I think it just needs "[]" to become "\[\]".

    0:05:36 load avg: 0.00 [ 53/427] test_check_c_globals
    ...\Tools\c-analyzer\c_common\tables.py:236: FutureWarning: Possible nested set at position 12
    _COLSPEC_RE = re.compile(textwrap.dedent(r'''

    @ericsnowcurrently
    Copy link
    Member Author

    New changeset 77bab59 by Eric Snow in branch 'main':
    bpo-36876: Update the c-analyzer whitelist. (gh-31225)
    77bab59

    @ericsnowcurrently
    Copy link
    Member Author

    New changeset cb68788 by Eric Snow in branch 'main':
    bpo-36876: Minor cleanup to c-analyzer "ignored" data.' (gh-31239)
    cb68788

    @ericsnowcurrently
    Copy link
    Member Author

    New changeset 80e4f26 by Eric Snow in branch 'main':
    bpo-36876: Make sure the c-analyzer is checking all the source files.' (gh-31264)
    80e4f26

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    ericsnowcurrently added a commit that referenced this issue Dec 9, 2022
    An earlier commit only defined check_ticks_per_second() when HAVE_TIMES is defined. However, we also need it when HAVE_CLOCK is defined. This primarily affects Windows.
    
    #81057
    @vstinner
    Copy link
    Member

    vstinner commented Dec 9, 2022

    I still don't understand how moving global variables into _PyRuntime does solve any problem. What are the advantages of moving a variable declared as "static" in a C file into _PyRuntime? I don't get the rationale.

    For me, it just sounds more complicated to get _PyRuntime.threads._condattr_monotonic.ptr instead of condattr_monotonic. It's annoying to have to repeat all conditions to decide if a variable should be declared or not in _PyRuntimeState, since they are tons of variables, and many are platform specific. A side effect is that any C file using _PyRuntime should now include headers files required to decide if platform specific variables should be declared or not.

    I already noticed this problem when I made global variables per-interpreter: move them into PyInterpreterState. See the long list of direct includes in pycore_interp.h:

    #include "pycore_atomic.h"        // _Py_atomic_address
    #include "pycore_ast_state.h"     // struct ast_state
    #include "pycore_code.h"          // struct callable_cache
    #include "pycore_context.h"       // struct _Py_context_state
    #include "pycore_dict.h"          // struct _Py_dict_state
    #include "pycore_exceptions.h"    // struct _Py_exc_state
    #include "pycore_floatobject.h"   // struct _Py_float_state
    #include "pycore_genobject.h"     // struct _Py_async_gen_state
    #include "pycore_gil.h"           // struct _gil_runtime_state
    #include "pycore_gc.h"            // struct _gc_runtime_state
    #include "pycore_list.h"          // struct _Py_list_state
    #include "pycore_tuple.h"         // struct _Py_tuple_state
    #include "pycore_typeobject.h"    // struct type_cache
    #include "pycore_unicodeobject.h" // struct _Py_unicode_state
    #include "pycore_warnings.h"      // struct _warnings_runtime_state
    

    But they are also many indirect includes, like <windows.h> which is quite big and has many side effects :-(

    ericsnowcurrently added a commit that referenced this issue Dec 9, 2022
    ericsnowcurrently added a commit that referenced this issue Dec 9, 2022
    ericsnowcurrently added a commit that referenced this issue Dec 12, 2022
    We can't move it to _PyRuntimeState because the symbol is exposed in the stable ABI. We'll have to sort that out before a per-interpreter GIL, but it shouldn't be too hard.
    
    #81057
    carljm added a commit to carljm/cpython that referenced this issue Dec 14, 2022
    * main: (103 commits)
      pythongh-100248: Add missing `ssl_shutdown_timeout` parameter in `asyncio` docs (python#100249)
      Assorted minor fixes for specialization stats. (pythonGH-100219)
      pythongh-100176: venv: Remove redundant compat code for Python <= 3.2 (python#100177)
      pythonGH-100222: Redefine _Py_CODEUNIT as a union to clarify structure of code unit. (pythonGH-100223)
      pythongh-99955: undef ERROR and SUCCESS before redefining (fixes sanitizer warning) (python#100215)
      pythonGH-100206: use versionadded for the addition of sysconfig.get_default_scheme (python#100207)
      pythongh-81057: Move _Py_RefTotal to the "Ignored Globals" List (pythongh-100203)
      pythongh-81057: Move Signal-Related Globals to _PyRuntimeState (pythongh-100085)
      pythongh-81057: Move faulthandler Globals to _PyRuntimeState (pythongh-100152)
      pythongh-81057: Move tracemalloc Globals to _PyRuntimeState (pythongh-100151)
      pythonGH-100143: Improve collecting pystats for parts of runs (pythonGH-100144)
      pythongh-99955: standardize return values of functions in compiler's code-gen (python#100010)
      pythongh-79218: Define `MS_WIN64` macro for Mingw-w64 64bit on Windows (pythonGH-100137)
      Fix: typo (Indention) (pythonGH-99904)
      pythongh-96715 Remove redundant NULL check in `profile_trampoline` function (python#96716)
      pythongh-100176: remove incorrect version compatibility check from argument clinic (python#100190)
      clarify the 4300-digit limit on int-str conversion (python#100175)
      pythongh-70393: Clarify mention of "middle" scope (python#98839)
      pythongh-99688: Fix outdated tests in test_unary (python#99712)
      pythongh-100174: [Enum] Correct PowersOfThree example. (pythonGH-100178)
      ...
    carljm added a commit to carljm/cpython that referenced this issue Dec 16, 2022
    * main:
      Improve stats presentation for calls. (pythonGH-100274)
      Better stats for `LOAD_ATTR` and `STORE_ATTR` (pythonGH-100295)
      pythongh-81057: Move the Cached Parser Dummy Name to _PyRuntimeState (python#100277)
      Document that zipfile's pwd parameter is a `bytes` object (python#100209)
      pythongh-99767: mark `PyTypeObject.tp_watched` as internal use only in table (python#100271)
      Fix typo in introduction.rst (python#100266)
      pythongh-78997: AttributeError if loading fails in LibraryLoader.__getattr__
      pythonGH-100234: Set a default value for random.expovariate() (pythonGH-100235)
      Remove uninformative itertools recipe (pythonGH-100253)
      pythonGH-99767: update PyTypeObject docs for type watchers (pythonGH-99928)
      Move stats for the method cache into the `Py_STAT` machinery (pythonGH-100255)
      pythonGH-100222: fix typo _py_set_opocde -> _py_set_opcode (pythonGH-100259)
      pythonGH-100000: Cleanup and polish various watchers code (pythonGH-99998)
      pythongh-90111: Minor Cleanup for Runtime-Global Objects (pythongh-100254)
    shihai1991 added a commit to shihai1991/cpython that referenced this issue Dec 18, 2022
    * origin/main: (1306 commits)
      Correct CVE-2020-10735 documentation (python#100306)
      pythongh-100272: Fix JSON serialization of OrderedDict (pythonGH-100273)
      pythongh-93649: Split tracemalloc tests from _testcapimodule.c (python#99551)
      Docs: Use `PY_VERSION_HEX` for version comparison (python#100179)
      pythongh-97909: Fix markup for `PyMethodDef` members (python#100089)
      pythongh-99240: Reset pointer to NULL when the pointed memory is freed in argument parsing (python#99890)
      pythongh-99240: Reset pointer to NULL when the pointed memory is freed in argument parsing (python#99890)
      pythonGH-98831: Add DECREF_INPUTS(), expanding to DECREF() each stack input (python#100205)
      pythongh-78707: deprecate passing >1 argument to `PurePath.[is_]relative_to()` (pythonGH-94469)
      pythongh-99540: Constant hash for _PyNone_Type to aid reproducibility (pythonGH-99541)
      pythongh-100039: enhance __signature__ to work with str and callables (pythonGH-100168)
      pythongh-99830: asyncio: Document returns of remove_{reader,writer} (python#100302)
      "Compound statement" docs: Fix with-statement step indexing (python#100286)
      pythonGH-90043: Handle NaNs in COMPARE_OP_FLOAT_JUMP (pythonGH-100278)
      Improve stats presentation for calls. (pythonGH-100274)
      Better stats for `LOAD_ATTR` and `STORE_ATTR` (pythonGH-100295)
      pythongh-81057: Move the Cached Parser Dummy Name to _PyRuntimeState (python#100277)
      Document that zipfile's pwd parameter is a `bytes` object (python#100209)
      pythongh-99767: mark `PyTypeObject.tp_watched` as internal use only in table (python#100271)
      Fix typo in introduction.rst (python#100266)
      ...
    ericsnowcurrently added a commit that referenced this issue Mar 9, 2023
    …2505)
    
    distutils was removed in November. However, the c-analyzer relies on it. To solve that here, we vendor the parts the tool needs so it can be run against 3.12+. (Also see gh-92584.)
    
    Note that we may end up removing this code later in favor of a solution in common with the peg_generator tool (which also relies on distutils).  At the least, the copy here makes sure the c-analyzer tool works on 3.12+ in the meantime.
    ericsnowcurrently added a commit that referenced this issue Mar 14, 2023
    …02506)
    
    This will keep us from adding new unsupported (i.e. non-const) C global variables, which would break interpreter isolation.
    
    FYI, historically it is very uncommon for new global variables to get added. Furthermore, it is rare for new code to break the c-analyzer. So the check should almost always pass unnoticed.
    
    Note that I've removed test_check_c_globals. A test wasn't a great fit conceptually and was super slow on debug builds. A CI check is a better fit.
    
    This also resolves gh-100237.
    
    #81057
    carljm added a commit to carljm/cpython that referenced this issue Mar 14, 2023
    * main: (50 commits)
      pythongh-102674: Remove _specialization_stats from Lib/opcode.py (python#102685)
      pythongh-102660: Handle m_copy Specially for the sys and builtins Modules (pythongh-102661)
      pythongh-102354: change python3 to python in docs examples (python#102696)
      pythongh-81057: Add a CI Check for New Unsupported C Global Variables (pythongh-102506)
      pythonGH-94851: check unicode consistency of static strings in debug mode (python#102684)
      pythongh-100315: clarification to `__slots__` docs. (python#102621)
      pythonGH-100227: cleanup initialization of global interned dict (python#102682)
      doc: Remove a duplicate 'versionchanged' in library/asyncio-task (pythongh-102677)
      pythongh-102013: Add PyUnstable_GC_VisitObjects (python#102014)
      pythonGH-102670: Use sumprod() to simplify, speed up, and improve accuracy of statistics functions (pythonGH-102649)
      pythongh-102627: Replace address pointing toward malicious web page (python#102630)
      pythongh-98831: Use DECREF_INPUTS() more (python#102409)
      pythongh-101659: Avoid Allocation for Shared Exceptions in the _xxsubinterpreters Module (pythongh-102659)
      pythongh-101524: Fix the ChannelID tp_name (pythongh-102655)
      pythongh-102069: Fix `__weakref__` descriptor generation for custom dataclasses (python#102075)
      pythongh-98169 dataclasses.astuple support DefaultDict (python#98170)
      pythongh-102650: Remove duplicate include directives from multiple source files (python#102651)
      pythonGH-100987: Don't cache references to the names and consts array in `_PyEval_EvalFrameDefault`. (python#102640)
      pythongh-87092: refactor assemble() to a number of separate functions, which do not need the compiler struct (python#102562)
      pythongh-102192: Replace PyErr_Fetch/Restore etc by more efficient alternatives (python#102631)
      ...
    Fidget-Spinner pushed a commit to Fidget-Spinner/cpython that referenced this issue Mar 27, 2023
    …pythongh-102506)
    
    This will keep us from adding new unsupported (i.e. non-const) C global variables, which would break interpreter isolation.
    
    FYI, historically it is very uncommon for new global variables to get added. Furthermore, it is rare for new code to break the c-analyzer. So the check should almost always pass unnoticed.
    
    Note that I've removed test_check_c_globals. A test wasn't a great fit conceptually and was super slow on debug builds. A CI check is a better fit.
    
    This also resolves pythongh-100237.
    
    python#81057
    warsaw pushed a commit to warsaw/cpython that referenced this issue Apr 11, 2023
    …pythongh-102506)
    
    This will keep us from adding new unsupported (i.e. non-const) C global variables, which would break interpreter isolation.
    
    FYI, historically it is very uncommon for new global variables to get added. Furthermore, it is rare for new code to break the c-analyzer. So the check should almost always pass unnoticed.
    
    Note that I've removed test_check_c_globals. A test wasn't a great fit conceptually and was super slow on debug builds. A CI check is a better fit.
    
    This also resolves pythongh-100237.
    
    python#81057
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes topic-subinterpreters type-bug An unexpected behavior, bug, or error
    Projects
    Status: Done
    Development

    No branches or pull requests

    5 participants