Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky test test_trigger_firing with ' SQLite objects created in a thread can only be used in that same thread.' #33323

Closed
1 task done
potiuk opened this issue Aug 11, 2023 · 0 comments · Fixed by #34075
Closed
1 task done
Labels
kind:meta High-level information important to the community Quarantine Issues that are occasionally failing and are quarantined

Comments

@potiuk
Copy link
Member

potiuk commented Aug 11, 2023

Body

Observed in https://github.com/apache/airflow/actions/runs/5835505313/job/15827357798?pr=33309

___________________ ERROR at teardown of test_trigger_firing ___________________

self = <sqlalchemy.future.engine.Connection object at 0x7f92d3327910>

    def _rollback_impl(self):
        assert not self.__branch_from
    
        if self._has_events or self.engine._has_events:
            self.dispatch.rollback(self)
    
        if self._still_open_and_dbapi_connection_is_valid:
            if self._echo:
                if self._is_autocommit_isolation():
                    self._log_info(
                        "ROLLBACK using DBAPI connection.rollback(), "
                        "DBAPI should ignore due to autocommit mode"
                    )
                else:
                    self._log_info("ROLLBACK")
            try:
>               self.engine.dialect.do_rollback(self.connection)

/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py:1062: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <sqlalchemy.dialects.sqlite.pysqlite.SQLiteDialect_pysqlite object at 0x7f92d9698650>
dbapi_connection = <sqlalchemy.pool.base._ConnectionFairy object at 0x7f92d33839d0>

    def do_rollback(self, dbapi_connection):
>       dbapi_connection.rollback()
E       sqlite3.ProgrammingError: SQLite objects created in a thread can only be used in that same thread. The object was created in thread id 140269149157120 and this is thread id 140269532822400.

/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/default.py:683: ProgrammingError

The above exception was the direct cause of the following exception:

    @pytest.fixture(autouse=True, scope="function")
    def close_all_sqlalchemy_sessions():
        from sqlalchemy.orm import close_all_sessions
    
        close_all_sessions()
        yield
>       close_all_sessions()

tests/conftest.py:953: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/session.py:4315: in close_all_sessions
    sess.close()
/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/session.py:1816: in close
    self._close_impl(invalidate=False)
/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/session.py:1858: in _close_impl
    transaction.close(invalidate)
/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/session.py:926: in close
    transaction.close()
/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py:2426: in close
    self._do_close()
/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py:2649: in _do_close
    self._close_impl()
/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py:2635: in _close_impl
    self._connection_rollback_impl()
/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py:2627: in _connection_rollback_impl
    self.connection._rollback_impl()
/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py:1064: in _rollback_impl
    self._handle_dbapi_exception(e, None, None, None, None)
/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py:2134: in _handle_dbapi_exception
    util.raise_(
/usr/local/lib/python3.11/site-packages/sqlalchemy/util/compat.py:211: in raise_
    raise exception
/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py:1062: in _rollback_impl
    self.engine.dialect.do_rollback(self.connection)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <sqlalchemy.dialects.sqlite.pysqlite.SQLiteDialect_pysqlite object at 0x7f92d9698650>
dbapi_connection = <sqlalchemy.pool.base._ConnectionFairy object at 0x7f92d33839d0>

    def do_rollback(self, dbapi_connection):
>       dbapi_connection.rollback()
E       sqlalchemy.exc.ProgrammingError: (sqlite3.ProgrammingError) SQLite objects created in a thread can only be used in that same thread. The object was created in thread id 140269149157120 and this is thread id 140269532822400.
E       (Background on this error at: https://sqlalche.me/e/14/f405)

/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/default.py:683: ProgrammingError
------------------------------ Captured log call -------------------------------
INFO     airflow.jobs.triggerer_job_runner:triggerer_job_runner.py:171 Setting up TriggererHandlerWrapper with handler <FileTaskHandler (NOTSET)>
INFO     airflow.jobs.triggerer_job_runner:triggerer_job_runner.py:227 Setting up logging queue listener with handlers [<LocalQueueHandler (NOTSET)>, <TriggererHandlerWrapper (NOTSET)>]
INFO     airflow.jobs.triggerer_job_runner.TriggerRunner:triggerer_job_runner.py:596 trigger test_dag/test_run/test_ti/-1/1 (ID 1) starting
INFO     airflow.jobs.triggerer_job_runner.TriggerRunner:triggerer_job_runner.py:600 Trigger test_dag/test_run/test_ti/-1/1 (ID 1) fired: TriggerEvent<True>
Level 100 airflow.triggers.testing.SuccessTrigger:triggerer_job_runner.py:633 trigger end
INFO     airflow.jobs.triggerer_job_runner.TriggerRunner:triggerer_job_runner.py:622 trigger test_dag/test_run/test_ti/-1/1 (ID 1) completed

Committer

  • I acknowledge that I am a maintainer/committer of the Apache Airflow project.
@potiuk potiuk added the kind:meta High-level information important to the community label Aug 11, 2023
@potiuk potiuk added the Quarantine Issues that are occasionally failing and are quarantined label Aug 14, 2023
potiuk added a commit to potiuk/airflow that referenced this issue Sep 4, 2023
As documented in apache#33323, we have frequent failures of the flaky
triggerer job tests.

The flaky failures are about some errors when we close all the
sessions in teardown of the test. It turns out that the tests
had side-effect - they have not waited for the TriggererJob thread
to complete, they merely marked them to be stopped, but they have
not waited for those to complete - this is quite plausible
explanation of the flaky test failures - since those threads have
1 second sleep, it's more than likely that the session has been
created and used by the thread while the teardown has been attempting
to close all the sessions.

This side effect could also have an effect for other tests that
were run after - because in a busy test run machine, the side
effects could propagate further than just to the teardown, so it
could also explain why sometimes (very rarely) other job tests
failed with similar errors.

The fix is to join the runner after marking it to be stopped.

Fixes: apache#33323 (Hopefully)
potiuk added a commit that referenced this issue Sep 4, 2023
As documented in #33323, we have frequent failures of the flaky
triggerer job tests.

The flaky failures are about some errors when we close all the
sessions in teardown of the test. It turns out that the tests
had side-effect - they have not waited for the TriggererJob thread
to complete, they merely marked them to be stopped, but they have
not waited for those to complete - this is quite plausible
explanation of the flaky test failures - since those threads have
1 second sleep, it's more than likely that the session has been
created and used by the thread while the teardown has been attempting
to close all the sessions.

This side effect could also have an effect for other tests that
were run after - because in a busy test run machine, the side
effects could propagate further than just to the teardown, so it
could also explain why sometimes (very rarely) other job tests
failed with similar errors.

The fix is to join the runner after marking it to be stopped.

Fixes: #33323 (Hopefully)
ephraimbuddy pushed a commit that referenced this issue Sep 4, 2023
As documented in #33323, we have frequent failures of the flaky
triggerer job tests.

The flaky failures are about some errors when we close all the
sessions in teardown of the test. It turns out that the tests
had side-effect - they have not waited for the TriggererJob thread
to complete, they merely marked them to be stopped, but they have
not waited for those to complete - this is quite plausible
explanation of the flaky test failures - since those threads have
1 second sleep, it's more than likely that the session has been
created and used by the thread while the teardown has been attempting
to close all the sessions.

This side effect could also have an effect for other tests that
were run after - because in a busy test run machine, the side
effects could propagate further than just to the teardown, so it
could also explain why sometimes (very rarely) other job tests
failed with similar errors.

The fix is to join the runner after marking it to be stopped.

Fixes: #33323 (Hopefully)
(cherry picked from commit 47f79b9)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:meta High-level information important to the community Quarantine Issues that are occasionally failing and are quarantined
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant