Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transaction timeout for longer queries #20

Closed
aakashnand opened this issue Nov 21, 2021 · 19 comments
Closed

Transaction timeout for longer queries #20

aakashnand opened this issue Nov 21, 2021 · 19 comments

Comments

@aakashnand
Copy link

aakashnand commented Nov 21, 2021

Background

I have been experiencing this issue for a long time. This issue happens when we try to execute the query which takes more than 5min. To reproduce this issue use dbt seed which can insert some rows which will take more than 5mins.

Reason

The reason for this issue is that, In Trino, the default value of transaction.idle-timeout is 5minutes and dbt executes commands in the following flow which causes this error

  1. START TRANSACTION
  2. Some queries which will take more than 5min
  3. COMMIT <- This will throw an error

Error Log

trino.exceptions.DatabaseError: failed to commit transaction 73dc7443-ec53-4911-95be-a372792743b4: TrinoUserError(type=USER_ERROR, name=UNKNOWN_TRANSACTION, message="Unknown transaction ID: 73dc7443-ec53-4911-95be-a372792743b4. Possibly expired? Commands ignored until end of transaction block", query_id=20211121_145311_00013_7rttf)

Is there any specific reason why transactions are explicitly started in dbt-trino in spite of using auto-commit in python connection?

isolation_level=IsolationLevel.AUTOCOMMIT,

Other reference:

dbt-labs/dbt-presto#75

@findinpath
Copy link
Collaborator

dbt is responsible for performing the changes in a transactional fashion.
If you think about it, this is meaningful in order to avoid having corrupt data after running dbt .

@aakashnand the PR trinodb/trino#9846 may be linked to your issue. It will be available only from Trino version 365.

@aakashnand
Copy link
Author

aakashnand commented Nov 22, 2021

this is meaningful in order to avoid having corrupt data after running dbt

@findinpath even when we are connecting using AUTOCOMMIT?

isolation_level=IsolationLevel.AUTOCOMMIT,

My understanding here is that when we use AUTOCOMMIT, every query is committed as soon as it is finished. In this sense, even if we execute 1000s of queries the end result will not lead to data corruption.

Also, what is the correct solution to solve this issue?

@findinpath
Copy link
Collaborator

dbt can execute more than one statement corresponding to a materialization.
Maybe I am wrong, please double check this on dbt slack, but this is the reason why dbt works with transactions.

@bachng2017
Copy link
Contributor

bachng2017 commented Nov 22, 2021

it might be a trino question than a dbt question, @findinpath, do you know if this kind of transaction works in trino ?

(for example seeding sequence looks like this):
transaction begin
insert 1
insert 2
insert 3
commit or rollback

as far as we've tested, if insert 3 fails, the result of insert 1 and 2 still in the seed table.

Maybe the transaction start/commit in dbt-presto/trino is just a place-holder or remains of a copy from other drivers. (btw, with snowflake, the rollback in previous sample seems to clear up the result of all inserts, as expected)

@findinpath
Copy link
Collaborator

@bachng2017

as far as we've tested, if insert 3 fails, the result of insert 1 and 2 still in the seed table.

Indeed, Trino doesn't support fully ACID semantics.
Trino makes sure that ONE INSERT statement doesn't create corrupt data by inserting data in one or more staging table(s) and then doing on the target database INSERT INTO target_table SELECT * FROM staging_table(s).

See https://www.trinoforum.org/t/insert-performance-on-trino-jdbc-connectors/99/3 for a more detailed explanation about the way that INSERT works on Trino.

@aakashnand
Copy link
Author

@findinpath Do you have any solution in mind for this or should we discuss this on slack channel #dbt-presto-trino?
The temporary solution we have is to increase transaction.idle-timeout=8h in config.properties

@findinpath
Copy link
Collaborator

@aakashnand I have no better solution than what you just posted.

A slightly different approach is to set the sesssion properties on dbt and affect therefore only the dbt project and not all Trino clients with the transaction idle timeout setting.

my-trino-db:
  target: dev
  outputs:
    dev:
      type: trino
      user: commander
      host: 127.0.0.1
      port: 8080
      database: analytics
      schema: public
      threads: 8
      http_scheme: http
      session_properties:
        ....

@hovaesco
Copy link
Contributor

I believe transaction.idle-timeout cannot be set as a session property.

@aakashnand
Copy link
Author

As suggested by @hovaesco this property cannot be set using session property so we need to set it in the config.properties 🙁

@bachng2017
Copy link
Contributor

bachng2017 commented Nov 23, 2021

we might have understood what happens but want to find more evidences.

btw @findinpath , could you run this example ?
https://github.com/trinodb/trino-python-client#transactions

somehow it did not work in our env with TWO inserts. ONE insert is fine.
just want to make sure our env. is setup correctly.

@bachng2017
Copy link
Contributor

bachng2017 commented Nov 23, 2021

I think the reason the transaction does not work is on our side. Anyway, we think the reason the time-out occurred at the 1st place might be this:

dbt makes the connection with IsolationLevel.AUTOCOMMIT as default. So when the sql is executed, it creates a request (#a). And when the start_transaction is execute, #b create the transaction in another request instance.
The query and the transaction happens independently. This answers the question why we had the error at the COMMIT but the query has been succeed. And also a simple test with start_transaction -> wait long enough -> commit yields the same error.

We need more test to see how to improve this but there might be 2 options now:

  • use a different isolation level when dbt-trino/presto initializes the connection
  • change the way trino client implements #a to care about current transaction instance, not only isolation level

@findinpath
Copy link
Collaborator

@hashhar / @ebyhr can you please take a look at the comment #20 (comment) and help out with feedback?

@hashhar
Copy link
Member

hashhar commented Nov 23, 2021

@bachng2017 Good catch. that indeed looks like a bug in how transactions are handled in the Trino python client.

Can you file the issue against the Python client with a simple reproduction code so that we can verify that the fix that gets implemented actually works?

If I understand correctly any query ran within a transaction that takes > transaction timeout will exhibit this issue?

@bachng2017
Copy link
Contributor

@hashhar , thanks for confirmation. Will file a issue for this.

And yes, that is what we are facing

If I understand correctly any query ran within a transaction that takes > transaction timeout will exhibit this issue?

@bachng2017
Copy link
Contributor

@hashhar created an issue for this at trino-python-client, pls check
trinodb/trino-python-client#129

@hovaesco
Copy link
Contributor

PR to address the issue: #30

@hovaesco
Copy link
Contributor

Merged #30

@bachng2017
Copy link
Contributor

hello @hovaesco
I would like to confirm about the #30 merge (sorry that I should contact earlier but did not have time)

the fix seems to eliminate (pass) all start_transaction/commit/rollback in the old codes with this reason

dbt-trino adapter uses connection in auto-commit mode, hence there is no need to separately start a new transaction in each query since it makes no effect. In auto-commit mode, the client will create a new transaction for each query.

with the old codes, the typical sequence happens like this:

start transaction -> create tmp table -> create real table ->  commit

in new codes, it just

create tmp table -> create real table

so, autocommit for each query might not be sufficient . What do you think ?

the affect is not too big but the concept is very different from the old codes.

@hovaesco
Copy link
Contributor

Could you please elaborate more on autocommit for each query might not be sufficient ?
Each query is executed with autocommit so each operation is committed immediately after performing it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants