-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transaction timeout for longer queries #20
Comments
@aakashnand the PR trinodb/trino#9846 may be linked to your issue. It will be available only from Trino version |
@findinpath even when we are connecting using AUTOCOMMIT? dbt-trino/dbt/adapters/trino/connections.py Line 195 in 8088447
My understanding here is that when we use AUTOCOMMIT, every query is committed as soon as it is finished. In this sense, even if we execute 1000s of queries the end result will not lead to data corruption. Also, what is the correct solution to solve this issue? |
dbt can execute more than one statement corresponding to a materialization. |
it might be a trino question than a dbt question, @findinpath, do you know if this kind of transaction works in trino ? (for example seeding sequence looks like this): as far as we've tested, if insert 3 fails, the result of insert 1 and 2 still in the seed table. Maybe the transaction start/commit in dbt-presto/trino is just a place-holder or remains of a copy from other drivers. (btw, with snowflake, the rollback in previous sample seems to clear up the result of all inserts, as expected) |
Indeed, Trino doesn't support fully ACID semantics. See https://www.trinoforum.org/t/insert-performance-on-trino-jdbc-connectors/99/3 for a more detailed explanation about the way that |
@findinpath Do you have any solution in mind for this or should we discuss this on slack channel #dbt-presto-trino? |
@aakashnand I have no better solution than what you just posted. A slightly different approach is to set the sesssion properties on
|
I believe |
As suggested by @hovaesco this property cannot be set using session property so we need to set it in the config.properties 🙁 |
we might have understood what happens but want to find more evidences. btw @findinpath , could you run this example ? somehow it did not work in our env with TWO inserts. ONE insert is fine. |
I think the reason the transaction does not work is on our side. Anyway, we think the reason the time-out occurred at the 1st place might be this:
dbt makes the connection with IsolationLevel.AUTOCOMMIT as default. So when the sql is executed, it creates a request (#a). And when the start_transaction is execute, #b create the transaction in another request instance. We need more test to see how to improve this but there might be 2 options now:
|
@hashhar / @ebyhr can you please take a look at the comment #20 (comment) and help out with feedback? |
@bachng2017 Good catch. that indeed looks like a bug in how transactions are handled in the Trino python client. Can you file the issue against the Python client with a simple reproduction code so that we can verify that the fix that gets implemented actually works? If I understand correctly any query ran within a transaction that takes > transaction timeout will exhibit this issue? |
@hashhar , thanks for confirmation. Will file a issue for this. And yes, that is what we are facing
|
@hashhar created an issue for this at trino-python-client, pls check |
PR to address the issue: #30 |
Merged #30 |
hello @hovaesco the fix seems to eliminate (pass) all start_transaction/commit/rollback in the old codes with this reason
with the old codes, the typical sequence happens like this:
in new codes, it just
so, the affect is not too big but the concept is very different from the old codes. |
Could you please elaborate more on |
Background
I have been experiencing this issue for a long time. This issue happens when we try to execute the query which takes more than 5min. To reproduce this issue use dbt seed which can insert some rows which will take more than 5mins.
Reason
The reason for this issue is that, In Trino, the default value of
transaction.idle-timeout
is5minutes
and dbt executes commands in the following flow which causes this errorSTART TRANSACTION
COMMIT
<- This will throw an errorError Log
Is there any specific reason why transactions are explicitly started in dbt-trino in spite of using auto-commit in python connection?
dbt-trino/dbt/adapters/trino/connections.py
Line 195 in 8088447
Other reference:
dbt-labs/dbt-presto#75
The text was updated successfully, but these errors were encountered: