Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-3705] Fix PostgresHook get_conn to use conn_name_attr #5841

Merged

Conversation

coopergillan
Copy link
Contributor

@coopergillan coopergillan commented Aug 16, 2019

Make sure you have checked all steps below.

Jira

Description

  • Here are some details about my PR, including screenshots of any UI changes:

Update the get_conn method of PostgresHook to ensure that the conn_name_attr specified is used to establish the connection via psycopg2.

This is an issue when subclassing PostgresHook. The postgres_conn_id attribute must be overridden since it is explicitly used in establishing the connection. Now any given conn_name_attr set when subclassing will be used when connecting.

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:
    • Adds a small subclass of PostgresHook with a new conn_name_attr which is used in TestPostgresHookConn. This illustrates the problem with the default behavior for get_conn with three test failures that say:
AttributeError: 'UnitTestPostgresHook' object has no attribute 'postgres_conn_id'

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Code Quality

  • Passes flake8

@coopergillan
Copy link
Contributor Author

Build failure says this at the very bottom:

No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.
Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#build-times-out-because-no-output-was-received

Does this seem likely caused by these code changes, or is this a known/transient issue?

@coopergillan coopergillan reopened this Aug 16, 2019
@coopergillan coopergillan force-pushed the AIRFLOW-3705-bug-fix-PostgresHook-get_conn branch from 5ba1722 to d8beac9 Compare August 17, 2019 18:49
@coopergillan
Copy link
Contributor Author

Looks like the first build failure must have been a transient error. Anything else needed for this one to get merged?

@coopergillan
Copy link
Contributor Author

@ashb @kaxil @potiuk - anything else you think is needed here?

@ashb
Copy link
Member

ashb commented Aug 19, 2019

Can you explain exactly why this is needed? Is there a reason you can't set postgres_conn_id inside subclass's constructor? Something like:

        class UnitTestPostgresHook(PostgresHook):
            def __int__(self, test_conn_id, **kwargs)
                super().__init__(postgres_conn_id=test_conn_id, **kwargs)

That way no change is needed to airflow.

@coopergillan
Copy link
Contributor Author

@ashb - I see your point. This looked like a bug to us when we sub-classed the Hook to connect to a Redshift instance. We set conn_name_attr to redshift_conn_id, but this made it impossible to use get_conn.

I'm working on your suggestion in my own code and have not yet gotten my own tests to pass, but I can keep trying.

I think I'm seeing now that this attribute really isn't meant to be changed and is really more for the entire hook class, correct?

@ashb
Copy link
Member

ashb commented Aug 21, 2019

Oh, I hadn't noticed that conn_name_attr is part of the "api" defined from the DbApiHook (I haven't used any of the DB hooks in Airflow), in which case although my change would probably work, the this PR probably makes sense.

@coopergillan coopergillan force-pushed the AIRFLOW-3705-bug-fix-PostgresHook-get_conn branch from d8beac9 to a7918be Compare August 23, 2019 21:03
Update PostgresHook's get_conn method to directly call the specified
conn_name_attr rather that always using self.postgres_conn_id.

Currently subclassing PostgresHook requires overriding the
postgres_conn_id attribute in order to establish a separate connection.

Add an additional unit test for this case checking that the subclassed
PostgresHook's get_conn calls the correct arguments and that the hook
calls the correction connection_id in get_connection.
@coopergillan coopergillan force-pushed the AIRFLOW-3705-bug-fix-PostgresHook-get_conn branch from a7918be to d75880c Compare August 23, 2019 23:12
@coopergillan
Copy link
Contributor Author

@ashb - all right, I incorporated those changes you suggested and got a green build. Any other thoughts or suggestions?

Thanks for the suggestion! I do like how much more explicit this test is than what I had before 👍

@ashb ashb merged commit f823a66 into apache:master Aug 28, 2019
Jerryguo pushed a commit to Jerryguo/airflow that referenced this pull request Sep 2, 2019
…e#5841)

Update PostgresHook's get_conn method to directly call the specified
conn_name_attr rather that always using self.postgres_conn_id.

Currently subclassing PostgresHook requires overriding the
postgres_conn_id attribute in order to establish a separate connection.

Add an additional unit test for this case checking that the subclassed
PostgresHook's get_conn calls the correct arguments and that the hook
calls the correction connection_id in get_connection.
ashb pushed a commit that referenced this pull request Oct 11, 2019
Update PostgresHook's get_conn method to directly call the specified
conn_name_attr rather that always using self.postgres_conn_id.

Currently subclassing PostgresHook requires overriding the
postgres_conn_id attribute in order to establish a separate connection.

Add an additional unit test for this case checking that the subclassed
PostgresHook's get_conn calls the correct arguments and that the hook
calls the correction connection_id in get_connection.

(cherry picked from commit f823a66)
adityav pushed a commit to adityav/airflow that referenced this pull request Oct 14, 2019
…e#5841)

Update PostgresHook's get_conn method to directly call the specified
conn_name_attr rather that always using self.postgres_conn_id.

Currently subclassing PostgresHook requires overriding the
postgres_conn_id attribute in order to establish a separate connection.

Add an additional unit test for this case checking that the subclassed
PostgresHook's get_conn calls the correct arguments and that the hook
calls the correction connection_id in get_connection.

(cherry picked from commit f823a66)
@coopergillan coopergillan deleted the AIRFLOW-3705-bug-fix-PostgresHook-get_conn branch December 6, 2019 19:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants