-
Notifications
You must be signed in to change notification settings - Fork 14.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIRFLOW-5751] add get_uri method to Connection #6426
[AIRFLOW-5751] add get_uri method to Connection #6426
Conversation
b815b65
to
9c6e779
Compare
Codecov Report
@@ Coverage Diff @@
## master #6426 +/- ##
=========================================
Coverage ? 84.09%
=========================================
Files ? 672
Lines ? 38176
Branches ? 0
=========================================
Hits ? 32105
Misses ? 6071
Partials ? 0
Continue to review full report at Codecov.
|
airflow/models/connection.py
Outdated
|
||
user_block = '' | ||
if self.login is not None: | ||
user_block += quote(self.login, safe='') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally prefer not to build URLs myself. What do you think about using the https://docs.python.org/2/library/urlparse.html#urlparse.urlunparse function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok so one reason is that urlunparse
did not seem that helpful to me.
I noticed we use urlunparse
in CLI connection add (to build a STDOUT message).
But even there, we construct user / password / host / port with format string (because it does not handle that for you). And it does not handle quoting for you either.
It seems more useful if you are starting with something that has been parsed with urlparse
.
In any case we have to be very careful to align our quoting with the handling in parse_from_uri.
It does not handle urlencoding of the extra
field either. So it did not seem of much use.
One thing I was on the fence about was whether to bother with minifying the formatting. By that i mean if user / password is omitted, should we omit the :@
.
For example we could do this:
def get_uri(self) -> str:
uri = '{conn_type}://{login}:{password}@{host}:{port}/{schema}?{query}'.format(
conn_type=self.conn_type.replace('_', '-'),
login=quote(self.login or '', safe=''),
password=quote(self.password or ''),
host=quote(self.host or '', safe=''),
port=self.port or '',
schema=quote(self.schema, safe=''),
query=urlencode(self.extra_dejson),
)
return uri
Much more elegant this way. However, if we do this, login will be ''
. When login is omitted in URI, the login
attribute it is parsed as None
. As a result, 3 of the tests fail (the last 3). So, the more verbose approach seems to be a more faithful inverse of the the parse_from_uri
function.
WDYT?
9c6e779
to
c3f8ef4
Compare
airflow/models/connection.py
Outdated
if self.port: | ||
host_block += ':{}'.format(self.port) | ||
|
||
if self.schema: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think schema doesn't depend on host.
For instance AIRFLOW__CORE__SQLALCHEMY_CONN=postgresql:///airflow
Should the schema includce the leading /
already? i.e. do we end up with a //
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok @ashb I have updated to support (with tests) for schema only, login only, password only, and port only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok lastly (one can hope) i have added another set of tests verifying that when we create connection from init attributes (rather than uri) that everything still works consistently -- URIs generated still parse to an equivalent connection object.
4840adb
to
07de406
Compare
I would like to point out that some hooks have methods for creating URI. |
Interesting. Yeah I think it's good for hooks to have this. I'd like it if every hook had a static method that took all valid params and produce correct URI. With a E.g. for GCP, such a method could look like this:
Stepping back, do you think this PR is a worthwhile addition (i.e. adding It's a small thing, but I think it's good to have a standardized method of generating a uri that is guaranteed to parse correctly. So a user could build their connection object with init params, and then generate a URI that will work. No fussing with urllib. No digging through to see how things are parsed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dstandish Could you rebase this on to latest master please? Sorry for leaving this.
If it still passes ther under pytest (it almost certainly should) then we can merge
* Add a convenience method `get_uri` on `Connection` object to generate the URI for a connection.
07de406
to
6d458a0
Compare
rebased; tests pass; thanks :) |
Add a convenience method `get_uri` on `Connection` object to generate the URI for a connection.
get_uri
onConnection
object to generate the URI for a connection.Make sure you have checked all steps below.
Jira
Description
Building URI for use in connection env var can be a nuisance. This is a convenience method that will do it for you, given a connection object.
So if you have created one in UI, you could do BaseHook.get_connection(conn_id).get_uri().
Or build using init params on
Connection
object and callget_uri()
.I think it could also be nice if each hook had a
get_uri
method that would take all relevant params and produce a correctly encoded URI. If that were implemented, it could use this function for that purpose.I added tests to verify that generated URIs, when parsed again, produce the same connection object. For this I used the same URIs we were already testing. And as part of this there was an efficiency in refactoring the existing from_uri test.
Tests
Commits
Documentation