Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Http protocol sensor #99

Closed
wants to merge 16 commits into from
Closed

Http protocol sensor #99

wants to merge 16 commits into from

Conversation

gtoonstra
Copy link
Contributor

This is the approach chosen for the HTTP hook and includes a very simple SimpleHttpOperator. This op only checks if the response status code is 20x or a failure. The sensor calls a simple GET and if the response code is 20x, it flows through. (regex is not implemented there, yet, to check on object status).

The most important bit is the hook, which is a wrapper around HTTP operations and which other, more complex operators can use to interact with other systems over HTTP.

The example http dag demonstrates how this appears to an end user.

Gotcha: for the Simple operator, the code in the dag must ensure that the data is of the right type and the correct content-type header is included, or the request will fail. This is because it's not viable to parse the header and know how to work with the input data to make it sensible.

If this simple interaction doesn't cut it, a specific operator for that system should be built. The simple op is not intended to be more generic nor clever than this, it probably serves more as an example of how to build your own http operators than useful in many use cases.

Obviously, more specific operators would reduce the amount of input parameters in the task instance declaration.

@mistercrunch
Copy link
Member

This is awesome and the more I think about it, the more I think there are things that can be generalized. We should definitely be thinking about how it could be derived to create more specific operators and provide the right hooks for it.

@@ -18,6 +18,7 @@
'samba_hook': ['SambaHook'],
'sqlite_hook': ['SqliteHook'],
'S3_hook': ['S3Hook'],
'http_hook': [ 'HttpHook' ]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I like to leave trailing comas

@mistercrunch
Copy link
Member

You have lots of commits here, people I work with tend to rebase their branch as opposed to merging which keeps the commit list saner, but don't worry about it if it's too complex at this point.

@@ -331,3 +331,44 @@ def poke(self, context):
logging.info(
'Checking if the time ({0}) has come'.format(self.target_time))
return datetime.now().time() > self.target_time

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 blank line

@gtoonstra
Copy link
Contributor Author

So ugh!, that rebase didn't work out, tried that in a safe branch first because of the potential issues and it got me. But I checked the code and that's ok. probably happened because I mixed it with a regular merge.

regex is still to be done. Lambda sounds like a nice alternative. Today is gone, so looking at this later.

@gtoonstra
Copy link
Contributor Author

Lambda function introduced too. I eventually decided to return the entire response, because how you grab content differs per response type. This way, there's also access to the status_code, which may be slightly different 200=OK is generic, but for a POST it sometimes is 201[CREATED]. I called it response_check because of that. I believe the original request object is also available.

If space is too limited, it's possible to pass in a regular function instead, which will make code easier to read for complex check logic.

The rebase is making this pull request ugly. This branch is complete, but I'm making a new pull request that's clean for integration.

@mistercrunch
Copy link
Member

Closing this, should taking conversation to https://github.com/airbnb/airflow/pull/103/files

mobuchowski pushed a commit to mobuchowski/airflow that referenced this pull request Jan 4, 2022
* refactor

Signed-off-by: Julien Le Dem <[email protected]>

* refactor

Signed-off-by: Julien Le Dem <[email protected]>
mobuchowski pushed a commit to mobuchowski/airflow that referenced this pull request Jan 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants