Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SnowflakeToSlackOperator: Use Slack API and send result as file #24660

Closed
wants to merge 1 commit into from

Conversation

Taragolis
Copy link
Contributor

closes: #9145

@boring-cyborg boring-cyborg bot added area:providers provider:snowflake Issues related to Snowflake provider labels Jun 25, 2022
@eladkal
Copy link
Contributor

eladkal commented Jun 26, 2022

@Taragolis note that we also have PrestoToSlackOperator
#23979
So this change should also applied there.

@alexkruc FYI since you are working on #24243

template_ext: Sequence[str] = ('.sql', '.jinja', '.j2')
SUPPORTED_FILE_FORMATS = (
'csv',
'parquet',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think sending parquet to slack is a reasonable use case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I you ask personally me, than answer very clear "No, I don't think sending parquet to slack is a reasonable"

But users want to do some strange things and ability to send parquet cost on developing side almost nothing. This is the reason why I keep it.

@alexkruc
Copy link
Contributor

alexkruc commented Jun 26, 2022

@Taragolis I've just submitted a PR that will deprecate SnowflakeToSlackOperator and PrestoToSlackOperator in favour of a generic SqlToSlackOperator (#23979), as @eladkal mentioned :).
This will make everything that uses DbApiHook and implements get_pandas_df will now be able to send the results to Slack.
There were no changes to the logic of the operators, but some changes regarding the Slack API that I saw can benefit to your feature as well :)

Adding the addition of a file in a generic way sounds fantastic! but maybe it's worth waiting for a review & merge to my PR first, that way - everything that is using DbApiHook will automatically inherit this feature :)

@potiuk potiuk force-pushed the slack-to-snowflake-by-file branch from c40df15 to 56b07f3 Compare June 26, 2022 10:18
@potiuk
Copy link
Member

potiuk commented Jun 26, 2022

Rebased to fix selective check problem from #24665 after it's merged.

@Taragolis
Copy link
Contributor Author

@alexkruc Nice! You could grab whatever you think is useful from this PR. FYI I've tested manually all 3 options:

  • Slack Incoming Webhook as a text
  • Slack API as text
  • Slack API as a file

And seems it works fine. However why this PR is draft - no unit tests for new implementation, but If i wouldn't create this changes as usual keep in my laptop for couple months.

@eladkal I noticed this in very late stage, which actually show me additional way how to do it in generic way.
Rather than create SnowflakeToSlack, PrestoToSlack and others probably better create some base class as part of Slack provider, let's call it PandasDataFrameToSlack which implements everything except method which return actual pf.DataFrame.

However create as part of slack-provider show some difficulties - cross dependencies, first of all it need to be implemented in slack-provider next add as extra to sql-provider and after that both of them should be added as part of dependencies for all of the DBApi.

And also seems like Slack it self a bit abandoned, even recommended Slack Api doesn't have their own connection.

@eladkal
Copy link
Contributor

eladkal commented Jun 26, 2022

You could grab whatever you think is useful from this PR

We don't have to. Alex PR is about generalizing the current operator. This PR is about adding more functionality. My suggest is to wait for Alex PR to be merged, then you can rebase and apply your changes on the new added generic operator.

let's call it PandasDataFrameToSlack

but it's not.. it's SqlToSlack. Airflow can't do transfer from in memory object as operators runs on different workers so you can't split transfer action to sql -> df, df -> slack. but lets keep these comments on #24663 as it's not related to this PR :)

next add as extra to sql-provider

The operator is in slack provider. you won't need sql provider (at least not by direct relations) and currently we don't have sql provider. The PR that attempts to add it wasn't merged and even if so it doesn't move dbapi hook yet.

@Taragolis
Copy link
Contributor Author

but it's not.. it's SqlToSlack. Airflow can't do transfer from in memory object as operators runs on different workers so you can't split transfer action to sql -> df, df -> slack. but lets keep these comments on #24663 as it's not related to this PR :)

I mean create base operator which do not have final implementation how to get this dataframe, something like

class BasePandasDataFrameToSlackOperator:
    ...

    def get_pandas_df(self) -> 'DataFrame':
        raise NotImplementedError("You need to implement this method first")

    ...

    def execute(self, context: 'Context') -> None:
        df = self.get_pandas_df()  # Better to move after some Slack messages validation

        if self.filename:
            self.send_file(df=df)
        else:
            self.send_text(context=context, df=df)

        self.log.debug('Finished sending Snowflake data to Slack')
    

And finally everyone who want to use as base need to just implements this one, example

class SnowflakeToSlackOperator(BasePandasDataFrameToSlackOperator):
    ...

    def get_pandas_df(self):
        self._validate_sql()
        self.log.info('Running SQL query: %s', self.sql)
        return self.snowflake_hook.get_pandas_df(self.sql, parameters=self.parameters)

@eladkal
Copy link
Contributor

eladkal commented Jun 26, 2022

The generic operator will handle this - if implemented right there will be no need for individual classes in each provider, again if you have comments on this lets please take them to #24663

@Taragolis
Copy link
Contributor Author

Close for now.
#24663 will enhance more generic way for sending SQL output to Slack.

If additional enhance required new PR related to SqlToSlackOperator would created

@Taragolis Taragolis closed this Jun 26, 2022
@eladkal
Copy link
Contributor

eladkal commented Jun 29, 2022

@Taragolis now that #24663 is merged you can revisit it.
The change you need to make is only in SqlToSlackOperator

@Taragolis Taragolis deleted the slack-to-snowflake-by-file branch January 14, 2023 18:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers provider:snowflake Issues related to Snowflake provider
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enhance SqlToSlackOperator to support attachments
4 participants