Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improves documentation regarding providers and custom connections #13375

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 47 additions & 14 deletions docs/apache-airflow-providers/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,12 @@ Provider packages
Provider packages context
'''''''''''''''''''''''''

Unlike Apache Airflow 1.10, the Airflow 2.0 is delivered in multiple, separate, but connected packages.
Unlike Apache Airflow 1.10, the Airflow 2.0 is delivered in multiple, separate but connected packages.
The core of Airflow scheduling system is delivered as ``apache-airflow`` package and there are around
60 providers packages which can be installed separately as so called "Airflow Provider packages".
Those provider packages are separated per-provider (for example ``amazon``, ``google``, ``salesforce``
etc.) Those packages are available as ``apache-airflow-providers`` packages - separately per each provider
(for example there is an ``apache-airflow-providers-amazon`` or ``apache-airflow-providers-google`` package.
etc.). Those packages are available as ``apache-airflow-providers`` packages - separately per each provider
(for example there is an ``apache-airflow-providers-amazon`` or ``apache-airflow-providers-google`` package).

You can install those provider packages separately in order to interface with a given provider. For those
providers that have corresponding extras, the provider packages (latest version from PyPI) are installed
Expand Down Expand Up @@ -72,26 +72,25 @@ Separate provider packages provide the possibilities that were not available in
Extending Airflow Connections and Extra links via Providers
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

Providers can not only deliver operators, hooks, sensor, transfer operators to communicate with
Providers can not only deliver operators, hooks, sensor, and transfer operators to communicate with a
multitude of external systems, but they can also extend Airflow. Airflow has several extension capabilities
that can be used by providers. Airflow automatically discovers which providers add those additional
capabilities and, once you install provider package and re-start Airflow, those become automatically
available to Airflow Users.

The capabilities are:

* Adding Extra Links to operators delivered by the provider.
See :doc:`apache-airflow:howto/define_extra_link`
for description of what extra links are and examples of provider registering an operator with extra links
* Adding Extra Links to operators delivered by the provider. See :doc:`apache-airflow:howto/define_extra_link`
for a description of what extra links are and examples of provider registering an operator with extra links

* Adding custom connection types, extending connection form and handling custom form field behaviour for the
connections defined by the provider. See :doc:`apache-airflow:howto/connection` for description of
connections defined by the provider. See :doc:`apache-airflow:howto/connection` for a description of
connection and what capabilities of custom connection you can define.

How to create your own provider
"""""""""""""""""""""""""""""""
'''''''''''''''''''''''''''''''

Adding provider to Airflow is just a matter of building a Python package and adding the right meta-data to
Adding a provider to Airflow is just a matter of building a Python package and adding the right meta-data to
the package. We are using standard mechanism of python to define
`entry points <https://docs.python.org/3/library/importlib.metadata.html#entry-points>`_ . Your package
needs to define appropriate entry-point ``apache_airflow_provider`` which has to point to a callable
Expand All @@ -111,7 +110,7 @@ your own purpose) but the two important fields from the extensibility point of v
:doc:`apache-airflow:howto/connection` for more details.


When your providers are installed you can query the installed providers and their capabilities with
When your providers are installed you can query the installed providers and their capabilities with the
``airflow providers`` command. This way you can verify if your providers are properly recognized and whether
they define the extensions properly. See :doc:`cli-and-env-variables-ref` for details of available CLI
sub-commands.
Expand Down Expand Up @@ -178,17 +177,51 @@ Creating your own providers
**When I write my own provider, do I need to do anything special to make it available to others?**

You do not need to do anything special besides creating the ``apache_airflow_provider`` entry point
returning properly formatted meta-data (dictionary with ``extra-links`` and ``hook-class-names`` fields.
returning properly formatted meta-data (dictionary with ``extra-links`` and ``hook-class-names`` fields).

**What do I need to do to turn a package into a provider?**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️


You need to do the following to turn an existing Python package into a provider (see below for examples):

* Add the ``apache_airflow_provider`` entry point in the ``setup.cfg`` - this tells airflow where to get
the required provider metadata
* Create the function that you refer to in the first step as part of your package: this functions returns a
dictionary that contains all meta-data about your provider package; see also ``provider.yaml``
files in the community managed provider packages as examples

Example ``setup.cfg``: cfg

.. code-block::

[options.entry_points]
# the function get_provider_info is defined in myproviderpackage.somemodule
apache_airflow_provider=
provider_info=myproviderpackage.somemodule:get_provider_info

Example ``myproviderpackage/somemodule.py``:

.. code-block:: Python

def get_provider_info():
return {
"package-name": "my-package-name",
"name": "name",
"description": "a description",
"hook-class-names": [
"myproviderpackage.hooks.source.SourceHook",
],
'versions': ["1.0.0"],
}


**Should I named my provider specifically or should it be created in ``airflow.providers`` package?**
**Should I name my provider specifically or should it be created in ``airflow.providers`` package?**

We have quite a number (>70) of providers managed by the community and we are going to maintain them
together with Apache Airflow. All those providers have well-defined structured and follow the
naming conventions we defined and they are all in ``airflow.providers`` package. If your intention is
to contribute your provider, then you should follow those conventions and make a PR to Apache Airflow
to contribute to it. But you are free to use any package name as long as there are no conflicts with other
names,so preferably choose package that is in your "domain".
names, so preferably choose package that is in your "domain".

**Is there a convention for a connection id and type?**

Expand Down
21 changes: 11 additions & 10 deletions docs/apache-airflow/howto/connection.rst
Original file line number Diff line number Diff line change
Expand Up @@ -327,23 +327,24 @@ an secrets backend to retrieve connections. For more details see :doc:`/security
Custom connection types
-----------------------

Airflow allows to define custom connection types - including modification of the add/edit form for the
connections. Custom connection types are defined in community maintained providers, but also you can add
custom providers, that can add their own connection types. See :doc:`apache-airflow-providers:index`
for description on how to add your own connection type via custom providers.
Airflow allows the definition of custom connection types - including modifications of the add/edit form
for the connections. Custom connection types are defined in community maintained providers, but you can
can also add a custom provider that adds custom connection types. See :doc:`apache-airflow-providers:index`
for description on how to add custom providers.

The custom connection types are defined via Hooks delivered by the providers. The Hooks can implement
methods defined in the protocol :class:`~airflow.hooks.base_hook.DiscoverableHook`. Note that your custom
Hook should not derive from the class, the class is merely there to document expectations about class
fields and methods that your Hook might define.
methods defined in the protocol class :class:`~airflow.hooks.base_hook.DiscoverableHook`. Note that your
custom Hook should not derive from this class, this class is a dummy example to document expectations
regarding about class fields and methods that your Hook might define. Another good example is
:py:class:`~airflow.providers.jdbc.hooks.jdbc.JdbcHook`.

By implementing those method in the hooks of yours and exposing them via ``hook-class-names`` array in
By implementing those methods in your hooks and exposing them via ``hook-class-names`` array in
the provider meta-data you can customize Airflow by:

* Adding custom connection type
* Adding custom connection types
* Adding automated Hook creation from the connection type
* Adding custom form widget to display and edit custom "extra" parameters in your connection URL
* Hiding fields that are not used for your connection
* Adding placeholders showing examples of how fields should be formatted

You can read more about details how to add custom connection type in the :doc:`apache-airflow-providers:index`
You can read more about details how to add custom provider packages in the :doc:`apache-airflow-providers:index`