Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setup SDK to be a Namespace Package so it can be extended #1205

Merged
merged 4 commits into from
Oct 15, 2020

Conversation

NathanielRN
Copy link
Contributor

Description

Sets up SDK package to be a Namespace Package.

A .pyi file (instead of a .py file) allows packages to be registered and imported under the opentelemetry.sdk.* namespace.

This will allow vendors to ship packages under a opentelemetry.sdk.extension package, for example.

Fixes #1204

Type of change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

  • You can create a package with the following structure:
opentelemetry-sdk-extension-my-extension
├── CHANGELOG.md
├── LICENSE
├── README.rst
├── setup.cfg
├── setup.py
├── src
│   ├── opentelemetry
│   │   └── sdk
│   │       └── my_extension
│   │           ├── __init__.py
│   │           └── version.py
│   └── opentelemetry_sdk_extension_my_extension.egg-info
│       ├── PKG-INFO
│       ├── SOURCES.txt
│       ├── dependency_links.txt
│       ├── requires.txt
│       └── top_level.txt
└── tests
    ├── __init__.py
    ├── base_test.py
    ├── test_automatic.py
    └── test_programmatic.py

and see that if you run

pip install -e ./opentelemetry-sdk-extenion

you can now start a python environment and install BOTH the regular SDK and the extension:

$ python
Python 3.8.2 (default, Aug 25 2020, 09:23:57)
[Clang 12.0.0 (clang-1200.0.32.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import opentelemetry.sdk
>>> import opentelemetry.sdk.my_extension

Checklist:

  • Followed the style guidelines of this project
  • [] Changelogs have been updated Not necessary?

@NathanielRN NathanielRN requested a review from a team October 6, 2020 01:18
@owais
Copy link
Contributor

owais commented Oct 6, 2020

Are there any other implications when moving from init.py to init.pyi?

@aabmass
Copy link
Member

aabmass commented Oct 6, 2020

The pyi files aren't used at runtime at all, AFAIK, so this changes the behavior. Unless we are ok to change the behavior, this won't work

from . import metrics, trace, util
__all__ = ["metrics", "trace", "util"]

@NathanielRN
Copy link
Contributor Author

@owais What I understand echoes what @aabmass said: Anything in this file (__init__.pyi) is not runtime code. Instead, it is just meant to give type hints.

PEP 561 would consider this package a "stub"

"stubs" - files containing only type information, empty of runtime code (the filename ends in .pyi).

And according to this explanation from the website for a Python IDE,

a *.pyi Python Interface file is another way to describe the contents of a module ... This file is simply a Python skeleton with the appropriate structure, call signature, and return values to match the functions, attributes, classes, and methods defined in a module.

We could add typing in the future potentially, but the code that was in __init__.py can be safely removed since they just exposed the metrics, trace, and util folder at the opentelemetry.sdk level even though those were already visible at that level since they are the names of the sub-packages to the opentelemetry-sdk/src/opentelemetry/sdk/ folder. (According to my understanding).

@owais
Copy link
Contributor

owais commented Oct 7, 2020

Sounds good. Slight concern that this might force us to never have init.py which is sometimes useful if we want to change internal structure without breaking public APIs or allow to easily expose things via public API that would otherwise require more drastic code structure changes. Also, how do we decide which module get to have init.py and which ones don't? I think we probably need some sort of guidelines on code-structure and packaging to sort this out.

@NathanielRN
Copy link
Contributor Author

@owais I can see what you're saying about the benefits. On the other hand, the current __init__.py isn't doing anything, which means we know no one is accessing things from opentelemetry.sdk at the top level. Removing it now at the beginning prevents the issue where someone ends up adding something to __init__.py which really would make us more locked it to exposing things at the top level. I think the current system of importing, which asks, "from which sub-directory of the OTel SDK do you want to import from? (i.e. traces?, metrics?, util?)" makes sense in terms of developers wanting to find the pieces they're looking for.

I would also say that one of the points of OTel is for it to be a developer's No. 1 choice for everything tracing. That's why in the API package we let it be extended so that we can easily bring in community-implemented instrumentations and exporters while not bundling them in the same API package. That would be the same goal here, since you asked

how do we decide which module get to have init.py and which ones don't?

I would say it would be for well-defined packages (like API and SDK uniquely are in the spec) which can support vendor specific features without including them in the packages themselves (like we do not want vendor specific features in the SDK and API packages).

Because there is a use case to provide implementations of the API defined items (e.g. custom Google/Microsoft/Amazon/Splunk/Datadog Tracers, TracerProviders, Resources, IdsGenerator, Propagators) then it makes sense to me to "extend" the SDK as opentelemetry.sdk.extension. It could make sense to just have opentelemetry.extension but I think that's less intuitive since it's not extending the API as much as it is extending the SDK.

@owais
Copy link
Contributor

owais commented Oct 9, 2020

Makes sense @NathanielRN. No objections from my side on removing it and making this technically possible but I still wonder if 3rd party extensions can just live in their own independent packages instead. For example, opentelemetry_aws.processor instead of opentelemetry.sdk.aws.processor. LightStep already does something similar for example. https://github.com/lightstep/otel-launcher-python and Splunk will have something similar very soon.

I don't see a strong reason for 3rd party packages to re-use opentelemetry import paths. This can still cause conflicts down the line as we as upstream cannot guarantee changing the directory structure in a way that breaks an unknown 3rd party package.

We do use this pattern today to extend some packages such as opentelemetry.instrumentation but the key difference is that all those packages are maintained by us in this project. In other words, they are all "official" OpenTelemetry pacakges. We essentially ship these packages under the root opentelemetry import path. The fact that most of the source is divided into multiple packages is just an implementation detail IMO.

I think only core and contrib packages should use the opentelemetry.* import path. 3rd party packages should use their own import paths. This makes package ownership very obvious. Users can look at the import path and immediately tell if they are using code from official Otel project or some other source.

Also, other OpenTelemetry projects have clear distinction between official and 3rd party packages as overloading import paths like this is not as easy with other languages. So it'd be better if we stay consistent with other languages as well.

I think @aabmass had the same concern in the last SIG call.

Copy link
Member

@aabmass aabmass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NathanielRN can you open a separate issue for the discussion on where "extensions" should live, maybe summarize the pro/cons of each approach? We should document the guidelines on code-structure and packaging like Owais mentioned.

As for this PR, I agree removing the opentelemetry/sdk/__init__.py file makes sense so I'm approving. I'd be interested if anyone remembers why its there.

@NathanielRN
Copy link
Contributor Author

@aabmass Sounds good. I've made issue #1234

@lzchen lzchen merged commit 6f514a3 into open-telemetry:master Oct 15, 2020
@NathanielRN NathanielRN deleted the sdk-namespace-package branch October 15, 2020 20:17
srikanthccv pushed a commit to srikanthccv/opentelemetry-python that referenced this pull request Nov 1, 2020
@ffe4 ffe4 mentioned this pull request Nov 14, 2020
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SDK package is setup as a Regular Package when it should be a Namespace Package
4 participants