Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(tags): Export and Import Functionality for Superset Dashboards and Charts #30833

Open
wants to merge 25 commits into
base: master
Choose a base branch
from

Conversation

asher-lab
Copy link

SUMMARY

This PR is related to Discussion: #30629 by Spens

Motivation

We have adopted the tagging functionality and it is working quite well for us for dashboards and and charts. Our use case and process requires us to build dashboards on a development server, export the dashboards, commit the YAMLs to source control, and finally import the ZIPs to our production server. We would like the tags we assign on the development server to be captured in the export so they are replicated to our production server.

Proposed Change

UI elements are not affected by this change. All changes are related to updated logic/processing of the existing export and import functionality.

Only tags of type "custom" are exported. Other tags that exist but are not exported are "type" and "owner".

Export Logic
When an object (Dashboard/Chart) is exported, a new parameter is added to the exported YAML file. The parameter is "tags" and the value is an array of tag names. Here is an example:

slice_name: Vaccine Candidates per Approach & Stage
description: null
tags:
  - CovidTag
  - Items that start with the letter V
  - Superset Example Data
certified_by: null

...
Because tags have additional information (for example, a description), a new file is needed to hold this information. The file is called "tags.yaml" and lives in the top level of the zip file. For example:

dashboard_export_20241017T025146.zip
  charts (folder)
  dashboards (folder)
  databases (folder)
  datasets (folder)
  metadata.yaml (file)
  tags.yaml (file)

The format of the tags.yaml file is an array of tags

tags:
- tag_name: CovidTag
  description: Everything related to Covid
- tag_name: Items that start with the letter V
  description: Just for fun, tag Vanilla, Vikings, and anything that starts with V
- tag_name: Superset Example Data
  description: Native example data included with Superset

Import Logic
When an object is imported, the code checks to see if any tags included in the import file.
If yes, the tag is created if it does not already exist then the tag is assigned to the imported object. If the tag already exists, it is just assigned with no changes to the tag.

Details

  1. A tag can only be created if the user doing the import has permission to create tags. Otherwise the tagging is ignored.
  2. A created tag picks up the current time as the "created_on" and "changed_on" properties for the tag
  3. A created tag picks up the current user as the "created_by_fk" and "changed_by_fk"
  4. A created tag picks up the description from the tags.yaml file described above
  5. If a tag already exists, the description is not updated (for consistency with other export/import behavior - for example, if you import a dashboard and the chart included in the zip already exists, Superset does not update that existing chart)
    Existing Superset behavior: When importing objects that already exist, the user is prompted whether it is ok to overwrite. If no, the import process aborts. If yes, the old object is replaced with the new object.

To be consistent with this behavior, when importing an object and specifying "overwrite", old tags that are no longer specified with the object are removed, new tags that didn't originally exist are created, old tags that still remain in the new set of tags are retained (descriptions are not updated for example)

  1. It is important to note that in this proposal, imported tags are not added to existing tags
    This is intentional because for the use case of trying to mirror the settings and properties from one system to another, you must have the ability to remove tags from the target system. Otherwise, for example, if you rename a tag, the new name would get added and the old name would stick around which is not the desired behavior.

Thank you to anyone taking the time to read and comment!

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@dosubot dosubot bot added dashboard:export Related to exporting dashboards dashboard:import Related to importing dashboards labels Nov 4, 2024
docker/pythonpath_dev/superset_config.py Outdated Show resolved Hide resolved
superset/commands/chart/export.py Outdated Show resolved Hide resolved
@michael-s-molina
Copy link
Member

Thank you for the PR @asher-lab. Could you add unit tests for the new feature with the feature flag on and off?

Asher Manangan and others added 2 commits November 5, 2024 19:24
@asher-lab
Copy link
Author

Good day, @michael-s-molina. We've added logic to handle cases when the Tagging System is disabled in Superset. When disabled, the tags.yaml file and any associated tags within charts.yaml and dashboard.yaml will not be exported, and the same applies to the import operation.

Could you provide guidance on creating a unit test for this new feature? I have limited experience with writing unit tests, so any help would be appreciated.

Thank you!

@michael-s-molina
Copy link
Member

Could you provide guidance on creating a unit test for this new feature? I have limited experience with writing unit tests, so any help would be appreciated.

Hi @asher-lab. You can check this file for examples.

Add Unit Testing for Tag Export and Import in Superset
@asher-lab
Copy link
Author

Hi @michael-s-molina . Added unit tests for the new feature with the feature flag TAGGING_SYSTEM on and off. Thank you.

Copy link

codecov bot commented Nov 6, 2024

Codecov Report

Attention: Patch coverage is 49.28571% with 71 lines in your changes missing coverage. Please review.

Project coverage is 65.35%. Comparing base (76d897e) to head (99fb8f1).
Report is 950 commits behind head on master.

Files with missing lines Patch % Lines
superset/commands/tag/export.py 35.84% 34 Missing ⚠️
superset/commands/chart/export.py 30.76% 9 Missing ⚠️
...perset/commands/dashboard/importers/v1/__init__.py 27.27% 8 Missing ⚠️
superset/commands/dashboard/export.py 30.00% 7 Missing ⚠️
superset/commands/importers/v1/utils.py 82.50% 7 Missing ⚠️
superset/commands/chart/importers/v1/__init__.py 44.44% 5 Missing ⚠️
superset/commands/importers/v1/__init__.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #30833      +/-   ##
==========================================
+ Coverage   60.48%   65.35%   +4.86%     
==========================================
  Files        1931      537    -1394     
  Lines       76236    39053   -37183     
  Branches     8568        0    -8568     
==========================================
- Hits        46114    25522   -20592     
+ Misses      28017    13531   -14486     
+ Partials     2105        0    -2105     
Flag Coverage Δ
hive 48.79% <28.57%> (-0.37%) ⬇️
javascript ?
presto 53.25% <28.57%> (-0.55%) ⬇️
python 65.35% <49.28%> (+1.86%) ⬆️
unit 60.84% <49.28%> (+3.22%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@asher-lab
Copy link
Author

I seen a bunch of tests: 10 are failing and 23 are successful. Could you provide what tests that I need to look at?

@@ -1550,7 +1550,7 @@ class ImportV1ChartSchema(Schema):
dataset_uuid = fields.UUID(required=True)
is_managed_externally = fields.Boolean(allow_none=True, dump_default=False)
external_url = fields.String(allow_none=True)

tag = fields.List(fields.String(), allow_none=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we have it named as tags, since it's a list?

model.tags if hasattr(model, "tags") else []
)
# Filter out any tags that contain "type:" in their name
payload["tag"] = [tag.name for tag in tags if not any(x in tag.name for x in ["type:", "owner:"])]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since users can create tags with those strings, could we filter for type, instead? something like:

payload["tags"] = [tag.name for tag in tags if tag.type == TagType.custom]

@@ -453,7 +453,7 @@ class ImportV1DashboardSchema(Schema):
certified_by = fields.String(allow_none=True)
certification_details = fields.String(allow_none=True)
published = fields.Boolean(allow_none=True)

tag = fields.List(fields.String(), allow_none=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

tags = (
model.tags if hasattr(model, "tags") else []
)
payload["tag"] = [tag.name for tag in tags if not any(prefix in tag.name for prefix in ["type:", "owner:"])]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

@Vitor-Avila
Copy link
Contributor

Thank you so much @asher-lab! This is super cool and very useful 😍 🙌 I left a few comments, and I plan to give it another pass Tomorrow.

@asher-lab
Copy link
Author

asher-lab commented Nov 7, 2024

Hi @Vitor-Avila , we have fixed the integration test issue, updated it to use "tags" instead of "tag" and update filter using TagType. Thank you!

@asher-lab
Copy link
Author

Hi @Vitor-Avila and @michael-s-molina, if these are merged in, when is the expected date it be released?

@rusackas
Copy link
Member

rusackas commented Nov 8, 2024

Running CI 🤞 This would definitely be part of 5.0, which needs a new estimated date. It will have missed the boat for 4.1.0, but could make a 4.2.0, though it's unclear if/when that would happen. It would not make a 4.1.1 release since that would only include fixes. Feel free to join us on slack in the release strategy or release announcement channels, if you want to get involved with the process.

@asher-lab
Copy link
Author

asher-lab commented Nov 12, 2024

Currently fixing the tests that are failing. Will be making a commit in order to fix those. Thank you. I realized that we need to work with pre-commit checks and integration tests...

* Added self.contents in database and datasets for import

* Add self.content in query for import

* Run precommit and fix some issues in superset

* Add ExportTagsCommand

* Fix ImportExamplesCommand arguments to make Optional

* Remove should_export_tags parameter to solve pre-commit error

* Fix ExportDashboard command calling ExportTagsCommandfrom ExportChartsCommand

* Remove protected access for ExportTagsCommand _export

* Fix some Pylint issues

* Ensure Feature Flag for TAGGING_SYSTEM is turned off

---------

Co-authored-by: Asher Manangan <[email protected]>
@asher-lab
Copy link
Author

Good morning team, @rusackas @Vitor-Avila and @michael-s-molina. Could I get another run for this one please. I did run it on my own fork and all tests have passed. 🤞

@Vitor-Avila
Copy link
Contributor

hey @asher-lab,
Sorry for the delay on my end. Approved the workflows and hoping to check it today

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dashboard:export Related to exporting dashboards dashboard:import Related to importing dashboards size/XL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants