Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: allow hive sql to be provided as config #312

Merged
merged 5 commits into from
Aug 10, 2020
Merged

Conversation

feng-tao
Copy link
Member

@feng-tao feng-tao commented Aug 8, 2020

Summary of Changes

This pr is to fix amundsen-io/amundsen#552 which allows user to provide hive metastore sql.

Tests

yes. add a unit test to test the new config.

Documentation

What documentation did you add or modify and why? Add any relevant links then remove this line

CheckList

Make sure you have checked all steps below to ensure a timely review.

  • PR title addresses the issue accurately and concisely. Example: "Updates the version of Flask to v1.0.2"
  • PR includes a summary of changes.
  • PR adds unit tests, updates existing unit tests, OR documents why no test additions or modifications are needed.
  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain docstrings that explain what it does
  • PR passes make test

@codecov-commenter
Copy link

codecov-commenter commented Aug 8, 2020

Codecov Report

Merging #312 into master will increase coverage by 0.79%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #312      +/-   ##
==========================================
+ Coverage   74.30%   75.10%   +0.79%     
==========================================
  Files         105      105              
  Lines        4492     4997     +505     
  Branches      419      518      +99     
==========================================
+ Hits         3338     3753     +415     
- Misses       1049     1127      +78     
- Partials      105      117      +12     
Impacted Files Coverage Δ
...builder/extractor/hive_table_metadata_extractor.py 94.33% <100.00%> (+0.22%) ⬆️
databuilder/rest_api/rest_api_failure_handlers.py 90.00% <0.00%> (-3.34%) ⬇️
databuilder/rest_api/base_rest_api_query.py 92.59% <0.00%> (-1.53%) ⬇️
...ilder/transformer/regex_str_replace_transformer.py 95.34% <0.00%> (-1.08%) ⬇️
...tabuilder/extractor/postgres_metadata_extractor.py 94.68% <0.00%> (-0.49%) ⬇️
databuilder/callback/call_back.py 92.30% <0.00%> (-0.29%) ⬇️
...abuilder/extractor/snowflake_metadata_extractor.py 95.09% <0.00%> (-0.22%) ⬇️
databuilder/loader/file_system_neo4j_csv_loader.py 89.23% <0.00%> (-0.06%) ⬇️
databuilder/extractor/db2_metadata_extractor.py 0.00% <0.00%> (ø)
databuilder/extractor/mysql_metadata_extractor.py 0.00% <0.00%> (ø)
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d24cba9...86f5668. Read the comment docs.

Golodhros
Golodhros previously approved these changes Aug 10, 2020
Copy link
Member

@Golodhros Golodhros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@jinhyukchang jinhyukchang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left one comment.

where_clause_suffix=conf.get_string(HiveTableMetadataExtractor.WHERE_CLAUSE_SUFFIX_KEY))

self.sql_stmt = conf.get_string(HiveTableMetadataExtractor.EXTRACT_SQL.format(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's missing closing bracket?

conf.get_string(HiveTableMetadataExtractor.EXTRACT_SQL.format
--> conf.get_string(HiveTableMetadataExtractor.EXTRACT_SQL).format

By the way, we may not need to add where clause if they provide SQL statement. WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually yeah, we should just let them provide the sql.

@feng-tao feng-tao merged commit 8075a6c into master Aug 10, 2020
@feng-tao feng-tao deleted the tfeng_change_hive_sql branch August 10, 2020 23:03
jerryzhu2007 pushed a commit to kylg/amundsendatabuilder that referenced this pull request Aug 20, 2020
* commit 'e14b33e776929f8b020f1c6fec75d0fb83687693': (23 commits)
  Fix Athena sample DAG (amundsen-io#341)
  fix: Update postgres_sample_dag to set table extract job as upstream for elastic search publisher (amundsen-io#340)
  chore: mypy cleanup (convert last comment types, remove noqa imports) (amundsen-io#338)
  chore: Convert typings to mypy (amundsen-io#311)
  chore: replace all references of Lyft repo with Amundsen (amundsen-io#323)
  feat: add github actions for databuilder (amundsen-io#336)
  build: fix broken tests in Python 3.7, test in CI (amundsen-io#334)
  fix(deps): Unpin attrs (amundsen-io#332)
  ci: add dependabot config (amundsen-io#330)
  Change repo name in travis file (amundsen-io#324)
  tests: add mock for bigquery auth (amundsen-io#313)
  feat: allow hive sql to be provided as config (amundsen-io#312)
  chore: remove python2 (amundsen-io#310)
  chore: update deps for databuilder (amundsen-io#309)
  fix: cypher statement param issue in Neo4jStalenessRemovalTask (amundsen-io#307)
  fix: Added missing job tag key in hive_sample_dag.py (amundsen-io#308)
  feat: enhance glue extractor (amundsen-io#306)
  fix: Fix sql for missing columns and mysql based dialects (#550) (amundsen-io#305)
  docs: Fix broken doc link to dashboard_execution model (amundsen-io#296)
  chore: apply license headers to all the source files (amundsen-io#304)
  ...

# Conflicts:
#	README.md
#	databuilder/extractor/kafka_source_extractor.py
#	databuilder/publisher/neo4j_csv_publisher.py
#	docs/models.md
#	example/scripts/sample_data_loader.py
#	setup.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

amundsendatabuilder -> HiveTableMetadataExtractor only works with mysql innodb
5 participants