Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ingest/vertica): performance improvement and bug fixes #8328

Merged
merged 89 commits into from
Aug 1, 2023
Merged
Show file tree
Hide file tree
Changes from 66 commits
Commits
Show all changes
89 commits
Select commit Hold shift + click to select a range
b80af76
fixed all code quality erros
vishalkSimplify May 26, 2023
b61fd94
final plugin checking for github actions now
vishalkSimplify May 29, 2023
2c903ba
Merge branch 'datahubversion2'
vishalkSimplify May 29, 2023
ee08ad9
removed stdout from vertica test
vishalkSimplify May 29, 2023
9da0c84
lint fixed vertica test
vishalkSimplify May 29, 2023
299633e
fixed linting
vishalkSimplify May 29, 2023
3112b68
fixed import sorting
vishalkSimplify May 30, 2023
50b3bbf
fixed mpypy lint error
vishalkSimplify May 30, 2023
a62dbc1
fixed get_view_properties
vishalkSimplify May 30, 2023
a7ac4b0
updated golden file
vishalkSimplify Jun 1, 2023
f25ca48
updated sources in datahub react frontend
vishalkSimplify Jun 1, 2023
d54bcea
started fixing integration tests
vishalkSimplify Jun 1, 2023
a75fb13
updated integration tests
vishalkSimplify Jun 2, 2023
714d1e5
updated the golden files
vishalkSimplify Jun 5, 2023
3b1747d
fixed linting in vertica.py file
vishalkSimplify Jun 5, 2023
9ded885
updated sqlalchemy vertica dialect version
vishalkSimplify Jun 5, 2023
9dc916e
fixed linting in vertica.py
vishalkSimplify Jun 5, 2023
bcfc242
fixed function argument
vishalkSimplify Jun 5, 2023
e971973
sd
vishalkSimplify Jun 5, 2023
fc17ff1
added ingore path in vertica integration test
vishalkSimplify Jun 6, 2023
463fd67
Merge branch 'dthubv2'
vishalkSimplify Jun 6, 2023
f674a33
removed normalised table names
vishalkSimplify Jun 6, 2023
160f026
updated vertica sqlalchemy dialect version
vishalkSimplify Jun 7, 2023
e8659e2
uncommented vertica from setup.py
vishalkSimplify Jun 7, 2023
43a2a4a
updated golden files
vishalkSimplify Jun 7, 2023
88d1d77
updated udx language in vertica mce file
vishalkSimplify Jun 7, 2023
36ba423
upgrading vertica dialect to 0.0.05 version
vishalkSimplify Jun 9, 2023
b3726d8
changed vertica documentations
vishalkSimplify Jun 9, 2023
f977d6f
Merge branch 'master' into master
vishalkSimplify Jun 9, 2023
2c6fe2f
Update metadata-ingestion/src/datahub/ingestion/source/sql/vertica.py
vishalkSimplify Jun 10, 2023
11cc348
Merge branch 'master' into master
vishalkSimplify Jun 10, 2023
8eb883c
fixing function contracts and code cleaning to remove type: ignore
vishalkSimplify Jun 11, 2023
1584785
removed all type ignores and fixed function contracts
vishalkSimplify Jun 13, 2023
766276b
Merge branch 'master' into master
vishalkSimplify Jun 13, 2023
a0e1d95
fixed funtions contracts and removed all type ignores .
vishalkSimplify Jun 15, 2023
4b81c94
Merge branch 'master' into master
vishalkSimplify Jun 15, 2023
fa703ac
updated vertica golden file
vishalkSimplify Jun 15, 2023
9155e4d
updated vertica golden file
vishalkSimplify Jun 15, 2023
a8b7875
Merge branch 'master' into master
vishalkSimplify Jun 15, 2023
c07926c
updated ignore path for udx langauge and updated mce golden files
vishalkSimplify Jun 15, 2023
e77b2ce
added vertica as base_dev_requirements in setup.py
vishalkSimplify Jun 15, 2023
5a90db3
Merge branch 'master' into master
vishalkSimplify Jun 16, 2023
caaebac
Merge branch 'master' into master
vishalkSimplify Jun 16, 2023
9099d42
Merge branch 'master' into master
vishalkSimplify Jun 17, 2023
e8b542c
Merge branch 'master' into master
vishalkSimplify Jun 20, 2023
9f15fa8
Merge branch 'master' into master
vishalkSimplify Jun 23, 2023
d5ad4d3
Apply suggestions from code review
vishalkSimplify Jun 23, 2023
36d6a24
fixed variable name of table,view, projection
vishalkSimplify Jun 23, 2023
3c99bf1
first version with schema level caching
vishalkSimplify Jun 27, 2023
c3c0d43
fixed integration tests
vishalkSimplify Jun 28, 2023
6b7cbd9
fixed table_columns for projection columns in profiling
vishalkSimplify Jun 28, 2023
b664ee2
reformated vertica.py
vishalkSimplify Jun 28, 2023
c03f121
added vertica in setup.py
vishalkSimplify Jun 28, 2023
f2e253a
Merge branch 'master' into pluginv2
vishalkSimplify Jun 28, 2023
bdf212b
Update vertica.py
vishalkSimplify Jun 28, 2023
fc78c3b
Merge branch 'master' into pluginv2
vishalkSimplify Jun 29, 2023
9380812
fixed linting
vishalkSimplify Jun 29, 2023
7da8156
updated mce file
vishalkSimplify Jun 29, 2023
a8e0a65
fixed changes according to PR comments
vishalkSimplify Jun 30, 2023
2203c13
Merge branch 'master' into pluginv2
vishalkSimplify Jun 30, 2023
b62814f
fixed changes according to PR comments
vishalkSimplify Jun 30, 2023
4c70647
Merge branch 'datahub-project:master' into pluginv2
vishalkSimplify Jul 10, 2023
3f45e3d
cleaned vertica.py file
vishalkSimplify Jul 10, 2023
82e929a
cleaned vertica.py file
vishalkSimplify Jul 10, 2023
88fee38
updated setup.py with new vertica dialect
vishalkSimplify Jul 11, 2023
44ef84d
Merge branch 'master' into pluginv2
anshbansal Jul 13, 2023
cf47978
Update vertica_mces_with_db_golden.json
vishalkSimplify Jul 13, 2023
c6c5864
Merge branch 'datahub-project:master' into pluginv2
vishalkSimplify Jul 14, 2023
382e6b3
updted golden file
vishalkSimplify Jul 14, 2023
cc64bfb
updated ignore paths
vishalkSimplify Jul 14, 2023
6901fe4
Merge branch 'master' into pluginv2
vishalkSimplify Jul 14, 2023
010ee59
Merge branch 'master' into pluginv2
vishalkSimplify Jul 16, 2023
5e6ebde
Merge branch 'master' into pluginv2
vishalkSimplify Jul 17, 2023
934d691
Merge branch 'master' into pluginv2
asikowitz Jul 17, 2023
14cd4de
Merge branch 'master' into pluginv2
vishalkSimplify Jul 18, 2023
0eb36f9
Merge branch 'master' into pluginv2
vishalkSimplify Jul 19, 2023
6290fa6
Merge branch 'master' into pluginv2
vishalkSimplify Jul 20, 2023
cfff6ec
Merge branch 'master' into pluginv2
vishalkSimplify Jul 21, 2023
db8d7bd
Merge branch 'master' into pluginv2
vishalkSimplify Jul 22, 2023
1e1d116
Merge branch 'master' into pluginv2
vishalkSimplify Jul 24, 2023
9e45dfb
Merge branch 'master' into pluginv2
vishalkSimplify Jul 25, 2023
58a63c6
Merge branch 'master' into pluginv2
vishalkSimplify Jul 26, 2023
ae59efc
updated golden files
vishalkSimplify Jul 26, 2023
0b23d7d
Merge branch 'master' into pluginv2
vishalkSimplify Jul 27, 2023
9103824
Merge branch 'master' into pluginv2
vishalkSimplify Jul 29, 2023
34f0bcc
Merge branch 'master' into pluginv2
vishalkSimplify Jul 30, 2023
1a1401b
Merge branch 'master' into pluginv2
vishalkSimplify Jul 31, 2023
b7d2fb2
Merge branch 'master' into pluginv2
vishalkSimplify Aug 1, 2023
7fe509a
Merge branch 'master' into pluginv2
vishalkSimplify Aug 1, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
vishalkSimplify marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ import {
PROJECT_NAME,
} from './lookml';
import { PRESTO, PRESTO_HOST_PORT, PRESTO_DATABASE, PRESTO_USERNAME, PRESTO_PASSWORD } from './presto';
import { BIGQUERY_BETA, DBT_CLOUD, MYSQL, POWER_BI, UNITY_CATALOG } from '../constants';
import { BIGQUERY_BETA, DBT_CLOUD, MYSQL, POWER_BI, UNITY_CATALOG, VERTICA } from '../constants';
import { BIGQUERY_BETA_PROJECT_ID, DATASET_ALLOW, DATASET_DENY, PROJECT_ALLOW, PROJECT_DENY } from './bigqueryBeta';
import { MYSQL_HOST_PORT, MYSQL_PASSWORD, MYSQL_USERNAME } from './mysql';
import { MSSQL, MSSQL_DATABASE, MSSQL_HOST_PORT, MSSQL_PASSWORD, MSSQL_USERNAME } from './mssql';
Expand Down Expand Up @@ -130,6 +130,17 @@ import {
WORKSPACE_ID_DENY,
} from './powerbi';

import {
VERTICA_HOST_PORT,
VERTICA_DATABASE,
VERTICA_USERNAME,
VERTICA_PASSWORD,
INCLUDE_PROJECTIONS,
INCLUDE_MLMODELS,
INCLUDE_VIEW_LINEAGE,
INCLUDE_PROJECTIONS_LINEAGE,
} from './vertica';

export enum RecipeSections {
Connection = 0,
Filter = 1,
Expand Down Expand Up @@ -428,6 +439,20 @@ export const RECIPE_FIELDS: RecipeFields = {
],
filterSectionTooltip: 'Include or exclude specific PowerBI Workspaces from ingestion.',
},
[VERTICA]: {
fields: [VERTICA_HOST_PORT, VERTICA_DATABASE, VERTICA_USERNAME, VERTICA_PASSWORD],
filterFields: [SCHEMA_ALLOW, SCHEMA_DENY, TABLE_ALLOW, TABLE_DENY, VIEW_ALLOW, VIEW_DENY],
advancedFields: [
INCLUDE_TABLES,
INCLUDE_VIEWS,
INCLUDE_PROJECTIONS,
INCLUDE_MLMODELS,
INCLUDE_VIEW_LINEAGE,
INCLUDE_PROJECTIONS_LINEAGE,
TABLE_PROFILING_ENABLED,
],
filterSectionTooltip: 'Include or exclude specific Schemas, Tables, Views and Projections from ingestion.',
},
};

export const CONNECTORS_WITH_FORM = new Set(Object.keys(RECIPE_FIELDS));
Expand Down
119 changes: 119 additions & 0 deletions datahub-web-react/src/app/ingest/source/builder/RecipeForm/vertica.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
import { get } from 'lodash';
import { RecipeField, FieldType } from './common';

export const VERTICA_HOST_PORT: RecipeField = {
name: 'host_port',
label: 'Host and Port',
tooltip:
"The host and port where Vertica is running. For example, 'localhost:5433'. Note: this host must be accessible on the network where DataHub is running (or allowed via an IP Allow List, AWS PrivateLink, etc).",
type: FieldType.TEXT,
fieldPath: 'source.config.host_port',
placeholder: 'localhost:5433',
required: true,
rules: null,
};

export const VERTICA_DATABASE: RecipeField = {
name: 'database',
label: 'Database',
tooltip: 'Ingest metadata for a specific Database.',
type: FieldType.TEXT,
fieldPath: 'source.config.database',
placeholder: 'Vertica_Database',
required: true,
rules: null,
};

export const VERTICA_USERNAME: RecipeField = {
name: 'username',
label: 'Username',
tooltip: 'The Vertica username used to extract metadata.',
type: FieldType.TEXT,
fieldPath: 'source.config.username',
placeholder: 'Vertica_Username',
required: true,
rules: null,
};

export const VERTICA_PASSWORD: RecipeField = {
name: 'password',
label: 'Password',
tooltip: 'The Vertica password for the user.',
type: FieldType.SECRET,
fieldPath: 'source.config.password',
placeholder: 'Vertica_Password',
required: true,
rules: null,
};

const includeProjectionPath = 'source.config.include_projections';
export const INCLUDE_PROJECTIONS: RecipeField = {
name: 'include_projections',
label: 'Include Projections',
tooltip: 'Extract Projections from source.',
type: FieldType.BOOLEAN,
fieldPath: includeProjectionPath,
// This is in accordance with what the ingestion sources do.
getValueFromRecipeOverride: (recipe: any) => {
const includeProjection = get(recipe, includeProjectionPath);
if (includeProjection !== undefined && includeProjection !== null) {
return includeProjection;
}
return true;
},
rules: null,
};

const includemodelsPath = 'source.config.include_models';
export const INCLUDE_MLMODELS: RecipeField = {
name: 'include_models',
label: 'Include ML Models',
tooltip: 'Extract ML models from source.',
type: FieldType.BOOLEAN,
fieldPath: includemodelsPath,
// This is in accordance with what the ingestion sources do.
getValueFromRecipeOverride: (recipe: any) => {
const includeModel = get(recipe, includemodelsPath);
if (includeModel !== undefined && includeModel !== null) {
return includeModel;
}
return true;
},
rules: null,
};

const includeviewlineagePath = 'source.config.include_view_lineage';
export const INCLUDE_VIEW_LINEAGE: RecipeField = {
name: 'include_view_lineage',
label: 'Include View Lineage',
tooltip: 'Extract View Lineage from source.',
type: FieldType.BOOLEAN,
fieldPath: includeviewlineagePath,
// This is in accordance with what the ingestion sources do.
getValueFromRecipeOverride: (recipe: any) => {
const includeviewlineage = get(recipe, includeviewlineagePath);
if (includeviewlineage !== undefined && includeviewlineage !== null) {
return includeviewlineage;
}
return true;
},
rules: null,
};

const includeprojectionlineagePath = 'source.config.include_projection_lineage';
export const INCLUDE_PROJECTIONS_LINEAGE: RecipeField = {
name: 'include_projection_lineage',
label: 'Include Projection Lineage',
tooltip: 'Extract Projection Lineage from source.',
type: FieldType.BOOLEAN,
fieldPath: includeprojectionlineagePath,
// This is in accordance with what the ingestion sources do.
getValueFromRecipeOverride: (recipe: any) => {
const includeprojectionlineage = get(recipe, includeprojectionlineagePath);
if (includeprojectionlineage !== undefined && includeprojectionlineage !== null) {
return includeprojectionlineage;
}
return true;
},
rules: null,
};
vishalkSimplify marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -200,7 +200,7 @@
"name": "vertica",
"displayName": "Vertica",
"docsUrl": "https://datahubproject.io/docs/generated/ingestion/sources/vertica/",
"recipe": "source:\n type: vertica\n config:\n # Coordinates\n host_port: localhost:5433\n # The name of the vertica database\n database: Vmart\n # Credentials\n username: dbadmin\n password:null\n include_tables: true\n include_views: true\n include_projections: true\n include_oauth: true\n include_models: true\n include_view_lineage: true\n include_projection_lineage: true\n profiling:\n enabled: true\n stateful_ingestion:\n enabled: true "
"recipe": "source:\n type: vertica\n config:\n # Coordinates\n host_port: localhost:5433\n # The name of the vertica database\n database: Database_Name\n # Credentials\n username: Vertica_User\n password: Vertica_Password\n\n include_tables: true\n include_views: true\n include_projections: true\n include_models: true\n include_view_lineage: true\n include_projection_lineage: true\n profiling:\n enabled: false\n stateful_ingestion:\n enabled: true "
},
{
"urn": "urn:li:dataPlatform:custom",
Expand Down
1 change: 0 additions & 1 deletion metadata-ingestion/docs/sources/vertica/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@ The DataHub Vertica Plugin extracts the following:
* Metadata for databases, schemas, views, tables, and projections
* Table level lineage
* Metadata for ML Models
* Metadata for Vertica OAuth


### Concept Mapping
Expand Down
2 changes: 1 addition & 1 deletion metadata-ingestion/docs/sources/vertica/vertica_pre.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,5 @@

In order to ingest metadata from Vertica, you will need:

- Vertica Server Version 10.1.1-0 and avobe. It may also work for older versions.
- Vertica Server Version 10.1.1-0 and above. It may also work with, but is not been tested with, older versions .
- Vertica Credentials (Username/Password)
1 change: 0 additions & 1 deletion metadata-ingestion/docs/sources/vertica/vertica_recipe.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@ source:
include_tables: true
include_views: true
include_projections: true
include_oauth: true
include_models: true
include_view_lineage: true
include_projection_lineage: true
Expand Down
9 changes: 6 additions & 3 deletions metadata-ingestion/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -386,7 +386,9 @@ def get_long_description():
"nifi": {"requests", "packaging", "requests-gssapi"},
"powerbi": microsoft_common | {"lark[regex]==1.1.4", "sqlparse"},
"powerbi-report-server": powerbi_report_server,
"vertica": sql_common | {"vertica-sqlalchemy-dialect[vertica-python]==0.0.1"},

"vertica": sql_common | {"vertica-sqlalchemy-dialect[vertica-python]==0.0.8"},

"unity-catalog": databricks | sqllineage_lib,
}

Expand Down Expand Up @@ -498,7 +500,8 @@ def get_long_description():
"powerbi-report-server",
"salesforce",
"unity-catalog",
"nifi"
"nifi",
"vertica"
# airflow is added below
]
if plugin
Expand Down Expand Up @@ -532,7 +535,7 @@ def get_long_description():
"mysql",
"mariadb",
"redash",
# "vertica",
"vertica",
]
for dependency in plugins[plugin]
),
Expand Down
Loading
Loading