Lineage Keeper

A lightweight lineage tool based on Spark and Delta Lake

Instalation

pip install lineage-keeper

Basic use

from lineage_keeper import load_listener, LineageViewer
load_listener(spark)

df1 = spark.read.table("db.table_1")
df2 = spark.read.table("db.table_2")

df_join = df1.join(df2, "key")

df_join.write.saveAsTable("db.join_tables")

LineageViewer(spark).viewer()

Functionalities

By default Lineage Keeper use "default._service_table_lineage_keeper" as a service table.

If wanted its possible to use a different service table.

Listener function

Manually input lineage information on the service table

LineageListener : spark sesison

listener : source DataFrame, target table

ll = LineageListener(spark)
ll.listener(df, "target_db.target_table")

load listener

Change df.write.saveAsTable to automatically input lineage information on the service table

load_listener(spark)

Lineage graph viewer

Generate a static HTML with the lineage graph

LineageViewer(spark).viewer()

Lineage graph writer

Save a static HTML with the lineage graph on disk

LineageViewer(spark).save_graph(path)

Limitations

Its necessary to use tables sintax to read data
- spark.read.table("db.table")
- spark.sql("SELECT * FROM db.table")
To use load_listener to is necessary to use df.write.saveAsTable("db.table") otherwise need to call LineageListener(spark).listener(df, "db.table")

Demo Notebook

Sample using Lineage Keeper

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.attachment		.attachment
lineage_keeper		lineage_keeper
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Lineage Keeper

Table of contents

Instalation

Basic use

Functionalities

Listener function

load listener

Lineage graph viewer

Lineage graph writer

Limitations

Demo Notebook

About

Releases

Packages

Languages

License

otacilio-psf/lineage-keeper

Folders and files

Latest commit

History

Repository files navigation

Lineage Keeper

Table of contents

Instalation

Basic use

Functionalities

Listener function

load listener

Lineage graph viewer

Lineage graph writer

Limitations

Demo Notebook

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages