Project Lagoon

Project Lagoon is a data productization system, designed to simplify and isolate the process of preparing and delivering large datasets to external applications.

Description

Project Lagoon aims to create a reusable, automated system for transforming raw application data into high-quality assets.

This project leverages a micro lake architecture, with automation, wiring, and abstraction to create a cost-effective, modular solution. Designed to be deployable to any AWS account, a Lagoon includes a data lake (Iceberg), automated ingestion (Spark), data pipelines (dbt), and orchestration (Dagster).

Just drop raw data files in your S3 bucket and create your dbt models. Create high quality, production ready datasets in minutes.

Modules

deploy: Automated deployment of pipeline changes to orchestration system.
ingest: Spark automation to upsert raw data (JSON, Avro, Parquet, ORC, CSV, XML) into Iceberg tables.
initialize: CLI tool for deploying and updating your Lagoon
orchestr: dbt data pipeline orchestration and execution.

Get Started

To run and use Project Lagoon, download the latest release binary for your platform and execute the binary.

./<binary> --profile <AWS PROFILE>

Name		Name	Last commit message	Last commit date
Latest commit History 452 Commits
.github		.github
src		src
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
template.yml		template.yml
version.txt		version.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Lagoon

Description

Modules

Get Started

About

Releases 69

Packages

Contributors 3

Languages

License

mytiki/lagoon

Folders and files

Latest commit

History

Repository files navigation

Project Lagoon

Description

Modules

Get Started

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases 69

Packages 0

Contributors 3

Languages

Packages