- Digital Nomad
- [email protected]
etl
🧙 Build, run, and manage data pipelines for integrating and transforming data.
A curated list of resources dedicated to Feature Engineering Techniques for Machine Learning
🌏🌍🌎Translators🌎🌍🌏 is a library that aims to bring free, multiple, enjoyable translations to individuals and students in Python. Translators是一个旨在用Python为个人和学生带来免费、多样、愉快翻译的库。
Library for exploring and validating machine learning data
Python library of 60+ commonly-used validator functions
A Python library to provide functions to handle, parse and validate standard numbers.
Python Data Validation for Humans™.
Engine for ML/Data tracking, visualization, explainability, drift detection, and dashboards for Polyaxon.
🚤 Label data at scale. Fun and precision included.
Mirror of the Xapian repository. You're welcome to open pull requests on github (they'll just get merged indirectly).
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
⚡ Workflow Automation Platform. Orchestrate & Schedule code in any language, run anywhere, 500+ plugins. Alternative to Zapier, Rundeck, Camunda, Airflow...
PRQL is a modern language for transforming data — a simple, powerful, pipelined SQL replacement
A Python Library for Graph Outlier Detection (Anomaly Detection)
Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.
SearXNG is a free internet metasearch engine which aggregates results from various search services and databases. Users are neither tracked nor profiled.
Small, dependency-free, fast Python package to infer binary file types checking the magic numbers signature
Truly universal encoding detector in pure Python
A library for preparing data for machine translation research (monolingual preprocessing, bitext mining, etc.) built by the FAIR NLLB team.
Technical Analysis Indicators - Pandas TA is an easy to use Python 3 Pandas Extension with 150+ Indicators
ExifLooter finds geolocation on all image urls and directories also integrates with OpenStreetMap
Postgresql capture data change software in Rust to allow realtime websockets
Rasterio reads and writes geospatial raster datasets
A powerful and user-friendly binary analysis platform!
A scalable, event-driven durable execution platform. Run stateful step functions functions deployed to serverless, servers, or the edge.
A Python Library for Outlier and Anomaly Detection, Integrating Classical and Deep Learning Techniques