Skip to content
View ddelange's full-sized avatar
💥
["translatio", "imitatio", "aemulatio"]
💥
["translatio", "imitatio", "aemulatio"]

Block or report ddelange

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

etl

Extract-Transform-Load, Data Wrangling, Data Mining, ...
244 repositories

Fastest and safest Rust implementation of parquet. `unsafe` free. Integration-tested against pyarrow

Rust 352 59 Updated Jul 31, 2024

Dataframes powered by a multithreaded, vectorized query engine, written in Rust

Rust 29,402 1,865 Updated Sep 24, 2024

Synthetic Data Generation for mixed-type, multivariate time series.

Python 101 14 Updated Sep 23, 2024

Community maintained fork of pdfminer - we fathom PDF

Python 5,828 922 Updated Aug 2, 2024

A very fast Python asyncio http client

Python 149 19 Updated Sep 23, 2024

🍰 Desktop utility to download images/videos/music/text from various websites, and more.

Python 21,768 2,013 Updated Apr 5, 2024

Python wrapper of the RuRe.

Rust 87 13 Updated Oct 30, 2019

An Elasticsearch client exposing DataFrame API

Python 285 45 Updated Apr 1, 2023

Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch

Python 637 98 Updated Sep 5, 2024

A blazingly fast JSON serializing & deserializing library

Assembly 6,758 333 Updated Sep 23, 2024

Super minimal python S3 cache

Python 4 Updated Sep 6, 2019

A web interface to extract tabular data from PDFs

HTML 1,562 227 Updated May 14, 2024

A validation library for Pandas data frames using user-friendly schemas

Python 189 35 Updated Mar 24, 2023

A toolkit to run Ray applications on Kubernetes

Go 1,146 370 Updated Sep 20, 2024

Python Serverless Microframework for AWS

Python 10,610 1,009 Updated Jul 29, 2024

A cloud-native Pipeline resource.

Go 8,442 1,773 Updated Sep 24, 2024

AintQ Is Not Task Queue - a Python asyncio task queue on PostgreSQL.

Python 51 3 Updated Dec 26, 2022

An open source python library for automated feature engineering

Python 7,208 873 Updated Sep 20, 2024

A novel data lake based on super-structured data

Go 1,375 67 Updated Sep 25, 2024

Normalizing Flows in JAX 🌊

Python 272 19 Updated Jun 18, 2023

Extract data from websites using basic statistical magic

Python 503 45 Updated Oct 2, 2020

Dockerfile for libpostal-service based on the Who's on First implementation

Dockerfile 36 14 Updated Dec 1, 2023

A multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud ☁️🚀

Python 317 105 Updated Sep 3, 2024

The best and simplest free open source web page change detection, website watcher, restock monitor and notification service. Restock Monitor, change detection. Designed for simplicity - Simply moni…

Python 16,997 950 Updated Sep 20, 2024

An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and more

Python 692 178 Updated Sep 24, 2024

Keep code, data, containers under control with git and git-annex

Python 525 110 Updated Sep 17, 2024

An interactive PDF reader.

Python 416 52 Updated Jul 19, 2023

Represent, send, store and search multimodal data

Python 2,938 231 Updated Sep 24, 2024

Python tools for geographic data

Python 4,452 925 Updated Sep 24, 2024

Identify bias and measure fairness of your data

Python 90 9 Updated Aug 26, 2024