Releases: Eventual-Inc/Daft
Releases · Eventual-Inc/Daft
v0.2.14
Changes
✨ New Features
- [FEAT] Add ceil function @NormallyGaussian (#1867)
- [FEAT] show full schema on request @samster25 (#1868)
- [FEAT] Enable Requester Pay for S3 reads @colin-ho (#1856)
- [FEAT] _add_monotonically_increasing_id method for Dataframe @colin-ho (#1827)
🚀 Performance Improvements
👾 Bug Fixes
- [BUG] Protect Global Context With Mutex @samster25 (#1857)
- [BUG] Schema hints not working properly for json reads @colin-ho (#1845)
📖 Documentation
- [DOCS] Change show_optimized kwarg to show_all @jaychia (#1874)
- [DOCS] Drop use of "Complex Data" in favor of multimodal @samster25 (#1875)
- [DOCS] Add docs for AWS S3 IO @colin-ho (#1855)
🧰 Maintenance
v0.2.13
Changes
✨ New Features
- [FEAT] Add group_by.map_groups @colin-ho (#1825)
- [FEAT] [Join Optimizations] Add sort-merge join. @clarkzinzow (#1755)
- [FEAT] is_in expression @colin-ho (#1811)
- [FEAT] Dataframe __contains__ magic method @colin-ho (#1817)
🚀 Performance Improvements
- [PERF] Split parquet scan tasks into individual row groups @kevinzwang (#1799)
👾 Bug Fixes
- [BUG] Scan Operator Fix + Physical Plan Scan Task Summary @samster25 (#1850)
- [BUG] [Parquet] Fix double-await on
JoinHandle
s concurrency bug in Parquet reader. @clarkzinzow (#1841) - [BUG] Incorrect expression naming for struct get @kevinzwang (#1832)
- [BUG] Fix empty struct fields @colin-ho (#1833)
- [BUG] Fix for Iceberg schema projection @jaychia (#1815)
📖 Documentation
- [DOCS] Add docs for Azure IO @jaychia (#1851)
- [Query Planner] Add physical plan visualization option to
df.explain()
; implementTreeVisitor
forLogicalPlan
andPhysicalPlan
. @clarkzinzow (#1836) - [DOCS] Add type conversions between iceberg and daft @jaychia (#1835)
- [DOCS] Add dedicated Iceberg page @jaychia (#1830)
- [DOCS] Refactor expressions docs layout @jaychia (#1816)
- [CHORE] Add is_in to docs @colin-ho (#1819)
🧰 Maintenance
v0.2.12
Changes
👾 Bug Fixes
- [BUG] bugfix for empty partitions when writing out empty partitions @samster25 (#1814)
v0.2.11
Changes
✨ New Features
- [FEAT] Support Hive-Style Partitioned Writes for Tabular Writes @samster25 (#1794)
👾 Bug Fixes
- [BUG] Fix scheduler deadlock on concurrent broadcast joins. @clarkzinzow (#1812)
- [BUG] Fix type annotation on UDF @jaychia (#1807)
- [BUG] Materialize Dataframes created from file writes @colin-ho (#1785)
- [BUG] Materialize Dataframes created from in-memory data @colin-ho (#1780)
📖 Documentation
- [DOCS] Add warning during repartition to use into_partitions instead @jaychia (#1808)
- [BUG] Fix type annotation on UDF @jaychia (#1807)
- [DOCS] Update README.rst to remove beta disclaimer @jaychia (#1802)
- [CHORE] Update docs to reflect materialized Dataframes from writes and in-memory reads @colin-ho (#1795)
- [DOCS] Upgrade version of docs sphinx-book-theme dependency @jaychia (#1789)
- [DOCS] Fix notebooks to use new public parquet file @jaychia (#1788)
- [DOCS] Fix docs build for sphinxcontrib-applehelp versioning @jaychia (#1787)
- [DOCS] Update README.rst for broken links @jaychia (#1786)
- [CHORE] Update tutorials to use released version of Daft @jaychia (#1751)
🧰 Maintenance
- [CHORE] Update docs to reflect materialized Dataframes from writes and in-memory reads @colin-ho (#1795)
- [CHORE] Update tutorials to use released version of Daft @jaychia (#1751)
⬆️ Dependencies
- Bump actions/cache from 3 to 4 @dependabot (#1805)
v0.2.10
Changes
✨ New Features
- [FEAT] Add getter for Struct and List expressions @kevinzwang (#1775)
- [FEAT] Iceberg Murmur3 Hash function @samster25 (#1778)
- [FEAT] Not_Null Expression @colin-ho (#1777)
- [FEAT] Add sample function for Dataframe @colin-ho (#1770)
🚀 Performance Improvements
- [PERF] Iceberg Truncate Transform @samster25 (#1783)
- [PERF] Iceberg Hash Bucket Transform @samster25 (#1779)
👾 Bug Fixes
- [BUG] Invalidate PartitionSpec when we run Explode on it @samster25 (#1772)
📖 Documentation
- [CHORE] Add sample to docs @colin-ho (#1781)
- [CHORE] Add not_null to docs @colin-ho (#1782)
- [FEAT] Add getter for Struct and List expressions @kevinzwang (#1775)
- [DOCS] Fix broken links on readme @jaychia (#1774)
- [DOCS] Add documentation for read_iceberg @jaychia (#1769)
- [DOCS] Documentation reorganization @jaychia (#1762)
🧰 Maintenance
v0.2.9
v0.2.8
Changes
✨ New Features
- [PERF] Iceberg Partition Pruning @samster25 (#1688)
- [FEAT] annotate ray tasks with name of instructions @samster25 (#1729)
🚀 Performance Improvements
- [PERF] Iceberg Partition Pruning @samster25 (#1688)
- [PERF] Speed up CSV Reader with SIMD and reduced allocations @samster25 (#1749)
- [PERF] Greatly speed up Variable Length Concat @samster25 (#1748)
- [PERF] Predicate Pushdown into Scan Operator @samster25 (#1730)
- [PERF] Json Predicate Pushdown while reading @samster25 (#1727)
- [PERF] Predicate Pushdown for CSV Reader @samster25 (#1724)
👾 Bug Fixes
- [BUG] Concat Fix when Variable Length Array is sliced @samster25 (#1750)
- [BUG] bugfix when cluster has no workers and key error happens when fetching num cores @samster25 (#1745)
- [BUG] Fix comparing date and timestamps @samster25 (#1735)
- [BUG] Apply the default IOConfig in daft.from_glob_path @jaychia (#1731)
- [BUG] [Hotfix] Fix limit pushdown test. @clarkzinzow (#1728)
📖 Documentation
- Revert "[DOCS] Add proper robots.txt and sitemap.xml to index only latest and stable" @jaychia (#1753)
- [DOCS] Add proper robots.txt and sitemap.xml to index only latest and stable @jaychia (#1752)
- [DOCS] Add documentation on memory @jaychia (#1736)
- [DOCS] Add anonymous io_config for notebook @jaychia (#1721)
🧰 Maintenance
- [CHORE] kernel override for notebook checker @samster25 (#1746)
- [CHORE] Clean up Repr for GlobScanOperator and Explain @samster25 (#1734)
- [CHORE] Generate S3 manifests @samster25 (#1732)
- [CHORE] update dev version to 0.2.0 dev @samster25 (#1723)
v0.2.7
Changes
✨ New Features
- [FEAT] Add ability to set global IOConfig @jaychia (#1710)
- [FEAT] [Join Optimizations] Add broadcast join. @clarkzinzow (#1706)
- [FEAT] Propagate configs to Ray remote functions @jaychia (#1707)
- [FEAT] [JSON Reader] Add native streaming + parallel JSON reader. @clarkzinzow (#1679)
🚀 Performance Improvements
- [PERF] Enable Predicates in Parquet Reader @samster25 (#1702)
👾 Bug Fixes
- [BUG] [Hotfix] [Join Optimization] Fix pre-partitioned check for larger side of join. @clarkzinzow (#1718)
- [BUG] Fix set_config logic so it can be called after call to set runner @jaychia (#1709)
- [BUG] Propagate URL download expressions max_connections to S3Config @jaychia (#1708)
📖 Documentation
v0.2.6
Changes
✨ New Features
- [FEAT] Add smart planning of ScanTasks starting with merging by filesizes @jaychia (#1692)
- [FEAT] Enable Comparison between timestamp / dates @samster25 (#1689)
- [FEAT] Enable MicroPartitions by default @jaychia (#1684)
- [FEAT] Temporal Literals for Date and Timestamp @samster25 (#1683)
- [FEAT] Partitioning exprs for Iceberg @samster25 (#1680)
👾 Bug Fixes
- [BUG] Use schema_hints as hints instead of definitive schema @colin-ho (#1636)
- [BUG] Allow for use of Ray jobs for benchmarking @jaychia (#1690)
- [BUG] fix off by 1 for retries for cred provider @samster25 (#1681)
🧰 Maintenance
- [CHORE] bump gcs and s3fs @samster25 (#1699)
- [CHORE] Add warmup step for remote tpch benchmarking @jaychia (#1691)
- [CHORE] drop s3 compat mode for gcs for anonymous mode @samster25 (#1682)
- [CHORE] Remove usage of credentials in workflows @jaychia (#1686)
- [CHORE] Iceberg Image Caching @samster25 (#1687)
- [CHORE] Bump Iceberg Version and V1 of caching @samster25 (#1685)
⬆️ Dependencies
- Bump globset from 0.4.13 to 0.4.14 @dependabot (#1694)
- Bump libc from 0.2.149 to 0.2.150 @dependabot (#1693)
- Bump google-github-actions/auth from 1 to 2 @dependabot (#1698)
v0.2.5
Changes
👾 Bug Fixes
- [BUG] Check queue state while waiting to place inside @samster25 (#1678)
- [BUG] Parametrize dataframe unit-tests with Parquet data @jaychia (#1610)
🧰 Maintenance
- [CHORE] Favor traversal over visitors @samster25 (#1677)
- [CHORE] Bring in TreeNode and Refactor Expression Traversal to use TreeNode @samster25 (#1676)
⬆️ Dependencies
- Bump indexmap from 2.0.2 to 2.1.0 @dependabot (#1669)