Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CHORE] Defer Register of Super Type #2836

Closed
wants to merge 1 commit into from

Conversation

samster25
Copy link
Member

No description provided.

@github-actions github-actions bot added the chore label Sep 11, 2024
Copy link

codspeed-hq bot commented Sep 11, 2024

CodSpeed Performance Report

Merging #2836 will degrade performances by 56.61%

Comparing sammy/defer_register_super_type (310e3fc) with main (c2d7d08)

Summary

⚡ 1 improvements
❌ 1 regressions
✅ 14 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark main sammy/defer_register_super_type Change
test_count[1 Small File] 10.6 ms 24.3 ms -56.61%
test_show[100 Small Files] 121.8 ms 49.9 ms ×2.4

desmondcheongzx added a commit that referenced this pull request Sep 19, 2024
Introduce lazy imports for heavy modules that are not needed as
top-level imports. For example, `ray` does not need to be a top level
import (it should only be imported when using the ray runner or when
specific ray data extension types needed. Another example would be
`UnityCatalogTable`, which is a relatively heavy import despite only
being needed when using delta lake.

Modules to import lazily were determined by the proportion of import
time as shown by `importtime-output-wrapper -c 'import daft' --format
waterfall --depth 25`.

The list of newly lazily imported modules are:
- `daft.unity_catalog`
- `fsspec`
- `numpy`
- `pandas`
- `PIL.Image`
- `pyarrow`
- `pyarrow.csv` 
- `pyarrow.dataset`
- `pyarrow.fs`
- `pyarrow.json`
- `pyarrow.parquet` 
- `ray`
- `ray.data.extensions`
- `xml.etree.ElementTree` 

Uses #2836 in order to defer
the import of `pyarrow`.

Additionally, we move all type-checking-only module imports into type
checking blocks.

With these changes, import times go from roughly 0.6-0.7s to ~0.045s
(~13-15x faster).

---------

Co-authored-by: Sammy Sidhu <[email protected]>
@samster25 samster25 closed this Oct 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant