feat(python): gpu based ivf partition training #1361

eddyxu · 2023-10-05T16:51:22Z

Use pytorch to train IVF partitions on GPU

ds.create_index(..., accelerator="cuda")

chebbyChefNEQ · 2023-10-05T17:49:36Z

python/python/lance/vector.py

+    else:
+        samples = dataset.sample(k * sample_rate)[column]
+
+    if accelerator in ["gpu", "cuda"]:


nit: gpu could be MPS or AMD as well, I don't think we should map gpu to cuda.

Yea, so i was thinking that use gpu to call preferred_device() later to auto detect GPU on the machine

lance/python/python/lance/torch/__init__.py

Line 23 in fe4b179

def preferred_device(device: Optional[str] = None):

But it is fair that we don't need to do it now. As mps performance is not good at the moment.

chebbyChefNEQ · 2023-10-05T17:50:16Z

python/python/lance/vector.py

+    column: str,
+    k: int,
+    metric_type: str,
+    accelerator: str,


nit: allow device id here like cuda:0.

chebbyChefNEQ

just two device handling nits

westonpace

Minor comment tweaks but this looks good

python/python/lance/dataset.py

westonpace · 2023-10-05T17:51:47Z

python/python/lance/vector.py

+    *,
+    sample_rate: int = 256,
+) -> np.ndarray:
+    """Use accelerator (GPU or MPS) to train kmeans."""


Is MPS actually supported currently?

yes, we can run mps today, it is just not as fast as we desired.

chebbyChefNEQ · 2023-10-05T17:53:40Z

python/python/lance/vector.py

+    sample_rate: int = 256,
+) -> np.ndarray:
+    """Use accelerator (GPU or MPS) to train kmeans."""
+


nit: should we check for torch is installed before trying to do all the sampling?

Co-authored-by: Weston Pace <[email protected]>

eddyxu added 2 commits October 5, 2023 09:37

train with gpu

ae186d6

gpu training

63e68af

eddyxu marked this pull request as draft October 5, 2023 16:51

eddyxu added 2 commits October 5, 2023 09:52

simplfiy a bit

9efb842

add tests

44f3986

eddyxu requested review from wjones127, changhiskhan and westonpace and removed request for wjones127 and changhiskhan October 5, 2023 17:13

eddyxu marked this pull request as ready for review October 5, 2023 17:13

fix fmt

e4bdff4

eddyxu self-assigned this Oct 5, 2023

eddyxu added the vector Vector Search label Oct 5, 2023

eddyxu requested review from changhiskhan, QianZhu and chebbyChefNEQ October 5, 2023 17:20

no cuda test in CI

909bd5d

chebbyChefNEQ reviewed Oct 5, 2023

View reviewed changes

chebbyChefNEQ approved these changes Oct 5, 2023

View reviewed changes

westonpace approved these changes Oct 5, 2023

View reviewed changes

chebbyChefNEQ reviewed Oct 5, 2023

View reviewed changes

eddyxu and others added 5 commits October 5, 2023 11:00

accept cuda

33c2601

Update python/python/lance/dataset.py

1f0104c

Co-authored-by: Weston Pace <[email protected]>

Update python/python/lance/dataset.py

03f19f5

Co-authored-by: Weston Pace <[email protected]>

Update python/python/lance/dataset.py

1fe97d6

Co-authored-by: Weston Pace <[email protected]>

check cuda and mps

377e08e

wjones127 approved these changes Oct 5, 2023

View reviewed changes

eddyxu added 2 commits October 5, 2023 11:21

change marker

79ce9cb

fix marker

2e5a358

eddyxu merged commit 0bed6a1 into main Oct 5, 2023
10 checks passed

eddyxu deleted the lei/create_idx_gpu branch October 5, 2023 18:54

westonpace mentioned this pull request Oct 9, 2023

Field_by_name is deprecated in pyarrow 13 lancedb/lancedb#548

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(python): gpu based ivf partition training #1361

feat(python): gpu based ivf partition training #1361

eddyxu commented Oct 5, 2023 •

edited

Loading

chebbyChefNEQ Oct 5, 2023

eddyxu Oct 5, 2023

chebbyChefNEQ Oct 5, 2023

eddyxu Oct 5, 2023

chebbyChefNEQ left a comment

westonpace left a comment

westonpace Oct 5, 2023

eddyxu Oct 5, 2023

chebbyChefNEQ Oct 5, 2023

feat(python): gpu based ivf partition training #1361

feat(python): gpu based ivf partition training #1361

Conversation

eddyxu commented Oct 5, 2023 • edited Loading

chebbyChefNEQ Oct 5, 2023

Choose a reason for hiding this comment

eddyxu Oct 5, 2023

Choose a reason for hiding this comment

chebbyChefNEQ Oct 5, 2023

Choose a reason for hiding this comment

eddyxu Oct 5, 2023

Choose a reason for hiding this comment

chebbyChefNEQ left a comment

Choose a reason for hiding this comment

westonpace left a comment

Choose a reason for hiding this comment

westonpace Oct 5, 2023

Choose a reason for hiding this comment

eddyxu Oct 5, 2023

Choose a reason for hiding this comment

chebbyChefNEQ Oct 5, 2023

Choose a reason for hiding this comment

eddyxu commented Oct 5, 2023 •

edited

Loading