Is there GPU support for box-prompted SAM? #58

silinsi · 2024-03-06T11:34:50Z

It seems that efficientSAM provided only works on CPU, I tried to use cuda() to move model and data to GPU , but it doesn't help a lot.

Also tried the seg-everything to cuda method in pull requests, doesn't help a lot as well.

Maybe the box-promted SAM needs a function like " predictor.set_image() " in SAM and Mobile-SAM to save time used in a same image.

liutongkun · 2024-03-13T15:13:49Z

It seems that efficientSAM provided only works on CPU, I tried to use cuda() to move model and data to GPU , but it doesn't help a lot.

Also tried the seg-everything to cuda method in pull requests, doesn't help a lot as well.

Maybe the box-promted SAM needs a function like " predictor.set_image() " in SAM and Mobile-SAM to save time used in a same image.

I have found that GPU can effectively accelerate, with approximately 30-40ms per image on 3090ti. The problem is that the speed will be slower when running the first inference, I'm not sure why

yformer · 2024-03-18T22:14:36Z

@silinsi, it should very fast to run EfficientSAM on GPU. Can you share more information?

yformer · 2024-03-18T22:17:39Z

@liutongkun, for the first inference, loading the model to GPU and moving the data to GPU may take time. Can you share the latency for the first inference?

liutongkun · 2024-03-19T02:58:07Z

@liutongkun, for the first inference, loading the model to GPU and moving the data to GPU may take time. Can you share the latency for the first inference?

Thanks for your reply. I put the data and model to GPU before starting the timing, here are my codes based on EfficientSAM_example.py

for i in range(10):
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    model = model.to(device)
    a = sample_image_tensor[None, ...]
    a = a.to(device)
    input_points = input_points.to(device)
    input_labels = input_labels.to(device)
    t1 = time.time()
    print('Running inference using ', model_name)
    predicted_logits, predicted_iou = model(
        a,
        input_points,
        input_labels,
    )
    t2 = time.time()
    print(f'timecost{t2-t1}')

and it shows:
Running inference using efficientsam_ti
timecost0.5455219745635986
Running inference using efficientsam_ti
timecost0.035993099212646484
Running inference using efficientsam_ti
timecost0.03607749938964844
Running inference using efficientsam_ti
timecost0.03591561317443848
Running inference using efficientsam_ti
timecost0.0360107421875
......

yformer · 2024-03-20T17:36:21Z

@liutongkun, can you move the model/data before the loop?

liutongkun · 2024-03-21T01:51:09Z

@liutongkun, can you move the model/data before the loop?

I modify the code to:

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = models['efficientsam_ti'].to(device)
a = sample_image_tensor[None, ...]
a = a.to(device)
input_points = input_points.to(device)
input_labels = input_labels.to(device)
for i in range(10):
    t1 = time.time()
    print('Running inference u sing ', 'efficientsam_ti')
    predicted_logits, predicted_iou = model(
        a,
        input_points,
        input_labels,
    )
    t2 = time.time()
    print(f'timecost{t2-t1}')

and it shows:
Running inference u sing efficientsam_ti
timecost0.56357741355896
Running inference u sing efficientsam_ti
timecost0.035813331604003906
Running inference u sing efficientsam_ti
timecost0.035944223403930664
Running inference u sing efficientsam_ti
timecost0.03553032875061035
Running inference u sing efficientsam_ti
timecost0.03602123260498047
......

EudicL · 2024-04-24T14:09:48Z

@liutongkun, for the first inference, loading the model to GPU and moving the data to GPU may take time. Can you share the latency for the first inference?

Thanks for your reply. I put the data and model to GPU before starting the timing, here are my codes based on EfficientSAM_example.py
for i in range(10):
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    model = model.to(device)
    a = sample_image_tensor[None, ...]
    a = a.to(device)
    input_points = input_points.to(device)
    input_labels = input_labels.to(device)
    t1 = time.time()
    print('Running inference using ', model_name)
    predicted_logits, predicted_iou = model(
        a,
        input_points,
        input_labels,
    )
    t2 = time.time()
    print(f'timecost{t2-t1}')
and it shows: Running inference using efficientsam_ti timecost0.5455219745635986 Running inference using efficientsam_ti timecost0.035993099212646484 Running inference using efficientsam_ti timecost0.03607749938964844 Running inference using efficientsam_ti timecost0.03591561317443848 Running inference using efficientsam_ti timecost0.0360107421875 ......

Thanks, it is useful

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there GPU support for box-prompted SAM? #58

Is there GPU support for box-prompted SAM? #58

silinsi commented Mar 6, 2024

liutongkun commented Mar 13, 2024 •

edited

Loading

yformer commented Mar 18, 2024

yformer commented Mar 18, 2024

liutongkun commented Mar 19, 2024

yformer commented Mar 20, 2024

liutongkun commented Mar 21, 2024 •

edited

Loading

EudicL commented Apr 24, 2024

Is there GPU support for box-prompted SAM? #58

Is there GPU support for box-prompted SAM? #58

Comments

silinsi commented Mar 6, 2024

liutongkun commented Mar 13, 2024 • edited Loading

yformer commented Mar 18, 2024

yformer commented Mar 18, 2024

liutongkun commented Mar 19, 2024

yformer commented Mar 20, 2024

liutongkun commented Mar 21, 2024 • edited Loading

EudicL commented Apr 24, 2024

liutongkun commented Mar 13, 2024 •

edited

Loading

liutongkun commented Mar 21, 2024 •

edited

Loading