Support loading quantized models from quanto directly #9058

sayakpaul · 2024-08-02T10:26:51Z

With more and more and larger and larger diffusion transformer models coming up, I think it makes sense to support the direct loading of quantized models from quanto.

Currently, the workflow to load a quantized model with quanto is simple:

from diffusers import DiffusionPipeline
from optimum.quanto import QuantizedPixArtTransformer2DModel
import torch

transformer = QuantizedPixArtTransformer2DModel.from_pretrained("pixart-sigma-fp8").to("cuda", torch.float16) 
pipe = DiffusionPipeline.from_pretrained(
    "PixArt-alpha/PixArt-Sigma-XL-2-1024-MS", 
    transformer=None,
    torch_dtype=torch.float16,
).to("cuda")
pipe.transformer = transformer

We should be able to just do:

pipe = PixArtSigmaPipeline.from_pretrained(
    "PixArt-alpha/PixArt-Sigma-XL-2-1024-MS", 
-    transformer=None,
+    transformer=transformer,
    torch_dtype=torch.float16,
).to("cuda")

Currently, transformer=transformer would fail because there's no mapping here:

diffusers/src/diffusers/pipelines/pipeline_loading_utils.py

Line 67 in b1f43d7

LOADABLE_CLASSES = {

Thoughts @yiyixuxu @DN6?

The text was updated successfully, but these errors were encountered:

github-actions · 2024-09-14T15:04:18Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

sayakpaul · 2024-09-26T04:24:03Z

Closing this issue because we will add quanto quantization backend after #9213 is done.

github-actions bot added the stale Issues that haven't received updates label Sep 14, 2024

sayakpaul closed this as completed Sep 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support loading quantized models from quanto directly #9058

Support loading quantized models from quanto directly #9058

sayakpaul commented Aug 2, 2024

github-actions bot commented Sep 14, 2024

sayakpaul commented Sep 26, 2024

Support loading quantized models from quanto directly #9058

Support loading quantized models from quanto directly #9058

Comments

sayakpaul commented Aug 2, 2024

github-actions bot commented Sep 14, 2024

sayakpaul commented Sep 26, 2024