Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Consolidate 6+ related use/disable: bool args in from_quantized into single backend: Backend #59

Closed
Qubitium opened this issue Jun 25, 2024 · 0 comments · Fixed by #68
Assignees
Labels
enhancement New feature or request

Comments

@Qubitium
Copy link
Contributor

Qubitium commented Jun 25, 2024

The following args will be merged into single backed: Backend = Backend.AUTO

  • use_triton: bool,
  • disable_exllama: bool = False,
  • disable_exllamav2: bool = False,
  • use_marlin: bool = False,
  • use_bitblas: bool = True,

Reason: It is not only super confusing for users to use correctly (matrix condition of passive binary toggles), even project developers ran into multiple bugs due to these passive switches. We can't keep adding more binary toggles every time we add a backend/kernel/runtime. Becoming unmaintainable and unusable by both end-users and project devs.

Prelim design:

class Backend(ENUM):
   AUTO # choose the fastest one based on quant model compatibility
   CUDA_OLD
   CUDA 
   TRITON_V2
   EXLLAMA
   EXLLAMA_V2
   MARLIN
   BITBLAS

This is the final todo for 0.9.1 Release.

@Qubitium Qubitium added the enhancement New feature or request label Jun 25, 2024
@Qubitium Qubitium changed the title [FEATURE] Consolidate 6+ related use\disable args in from_quantized into 1 backend [FEATURE] Consolidate 6+ related use/disable args in from_quantized into 1 backend Jun 25, 2024
@Qubitium Qubitium changed the title [FEATURE] Consolidate 6+ related use/disable args in from_quantized into 1 backend [FEATURE] Consolidate 6+ related use/disable: bool args in from_quantized into single backend: Backend Jun 25, 2024
DeJoker pushed a commit to DeJoker/GPTQModel that referenced this issue Jul 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants