What's the difference between mlx-community model and huggingface original model？ #1250

100ZZ · 2024-07-03T09:49:24Z

100ZZ
Jul 3, 2024

Qwen2/Qwen2-7B => python -m mlx_lm.convert => mlx-community/Qwen2-7B? is it？
but why i can also run "python -m mlx_lm.generate --model huggingface/llm/Qwen/Qwen2-7B/ --prompt "hello"" from Original model
when i test mlx-example/stable-diffusion, sdxl-turbo is download, but model type like ollama, amazing

~/.cache/huggingface/hub/models--stabilityai--sdxl-turbo $ ls -l blobs
total 27104120
-rw-r--r-- 1 lihui staff 459 Jun 30 20:16 0359d7abb0b9c7b4a433be2db87cefea03c06ea5
-rw-r--r-- 1 lihui staff 10270077736 Jun 30 20:26 1968fc61aa8449ab3d3f9b9a05bce88c611760c01e0c4a7a3785911b546fe582
-rw-r--r-- 1 lihui staff 1776 Jun 30 20:16 220d7ae3e59ce3484c7eeb47ef2ac9db5097e29a
-rw-r--r-- 1 lihui staff 1059962 Jun 30 20:26 469be27c5c010538f845f518c4f5e8574c78f7c8
-rw-r--r-- 1 lihui staff 334643268 Jun 30 20:26 716971093e3428c9156906fcbcc5500abf005317c5f4d3a5bb3fa28c45e1e071
-rw-r--r-- 1 lihui staff 524619 Jun 30 20:26 76e821f1b6f0a9709293c3b6b51ed90980b3166b
-rw-r--r-- 1 lihui staff 492265168 Jun 30 20:26 778d02eb9e707c3fbaae0b67b79ea0d1399b52e624fb634f2f19375ae7c047c3
-rw-r--r-- 1 lihui staff 565 Jun 30 20:26 8e91c97936ad0b2c1356f03de8d47589b5232704
-rw-r--r-- 1 lihui staff 607 Jun 30 20:26 ae14cf90e29b12134a53383691c98c73dee5d422
-rw-r--r-- 1 lihui staff 575 Jun 30 20:26 f9e084535c55110233f44ccc6c7f9d0e1540f8be
-rw-r--r-- 1 lihui staff 2778702264 Jun 30 20:27 fa5b2e6f4c2efc2d82e4b8312faec1a5540eabfc6415126c9a05c8436a530ef4

So，if i want to run Qwen2-7B， run original model？ or convert to Qwen2-7B-MLX like mlx-community/* then run？

100ZZ · 2024-07-03T10:02:07Z

100ZZ
Jul 3, 2024
Author

or, mlx is the same as llama.cpp?

1 reply

awni Jul 3, 2024
Maintainer

MLX is not the same as llama.cpp. Your observation about the blob storage format which is similar to Ollama is just a coincidence. Hugging Face models look similar.

awni · 2024-07-03T13:49:47Z

awni
Jul 3, 2024
Maintainer

For the most part there is no real difference between MLX community models and the original Hugging Face model when the precision is fp16, bf16, or fp32. In some cases the model could have a slightly different format but in many cases they are identical.

The main difference between MLX Community models is that we keep the quantized models there (4-bit and 8-bit). The quantization format is quite specific to MLX. But there is no rule that quantized models must live in the MLX community. That's just a convenient place to put them if the original model creator didn't make MLX quantized models.

1 reply

100ZZ Jul 4, 2024
Author

Get, thanks very much for your patient guidance

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's the difference between mlx-community model and huggingface original model？ #1250

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

What's the difference between mlx-community model and huggingface original model？ #1250

100ZZ Jul 3, 2024

Replies: 2 comments · 2 replies

100ZZ Jul 3, 2024 Author

awni Jul 3, 2024 Maintainer

awni Jul 3, 2024 Maintainer

100ZZ Jul 4, 2024 Author

100ZZ
Jul 3, 2024

Replies: 2 comments 2 replies

100ZZ
Jul 3, 2024
Author

awni Jul 3, 2024
Maintainer

awni
Jul 3, 2024
Maintainer

100ZZ Jul 4, 2024
Author