diff --git a/README.md b/README.md index b6e3b4a..a2aaca7 100644 --- a/README.md +++ b/README.md @@ -16,17 +16,17 @@ Python bindings for the Transformer models implemented in C/C++ using [GGML](htt ## Supported Models -| Models | Model Type | -| :-------------------- | ----------- | -| GPT-2 | `gpt2` | -| GPT-J, GPT4All-J | `gptj` | -| GPT-NeoX, StableLM | `gpt_neox` | -| LLaMA, LLaMA 2 | `llama` | -| MPT | `mpt` | -| Dolly V2 | `dolly-v2` | -| Replit | `replit` | -| StarCoder, StarChat | `starcoder` | -| Falcon (Experimental) | `falcon` | +| Models | Model Type | CUDA | Metal | +| :------------------ | ------------- | :--: | :---: | +| GPT-2 | `gpt2` | | | +| GPT-J, GPT4All-J | `gptj` | | | +| GPT-NeoX, StableLM | `gpt_neox` | | | +| Falcon | `falcon` | ✅ | | +| LLaMA, LLaMA 2 | `llama` | ✅ | ✅ | +| MPT | `mpt` | ✅ | | +| StarCoder, StarChat | `gpt_bigcode` | ✅ | | +| Dolly V2 | `dolly-v2` | | | +| Replit | `replit` | | | ## Installation @@ -108,8 +108,6 @@ It is integrated into LangChain. See [LangChain docs](https://python.langchain.c ### GPU -> **Note:** Currently only LLaMA, MPT and Falcon models have GPU support. - To run some of the model layers on GPU, set the `gpu_layers` parameter: ```py @@ -179,7 +177,7 @@ It can also be used with LangChain. Low-level APIs are not fully supported. | `context_length` | `int` | The maximum context length to use. | `-1` | | `gpu_layers` | `int` | The number of layers to run on GPU. | `0` | -> **Note:** Currently only LLaMA, MPT and Falcon models support the `context_length` and `gpu_layers` parameters. +> **Note:** Currently only LLaMA, MPT and Falcon models support the `context_length` parameter. ### class `AutoModelForCausalLM` diff --git a/scripts/docs.py b/scripts/docs.py index 61ac2e2..e287ce3 100755 --- a/scripts/docs.py +++ b/scripts/docs.py @@ -29,7 +29,7 @@ default = getattr(Config, param) docs += f"| `{param}` | `{type_}` | {description} | `{default}` |\n" docs += """ -> **Note:** Currently only LLaMA, MPT and Falcon models support the `context_length` and `gpu_layers` parameters. +> **Note:** Currently only LLaMA, MPT and Falcon models support the `context_length` parameter. """ # Class Docs