Release v2.20.1 · mudler/LocalAI

It's that time again—I’m excited (and honestly, a bit proud) to announce the release of LocalAI v2.20! This one’s a biggie, with some of the most requested features and enhancements, all designed to make your self-hosted AI journey even smoother and more powerful.

TL;DR

🌍 Explorer & Community: Explore global community pools at explorer.localai.io
👀 Demo instance available: Test out LocalAI at demo.localai.io
🤗 Integration: Hugging Face Local apps now include LocalAI
🐛 Bug Fixes: Diffusers and hipblas issues resolved
🎨 New Feature: FLUX-1 image generation support
🏎️ Strict Mode: Stay compliant with OpenAI’s latest API changes
💪 Multiple P2P Clusters: Run multiple clusters within the same network
🧪 Deprecation Notice: gpt4all.cpp and petals backends deprecated

🌍 Explorer and Global Community Pools

Now you can share your LocalAI instance with the global community or explore available instances by visiting explorer.localai.io. This decentralized network powers our demo instance, creating a truly collaborative AI experience.

How It Works

Using the Explorer, you can easily share or connect to clusters. For detailed instructions on creating new clusters or connecting to existing ones, check out our documentation.

👀 Demo Instance Now Available

Curious about what LocalAI can do? Dive right in with our live demo at demo.localai.io! Thanks to our generous sponsors, this instance is publicly available and configured via peer-to-peer (P2P) networks. If you'd like to connect, follow the instructions here.

🤗 Hugging Face Integration

I am excited to announce that LocalAI is now integrated within Hugging Face’s local apps! This means you can select LocalAI directly within Hugging Face to build and deploy models with the power and flexibility of our platform. Experience seamless integration with a single click!

This integration was made possible through this PR.

🎨 FLUX-1 Image Generation Support

FLUX-1 lands in LocalAI! With this update, LocalAI can now generate stunning images using FLUX-1, even in federated mode. Whether you're experimenting with new designs or creating production-quality visuals, FLUX-1 has you covered.

Try it out at demo.localai.io and see what LocalAI + FLUX-1 can do!

🐛 Diffusers and hipblas Fixes

Great news for AMD users! If you’ve encountered issues with the Diffusers backend or hipblas, those bugs have been resolved. We’ve transitioned to uv for managing Python dependencies, ensuring a smoother experience. For more details, check out Issue #1592.

🏎️ Strict Mode for API Compliance

To stay up to date with OpenAI’s latest changes, now LocalAI have support as well for Strict Mode ( https://openai.com/index/introducing-structured-outputs-in-the-api/ ). This new feature ensures compatibility with the most recent API updates, enforcing stricter JSON outputs using BNF grammar rules.

To activate, simply set strict: true in your API calls, even if it’s disabled in your configuration.

Key Notes:

Setting strict: true enables grammar enforcement, even if disabled in your config.
If format_type is set to json_schema, BNF grammars will be automatically generated from the schema.

🛑 Disable Gallery

Need to streamline your setup? You can now disable the gallery endpoint using LOCALAI_DISABLE_GALLERY_ENDPOINT. For more options, check out the full list of commands with --help.

🌞 P2P and Federation Enhancements

Several enhancements have been made to improve your experience with P2P and federated clusters:

Load Balancing by Default: This feature is now enabled by default (disable it with LOCALAI_RANDOM_WORKER if needed).
Target Specific Workers: Directly target workers in federated mode using LOCALAI_TARGET_WORKER.

💪 Run Multiple P2P Clusters in the Same Network

You can now run multiple clusters within the same network by specifying a network ID via CLI. This allows you to logically separate clusters while using the same shared token. Just set LOCALAI_P2P_NETWORK_ID to a UUID that matches across instances.

Please note, while this offers segmentation, it’s not fully secure—anyone with the network token can view available services within the network.

🧪 Deprecation Notice: `gpt4all.cpp` and `petals` Backends

As we continue to evolve, we are officially deprecating the gpt4all.cpp and petals backends. The newer llama.cpp offers a superior set of features and better performance, making it the preferred choice moving forward.

From this release onward, gpt4all models in ggml format are no longer compatible. Additionally, the petals backend has been deprecated as well. LocalAI’s new P2P capabilities now offer a comprehensive replacement for these features.

What's Changed

Breaking Changes 🛠

chore: drop gpt4all.cpp by @mudler in #3106
chore: drop petals by @mudler in #3316

Bug fixes 🐛

fix(ui): do not show duplicate entries if not installed by gallery by @mudler in #3107
fix: be consistent in downloading files, check for scanner errors by @mudler in #3108
fix: ensure correct version of torch is always installed based on BUI… by @cryptk in #2890
fix(python): move accelerate and GPU-specific libs to build-type by @mudler in #3194
fix(apple): disable BUILD_TYPE metal on fallback by @mudler in #3199
fix(vall-e-x): pin hipblas deps by @mudler in #3201
fix(diffusers): use nightly rocm for hipblas builds by @mudler in #3202
fix(explorer): reset counter when network is active by @mudler in #3213
fix(p2p): allocate tunnels only when needed by @mudler in #3259
fix(gallery): be consistent and disable UI routes as well by @mudler in #3262
fix(parler-tts): bump and require after build type deps by @mudler in #3272
fix: add llvm to extra images by @mudler in #3321
fix(p2p): re-use p2p host when running federated mode by @mudler in #3341
fix(ci): pin to llvmlite 0.43 by @mudler in #3342
fix(p2p): avoid starting the node twice by @mudler in #3349
fix(chat): re-generated uuid, created, and text on each request by @mudler in #3359

Exciting New Features 🎉

feat(guesser): add gemma2 by @sozercan in #3118
feat(venv): shared env by @mudler in #3195
feat(openai): add json_schema format type and strict mode by @mudler in #3193
feat(p2p): allow to run multiple clusters in the same p2p network by @mudler in #3128
feat(p2p): add network explorer and community pools by @mudler in #3125
feat(explorer): relax token deletion with error threshold by @mudler in #3211
feat(diffusers): support flux models by @mudler in #3129
feat(explorer): make possible to run sync in a separate process by @mudler in #3224
feat(federated): allow to pickup a specific worker, improve loadbalancing by @mudler in #3243
feat: Initial Version of vscode DevContainer by @dave-gray101 in #3217
feat(explorer): visual improvements by @mudler in #3247
feat(gallery): lazy load images by @mudler in #3246
chore(explorer): add join instructions by @mudler in #3255
chore: allow to disable gallery endpoints, improve p2p connection handling by @mudler in #3256
chore(ux): add animated header with anime.js in p2p sections by @mudler in #3271
chore(p2p): make commands easier to copy-paste by @mudler in #3273
chore(ux): allow to create and drag dots in the animation by @mudler in #3287
feat(federation): do not allocate local services for load balancing by @mudler in #3337
feat(p2p): allow to set intervals by @mudler in #3353

🧠 Models

models(gallery): add meta-llama-3.1-instruct-9.99b-brainstorm-10x-form-3 by @mudler in #3103
models(gallery): add mn-12b-celeste-v1.9 by @mudler in #3104
models(gallery): add shieldgemma by @mudler in #3105
models(gallery): add llama-3.1-techne-rp-8b-v1 by @mudler in #3112
models(gallery): add llama-spark by @mudler in #3116
models(gallery): add glitz by @mudler in #3119
models(gallery): add gemmasutra-mini by @mudler in #3120
models(gallery): add kumiho-v1-rp-uwu-8b by @mudler in #3121
models(gallery): add humanish-roleplay-llama-3.1-8b-i1 by @mudler in #3126
chore(model-gallery): ⬆️ update checksum by @localai-bot in #3167
models(gallery): add calme-2.2-qwen2-72b by @mudler in #3185
models(gallery): add calme-2.3-legalkit-8b by @mudler in #3200
chore(model-gallery): ⬆️ update checksum by @localai-bot in #3210
models(gallery): add flux.1-dev and flux.1-schnell by @mudler in #3215
models(gallery): add infinity-instruct-7m-gen-llama3_1-70b by @mudler in #3220
models(gallery): add cathallama-70b by @mudler in #3221
models(gallery): add edgerunner-tactical-7b by @mudler in #3249
models(gallery): add hermes-3 by @mudler in #3252
models(gallery): add SmolLM by @mudler in #3265
models(gallery): add mahou-1.3-llama3.1-8b by @mudler in #3266
models(gallery): add fireball-llama-3.11-8b-v1orpo by @mudler in #3267
models(gallery): add rocinante-12b-v1.1 by @mudler in #3268
models(gallery): add pantheon-rp-1.6-12b-nemo by @mudler in #3269
models(gallery): add llama-3.1-storm-8b-q4_k_m by @mudler in #3270

📖 Documentation and examples

docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #3109
fix(docs): Refer to the OpenAI documentation to update the openai-functions docu… by @jermeyhu in #3317
chore(docs): update p2p env var documentation by @mudler in #3350

👒 Dependencies

chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #3110
chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #3115
chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #3117
chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #3123
chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/autogptq by @dependabot in #3130
chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/common/template by @dependabot in #3131
chore(deps): Bump langchain from 0.2.10 to 0.2.12 in /examples/functions by @dependabot in #3132
chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/openvoice by @dependabot in #3137
chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/coqui by @dependabot in #3138
chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/transformers-musicgen by @dependabot in #3140
chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/diffusers by @dependabot in #3141
chore(deps): Bump llama-index from 0.10.56 to 0.10.59 in /examples/chainlit by @dependabot in #3143
chore(deps): Bump docs/themes/hugo-theme-relearn from 7aec99b to 8b14837 by @dependabot in #3142
chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/exllama2 by @dependabot in #3146
chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/bark by @dependabot in #3144
chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/rerankers by @dependabot in #3147
chore(deps): Bump langchain from 0.2.10 to 0.2.12 in /examples/langchain-chroma by @dependabot in #3148
chore(deps): Bump streamlit from 1.37.0 to 1.37.1 in /examples/streamlit-bot by @dependabot in #3151
chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/vllm by @dependabot in #3152
chore(deps): Bump langchain from 0.2.11 to 0.2.12 in /examples/langchain/langchainpy-localai-example by @dependabot in #3155
chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/transformers by @dependabot in #3161
chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/vall-e-x by @dependabot in #3156
chore(deps): Bump sqlalchemy from 2.0.31 to 2.0.32 in /examples/langchain/langchainpy-localai-example by @dependabot in #3157
chore: ⬆️ Update ggerganov/whisper.cpp by @localai-bot in #3164
chore(deps): Bump openai from 1.37.0 to 1.39.0 in /examples/functions by @dependabot in #3134
chore(deps): Bump openai from 1.37.0 to 1.39.0 in /examples/langchain-chroma by @dependabot in #3149
chore(deps): Bump openai from 1.37.1 to 1.39.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3158
chore: ⬆️ Update ggerganov/llama.cpp by @mudler in #3166
chore(deps): Bump tqdm from 4.66.4 to 4.66.5 in /examples/langchain/langchainpy-localai-example by @dependabot in #3159
chore(deps): Bump llama-index from 0.10.56 to 0.10.61 in /examples/langchain-chroma by @dependabot in #3168
chore: ⬆️ Update ggerganov/llama.cpp to 1e6f6554aa11fa10160a5fda689e736c3c34169f by @mudler in #3189
chore: ⬆️ Update ggerganov/llama.cpp to 15fa07a5c564d3ed7e7eb64b73272cedb27e73ec by @localai-bot in #3197
chore: ⬆️ Update ggerganov/whisper.cpp to 6eac06759b87b50132a01be019e9250a3ffc8969 by @localai-bot in #3203
chore: ⬆️ Update ggerganov/llama.cpp to 3a14e00366399040a139c67dd5951177a8cb5695 by @localai-bot in #3204
chore(deps): Bump aiohttp from 3.9.5 to 3.10.2 in /examples/langchain/langchainpy-localai-example in the pip group by @dependabot in #3207
chore: ⬆️ Update ggerganov/llama.cpp to b72942fac998672a79a1ae3c03b340f7e629980b by @localai-bot in #3208
chore: ⬆️ Update ggerganov/whisper.cpp to 81c999fe0a25c4ebbfef10ed8a1a96df9cfc10fd by @localai-bot in #3209
chore: ⬆️ Update ggerganov/llama.cpp to 6e02327e8b7837358e0406bf90a4632e18e27846 by @localai-bot in #3212
chore(deps): update edgevpn by @mudler in #3214
chore: ⬆️ Update ggerganov/llama.cpp to 4134999e01f31256b15342b41c4de9e2477c4a6c by @localai-bot in #3218
chore(deps): Bump llama-index from 0.10.61 to 0.10.65 in /examples/langchain-chroma by @dependabot in #3225
chore(deps): Bump langchain-community from 0.2.9 to 0.2.11 in /examples/langchain/langchainpy-localai-example by @dependabot in #3230
chore(deps): Bump attrs from 23.2.0 to 24.2.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3232
chore(deps): Bump pyyaml from 6.0.1 to 6.0.2 in /examples/langchain/langchainpy-localai-example by @dependabot in #3231
chore(deps): Bump llama-index from 0.10.59 to 0.10.65 in /examples/chainlit by @dependabot in #3238
chore: ⬆️ Update ggerganov/llama.cpp to fc4ca27b25464a11b3b86c9dbb5b6ed6065965c2 by @localai-bot in #3240
chore(deps): Bump openai from 1.39.0 to 1.40.5 in /examples/langchain-chroma by @dependabot in #3241
chore: ⬆️ Update ggerganov/whisper.cpp to 22fcd5fd110ba1ff592b4e23013d870831756259 by @localai-bot in #3239
chore(deps): Bump aiohttp from 3.10.2 to 3.10.3 in /examples/langchain/langchainpy-localai-example by @dependabot in #3234
chore(deps): Bump openai from 1.39.0 to 1.40.6 in /examples/langchain/langchainpy-localai-example by @dependabot in #3244
chore: ⬆️ Update ggerganov/llama.cpp to 06943a69f678fb32829ff06d9c18367b17d4b361 by @localai-bot in #3245
chore(deps): Bump openai from 1.39.0 to 1.40.4 in /examples/functions by @dependabot in #3235
chore: ⬆️ Update ggerganov/llama.cpp to 5fd89a70ead34d1a17015ddecad05aaa2490ca46 by @localai-bot in #3248
chore(deps): bump llama.cpp, rename llama_add_bos_token by @mudler in #3253
chore: ⬆️ Update ggerganov/llama.cpp to 8b3befc0e2ed8fb18b903735831496b8b0c80949 by @localai-bot in #3257
chore: ⬆️ Update ggerganov/llama.cpp to 2fb9267887d24a431892ce4dccc75c7095b0d54d by @localai-bot in #3260
chore: ⬆️ Update ggerganov/llama.cpp to 554b049068de24201d19dde2fa83e35389d4585d by @localai-bot in #3263
chore(deps): Bump langchain from 0.2.12 to 0.2.14 in /examples/langchain-chroma by @dependabot in #3275
chore(deps): Bump grpcio from 1.65.4 to 1.65.5 in /backend/python/openvoice by @dependabot in #3282
chore(deps): Bump docs/themes/hugo-theme-relearn from 8b14837 to 82a5e98 by @dependabot in #3274
chore(deps): Bump grpcio from 1.65.4 to 1.65.5 in /backend/python/bark by @dependabot in #3285
chore(deps): Bump grpcio from 1.65.1 to 1.65.5 in /backend/python/parler-tts by @dependabot in #3283
chore(deps): Bump grpcio from 1.65.4 to 1.65.5 in /backend/python/common/template by @dependabot in #3291
chore(deps): Bump grpcio from 1.65.1 to 1.65.5 in /backend/python/sentencetransformers by @dependabot in #3292
chore(deps): Bump grpcio from 1.65.4 to 1.65.5 in /backend/python/vall-e-x by @dependabot in #3294
chore(deps): Bump grpcio from 1.65.4 to 1.65.5 in /backend/python/transformers by @dependabot in #3296
chore(deps): Bump grpcio from 1.65.0 to 1.65.5 in /backend/python/exllama by @dependabot in #3299
chore(deps): Bump grpcio from 1.65.4 to 1.65.5 in /backend/python/vllm by @dependabot in #3301
chore(deps): Bump langchain from 0.2.12 to 0.2.14 in /examples/functions by @dependabot in #3304
chore(deps): Bump numpy from 2.0.1 to 2.1.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3310
chore(deps): Bump grpcio from 1.65.1 to 1.65.5 in /backend/python/mamba by @dependabot in #3313
chore(deps): Bump grpcio from 1.65.4 to 1.65.5 in /backend/python/coqui by @dependabot in #3306
chore(deps): Bump grpcio from 1.65.4 to 1.65.5 in /backend/python/transformers-musicgen by @dependabot in #3308
chore(deps): Bump langchain-community from 0.2.11 to 0.2.12 in /examples/langchain/langchainpy-localai-example by @dependabot in #3311
chore: ⬆️ Update ggerganov/llama.cpp to cfac111e2b3953cdb6b0126e67a2487687646971 by @localai-bot in #3315
chore(deps): Bump openai from 1.40.4 to 1.41.1 in /examples/functions by @dependabot in #3319
chore(deps): Bump openai from 1.40.6 to 1.41.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3320
chore(deps): Bump llama-index from 0.10.65 to 0.10.67.post1 in /examples/langchain-chroma by @dependabot in #3335
chore(deps): update edgevpn by @mudler in #3340
chore(deps): Bump langchain from 0.2.12 to 0.2.14 in /examples/langchain/langchainpy-localai-example by @dependabot in #3307
chore(deps): update edgevpn by @mudler in #3346
chore: ⬆️ Update ggerganov/whisper.cpp to d65786ea540a5aef21f67cacfa6f134097727780 by @localai-bot in #3344
chore: ⬆️ Update ggerganov/llama.cpp to 2f3c1466ff46a2413b0e363a5005c46538186ee6 by @localai-bot in #3345
chore: ⬆️ Update ggerganov/llama.cpp to fc54ef0d1c138133a01933296d50a36a1ab64735 by @localai-bot in #3356
chore: ⬆️ Update ggerganov/whisper.cpp to 9e3c5345cd46ea718209db53464e426c3fe7a25e by @localai-bot in #3357

Other Changes

feat(swagger): update swagger by @localai-bot in #3196
fix: devcontainer part 1 by @dave-gray101 in #3254
fix: devcontainer pt 2 by @dave-gray101 in #3258
feat: devcontainer part 3 by @dave-gray101 in #3318
feat: devcontainer part 4 by @dave-gray101 in #3339
feat(swagger): update swagger by @localai-bot in #3343
chore(anime.js): drop unused by @mudler in #3351
chore(p2p): single-node when sharing federated instance by @mudler in #3354

New Contributors

@jermeyhu made their first contribution in #3317

Full Changelog: v2.19.4...v2.20.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2.20.1

TL;DR

🌍 Explorer and Global Community Pools

How It Works

👀 Demo Instance Now Available

🤗 Hugging Face Integration

🎨 FLUX-1 Image Generation Support

🐛 Diffusers and hipblas Fixes

🏎️ Strict Mode for API Compliance

Key Notes:

🛑 Disable Gallery

🌞 P2P and Federation Enhancements

💪 Run Multiple P2P Clusters in the Same Network

🧪 Deprecation Notice: `gpt4all.cpp` and `petals` Backends

What's Changed

Breaking Changes 🛠

Bug fixes 🐛

Exciting New Features 🎉

🧠 Models

📖 Documentation and examples

👒 Dependencies

Other Changes

New Contributors

Contributors

v2.20.1

TL;DR

🌍 Explorer and Global Community Pools

How It Works

👀 Demo Instance Now Available

🤗 Hugging Face Integration

🎨 FLUX-1 Image Generation Support

🐛 Diffusers and hipblas Fixes

🏎️ Strict Mode for API Compliance

Key Notes:

🛑 Disable Gallery

🌞 P2P and Federation Enhancements

💪 Run Multiple P2P Clusters in the Same Network

🧪 Deprecation Notice: gpt4all.cpp and petals Backends

What's Changed

Breaking Changes 🛠

Bug fixes 🐛

Exciting New Features 🎉

🧠 Models

📖 Documentation and examples

👒 Dependencies

Other Changes

New Contributors

Contributors

🧪 Deprecation Notice: `gpt4all.cpp` and `petals` Backends