Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thread Safety in llama.cpp #596

Open
martindevans opened this issue Mar 12, 2024 · 1 comment
Open

Thread Safety in llama.cpp #596

martindevans opened this issue Mar 12, 2024 · 1 comment
Labels
Upstream Tracking an issue in llama.cpp

Comments

@martindevans
Copy link
Member

Tracking issue for thread safety in llama.cpp. The global inference lock can be removed once this is resolved.

ggerganov/llama.cpp#3960

@martindevans martindevans added the Upstream Tracking an issue in llama.cpp label Mar 12, 2024
@zsogitbe
Copy link
Contributor

llama.cpp : add pipeline parallelism support #6017. Good news: seems high priority and will probably be ready soon. If this and the CUDA memory release bug correction is ready please add a quick intermediate release integration to LLamaSharp. This is important.

ggerganov/llama.cpp#6017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Upstream Tracking an issue in llama.cpp
Projects
None yet
Development

No branches or pull requests

2 participants