Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stopping the Model #15

Open
will-lumley opened this issue Oct 11, 2024 · 1 comment
Open

Stopping the Model #15

will-lumley opened this issue Oct 11, 2024 · 1 comment

Comments

@will-lumley
Copy link
Contributor

Is it possible to stop the model? We have start(), but we don’t have a stop() equivalent.

In certain scenarios, it would be useful to have the ability to gracefully stop or terminate a running model inference process, especially when it’s being used in environments where resource management is crucial.

A stop() function could help with:
• Freeing up resources like memory or compute when the model is no longer needed.
• Handling cases where the inference is taking too long and needs to be interrupted.
• Ensuring that models can be started and stopped dynamically without having to reinitialise and reinitialise the whole model object.

Is this something that could be added, or is there already a workaround for this use case?

Thanks!

@ShenghaiWang
Copy link
Owner

The SwiftLlama is very lightweight, you can free the model related resources by freeing up SwiftLlama object instance. If stop() frees up the memory that models used, the system has to reload models before calling it again and reinitialise would be a must.

I understand you might be thinking freeing up resources partially, I didn't dig into the llama.cpp code for this yet. I am also not sure if partially freeing up memory is meaningful or not as this type of system is usually designed to run with exclusive resources.

Regarding stopping long run cases, the maxTokenCount parameter in configuration is used for this purpose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants