-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to Deploy Ollama LLM on Cloud-Managed Kubernetes (OCI) (pvc/init container) #90
Comments
Hello @brokedba, Here's an explanation:
First, if
volumes:
- name: ollama-data
{{- if .Values.persistentVolume.enabled }}
persistentVolumeClaim:
claimName: {{ .Values.persistentVolume.existingClaim | default (printf "%s" (include "ollama.fullname" .)) }}
{{- else }}
emptyDir: { }
{{- end }}
You can specify a StorageClass that is already configured in your infrastructure to automatically create a PVC. Alternatively, if you already have a PVC configured, you can set the Example using longhorn as provisioner: # Enable persistence using Persistent Volume Claims
# ref: https://kubernetes.io/docs/concepts/storage/persistent-volumes/
persistentVolume:
# -- Enable persistence using PVC
enabled: true
# -- Ollama server data Persistent Volume Storage Class
# If defined, storageClassName: <storageClass>
# If set to "-", storageClassName: "", which disables dynamic provisioning
# If undefined (the default) or set to null, no storageClassName spec is
# set, choosing the default provisioner. (gp2 on AWS, standard on
# GKE, AWS & OpenStack)
storageClass: "longhorn"
To preload models at startup, simply populate the {{- if or .Values.ollama.models .Values.ollama.defaultModel }}
lifecycle:
postStart:
exec:
command: [ "/bin/sh", "-c", "{{- printf "echo %s | xargs -n1 /bin/ollama pull %s" (include "ollama.modelList" .) (ternary "--insecure" "" .Values.ollama.insecure)}}" ]
{{- end }} Let me know if you need more details! |
@jdetroyes thank you so much for the answers !!
If I understand well ,
But if StorageClass doesn't exist , the option 2 will not really work am I right??
EDIT : it only worked after I hardcoded the storageClassName: "oci-bv" (Default in Oracle Clooud)
Here's the thing , my K8 is CPU only . So I need GGUF models to be loaded not GPU. Hence my
the purpose of the init container
|
Hello @clouddude Based on your scenario, here an example with initContainers:
- name: install-and-setup-model
image: python:3.9 # Use an image with Python and pip pre-installed
command: [sh, -c]
args:
- |
pip install -U "huggingface_hub[cli]";
mkdir -p /root/.ollama/download;
huggingface-cli download bartowski/Meta-Llama-3.1-8B-Instruct-GGUF Meta-Llama-3.1-8B-Instruct-Q4_K_S.gguf \
--local-dir /root/.ollama/download \
--local-dir-use-symlinks False;
echo 'FROM /root/.ollama/download/Meta-Llama-3.1-8B-Instruct-Q4_K_S.gguf' > /root/.ollama/download/llama3.loc;
volumeMounts:
- name: ollama-data # Use the same name defined is volumes sections in deployment.yaml
mountPath: /root/.ollama # Use same as default
# -- Lifecycle for pod assignment (override ollama.models startup pulling)
lifecycle:
postStart:
exec:
command: [ "/bin/sh", "-c", "ollama create llama3 -f /root/.ollama/download/llama3.loc" ]
persistentVolume:
# Enable PVC for Ollama
enabled: true
# Use default storage class
storageClass: "" |
I hit Docker Hub rate limits so I needed to add dockerhub secret but it's complaining .
am I missing something ? is it included in the template ?
|
Hey @brokedba Docker secrets are shared with all containers in the deployment. You don't have to add a line in the initContainers, you just have to populate in the values.yaml # -- Docker registry secret names as an array
imagePullSecrets: [] |
my bad I had corrected before I could update the post.
Edit : the initcontainer phase worked but partially .
What do you think could have caused the
|
I also noted that the result of the modelfile is truncated where any line between curly brackets {{}}} was ignored . the below is the final version after I logged in to the container. I think it might be behind the create command working who knows.
I found online that |
How to Deploy Ollama LLM on Cloud-Managed Kubernetes (OCI)
I'm looking for guidance on deploying Ollama LLM using Helm charts on a cloud-managed Kubernetes service, specifically Oracle Cloud Infrastructure (OCI). I have a few questions regarding the deployment process:
Persistent Volume and Data Volume Mounting:
ollama-data
volumemountPath: ""
match thepersistentVolume
if it's enabled? It's unclear how these values are connected.persistentVolume.values
to be effective? There isn't much clarity on this in the documentation, and it would be helpful to have an example.Loading Models with Init Containers:
mountPath
before the main pod is spun up? This feature would be useful for preloading models and ensuring they're ready when the main container starts.The documentation seems limited, making it challenging to proceed. Any examples or additional guidance would be greatly appreciated.
Thank you
The text was updated successfully, but these errors were encountered: