Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

forbidden sysctl: "vm.max_map_count" not whitelisted #87

Closed
dyipon opened this issue Oct 7, 2021 · 19 comments
Closed

forbidden sysctl: "vm.max_map_count" not whitelisted #87

dyipon opened this issue Oct 7, 2021 · 19 comments

Comments

@dyipon
Copy link

dyipon commented Oct 7, 2021

Hi,

i've tried to install opensearch via helm, but pods are pending with this message:
forbidden sysctl: "vm.max_map_count" not whitelisted

I've tried to override the security settings, but did not help:

helm upgrade --install opensearch opensearch/opensearch --set sysctl.enabled=true --set podSecurityContext.runAsUser=0 --set securityContext.runAsNonRoot=false >test.yml

Im try to migrate from ES, but theirs helm using a bit different method:
they are using a fully privileges initcontainer to set the vm.max_map_count:

   initContainers:
154       - name: configure-sysctl
155         securityContext:
156           runAsUser: 0
157           privileged: true
158         image: "docker.elastic.co/elasticsearch/elasticsearch:7.14.0"
159         imagePullPolicy: "IfNotPresent"
160         command: ["sysctl", "-w", "vm.max_map_count=262144"]
161         resources:

What would be the best practice? Others does not have this situation?

thanks

@sdwerwed
Copy link

sdwerwed commented Oct 8, 2021

Maybe try this init container as a workaround

# # Enable on the first time
# sysctl:
#   enabled: false


extraInitContainers:
  ## Image that performs the sysctl operation to modify Kernel settings (needed sometimes to avoid boot errors)
  - name: sysctl
    image: docker.io/bitnami/bitnami-shell:10-debian-10-r199
    imagePullPolicy: "IfNotPresent"
    command:
      - /bin/bash
      - -ec
      - |
        CURRENT=`sysctl -n vm.max_map_count`;
        DESIRED="262144";
        if [ "$DESIRED" -gt "$CURRENT" ]; then
            sysctl -w vm.max_map_count=262144;
        fi;
        CURRENT=`sysctl -n fs.file-max`;
        DESIRED="65536";
        if [ "$DESIRED" -gt "$CURRENT" ]; then
            sysctl -w fs.file-max=65536;
        fi;
    securityContext:
      #runAsUser: 0
      privileged: true
      

@dyipon
Copy link
Author

dyipon commented Oct 8, 2021

I had to add an extra line to the securityContext part of your code:
"runAsUser: 0"
and working fine.

Thanks for your help @sdwerwed !

@dyipon dyipon closed this as completed Oct 8, 2021
@sdwerwed
Copy link

Glad you made it work! However, I would leave this issue as open as this workaround is not the solution, it should work with the default values.

To maintainers:
Maybe should we add something similar to default values?

@dyipon dyipon reopened this Oct 11, 2021
@DandyDeveloper
Copy link
Collaborator

DandyDeveloper commented Oct 13, 2021

@dyipon @sdwerwed By default;

This enabled will result in this being enabled on the sts;

        {{- if .Values.sysctl.enabled }}
        sysctls:
        - name: vm.max_map_count
          value: {{ .Values.sysctlVmMaxMapCount | quote }}
        {{- end }}

As explained in those values, your kubelet / node needs to be allowed to access these things (Per the comments):
https://www.elastic.co/guide/en/elasticsearch/reference/current/vm-max-map-count.html)
https://kubernetes.io/docs/tasks/administer-cluster/sysctl-cluster/

It's not a bug, as the container points out, Kubernetes/CRI can't access the system option because it isn't whitelisted.

The recommendation of enabling privileged: true is not good and dangerous. You'll need to follow the instructions on those k8s docs link I referenced to make sure this works consistently and safely.

@dyipon
Copy link
Author

dyipon commented Oct 13, 2021

@DandyDeveloper ES Helm chart using
privileged: true option too, and working fine.
https://github.com/elastic/helm-charts/blob/e077086c74d3b9e121cec1d93f04d8418599bae4/elasticsearch/templates/statefulset.yaml#L166

@DandyDeveloper
Copy link
Collaborator

@dyipon That doesn't make it safe ;)

@sdwerwed
Copy link

sdwerwed commented Oct 13, 2021

Yes I agree it is not safe, but what is safer?
init container that runs on startup for some seconds as root?
Or to whitelist sysctl in each node?

The current helm chart will not work unless the user prepare the environment which is not common in Kubernetes, it will be more complicated also in live environments, to modify kublet (not sure if it will need recreation of VM). While with init container will be more compatible and admins will not have to modify kubelet arguments.

If it makes way more secure and reliable we can keep it as it is and make a note on readme what is expected from admins to do before installing this chart.

@dyipon
Copy link
Author

dyipon commented Oct 13, 2021

yes, thats uncommon in kubernetes (for me) if I have to add extra arguments to the kubelet, to whitelist the corresponding sysctl.

@dyipon
Copy link
Author

dyipon commented Oct 13, 2021

Well, I tried to follow these instructions:
https://www.elastic.co/guide/en/elasticsearch/reference/current/vm-max-map-count.html
https://kubernetes.io/docs/tasks/administer-cluster/sysctl-cluster/

Article say 'Only namespaced sysctls can be enabled this way', but vm.max_map_count is not a namespaced sysctl variable, and the preferred kubelet args give this error:
Oct 13 23:25:26 k3s-ha-master-1 k3s[474998]: E1013 23:25:26.608622 474998 server.go:288] "Failed to run kubelet" err="failed to run Kubelet: failed to create kubelet: the sysctl \"vm.max_map_count\" are not known to be namespaced"

Im open for any new suggestion, thanks :)

@acjohnson
Copy link

Sadly on GKE adding vm.max_map_count to the kublet --allowed-unsafe-sysctls arg is not a supported option as vm.max_map_count is an "unsupported" kernel parameter...

This is the error you'll get if you try to set it using gcloud beta container node-pools create

ERROR: (gcloud.beta.container.node-pools.create) ResponseError: code=400, message=Unsupported kernel parameter vm.max_map_count.

So it looks like the privileged sysctl init container is the only decent option currently for opensearch...

@sastorsl
Copy link
Contributor

I would argue that setting kernel parameters should be left outside.
Either you have nodes set aside for OpenSearch - and can get admins to set these values directly on the nodes, or you run other pods on the same nodes requiring some thought and coordination.

When I researched this I ended up setting it directly on the host, and leaving OpenSearch / helm unprivileged.

Not arguing against automating though, we do this with other orchestration tools, but in this ecosystem (k8s) I would say it's best left outside.

@DandyDeveloper
Copy link
Collaborator

Unfortunately, regarding GKE, there's just not many other options. I don't like the idea of privileged: true on a initContainer, but it does seem to be the only option here.

Per the above comment though, there definitely needs to be some understanding that either;

  • Nodes have to be prepared beforehand with the vm.max_map_count specifically for OpenSearch
  • Kubelet needs to have explicitly access to the necessary sysctls (even if it is considered unsafe)

Otherwise, the best bet is privileged: true which isn't good advice but we don't really have many other options.

@smlx
Copy link
Contributor

smlx commented Dec 1, 2021

At least scoping it to an init container is a reasonable compromise for now - as long as we can enable the init container via a flag.

@peterzhuamazon
Copy link
Member

Close this for now as it seems to be resolved by community.
Please feel free to re-open if you still have questions.

Thanks.

@acjohnson
Copy link

It would be nice if the init container method in this issue could become an official part of this helm chart and could be toggled with a bool value.

@peterzhuamazon
Copy link
Member

It would be nice if the init container method in this issue could become an official part of this helm chart and could be toggled with a bool value.

Hi @acjohnson feel free to contribute the changes if you are available for that :)
We welcome community contributions to help improve helm charts every day.

Thanks.

@DandyDeveloper
Copy link
Collaborator

I'm indifferent about adding it. If there is an overwhelming (somewhat) need for it. It should be done. Community driver after all

@nitinjagjivan
Copy link

With k8s v1.25 PodSecurity "baseline:latest" or PodSecurity "restricted:latest", initContainer solution provided above doesn't work as it needs securityContext.privileged: true which is dangerous as well.

Also helm-charts/charts/opensearch/values.yaml doesn't work as it does the same thing.

error:
would violate PodSecurity "baseline:latest": forbidden sysctls (vm.max_map_count)

@AriBerisha
Copy link

Maybe try this init container as a workaround

# # Enable on the first time
# sysctl:
#   enabled: false


extraInitContainers:
  ## Image that performs the sysctl operation to modify Kernel settings (needed sometimes to avoid boot errors)
  - name: sysctl
    image: docker.io/bitnami/bitnami-shell:10-debian-10-r199
    imagePullPolicy: "IfNotPresent"
    command:
      - /bin/bash
      - -ec
      - |
        CURRENT=`sysctl -n vm.max_map_count`;
        DESIRED="262144";
        if [ "$DESIRED" -gt "$CURRENT" ]; then
            sysctl -w vm.max_map_count=262144;
        fi;
        CURRENT=`sysctl -n fs.file-max`;
        DESIRED="65536";
        if [ "$DESIRED" -gt "$CURRENT" ]; then
            sysctl -w fs.file-max=65536;
        fi;
    securityContext:
      #runAsUser: 0
      privileged: true
      

I would like to thank you, this works very well for deprecated kubernetes version 1.15 and official helm chart of opensearch version 2.11.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants