Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURES] Add the PodAffinity to the Dataset CRD #3496

Closed
dashanji opened this issue Oct 19, 2023 · 3 comments
Closed

[FEATURES] Add the PodAffinity to the Dataset CRD #3496

dashanji opened this issue Oct 19, 2023 · 3 comments
Labels
features features

Comments

@dashanji
Copy link
Contributor

What feature you'd like to add:

Add the PodAffinity to the Dataset Spec.

type DatasetSpec struct {
         ....
         PodAffinity corev1.PodAffinity
         ....
}

Why is this feature needed:

When the dataset is not cached by Node, but cached by Pod, the PodAffinity field is more appropriate.

BTW, I can submit a PR to address it if the feature is approved.

@dashanji dashanji added the features features label Oct 19, 2023
@TrafalgarZZZ
Copy link
Member

@dashanji Thanks for the feature request ! I think that's a great feature, but I am not sure what did u mean "not cached by Node, but cached by Pod"? Do u mind explaining it more detailedly? An example is appreciated to make sure we're on the same page to the feature.

@dashanji
Copy link
Contributor Author

dashanji commented Oct 20, 2023

Hi @TrafalgarZZZ, actually I'm doing some integrations and want to use Fluid as the launcher of our storage system Vineyard.

We have a storage engine Vineyard, which uses socket as the access interface. Therefore, all applications using Vineyard must be on the same node as they must connect the same socket. Inspired by this example, we can leverage fluid's Dataset and ThinRuntime to implement socket mounting. The main steps are as follows.

Here I used kind to create a k8s cluster. Assume there is 1 master node and 3 worker nodes.

  1. Deploy Fluid.

  2. Deploy a Vineyard Deployment containing only one replica.

  3. Create the configure file configure-vineyard-socket.py.

import json

with open("/etc/fluid/config.json", "r") as f:
    lines = f.readlines()

rawStr = lines[0]
print(rawStr)


script = """
#!/bin/sh
set -ex

mkdir -p $targetPath
while true; do
    if [ ! -S "$targetPath/vineyard.sock" ]; then
        mount --bind $socketPath $targetPath
    fi
    sleep 10
done
"""

obj = json.loads(rawStr)

with open("mount-vineyard-socket.sh", "w") as f:
    f.write("targetPath=\"%s\"\n" % obj['targetPath'])
    if obj['mounts'][0]['mountPoint'].startswith("local://"):
      f.write("socketPath=\"%s\"\n" % obj['mounts'][0]['mountPoint'][len("local://"):])
    else:
      f.write("socketPath=\"%s\"\n" % obj['mounts'][0]['mountPoint'])

    f.write(script)
  1. Create the following Profile.
apiVersion: data.fluid.io/v1alpha1
kind: ThinRuntimeProfile
metadata:
  name: vineyard-profile
spec:
  fileSystemType: fuse
  volumes:
  - name: vineyard-socket
    hostPath:
      # This path should be the same as the vineyard socket path in the vineyard deployment
      path: /var/run/vineyard-kubernetes/vineyard-system/vineyardd-sample
      type: DirectoryOrCreate
  fuse:
    image: configure-vineyard-socket
    imageTag: latest
    imagePullPolicy: IfNotPresent
    volumeMounts:
    - name: vineyard-socket
      mountPath: /var/run/vineyard-kubernetes/vineyard-system/vineyardd-sample
    command:
    - sh
    - -c
    - "python3 /configure-vineyard-socket.py && chmod u+x ./mount-vineyard-socket.sh && ./mount-vineyard-socket.sh"
  1. Create the dataset with PodAffinity, so that the generated vineyard-fuse-pod can be bound to the same node as the previously deployed vineyard Pod.
apiVersion: data.fluid.io/v1alpha1
kind: Dataset
metadata:
  name: vineyard
spec:
  mounts:
  # This directory should be the same as the vineyard socket directory in the vineyard deployment
  - mountPoint: local:///var/run/vineyard-kubernetes/vineyard-system/vineyardd-sample
    name: vineyard
  accessModes:
  - ReadWriteMany
  ################## Added #################
  podAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
    - labelSelector:
        matchExpressions:
        ###### label_that_match_vineyard_pod ######
        - key: app.kubernetes.io/instance
          operator: In
          values:
          - vineyard-system-vineyardd-sample
       topologyKey: kubernetes.io/hostname
  #########################################
---
apiVersion: data.fluid.io/v1alpha1
kind: ThinRuntime
metadata:
  name: vineyard
spec:
  profileName: vineyard-profile

I think that's a great feature, but I am not sure what did u mean "not cached by Node, but cached by Pod"?

In the above example, the dataset is strongly dependent on Pod (Vineyard) rather than Node.

However, in practice, I found a problem. Using the following nodeAffinity does not allow the fuse pod to be scheduled to the specified node (kind-worker3).

apiVersion: data.fluid.io/v1alpha1
kind: Dataset
metadata:
  name: vineyard
spec:
  mounts:
  # This directory should be the same as the vineyard socket directory in the vineyard deployment
  - mountPoint: local:///var/run/vineyard-kubernetes/vineyard-system/vineyardd-sample
    options:
      vineyard-socket-directory: /var/run/vineyard-kubernetes/vineyard-system/vineyardd-sample
    name: vineyard
  accessModes:
  - ReadWriteMany
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
            - kind-worker3
---
apiVersion: data.fluid.io/v1alpha1
kind: ThinRuntime
metadata:
  name: vineyard
spec:
  profileName: vineyard-profile
apiVersion: v1
kind: Pod
metadata:
  name: vineyard-test
  labels:
    fuse.serverful.fluid.io/inject: "true"
    fluid.io/dataset.vineyard.sched: required
spec:
  containers:
    - name: nginx
      image: nginx
      volumeMounts:
        - mountPath: /data
          name: vineyard-data
  volumes:
    - name: vineyard-data
      persistentVolumeClaim:
        claimName: vineyard

Is there something I missed? Looking forward to your reply. Thanks.

@dashanji
Copy link
Contributor Author

dashanji commented Feb 5, 2024

Close via #3528

@dashanji dashanji closed this as completed Feb 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
features features
Projects
None yet
Development

No branches or pull requests

2 participants