Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kind: job need toleration #1541

Open
julianobarbosa opened this issue Sep 4, 2024 · 2 comments
Open

kind: job need toleration #1541

julianobarbosa opened this issue Sep 4, 2024 · 2 comments

Comments

@julianobarbosa
Copy link

Describe the bug
There is a toleration issue related to job scheduling in the Kubernetes cluster. The job is expected to tolerate specific taints on the nodes, but it fails to do so, leading to scheduling problems and the job being stuck in a pending state.

To Reproduce
Steps to reproduce the behavior:

  1. Deploy the job with the specified toleration in the YAML file.
  2. Apply the job configuration to the Kubernetes cluster.
  3. Check the status of the job using kubectl get jobs.
  4. Observe that the job remains in a pending state due to node taints.

Expected behavior
The job should tolerate the taints specified in its configuration and be scheduled on the appropriate node, moving from pending to running state.

Screenshots
If applicable, add screenshots to help explain your problem. (E.g., screenshots of the YAML configuration and the job's status in the Kubernetes dashboard.)

Desktop (please complete the following information):

  • OS: [e.g. Ubuntu 20.04]
  • Browser [e.g. Chrome, Firefox]
  • Version [e.g. Chrome 93]

Smartphone (please complete the following information):

  • Device: [e.g. OnePlus 7T]
  • OS: [e.g. Android 11]
  • Browser [e.g. Chrome, Firefox]
  • Version [e.g. Chrome 93]

Additional context
This issue may be related to the configuration of the taints and tolerations in the job's YAML file. Please verify that the taint keys and values match exactly with the node's taints.

Copy link

github-actions bot commented Sep 4, 2024

Hi 👋, thanks for opening an issue! Please note, it may take some time for us to respond, but we'll get back to you as soon as we can!

  • 💬 Slack Community: Join Robusta team and other contributors on Slack here.
  • 📖 Docs: Find our documentation here.
  • 🎥 YouTube Channel: Watch our videos here.

@julianobarbosa
Copy link
Author

File location: playbooks/robusta_playbooks/disk_benchmark.py

this is the snippet.

class DiskBenchmarkParams(PodRunningParams):
    """
    :var pvc_name: Name of the pvc created for the benchmark.
    :var test_seconds: The benchmark duration.
    :var namespace: Namespace used for the benchmark.
    :var disk_size: The size of pvc used for the benchmark.
    :var storage_class_name: Pvc storage class, From the available cluster storage classes. standard/fast/etc.
    """

    pvc_name: str = "robusta-disk-benchmark"
    test_seconds: int = 20
    namespace: str = INSTALLATION_NAMESPACE
    disk_size: str = "10Gi"
    storage_class_name: str

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant