Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SRIOV lane: Mount in memory directory for backing in-memory etcd data #709

Closed
wants to merge 1 commit into from

Conversation

ormergi
Copy link
Contributor

@ormergi ormergi commented Nov 15, 2020

Currently we encounter bad performance of KIND cluster on DinD setup,
we get 'etcdserver: timeout errors' on SRIOV lane jobs causesing jobs to fail often.

In such cases it is recommended to run in-memory etcd
kubernetes-sigs/kind#1922
kubernetes-sigs/kind#845
Running etcd in memory should improve performance and make sriov lane more stabilized

This PR provides in memory directory for backing etcd data directory
kubevirt/kubevirtci#478

@kubevirt-bot kubevirt-bot added the dco-signoff: yes Indicates the PR's author has DCO signed all their commits. label Nov 15, 2020
@ormergi
Copy link
Contributor Author

ormergi commented Nov 15, 2020

@@ -98,6 +98,8 @@ presubmits:
mountPath: /sys/fs/cgroup
- name: vfio
mountPath: /dev/vfio/
- name: etcd-data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is etcd-data-dir

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Contributor

@oshoval oshoval left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, a retry or other solutions ad hoc, would be better until we have faster HD (SSD), because ramfs might be too sterile
and mask problems / opportunities (and we can see the latest sriov changes as an example)

@ormergi ormergi requested a review from qinqon November 16, 2020 09:57
@ormergi ormergi force-pushed the sriov_lane_in_memory_etcd branch 4 times, most recently from 4af25d8 to 9f318a6 Compare November 16, 2020 10:51
@ormergi
Copy link
Contributor Author

ormergi commented Nov 16, 2020

/rehearse check-up-kind-1.17-sriov

@danielBelenky
Copy link
Contributor

/rehearse

@ormergi
Copy link
Contributor Author

ormergi commented Nov 17, 2020

/rehearse

1 similar comment
@ormergi
Copy link
Contributor Author

ormergi commented Nov 17, 2020

/rehearse

@ormergi
Copy link
Contributor Author

ormergi commented Nov 17, 2020

/rehearse

@kubevirt-bot
Copy link
Contributor

@ormergi: GitHub didn't allow me to request PR reviews from the following users: cancel.

Note that only kubevirt members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @EdDev cancel

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ormergi
Copy link
Contributor Author

ormergi commented Nov 17, 2020

/uncc @EdDev @oshoval @qinqon @omeryahud @dhiller @danielBelenky @cynepco3hahue

Not ready for review yet

@ormergi
Copy link
Contributor Author

ormergi commented Nov 17, 2020

/rehearse

@ormergi
Copy link
Contributor Author

ormergi commented Nov 17, 2020

/rehearse

@ormergi
Copy link
Contributor Author

ormergi commented Nov 17, 2020

/rehearse

@ormergi
Copy link
Contributor Author

ormergi commented Nov 17, 2020

/rehearse

@ormergi ormergi force-pushed the sriov_lane_in_memory_etcd branch 2 times, most recently from 1b2edd4 to bf05053 Compare November 17, 2020 14:53
@ormergi
Copy link
Contributor Author

ormergi commented Nov 17, 2020

/rehearse

@ormergi
Copy link
Contributor Author

ormergi commented Nov 17, 2020

@qinqon
Running check up job with kubevirt/kubevirtci#478 on top using rehearsal plugin.
https://prow.apps.ovirt.org/view/gcs/kubevirt-prow/pr-logs/pull/kubevirt_project-infra/709/rehearsal-check-up-kind-1.17-sriov/1328713493427261440

EDIT:
Changes from cluster up did not reflected
This jobs running kubevirt tests with kubevirt/kubevirtci#478 on top using rehearsal plugin
It means that running tests using etcd in memory works.
https://prow.apps.ovirt.org/view/gcs/kubevirt-prow/pr-logs/pull/kubevirt_project-infra/709/rehearsal-pull-kubevirt-e2e-kind-1.17-sriov/1328713493427261441

@ormergi
Copy link
Contributor Author

ormergi commented Nov 17, 2020

/rehearse

@ormergi
Copy link
Contributor Author

ormergi commented Nov 18, 2020

/rehearse

@qinqon
Copy link
Contributor

qinqon commented Nov 18, 2020

/hold
This is a testing PR rigth now, it has to be cleaned up to be merged.

@kubevirt-bot kubevirt-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 18, 2020
@ormergi
Copy link
Contributor Author

ormergi commented Nov 18, 2020

/hold
This is a testing PR rigth now, it has to be cleaned up to be merged.

Done
I also added another commit to ensure cluster teardown to prevent resource leaks

@ormergi ormergi force-pushed the sriov_lane_in_memory_etcd branch 2 times, most recently from 4588af4 to 0cdb49a Compare November 20, 2020 11:23
@ormergi
Copy link
Contributor Author

ormergi commented Nov 20, 2020

Rebased

Currently we encounter bad performance of KIND
cluster on DinD setup, we get 'etcdserver: timeout errors'
that causes jobs to fail often.
Running etcd in memory should improve performance and
make sriov jobs more stabilized

This commit create in memory directory to back etcd
data directory

Signed-off-by: Or Mergi <[email protected]>
@ormergi
Copy link
Contributor Author

ormergi commented Nov 22, 2020

/hold

Looks like we wont need it

@ormergi
Copy link
Contributor Author

ormergi commented Nov 23, 2020

/close

No need to create a mount for etcd data anymore

@kubevirt-bot
Copy link
Contributor

@ormergi: Closed this PR.

In response to this:

/close

No need to create a mount for etcd data anymore

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dco-signoff: yes Indicates the PR's author has DCO signed all their commits. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. size/S
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants