-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow multiple firecracker test shards to run concurrently on a single machine #7439
Conversation
|
||
// Set up a symlink in PATH so that 'iptables' points to 'iptables-legacy'. | ||
// Our Firecracker setup does not yet have nftables enabled and can't use | ||
// the newer iptables. | ||
iptablesLegacyPath, err := exec.LookPath("iptables-legacy") | ||
require.NoError(t, err) | ||
overrideBinDir := testfs.MakeTempDir(t) | ||
err = os.Symlink(iptablesLegacyPath, filepath.Join(overrideBinDir, "iptables")) | ||
require.NoError(t, err) | ||
err = os.Setenv("PATH", overrideBinDir+":"+os.Getenv("PATH")) | ||
require.NoError(t, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got rid of this hack because it broke networking on my local machine (because the firewall rules were only being added to the legacy tables rather than nftables). It seems like this is no longer needed for our networking tests to pass, maybe because we updated our firecracker kernel config.
bc5e8a1
to
3f00ceb
Compare
return nil, fmt.Errorf("make lock dir %q: %s", *networkLockDir, err) | ||
} | ||
path := filepath.Join(*networkLockDir, fmt.Sprintf("ip_range.%d.lock", netIdx)) | ||
f, err := os.Create(path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sanity check - if path already exists (because another process already holds the lock), will os.Create return the same file descriptor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it'll return a different file descriptor (file descriptors are local to a process) but the returned descriptor will point to the same underlying file (inode), so flock
should work here. (I also checked this with some debug logging)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - but maybe check with Iain because I remember he had some thoughts/alternate ideas for other ways to implement this
Sure - for reference, I dug up the thread: https://buildbuddy-corp.slack.com/archives/C04R9V7H28N/p1697734759336109 A couple of thoughts after reading through it:
|
3f00ceb
to
7f83c53
Compare
Follow-up from #7429
flock
-based IP range locking to prevent concurrent test shards from trying to use the same IP range concurrently.shard_count
onfirecracker_test
. (This won't actually buy us a ton of concurrency until we disable exclusive task scheduling on the bare pool.)Related issues: https://github.com/buildbuddy-io/buildbuddy-internal/issues/3758