-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
packetbeat: add option to allow sniffer to change device when default route changes #32681
Conversation
42b301f
to
767ac40
Compare
This pull request is now in conflicts. Could you fix it? 🙏
|
767ac40
to
3c5e822
Compare
3c5e822
to
c04f430
Compare
Pinging @elastic/security-external-integrations (Team:Security-External Integrations) |
c04f430
to
fdaff9b
Compare
const timeSuffixFormat = "20060102150405" | ||
filename := fmt.Sprintf("%s-%s.pcap", s.config.Dumpfile, time.Now().Format(timeSuffixFormat)) | ||
logp.Info("creating new dump file %s", filename) | ||
f, err := os.Create(filename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will result in file clobbering if more than one sniffer is run by the config manager. This is not worse than what we currently have, but worth being aware of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did some testing by switching over to a Wireguard VPN on my macbook. It switches the default route over to a utun devices. The poller did not observe the change. I was expected this to result in a failure of some kind since Packetbeat cannot sniff "raw" devices.
# No VPN.
% route -n get default
route to: default
destination: default
mask: default
gateway: 10.100.11.1
interface: en0
flags: <UP,GATEWAY,DONE,STATIC,PRCLONING,GLOBAL>
recvpipe sendpipe ssthresh rtt,msec rttvar hopcount mtu expire
0 0 0 0 0 0 1500 0
# Wireguard enabled.
% route -n get default
route to: default
destination: default
mask: default
interface: utun5
flags: <UP,DONE,CLONING,STATIC,GLOBAL>
recvpipe sendpipe ssthresh rtt,msec rttvar hopcount mtu expire
0 0 0 0 0 0 1420 0
var err error | ||
last, worker, err = s.sniffOneDynamic(device, last, worker) | ||
if err != nil { | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering about an edge case that we probably want to handle. If the default route moves to an unsupported link type temporarily, do we want Packetbeat to continue watching for changes even if it cannot sniff the current interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think that should have a timeout or just continue forever? I'm leaning towards forever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm thinking forever too. If I leave home and turn on my VPN, which cannot be monitored due to it being a raw device, I still want packetbeat to start monitoring when the VPN goes off.
side note: I think it would be useful to have a monitoring metric that reports the current interface name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will work for the default route case, but when there are multiple interfaces this won't work AFAICS. So I think the best thing to do is to publish it as the default route.
I tested two other scenarios:
|
ISTM that if we never fail out due to an error in reading the network, we cover both of #32681 (comment) and #32681 (comment). There is no reason to fail out for a read error (or probably any gopacket error) AFAICS. |
This pull request is now in conflicts. Could you fix it? 🙏
|
f3a656c
to
df66ff6
Compare
// If we are following a default route, request an interface | ||
// refresh and log the error. | ||
if s.followDefault { | ||
select { | ||
case refresh <- struct{}{}: | ||
default: | ||
// Don't request to refresh if already requested. | ||
} | ||
logp.Warn("error during packet capture: %v", err) | ||
continue | ||
} | ||
|
||
s.state.Store(snifferInactive) | ||
return fmt.Errorf("sniffing error: %w", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This raises the question of how we treat errors on non-default route interfaces. I think it's arguable that they should also just keep trying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 I can make arguments for it both ways.
If the interfaces goes way temporarily for some reason then I still want Packetbeat to resume when it comes back. I could let systemd/launchd/agent handle restarting the process continuously.
On the other hand, it is useful to have hard failure when testing out my config. I can quickly catch that I typo'ed the interface name if Packetbeat exits immediately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so an option would be to keep a state of whether there was ever a connection. If there was, keep trying, but if there wasn't, fail out. I would not want to add this in this change, but it is an option to consider in the future.
This pull request is now in conflicts. Could you fix it? 🙏
|
1b884dc
to
469ff29
Compare
… route changes Rather than attempting to append new sniffer device data to an established dump, file, dump files are rolled when a new device is used. The alternative would require keeping a track of link type and *pcapgo.Writer between calls to sniff to ensure that the dump file is not truncated and that the header is appropriately written. If the user needs to have a single pcap file for the session, they can be merged using either wireshark or mergecap.
Use this when there have been too many timeouts, or when there is an error condition during packet capture.
469ff29
to
41403c7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
The error handling in the sniffer w.r.t. unhandled interface types is something we should improve upon separately.
Absolutely. |
… route changes (#32681) * packetbeat: add option to allow sniffer to change device when default route changes Rather than attempting to append new sniffer device data to an established dump, file, dump files are rolled when a new device is used. The alternative would require keeping a track of link type and *pcapgo.Writer between calls to sniff to ensure that the dump file is not truncated and that the header is appropriately written. If the user needs to have a single pcap file for the session, they can be merged using either wireshark or mergecap. * packetbeat/sniffer: add logic to allow requests for interface refresh Use this when there have been too many timeouts, or when there is an error condition during packet capture. * register a default_route metric when following default route
What does this PR do?
Rather than attempting to append new sniffer device data to an established dump,
file, dump files are rolled when a new device is used. The alternative would
require keeping a track of link type and *pcapgo.Writer between calls to sniff
to ensure that the dump file is not truncated and that the header is
appropriately written.
If the user needs to have a single pcap file for the session, they can be merged
using either wireshark or mergecap.
Why is it important?
This makes packetbeat more useful on non-linux platforms that do not allow sniffing on the "any" device.
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Author's Checklist
How to test this PR locally
Make the following change and run packetbeat locally with
packetbeat run -e
.You should see a set of files in the form
dump-<timestamp>.pcap
. Opening these with wireshark should show you the network activity on the default route. Changing network interface should not stop capture, though there may be interruptions up to 20s long.Related issues
Use cases
Screenshots
Logs