Use the Zeek is_orig field to set files tx_host/rx_host #3004
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
A recent Zeek community Slack thread educated me about the
is_orig
field of Zeekfiles
logs. As those docs explain:In the changes I pushed in #2981 I wasn't yet aware of this so for the result in the Correlation view I'd just set the
tx_host
to be theid.resp_h
of thefiles
event and likewise set therx_host
to be theid.orig_h
. This all makes sense for the common case of a file download, but it's incorrect for file uploads.To illustrate the effect I've attached some test data query-aws.pcapng.gz which is a capture of my laptop (IP
199.83.220.169
) performing a query over the lake API to a Zed service running on an AWS EC2 instance (IP3.138.203.14
).When this pcap is imported into Zui, Zeek ends up finding two
files
events within this single connection, the first of which is a log of the query payload ({"query":"from inventory@main | count() by warehouse"}
) and the second which is the query response ) ({warehouse:"chicago",count:2(uint64)} {warehouse:"miami",count:1(uint64)}
).With Zui commit cf615ef that's current tip of
main
before this PR's branch, bothfiles
events show thetx_host
to be the same value: That of the AWS instance.Now at commit ae287aa using the branch for this PR, the first
files
event shows my laptop as thetx_host
, which is more in line with expectations since my laptop is what "originated" the sending of the query payload.While the second
files
event shows the AWS instance as thetx_host
, which also makes sense since it's what "originated" the sending of the query response.