Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[packetbeat] Expire source port mappings. #41581

Merged
merged 2 commits into from
Nov 12, 2024
Merged

[packetbeat] Expire source port mappings. #41581

merged 2 commits into from
Nov 12, 2024

Commits on Nov 12, 2024

  1. [packetbeat] Expire source port mappings.

    port->pid mappings were only overwritten, never expired, the overwriting
    mechanism has a bunch of issues:
     - It only overwrites if it manages to find the new pid, so it misses short
    lived processes.
     - It only refreshes the mapping of said port, if a packet arriving on _another_
    port misses the lookup (otherwise the original port is found and returned).
    Meaning, once all ports are used at least once, the cache is filled and never
    mutated again.
    
    The observable effect is that the user will see wrong process correlations _to_
    older/long lived processes, imagine the follwing:
     - Long lived process makes _short_ lived TCP connection from src_port S.
     - Years later, a _short_ lived process makes a TCP connection to somewhere
    else, but from the same src_port S. It hits the cache, since it had a mapping
    for S, so packetbeat incorrectly correlates the new short-lived process
    connection, with the old long lived process.
    
    Related to a very long SDH, where a more in depth explanation of the bug can be
    found here, with a program to reproduce it.
     - elastic/sdh-beats#4604 (comment)
     - elastic/sdh-beats#4604 (comment)
    
    The solution is to discard mappings that are "old enough", with a hardcoded
    window of 10 seconds, so as long as the port is not re-used in this window, we
    are fine.
    
    This also makes sure the cache never becomes "immutable", since mappings will
    invariably get old, forcing a refresh.
    
    It's a very conservative approach as I don't want to introduce other bugs by
    redesigning it, work is on the way to change how the cache works in linux
    anyway.
    
    While here, I've noticed the locking was also wrong, we were doing the lookup
    unlocked, and also having to relock in case we have to update the mapping, so
    change this to grab the lock once and only once, interleaving is baad.
    haesbaert committed Nov 12, 2024
    Configuration menu
    Copy the full SHA
    4ed9319 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    60e6a5f View commit details
    Browse the repository at this point in the history