Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(userspace/engine): strengthen and document thread-safety guarantees of falco_engine::process_event #2082

Merged
merged 4 commits into from
Aug 26, 2022

Conversation

jasondellaluce
Copy link
Contributor

@jasondellaluce jasondellaluce commented Jun 21, 2022

What type of PR is this?

/kind feature

Any specific area of the project related to this PR?

/area engine

What this PR does / why we need it:

This PR is focused on guaranteeing (and documenting) that concurrent access to falco_engine::process_event() is safe for different event sources.

Since the new plugin system got implemented, and due to the recent refactoring of the Falco engine for Falco v0.32, the engine is not internally split by event sources. This means that the engine contains multiple rulesets, each mapped 1-1 with an event source. Accordingly, it is possible to assume that falco_engine::process_event() is now thread-safe with some assumptions:

  1. Each concurrent invoker must use a distinct and unique source_idx
  2. Each concurrent invoker must not switch values of source_idx across subsequent calls to falco_engine::process_event()
  3. Each concurrent invoker must use pass a different event (which is an implicit consequence of point 1)
  4. Filterchecks and formatters used to populate the conditions for a given event-source ruleset are not reused across rulesets of other event sources

Considering the above, the only synchronization point is stats_manager::on_event(), which has been modified to be thread-safe and lock-free.

Which issue(s) this PR fixes:

Special notes for your reviewer:

Thread safety is not a requirement for Falco, but future developments such as the ones described in #2074 could require it. Among the many things that can be useful to make thread safe, the most crucial one is the rule-event matching routine, of which falco_engine::process_event() is entrypoint. With this implementation, it is possible to consider processing events coming from different event sources in different parallel threads.

Assuming a single-threaded scenario, the changes involved in stats_manager::on_event() are a no-op in both expected behavior and performance. This is true because stats_manager::on_event() is invoked when an event matches a given rule exclusively, which happens with less frequency than the production of events by some orders of magnitude. As such, we can consider these changes as non-noticeable for single-thread scenarios (which is what we have right now). Moreover, when events are matched with a rule we then have to queue them in the Falco output engine, which is still an asynchronous sequential synchronization point.

Does this PR introduce a user-facing change?:

refactor(userspace/engine): strengthen and document thread-safety guarantees of falco_engine::process_event

@poiana poiana requested review from leodido and mfdii June 21, 2022 17:24
@jasondellaluce jasondellaluce changed the title refactor(userspace/engine): strengthen and document thread-safety guarantees of falco_engine::process_event wip: refactor(userspace/engine): strengthen and document thread-safety guarantees of falco_engine::process_event Jun 21, 2022
@jasondellaluce
Copy link
Contributor Author

/milestone 0.33.0

@poiana poiana added this to the 0.33.0 milestone Jun 21, 2022
std::vector<uint64_t> m_by_priority;
std::vector<uint64_t> m_by_rule_id;
atomic<uint64_t> m_total;
std::vector<std::unique_ptr<atomic<uint64_t>>> m_by_priority;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: the usage of unique_ptr is required because atomic<> does not have copy nor move constructors, which makes them unusable in vectors as simple values.

@jasondellaluce jasondellaluce changed the title wip: refactor(userspace/engine): strengthen and document thread-safety guarantees of falco_engine::process_event refactor(userspace/engine): strengthen and document thread-safety guarantees of falco_engine::process_event Jun 22, 2022
@jasondellaluce jasondellaluce force-pushed the ref/statsmanager-tsafe branch 2 times, most recently from ff30d60 to 343ceac Compare June 23, 2022 16:53
Copy link
Member

@leogr leogr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

@poiana
Copy link
Contributor

poiana commented Aug 26, 2022

LGTM label has been added.

Git tree hash: 722c6bad23695db9ab69de98bb4de599ab91fbdc

Copy link
Contributor

@FedeDP FedeDP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@poiana
Copy link
Contributor

poiana commented Aug 26, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: FedeDP, jasondellaluce

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [FedeDP,jasondellaluce]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@poiana poiana merged commit c2a8efc into master Aug 26, 2022
@poiana poiana deleted the ref/statsmanager-tsafe branch August 26, 2022 09:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants