Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation Error when using a macro as definition of an overwritten macro. #706

Closed
natalysheinin opened this issue Jul 4, 2019 · 41 comments · Fixed by #1966
Closed

Validation Error when using a macro as definition of an overwritten macro. #706

natalysheinin opened this issue Jul 4, 2019 · 41 comments · Fixed by #1966

Comments

@natalysheinin
Copy link
Contributor

What happened / Replication Steps:

Given a macro that is defined in the default ruleset and for the purposes of being concrete let's use Clear Log Activities:

- macro: allowed_clear_log_files
  condition: (never_true)
  1. Now, if you overwrite this macro in falco_rules.local.yaml, for example:
- macro: allowed_clear_log_files
  condition: >
    (container.name = "I am testing something")

When I run my validation script:

falco -c /etc/falco/falco.yml.erb               \
--validate /etc/falco/falco_rules.yml.erb        \                           
--validate /etc/falco/falco_macros.local.yml.erb  \                

Everything succeeds 🎉

  1. However, if I overwrite this macro in falco_rules.local.yaml using another macro, for example:
- macro: allowed_clear_log_files
  condition: >
    consider_all_chmods

I get the following error message:
Runtime error: Error loading rules: /usr/share/falco/lua/compiler.lua:65: Undefined macro 'consider_all_chmods' used in filter.. Exiting.

(It is important to note that consider_all_chmods is in defined in the default ruleset, so I am not sure why it is failing here.

Anything else we need to know?:

Environment:

  • Falco version (use falco --version): falco 0.14.0
  • System info
  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools (e.g. in kubernetes, rpm, deb, from source):
  • Others:
@fntlnz
Copy link
Contributor

fntlnz commented Jul 5, 2019

Me and @leodido debugged this for a while this morning.

We've been able to reproduce this thanks for reporting this @natalysheinin !

The reason why this does not work as expected is because in the default falco_rules.yaml the consider_all_chmods is defined after the macro that uses it allowed_clear_log_files.

We crafted a minimum reproducible example to test it:

falco_rules.yaml

- macro: never_true
  condition: (evt.num=0)
- macro: open_write
  condition: (evt.type=open or evt.type=openat) and evt.is_open_write=true and fd.typechar='f' and fd.num>=0

- macro: allowed_clear_log_files
  condition: (never_true)

- macro: consider_all_chmods
  condition: (never_true)

- rule: Clear Log Activities
  desc: Detect clearing of critical log files
  condition: >
    open_write and
    not allowed_clear_log_files
  output: >
    Log files were tampered (user=%user.name command=%proc.cmdline file=%fd.name container_id=%container.id image=%container.image.repository)
  priority:
    WARNING
  tags: [file, mitre_defense_evasion]

falco_rules.local.yaml

- macro: allowed_clear_log_files
  condition: (consider_all_chmods)

That set of rules with that order will output the same error you are reporting:

./build/userspace/falco/falco -c falco.yaml --validate rules/falco_rules.yaml --validate rules/falco_rules.local.yaml
Fri Jul  5 12:22:39 2019: Validating rules file(s):
Fri Jul  5 12:22:39 2019:    rules/falco_rules.yaml
Fri Jul  5 12:22:39 2019:    rules/falco_rules.local.yaml
Fri Jul  5 12:22:39 2019: Runtime error: Error loading rules: ...ts/falcosecurity/falco/userspace/engine/lua/compiler.lua:74: attempt to index a nil value. Exiting.

After that, if we swap the order of allowed_clear_log_files and consider_all_chmods in falco_rules.yaml we have it working just fine.

That's what we observed with this reproducible example but it's actually the same if you just swap the order in the upstream falco_rules.yaml.

Here's the falco_rules.yaml with the macros swapped

- macro: never_true
  condition: (evt.num=0)
- macro: open_write
  condition: (evt.type=open or evt.type=openat) and evt.is_open_write=true and fd.typechar='f' and fd.num>=0

- macro: consider_all_chmods
  condition: (never_true)
  
- macro: allowed_clear_log_files
  condition: (never_true)

- rule: Clear Log Activities
  desc: Detect clearing of critical log files
  condition: >
    open_write and
    not allowed_clear_log_files
  output: >
    Log files were tampered (user=%user.name command=%proc.cmdline file=%fd.name container_id=%container.id image=%container.image.repository)
  priority:
    WARNING
  tags: [file, mitre_defense_evasion]

Here's the result:

./build/userspace/falco/falco -c falco.yaml --validate rules/falco_rules.yaml --validate rules/falco_rules.local.yaml
Fri Jul  5 12:25:22 2019: Validating rules file(s):
Fri Jul  5 12:25:22 2019:    rules/falco_rules.yaml
Fri Jul  5 12:25:22 2019:    rules/falco_rules.local.yaml
Fri Jul  5 12:25:22 2019: Ok

@fntlnz
Copy link
Contributor

fntlnz commented Jul 5, 2019

/assign @leodido
/assign @fntlnz

@fntlnz
Copy link
Contributor

fntlnz commented Jul 5, 2019

/area rules

@leodido
Copy link
Member

leodido commented Jul 5, 2019

While swapping the macro definitions in the rules file works, it cannot be considered a fix for this bug, for 2 reasons:

  1. changing this in the upstream rules will fix this issue only for half of the users because the other half of the users will still have the reverse problem
  2. we cannot rely on ordering based of the original position of the overridden macro definition.

So we think that, in order to truly fix this bug, the way the rule_loader.lua processes the macro definitions needs to be changed to be agnostic with respect to the position of the overridden macro definition.

@natalysheinin
Copy link
Contributor Author

This is great! I'm glad you were able to easily replicate this scenario. So now I have another (similar) scenario.

Take for instance the exact same default rule file as you mentioned:

falco_rules.yaml

- macro: never_true
  condition: (evt.num=0)
- macro: open_write
  condition: (evt.type=open or evt.type=openat) and evt.is_open_write=true and fd.typechar='f' and fd.num>=0

- macro: allowed_clear_log_files
  condition: (never_true)

- macro: consider_all_chmods
  condition: (never_true)

- rule: Clear Log Activities
  desc: Detect clearing of critical log files
  condition: >
    open_write and
    not allowed_clear_log_files
  output: >
    Log files were tampered (user=%user.name command=%proc.cmdline file=%fd.name container_id=%container.id image=%container.image.repository)
  priority:
    WARNING
  tags: [file, mitre_defense_evasion]

However, now allowed_clear_log_file references a macro that is only defined in the custom rules file:

falco_rules.local.yaml

- macro: test_macro
  condition: >
    (container.name = "I am testing something")

- macro: allowed_clear_log_files
  condition: >
    (test_macro)

When I run my validation, once again, a configuration error (similar to above):

falco -c /etc/falco/falco.yml.erb               \
--validate /etc/falco/falco_rules.yml.erb        \                           
--validate /etc/falco/falco_macros.local.yml.erb  \    
Runtime error: Error loading rules: /usr/share/falco/lua/compiler.lua:65: Undefined macro 'test_macro' used in filter.. Exiting.            

@leodido
Copy link
Member

leodido commented Jul 5, 2019

Yes @natalysheinin, unfortunately this is due to the fact that test_macro macro does not exist in the default rules file (ie., falco_rules.yaml).

And this is very correlated to the original problem you reported. Also this is happening because all the processing (particularly the creation of the macros tree) that rule_loader.lua does are based solely on the default rules file.

@natalysheinin
Copy link
Contributor Author

@leodido Thanks for clarifying that, it is extremely useful to know when the macro tree processing occurs.

Do you have any recommendations for how to approach solving this from the user-end for now?
And, do you believe that your earlier comment will address solving this?

the way the rule_loader.lua processes the macro definitions needs to be changed to be agnostic with respect to the position of the overridden macro definition.

@leodido
Copy link
Member

leodido commented Jul 5, 2019

Do you have any recommendations for how to approach solving this from the user-end for now?

The workaround I'd use right now would be putting everything in one file (making sure that test_macro is above allowed_clear_log_files macro).

Clearly the edits to the creation and the processing of the macros tree needs to fix also this case.

@fntlnz
Copy link
Contributor

fntlnz commented Jul 5, 2019

For the sake of completeness, this is the generated AST for the last rules that @natalysheinin sent in this comment

click me to see the ast dump
{
  allowed_clear_log_files = {
    ast = {
      type = "Macro",
      value = "never_true"
    },
    used = false
  },
  consider_all_chmods = {
    ast = {
      type = "Macro",
      value = "never_true"
    },
    used = false
  },
  never_true = {
    ast = {
      left = {
        type = "FieldName",
        value = "evt.num"
      },
      operator = "=",
      right = {
        type = "Number",
        value = 0
      },
      type = "BinaryRelOp"
    },
    used = true
  },
  open_write = {
    ast = {
      left = {
        left = {
          left = {
            left = {
              left = {
                type = "FieldName",
                value = "evt.type"
              },
              operator = "=",
              right = {
                type = "BareString",
                value = "open"
              },
              type = "BinaryRelOp"
            },
            operator = "or",
            right = {
              left = {
                type = "FieldName",
                value = "evt.type"
              },
              operator = "=",
              right = {
                type = "BareString",
                value = "openat"
              },
              type = "BinaryRelOp"
            },
            type = "BinaryBoolOp"
          },
          operator = "and",
          right = {
            left = {
              type = "FieldName",
              value = "evt.is_open_write"
            },
            operator = "=",
            right = {
              type = "BareString",
              value = "true"
            },
            type = "BinaryRelOp"
          },
          type = "BinaryBoolOp"
        },
        operator = "and",
        right = {
          left = {
            type = "FieldName",
            value = "fd.typechar"
          },
          operator = "=",
          right = {
            type = "String",
            value = "f"
          },
          type = "BinaryRelOp"
        },
        type = "BinaryBoolOp"
      },
      operator = "and",
      right = {
        left = {
          type = "FieldName",
          value = "fd.num"
        },
        operator = ">=",
        right = {
          type = "Number",
          value = 0
        },
        type = "BinaryRelOp"
      },
      type = "BinaryBoolOp"
    },
    used = false
  }
}

The test_macro is clearly missing, proving that it was not considered.

@mstemm
Copy link
Contributor

mstemm commented Jul 11, 2019

Here's some background about how the objects are maintained and how it can cause this bug. When the objects are read they're maintained in an ordered array. The ordered array is necessary to assure that appends are applied in the right order.

I think the problem is that the macro allowed_clear_log_files replaces the instance of allowed_clear_log_files in that array (in this case, third). Since it's third, the macro reference test_macro doesn't exist yet.

I think the fix is to be more liberal about using macros defined later in the file when expanding macro references. We'll have to make sure that semantics about ordering still apply wrt appends and overrides.

@stale
Copy link

stale bot commented Sep 9, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Sep 9, 2019
@mstemm
Copy link
Contributor

mstemm commented Sep 10, 2019

We still want to fix this.

@stale stale bot removed the wontfix label Sep 10, 2019
@mfdii mfdii assigned krisnova and unassigned leodido and fntlnz Sep 10, 2019
@mfdii
Copy link
Member

mfdii commented Sep 10, 2019

@kris-nova is going to start looking at this. It would be good to discuss on the upcoming repo planning call to get an idea of how long it might take to fix.

@fntlnz
Copy link
Contributor

fntlnz commented Sep 17, 2019

/assign @leodido
/assign @fntlnz

@fntlnz fntlnz added this to the 0.18.0 milestone Sep 17, 2019
@leodido
Copy link
Member

leodido commented Sep 25, 2019

/priority high

@leodido
Copy link
Member

leodido commented Jan 28, 2021 via email

@poiana
Copy link
Contributor

poiana commented Jan 28, 2021

@leodido: Reopened this issue.

In response to this:

/reopen
L.

On Thu, Jan 28, 2021 at 9:57 AM poiana [email protected] wrote:

@poiana https://github.com/poiana: Closing this issue.

In response to this
#706 (comment)
:

Rotten issues close after 30d of inactivity.

Reopen the issue with /reopen.

Mark the issue as fresh with /remove-lifecycle rotten.

Provide feedback via https://github.com/falcosecurity/community.
/close

Instructions for interacting with me using PR comments are available here
https://git.k8s.io/community/contributors/guide/pull-requests.md. If
you have questions or suggestions related to my behavior, please file an
issue against the kubernetes/test-infra
https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:
repository.


You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
#706 (comment),
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAA5J43FLDQYD6KRWWX5PC3S4ERHRANCNFSM4H6AIRSA
.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@poiana poiana reopened this Jan 28, 2021
@poiana
Copy link
Contributor

poiana commented Feb 27, 2021

Rotten issues close after 30d of inactivity.

Reopen the issue with /reopen.

Mark the issue as fresh with /remove-lifecycle rotten.

Provide feedback via https://github.com/falcosecurity/community.
/close

@poiana
Copy link
Contributor

poiana commented Feb 27, 2021

@poiana: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue with /reopen.

Mark the issue as fresh with /remove-lifecycle rotten.

Provide feedback via https://github.com/falcosecurity/community.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@poiana poiana closed this as completed Feb 27, 2021
@leogr
Copy link
Member

leogr commented Mar 1, 2021

/reopen
/remove-lifecycle rotten

/cc @leodido

@poiana
Copy link
Contributor

poiana commented Mar 1, 2021

@leogr: Reopened this issue.

In response to this:

/reopen
/remove-lifecycle rotten

/cc @leodido

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@poiana poiana reopened this Mar 1, 2021
@poiana
Copy link
Contributor

poiana commented May 30, 2021

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@leogr
Copy link
Member

leogr commented Jun 1, 2021

/remove-lifecycle stale

@poiana
Copy link
Contributor

poiana commented Aug 30, 2021

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@leogr
Copy link
Member

leogr commented Aug 30, 2021

/remove-lifecycle stale

@fntlnz fntlnz removed their assignment Aug 31, 2021
@wdoekes
Copy link

wdoekes commented Nov 17, 2021

Here are some hacks that appear to work.

I would suggest refactoring this (and the rest of the load_rules, as it has gotten way too big) before inclusion.

--- a/userspace/engine/lua/rule_loader.lua
+++ b/userspace/engine/lua/rule_loader.lua
@@ -917,17 +917,35 @@ function load_rules(rules_content,
       state.lists[v['list']] = {["items"] = items, ["used"] = false}
    end
 
-   for _, name in ipairs(state.ordered_macro_names) do
-
-      local v = state.macros_by_name[name]
-
-      local status, ast = compiler.compile_macro(v['condition'], state.macros, state.lists)
-
-      if status == false then
-        return false, nil, nil, build_error_with_context(v['context'], ast), warnings
+   -- Ugly hackery so we push macros that do not compile to the back for
+   -- later reprocessing. Sometimes macros aren't expanded yet, so we'll
+   -- retry them later. Stop/error once we've failed for an entire
+   -- iteration.
+   local v, ast
+   local idx, changed
+   local macros_to_try = {}
+   local macros_to_retry = {}
+   for idx = 1, #state.ordered_macro_names do
+      macros_to_try[idx] = state.ordered_macro_names[idx]
+   end
+   repeat
+      changed = false
+      for idx = 1, #macros_to_try do
+         local name = macros_to_try[idx]
+         v = state.macros_by_name[name]
+         status, ast = compiler.compile_macro(v['condition'], state.macros, state.lists)
+         if status == false then
+            macros_to_retry[#macros_to_retry + 1] = name
+         else
+            state.macros[v['macro']] = {["ast"] = ast.filter.value, ["used"] = false}
+            changed = true
+         end
       end
-
-      state.macros[v['macro']] = {["ast"] = ast.filter.value, ["used"] = false}
+      macros_to_try = macros_to_retry
+      macros_to_retry = {}
+   until #macros_to_try == 0 or changed == false
+   if #macros_to_try ~= 0 then
+      return false, nil, nil, build_error_with_context(v['context'], ast), warnings
    end
 
    for _, name in ipairs(state.ordered_rule_names) do

Replaces:

    for _, name in ipairs(state.ordered_macro_names) do
   
      local v = state.macros_by_name[name]
   
      local status, ast = compiler.compile_macro(v['condition'], state.macros, state.lists)
   
      if status == false then
         return false, nil, nil, build_error_with_context(v['context'], ast), warnings
      end

      state.macros[v['macro']] = {["ast"] = ast.filter.value, ["used"] = false}
   end

with:

   -- Ugly hackery so we push macros that do not compile to the back for
   -- later reprocessing. Sometimes macros aren't expanded yet, so we'll
   -- retry them later. Stop/error once we've failed for an entire
   -- iteration.
   local v, ast
   local idx, changed
   local macros_to_try = {}
   local macros_to_retry = {}
   for idx = 1, #state.ordered_macro_names do
      macros_to_try[idx] = state.ordered_macro_names[idx]
   end
   repeat
      changed = false
      for idx = 1, #macros_to_try do
         local name = macros_to_try[idx]
         v = state.macros_by_name[name]
         status, ast = compiler.compile_macro(v['condition'], state.macros, state.lists)
         if status == false then
            macros_to_retry[#macros_to_retry + 1] = name
         else
            state.macros[v['macro']] = {["ast"] = ast.filter.value, ["used"] = false}
            changed = true
         end
      end
      macros_to_try = macros_to_retry
      macros_to_retry = {}
   until #macros_to_try == 0 or changed == false
   if #macros_to_try ~= 0 then
      return false, nil, nil, build_error_with_context(v['context'], ast), warnings
   end

@leogr
Copy link
Member

leogr commented Nov 18, 2021

Hey @wdoekes
Would you like to open a PR? 🤩
Maybe @jasondellaluce is interested too.

@poiana
Copy link
Contributor

poiana commented Feb 16, 2022

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@jasondellaluce
Copy link
Contributor

/remove-lifecycle stale

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment