Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial Filter Processor #1220

Merged
merged 17 commits into from
Sep 16, 2022
Merged

Conversation

AlgoStephenAkiki
Copy link
Contributor

Resolves #1100

Adds dynamic filter processor and initial tests

Summary

Explain the goal of this change and what problem it is solving. Format this cleanly so that it may be used for a commit message, as your changes will be squash-merged.

Test Plan

How did you test these changes? Please provide the exact scenarios you tested in as much detail as possible including commands, output and rationale.

Resolves #1100

Adds dynamic filter processor and initial tests
@AlgoStephenAkiki AlgoStephenAkiki self-assigned this Sep 13, 2022
@codecov
Copy link

codecov bot commented Sep 13, 2022

Codecov Report

❗ No coverage uploaded for pull request base (conduit@e8fa891). Click here to learn what that means.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             conduit    #1220   +/-   ##
==========================================
  Coverage           ?   62.08%           
==========================================
  Files              ?       69           
  Lines              ?     9383           
  Branches           ?        0           
==========================================
  Hits               ?     5825           
  Misses             ?     3059           
  Partials           ?      499           

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Contributor

@Eric-Warehime Eric-Warehime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really cool stuff! I left some small comments on various things. I think the big thing for me is adding more tests, and splitting the library functions out and getting individual tests for each of those interfaces/utility functions.

processors/config.go Outdated Show resolved Hide resolved
processors/config.go Outdated Show resolved Hide resolved
processors/filterprocessor/filter_processor.go Outdated Show resolved Hide resolved
processors/filterprocessor/filter_processor.go Outdated Show resolved Hide resolved
processors/filterprocessor/filter_processor.go Outdated Show resolved Hide resolved
processors/filterprocessor/filter_processor_test.go Outdated Show resolved Hide resolved
processors/filterprocessor/filter_processor_test.go Outdated Show resolved Hide resolved
processors/filterprocessor/filter_processor_test.go Outdated Show resolved Hide resolved
processors/filterprocessor/filter_processor_test.go Outdated Show resolved Hide resolved
}

// TestFilterProcessor_Init tests initialization of the filter processor
func TestFilterProcessor_Init(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need more tests here. I don't see all being tested, there's not error tests for example supplying a bad tag, or a field which doesn't have the correct operations defined such as a MicroAlgos field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added more unit tests. The test that detects either a bad tag or a field that doesnt have the correct function is located in searcher_test.go

expression: "` + sampleAddr2.String() + `"
- tag: SignedTxnWithAD.SignedTxn.Txn.AssetTransferTxnFields.AssetReceiver
expression-type: regex
expression: "` + sampleAddr2.String() + `"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if all and any have overlapping tags?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The filters are evaluated in a top down order. Meaning that the first filter that gets encountered gets executed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a feature isn't it? i.e. if you want to match two different accounts:

- any:
    - tag:  SignedTxnWithAD.SignedTxn.Txn.Sender
      expression-type: const
      expression: "AAAAAAA..."
    - tag:  SignedTxnWithAD.SignedTxn.Txn.Sender
      expression-type: const
      expression: "BBBBBBB..."

expression: "` + sampleAddr1.String() + `"
- tag: SignedTxnWithAD.SignedTxn.Txn.PaymentTxnFields.Receiver
expression-type: exact
expression: "` + sampleAddr2.String() + `"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if I duplicate one of the tags? Will I get duplicate blocks in result?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on the operator. If it is "any" then if any of the duplicate tags expressions are true then it passes. If it is all, then all the tag expressions must be true. A use case for the latter would be to have a regex that specifies that an address begins and ends with some characters (although you could do that in one regex if you wanted).

Copy link
Contributor

@Eric-Warehime Eric-Warehime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a few places that I think we should cover--noted them in the comments. I'm working on fixing the codecov report, but I've used the data uploaded from the most recent commit if you're curious https://app.codecov.io/github/algorand/indexer/commit/0fdcf5035e651fd933fe4321914fea7815afb204

Sorry for being paranoid about tests here, but I think because some of this code has potential to cause panics it's better to be extra careful with testing here.

Comment on lines +71 to +72
return fmt.Errorf("filter processor Init(): illegal filter tag formation. tag length was: %d", len(configMaps))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have a test for this. I think it's something we should test.

for key, subConfigs := range configMaps {

if !fields.ValidFieldOperation(key) {
return fmt.Errorf("filter processor Init(): filter key was not a valid value: %s", key)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is apparently uncovered--seems like a good test to assert which operations we support.


exp, err := expression.MakeExpression(subConfig.ExpressionType, subConfig.Expression)
if err != nil {
return fmt.Errorf("filter processor Init(): could not make expression with string %s for filter tag %s - %w", subConfig.Expression, subConfig.FilterTag, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this and the error cases of MakeExpression aren't covered.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a filter processor table test that addresses this but I didn't add regex expression specific tests since it would only fail on mal-formed regex.

type SubConfig struct {
// The tag of the struct to analyze
FilterTag string `yaml:"tag"`
// The type of expression to search for "const" or "regex"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like const this was renamed. Also, since you have comments anyway, could you add the variable names so that they matches the godoc format?

- tag: SignedTxnWithAD.SignedTxn.AuthAddr
expression-type: exact
expression: "` + sampleAddr1.String() + `"
- tag: SignedTxnWithAD.SignedTxn.Txn.AssetTransferTxnFields.AssetSender
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's important to use the msgpack path here. Have you looked into how that would be implemented?

For example, this would be txn.asnd.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can take a look for sure. I think its just different logic in the reflection package.

Copy link
Contributor

@winder winder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definitely needs to specify txn fields by their canonical names rather than the go types. I'm not sure that change needs to block the PR.

Otherwise, LGTM.

@AlgoStephenAkiki AlgoStephenAkiki merged commit 7446007 into conduit Sep 16, 2022
@AlgoStephenAkiki AlgoStephenAkiki deleted the 1100-implement-filter-processor branch September 16, 2022 17:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants