Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TraceQL: Error when searching for span attributes that start with resource. #2605

Closed
LasseHels opened this issue Jul 3, 2023 · 13 comments · Fixed by #3284
Closed

TraceQL: Error when searching for span attributes that start with resource. #2605

LasseHels opened this issue Jul 3, 2023 · 13 comments · Fixed by #3284
Assignees
Labels
keepalive Label to exempt Issues / PRs from stale workflow traceql type/bug Something isn't working

Comments

@LasseHels
Copy link
Contributor

LasseHels commented Jul 3, 2023

Describe the bug
When executing this query:

{ resource.service.name = "scribe" && span.resource.id = "/subscriptions/05828da8-9a93-4b5c-ae81-11e62e5c8024/resourceGroups/rg-eu-west-iom-pp/providers/Microsoft.DBforPostgreSQL/flexibleServers/iom-db-pp" }

We get this error:

invalid TraceQL query: parse error at line 0, col 44: syntax error: unexpected resource., expecting IDENTIFIER

Screenshot 2023-07-03 at 10 13 25

To Reproduce
Steps to reproduce the behavior:

  1. Start Tempo. We are running grafana/tempo:2.1.1; other versions may be affected.
  2. Ingest a span with a span attribute called resource.id. The value of the attribute (probably) doesn't matter.
  3. Attempt to search for the span: { span.resource.id = "" }

Expected behavior
No error is emitted.

Observed behaviour
An error is emitted:
Screenshot 2023-07-03 at 10 16 56

Environment:

  • Infrastructure: We run Tempo in a large Kubernetes cluster on AKS.
  • Deployment tool: Our cluster is continually reconciled with Flux. We use Tempo's jsonnet templates to generate the YAML files that make up our cluster (or at least, the Tempo part of it).

Additional Context
I believe this issue is isolated to span attributes prefixed with resource.. I'm guessing this kind of span attribute name tricks the TraceQL parser into believing that we are trying to search for a resource attribute.

This query works:

{ resource.service.name = "scribe" && span.identifier = "" }

Screenshot 2023-07-03 at 10 21 43

This query fails:

{ resource.service.name = "scribe" && span.resource.id = "" }

Screenshot 2023-07-03 at 10 21 57

Our application only emits a resource.id span attribute, so I've not tested with other resource.* attributes, but I would assume that the error is the same.

This bug is not critical to us; we are likely going to rename the span attribute to resolve the issue. In hindsight, naming a span attribute resource.id was asking for trouble and not my brighest move 😅.

@joe-elliott
Copy link
Member

Thanks for the report. This is also failing with a similar issue:

{ resource.span.foo = "bar" }

This seems to work but it is not as performant:

{ .resource.id = "" }

@joe-elliott joe-elliott added type/bug Something isn't working traceql labels Jul 5, 2023
@joe-elliott
Copy link
Member

The below diff fixes the issue as stated but still fails on weird constructions like:

{ parent.span.resource.id = "" }

I think this is just going to require some better attribute scope parsing.

diff --git a/pkg/traceql/lexer.go b/pkg/traceql/lexer.go
index 830c30912..19266c75d 100644
--- a/pkg/traceql/lexer.go
+++ b/pkg/traceql/lexer.go
@@ -77,15 +77,15 @@ type lexer struct {
        parser *yyParserImpl
        errs   []ParseError
 
-       parsingAttribute bool
+       currentAttribute int
 }
 
 func (l *lexer) Lex(lval *yySymType) int {
        // if we are currently parsing an attribute and the next rune suggests that
        //  this attribute will end, then return a special token indicating that the attribute is
        //  done parsing
-       if l.parsingAttribute && !isAttributeRune(l.Peek()) {
-               l.parsingAttribute = false
+       if l.parsingAttribute() && !isAttributeRune(l.Peek()) {
+               l.resetAttribute()
                return END_ATTRIBUTE
        }
 
@@ -93,13 +93,16 @@ func (l *lexer) Lex(lval *yySymType) int {
 
        // if we are currently parsing an attribute then just grab everything until we find a character that ends the attribute.
        // we will handle parsing this out in ast.go
-       if l.parsingAttribute {
+       if l.parsingAttribute() {
                str := l.TokenText()
-               // parse out any scopes here
-               tok := tokens[str+string(l.Peek())]
-               if tok == RESOURCE_DOT || tok == SPAN_DOT {
-                       l.Next()
-                       return tok
+
+               // parse out any scopes here, but only attempt if we are parsing a parent attribute
+               if l.currentAttribute == PARENT_DOT {
+                       tok := tokens[str+string(l.Peek())]
+                       if tok == RESOURCE_DOT || tok == SPAN_DOT {
+                               l.Next()
+                               return tok
+                       }
                }
 
                // go forward until we find the end of the attribute
@@ -168,12 +171,12 @@ func (l *lexer) Lex(lval *yySymType) int {
        tokStrNext := l.TokenText() + string(l.Peek())
        if tok, ok := tokens[tokStrNext]; ok {
                l.Next()
-               l.parsingAttribute = startsAttribute(tok)
+               l.setAttribute(tok)
                return tok
        }
 
        if tok, ok := tokens[l.TokenText()]; ok {
-               l.parsingAttribute = startsAttribute(tok)
+               l.setAttribute(tok)
                return tok
        }
 
@@ -185,6 +188,20 @@ func (l *lexer) Error(msg string) {
        l.errs = append(l.errs, newParseError(msg, l.Line, l.Column))
 }
 
+func (l *lexer) parsingAttribute() bool {
+       return l.currentAttribute != 0
+}
+
+func (l *lexer) setAttribute(tok int) {
+       if startsAttribute(tok) {
+               l.currentAttribute = tok
+       }
+}
+
+func (l *lexer) resetAttribute() {
+       l.currentAttribute = 0
+}
+
 func tryScanDuration(number string, l *scanner.Scanner) (time.Duration, bool) {
        var sb strings.Builder
        sb.WriteString(number)
diff --git a/pkg/traceql/lexer_test.go b/pkg/traceql/lexer_test.go
index 401fa1229..8d1dd3d78 100644
--- a/pkg/traceql/lexer_test.go
+++ b/pkg/traceql/lexer_test.go
@@ -34,12 +34,14 @@ func TestLexerAttributes(t *testing.T) {
                {`span.foo3`, []int{SPAN_DOT, IDENTIFIER, END_ATTRIBUTE}},
                {`span.foo+bar`, []int{SPAN_DOT, IDENTIFIER, END_ATTRIBUTE}},
                {`span.foo-bar`, []int{SPAN_DOT, IDENTIFIER, END_ATTRIBUTE}},
+               {`span.resource.foo`, []int{SPAN_DOT, IDENTIFIER, END_ATTRIBUTE}},
                // resource attributes
                {`resource.foo`, []int{RESOURCE_DOT, IDENTIFIER, END_ATTRIBUTE}},
                {`resource.count`, []int{RESOURCE_DOT, IDENTIFIER, END_ATTRIBUTE}},
                {`resource.foo3`, []int{RESOURCE_DOT, IDENTIFIER, END_ATTRIBUTE}},
                {`resource.foo+bar`, []int{RESOURCE_DOT, IDENTIFIER, END_ATTRIBUTE}},
                {`resource.foo-bar`, []int{RESOURCE_DOT, IDENTIFIER, END_ATTRIBUTE}},
+               {`resource.span.foo`, []int{RESOURCE_DOT, IDENTIFIER, END_ATTRIBUTE}},
                // parent span attributes
                {`parent.span.foo`, []int{PARENT_DOT, SPAN_DOT, IDENTIFIER, END_ATTRIBUTE}},
                {`parent.span.count`, []int{PARENT_DOT, SPAN_DOT, IDENTIFIER, END_ATTRIBUTE}},

@github-actions
Copy link
Contributor

github-actions bot commented Sep 4, 2023

This issue has been automatically marked as stale because it has not had any activity in the past 60 days.
The next time this stale check runs, the stale label will be removed if there is new activity. The issue will be closed after 15 days if there is no new activity.
Please apply keepalive label to exempt this Issue.

@github-actions github-actions bot added the stale Used for stale issues / PRs label Sep 4, 2023
@joe-elliott joe-elliott added keepalive Label to exempt Issues / PRs from stale workflow and removed stale Used for stale issues / PRs labels Sep 5, 2023
@mghildiy
Copy link
Contributor

I am working on it.

@mghildiy
Copy link
Contributor

To confirm my understanding:

span.xxx.yyy represents a span attribute xxx.yyy, and when we have a span attribute starting with resource. (like, span.resource.foo), this problem arises.

@mghildiy
Copy link
Contributor

tok := tokens[str+string(l.Peek())]

The below diff fixes the issue as stated but still fails on weird constructions like:

{ parent.span.resource.id = "" }

I think this is just going to require some better attribute scope parsing.

diff --git a/pkg/traceql/lexer.go b/pkg/traceql/lexer.go
index 830c30912..19266c75d 100644
--- a/pkg/traceql/lexer.go
+++ b/pkg/traceql/lexer.go
@@ -77,15 +77,15 @@ type lexer struct {
        parser *yyParserImpl
        errs   []ParseError
 
-       parsingAttribute bool
+       currentAttribute int
 }
 
 func (l *lexer) Lex(lval *yySymType) int {
        // if we are currently parsing an attribute and the next rune suggests that
        //  this attribute will end, then return a special token indicating that the attribute is
        //  done parsing
-       if l.parsingAttribute && !isAttributeRune(l.Peek()) {
-               l.parsingAttribute = false
+       if l.parsingAttribute() && !isAttributeRune(l.Peek()) {
+               l.resetAttribute()
                return END_ATTRIBUTE
        }
 
@@ -93,13 +93,16 @@ func (l *lexer) Lex(lval *yySymType) int {
 
        // if we are currently parsing an attribute then just grab everything until we find a character that ends the attribute.
        // we will handle parsing this out in ast.go
-       if l.parsingAttribute {
+       if l.parsingAttribute() {
                str := l.TokenText()
-               // parse out any scopes here
-               tok := tokens[str+string(l.Peek())]
-               if tok == RESOURCE_DOT || tok == SPAN_DOT {
-                       l.Next()
-                       return tok
+
+               // parse out any scopes here, but only attempt if we are parsing a parent attribute
+               if l.currentAttribute == PARENT_DOT {
+                       tok := tokens[str+string(l.Peek())]
+                       if tok == RESOURCE_DOT || tok == SPAN_DOT {
+                               l.Next()
+                               return tok
+                       }
                }
 
                // go forward until we find the end of the attribute
@@ -168,12 +171,12 @@ func (l *lexer) Lex(lval *yySymType) int {
        tokStrNext := l.TokenText() + string(l.Peek())
        if tok, ok := tokens[tokStrNext]; ok {
                l.Next()
-               l.parsingAttribute = startsAttribute(tok)
+               l.setAttribute(tok)
                return tok
        }
 
        if tok, ok := tokens[l.TokenText()]; ok {
-               l.parsingAttribute = startsAttribute(tok)
+               l.setAttribute(tok)
                return tok
        }
 
@@ -185,6 +188,20 @@ func (l *lexer) Error(msg string) {
        l.errs = append(l.errs, newParseError(msg, l.Line, l.Column))
 }
 
+func (l *lexer) parsingAttribute() bool {
+       return l.currentAttribute != 0
+}
+
+func (l *lexer) setAttribute(tok int) {
+       if startsAttribute(tok) {
+               l.currentAttribute = tok
+       }
+}
+
+func (l *lexer) resetAttribute() {
+       l.currentAttribute = 0
+}
+
 func tryScanDuration(number string, l *scanner.Scanner) (time.Duration, bool) {
        var sb strings.Builder
        sb.WriteString(number)
diff --git a/pkg/traceql/lexer_test.go b/pkg/traceql/lexer_test.go
index 401fa1229..8d1dd3d78 100644
--- a/pkg/traceql/lexer_test.go
+++ b/pkg/traceql/lexer_test.go
@@ -34,12 +34,14 @@ func TestLexerAttributes(t *testing.T) {
                {`span.foo3`, []int{SPAN_DOT, IDENTIFIER, END_ATTRIBUTE}},
                {`span.foo+bar`, []int{SPAN_DOT, IDENTIFIER, END_ATTRIBUTE}},
                {`span.foo-bar`, []int{SPAN_DOT, IDENTIFIER, END_ATTRIBUTE}},
+               {`span.resource.foo`, []int{SPAN_DOT, IDENTIFIER, END_ATTRIBUTE}},
                // resource attributes
                {`resource.foo`, []int{RESOURCE_DOT, IDENTIFIER, END_ATTRIBUTE}},
                {`resource.count`, []int{RESOURCE_DOT, IDENTIFIER, END_ATTRIBUTE}},
                {`resource.foo3`, []int{RESOURCE_DOT, IDENTIFIER, END_ATTRIBUTE}},
                {`resource.foo+bar`, []int{RESOURCE_DOT, IDENTIFIER, END_ATTRIBUTE}},
                {`resource.foo-bar`, []int{RESOURCE_DOT, IDENTIFIER, END_ATTRIBUTE}},
+               {`resource.span.foo`, []int{RESOURCE_DOT, IDENTIFIER, END_ATTRIBUTE}},
                // parent span attributes
                {`parent.span.foo`, []int{PARENT_DOT, SPAN_DOT, IDENTIFIER, END_ATTRIBUTE}},
                {`parent.span.count`, []int{PARENT_DOT, SPAN_DOT, IDENTIFIER, END_ATTRIBUTE}},

I pulled latest code, and it seems code has undergone changes since this set of change.

@mapno
Copy link
Member

mapno commented Jan 2, 2024

To confirm my understanding:

span.xxx.yyy represents a span attribute xxx.yyy, and when we have a span attribute starting with resource. (like, span.resource.foo), this problem arises.

Exactly this, yes!

@mghildiy
Copy link
Contributor

mghildiy commented Jan 6, 2024

parent.resource.xxx.
I understand above one is a wrong combination.

We can have following though:
parent.span.xxx
parent.span.resource.xxx

Correct me if I am wrong.

@joe-elliott
Copy link
Member

technically parent.resource.xxx is allowed just like parent.span.xxx. We don't actually support this scope in code though so it's ok if this PR doesn't fix that.

@mghildiy
Copy link
Contributor

mghildiy commented Jan 8, 2024

I think we don't need check for resource scope,
And only resolving span scope when starting scope is parent.
This has made all tests pass.

@mghildiy
Copy link
Contributor

mghildiy commented Jan 9, 2024

What is expected for:
parent.span.resource.id

PARENT_DOT, SPAN_DOT, IDENTIFIER, END_ATTRIBUTE
OR
PARENT_DOT, SPAN_DOT, RESOURCE_DOT, ATTRIBUTE, END_ATTRIBUTE

@joe-elliott
Copy link
Member

According to the spec this would be most correct:

PARENT_DOT, SPAN_DOT, IDENTIFIER, END_ATTRIBUTE

@mghildiy
Copy link
Contributor

PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
keepalive Label to exempt Issues / PRs from stale workflow traceql type/bug Something isn't working
Projects
None yet
4 participants