Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EQL: ambiguous syntax for chained predicates #61654

Closed
rw-access opened this issue Aug 27, 2020 · 3 comments · Fixed by #62567
Closed

EQL: ambiguous syntax for chained predicates #61654

rw-access opened this issue Aug 27, 2020 · 3 comments · Fixed by #62567
Assignees
Labels
:Analytics/EQL EQL querying Team:QL (Deprecated) Meta label for query languages team team-discuss

Comments

@rw-access
Copy link
Contributor

rw-access commented Aug 27, 2020

Existing implementations of EQL have peg-based grammars, and as part of the design can't support chaining predicates.

For example, 1 == 1 == 1 raises a syntax error. This seems like good behavior, forcing the user to do (1 == 1) == 1.

$ eql query 'any where 1 == 1 == 1'
Error at line:1,column:18
Invalid syntax
any where 1 == 1 == 1
                 ^

Then you get this

$ eql query 'any where (1 == 1) == 1'
Error at line:1,column:11
Invalid comparison of boolean to number
any where (1 == 1) == 1
          ^^^^^^^^^^^^^

However, for Elasticsearch, we accept this syntax. But it's not clear what it means.
(I think there's another issue here with data type validation isn't detecting a type mismatch with (bool) == long

GET logs-endpoint.alerts-default/_eql/search
{
  "query": """
    any where 1 == 1 == 1
  """
  ,
  "size": 1
}
GET logs-endpoint.alerts-default/_eql/search
{
  "query": """
    any where 1 < 2 < 3
  """
  ,
  "size": 1
}
@rw-access rw-access added >bug :Analytics/EQL EQL querying Team:QL (Deprecated) Meta label for query languages team labels Aug 27, 2020
@rw-access rw-access self-assigned this Aug 27, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-ql (:Query Languages/EQL)

@matriv matriv self-assigned this Sep 10, 2020
@rw-access
Copy link
Contributor Author

rw-access commented Sep 10, 2020

Update: I think this is actually okay in its current implementation. It's not a bug then, more of a specification question.

I looked into this a little bit more, and I think that the expectation for many languages is for the equality operator to be left-associative. Thus a == b == c is equivalent to (a == b) == c

I think it comes to a matter of specification, and what we think is desired behavior for the language. Do we want this operator to be left-associative or do we want to forbid this syntax? We can decide the ideal specification, and from there we can work out the implementation details, if any need to be changed. I added the label team-discuss to ask in the next meeting.

@matriv
Copy link
Contributor

matriv commented Sep 17, 2020

Just listing the behaviour of SQL in some popular DBs:
PostgreSQL:

postgres=# select 1 < 2 < 3;
ERROR:  syntax error at or near "<"
LINE 1: select 1 < 2 < 3;
                     ^
postgres=# select 1 = 2 = 3;
ERROR:  syntax error at or near "="
LINE 1: select 1 = 2 = 3;
                     ^
postgres=# select (1 = 2) = true;
 ?column?
----------
 f
(1 row)

postgres=# select 1 = 1 = 1;
ERROR:  syntax error at or near "="
LINE 1: select 1 = 1 = 1;
                     ^

MySQL:

select 1 = 1 = 1;

1 = 1 = 1
--
1

select 1 = 2 = 3

1 = 2 = 3
--
0

select 1 < 2 < 3

1 < 2 < 3
--
1

select 1 > 2 > 3

1 > 2 > 3
--
0

Oracle gives a syntax error for those expressions.

I have no real personal preference, I'd say to keep it as is (left-associative) rather than forbid it.

matriv added a commit to matriv/elasticsearch that referenced this issue Sep 17, 2020
Expressions like `1 = 2 = 3 = 4` or `1 < 2 = 3 >= 4` were treated with
leftmost priority: ((1 = 2) = 3) = 4 which can lead to confusing
results. Since such expressions don't make so much change for EQL
filters we disallow them in the parser to prevent unexpected results
from their bad usage.

Major DBs like PostgreSQL and Oracle also disallow them in their SQL
syntax. (counter example would be MySQL which interprets them as we did
before with leftmost priority).

Fixes: elastic#61654
matriv added a commit that referenced this issue Sep 18, 2020
Expressions like `1 = 2 = 3 = 4` or `1 < 2 = 3 >= 4` were treated with
leftmost priority: ((1 = 2) = 3) = 4 which can lead to confusing
results. Since such expressions don't make so much change for EQL
filters we disallow them in the parser to prevent unexpected results
from their bad usage.

Major DBs like PostgreSQL and Oracle also disallow them in their SQL
syntax. (counter example would be MySQL which interprets them as we did
before with leftmost priority).

Fixes: #61654
matriv added a commit that referenced this issue Sep 18, 2020
Expressions like `1 = 2 = 3 = 4` or `1 < 2 = 3 >= 4` were treated with
leftmost priority: ((1 = 2) = 3) = 4 which can lead to confusing
results. Since such expressions don't make so much change for EQL
filters we disallow them in the parser to prevent unexpected results
from their bad usage.

Major DBs like PostgreSQL and Oracle also disallow them in their SQL
syntax. (counter example would be MySQL which interprets them as we did
before with leftmost priority).

Fixes: #61654
(cherry picked from commit 8f94981)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/EQL EQL querying Team:QL (Deprecated) Meta label for query languages team team-discuss
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants