Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/watcher java time migration #6

Open
wants to merge 191 commits into
base: master
Choose a base branch
from

Conversation

pgomulka
Copy link
Owner

  • Have you signed the contributor license agreement?
  • Have you followed the contributor guidelines?
  • If submitting code, have you built your formula locally prior to submission with gradle check?
  • If submitting code, is your pull request against master? Unless there is a good reason otherwise, we prefer pull requests against master and will backport as needed.
  • If submitting code, have you checked that your submission is for an OS that we support?
  • If you are submitting this code for a class then read our policy for that.

This commit moves the aggregation and mapping code from joda time to
java time.

This includes field mappers, root object mappers, aggregations with date
histograms, query builders and a lot of changes within tests.
pgomulka pushed a commit that referenced this pull request Mar 16, 2021
…astic#69765)

Previously we did not resolve the attributes recursively which meant that if a field or expression was re-aliased multiple times (through multiple levels of subqueries), the aliases were only resolved one level down. This led to failed query translation because `ReferenceAttribute`s were pointing to non-existing attributes during query translation.

For example the query

```sql
SELECT i AS j FROM ( SELECT int AS i FROM test) ORDER BY j
```

failed during translation because the `OrderBy` resolved the `j` ReferenceAttribute to another `i` ReferenceAttribute that was later removed by an Optimization:

```
OrderBy[[Order[j{r}#4,ASC,LAST]]]                                             ! OrderBy[[Order[i{r}#2,ASC,LAST]]]
\_Project[[j]]                                                                = \_Project[[j]]
  \_Project[[i]]                                                              !   \_EsRelation[test][date{f}#6, some{f}#7, some.string{f}#8, some.string..]
    \_EsRelation[test][date{f}#6, some{f}#7, some.string{f}#8, some.string..] ! 
```

By resolving the `Attributes` recursively both `j{r}` and `i{r}` will resolve to `test.int{f}` above:

```
OrderBy[[Order[test.int{f}#22,ASC,LAST]]]                                     = OrderBy[[Order[test.int{f}#22,ASC,LAST]]]
\_Project[[j]]                                                                = \_Project[[j]]
  \_Project[[i]]                                                              !   \_EsRelation[test][date{f}#6, some{f}#7, some.string{f}#8, some.string..]
    \_EsRelation[test][date{f}#6, some{f}#7, some.string{f}#8, some.string..] ! 
 ```

The scope of recursive resolution depends on how the `AttributeMap` is constructed and populated.

Fixes elastic#67237
pgomulka pushed a commit that referenced this pull request Mar 31, 2021
…astic#69765) (elastic#70322)

Previously we did not resolve the attributes recursively which meant that if a field or expression was re-aliased multiple times (through multiple levels of subqueries), the aliases were only resolved one level down. This led to failed query translation because `ReferenceAttribute`s were pointing to non-existing attributes during query translation.

For example the query

```sql
SELECT i AS j FROM ( SELECT int AS i FROM test) ORDER BY j
```

failed during translation because the `OrderBy` resolved the `j` ReferenceAttribute to another `i` ReferenceAttribute that was later removed by an Optimization:

```
OrderBy[[Order[j{r}#4,ASC,LAST]]]                                             ! OrderBy[[Order[i{r}#2,ASC,LAST]]]
\_Project[[j]]                                                                = \_Project[[j]]
  \_Project[[i]]                                                              !   \_EsRelation[test][date{f}#6, some{f}#7, some.string{f}#8, some.string..]
    \_EsRelation[test][date{f}#6, some{f}#7, some.string{f}#8, some.string..] ! 
```

By resolving the `Attributes` recursively both `j{r}` and `i{r}` will resolve to `test.int{f}` above:

```
OrderBy[[Order[test.int{f}#22,ASC,LAST]]]                                     = OrderBy[[Order[test.int{f}#22,ASC,LAST]]]
\_Project[[j]]                                                                = \_Project[[j]]
  \_Project[[i]]                                                              !   \_EsRelation[test][date{f}#6, some{f}#7, some.string{f}#8, some.string..]
    \_EsRelation[test][date{f}#6, some{f}#7, some.string{f}#8, some.string..] ! 
 ```

The scope of recursive resolution depends on how the `AttributeMap` is constructed and populated.

Fixes elastic#67237
pgomulka pushed a commit that referenced this pull request Mar 31, 2021
…astic#69765) (elastic#70325)

Previously we did not resolve the attributes recursively which meant that if a field or expression was re-aliased multiple times (through multiple levels of subqueries), the aliases were only resolved one level down. This led to failed query translation because `ReferenceAttribute`s were pointing to non-existing attributes during query translation.

For example the query

```sql
SELECT i AS j FROM ( SELECT int AS i FROM test) ORDER BY j
```

failed during translation because the `OrderBy` resolved the `j` ReferenceAttribute to another `i` ReferenceAttribute that was later removed by an Optimization:

```
OrderBy[[Order[j{r}#4,ASC,LAST]]]                                             ! OrderBy[[Order[i{r}#2,ASC,LAST]]]
\_Project[[j]]                                                                = \_Project[[j]]
  \_Project[[i]]                                                              !   \_EsRelation[test][date{f}#6, some{f}#7, some.string{f}#8, some.string..]
    \_EsRelation[test][date{f}#6, some{f}#7, some.string{f}#8, some.string..] ! 
```

By resolving the `Attributes` recursively both `j{r}` and `i{r}` will resolve to `test.int{f}` above:

```
OrderBy[[Order[test.int{f}#22,ASC,LAST]]]                                     = OrderBy[[Order[test.int{f}#22,ASC,LAST]]]
\_Project[[j]]                                                                = \_Project[[j]]
  \_Project[[i]]                                                              !   \_EsRelation[test][date{f}#6, some{f}#7, some.string{f}#8, some.string..]
    \_EsRelation[test][date{f}#6, some{f}#7, some.string{f}#8, some.string..] ! 
 ```

The scope of recursive resolution depends on how the `AttributeMap` is constructed and populated.

Fixes elastic#67237
pgomulka pushed a commit that referenced this pull request Nov 19, 2021
pgomulka pushed a commit that referenced this pull request Sep 6, 2023
Fixes elastic/elasticsearch-internal#497
Fixes ESQL-560

A query like `from test | sort data | limit 2 | project count` fails
because `LocalToGlobalLimitAndTopNExec` planning rule adds a collecting
`TopNExec` after last GATHER exchange, to perform last reduce, see

```
TopNExec[[Order[data{f}#6,ASC,LAST]],2[INTEGER]]
\_ExchangeExec[GATHER,SINGLE_DISTRIBUTION]
  \_ProjectExec[[count{f}#4]]      // <- `data` is projected away but still used by the TopN node above
    \_FieldExtractExec[count{f}#4]
      \_TopNExec[[Order[data{f}#6,ASC,LAST]],2[INTEGER]]
        \_FieldExtractExec[data{f}#6]
          \_ExchangeExec[REPARTITION,FIXED_ARBITRARY_DISTRIBUTION]
            \_EsQueryExec[test], query[][_doc_id{f}#9, _segment_id{f}#10, _shard_id{f}#11]
```

Unfortunately, at that stage the inputs needed by the TopNExec could
have been projected away by a ProjectExec, so they could be no longer
available.

This PR adapts the plan as follows:
- add all the projections used by the `TopNExec` to the existing
`ProjectExec`, so that they are available when needed
- add another ProjectExec on top of the plan, to project away the
originally removed projections and preserve the query semantics


This approach is a bit dangerous, because it bypasses the mechanism of
input/output resolution and validation that happens on the logical plan.
The alternative would be to do this manipulation on the logical plan,
but it's probably hard to do, because there is no concept of Exchange at
that level.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.