Fix analysis/planning when lambda arguments clash with relation's columns #9099

findepi · 2017-10-04T19:25:11Z

Fix lambda arguments handling in ExpressionAnalyzer and during query planning (TranslationMap).

findepi · 2017-10-04T19:28:11Z

I like #9090 more.

sopel39 · 2017-10-04T19:30:02Z

presto-main/src/main/java/com/facebook/presto/sql/planner/TranslationMap.java

    {
+        if (!analysis.isColumnReference(expression)) {


when this is needed? Is there some query that fails or this is just fail-safe?
Maybe just add check state that expression is column reference.

This is needed e.g. when r is a column reference of type row(a: integer, ...) and we try to get symbol for r.a.

When this would happen and why it wasn't a problem before?
Scope shouldn't resolve non column reference fields with .tryResolveField(expression).

I guess now non-lambda scope is applied for lambda arg expression.
Maybe instead check that expression is not lambda a argument reference (since this is what actually we want to avoid).

It would be redoing analysis in the query planning. Why would I want that?

Well, using scope again here is redoing analysis, but we have to live with this for now (this method will be gone eventually). However, as I mentioned, scope resolves only column references. That is why this change is questionable. In fact, the intention was to prevent using incorrect (non-lambda) scope for lambda arg reference. Because of such, it would be more natural to check if expression is not a lambda reference.

I agree with you. Except that it's not possible to just "check if expression is not a lambda reference" without redoing analysis, so I'd argue that the presented solution is more natural. Are we bike shedding?

Except that it's not possible to just "check if expression is not a lambda reference"

What about adding method:

boolean Analysis::isLambdaArgumentReference(Expression expression) { returns expression instanceof Identifier && lambdaArgumentReferences.contains(NodeRef.of((Identifier) expression)); }

which would be similar to isColumnReference(Expression expression)?

Indeed, this would work. But also it would also make logic more complex. Instead of "is column reference so resolvable with scope" it would be "is not lambda reference, so probably column reference so resolvable with scope".

Instead of "is column reference so resolvable with scope"

This is actually: "is expression that is column reference resolvable with scope (than only resolves column references)", but then I think: "hey, if it's not a column reference then it shouldn't be resolved by scope anyway, so why this additional check"

The entire planner and analyzer is very complex piece of code. The less WTF code the better (you experienced that too). The code here deserves comment at least.

The code here deserves comment at least.

Done.

sopel39 · 2017-10-04T19:33:43Z

I like #9090 more.

Nah. #9090 reapplies part of analysis in TranslationMap while you already have information which Node is column reference or lambda argument reference. In a way #9090 applies redundant logic and contains intermediate (unnecessary) step (creating scope for lambda analysis in translation map). This one is cleaner since it relies (in planning) on already performed analysis

sopel39 · 2017-10-12T21:46:37Z

presto-main/src/main/java/com/facebook/presto/sql/analyzer/ExpressionAnalyzer.java

@@ -1269,43 +1269,43 @@ private Context(
            this.nameToLambdaArgumentDeclarationMap = nameToLambdaArgumentDeclarationMap;
        }

-        public static Context notInLambda()
+        static Context notInLambda()


it can be private

sopel39 · 2017-10-12T21:47:44Z

presto-main/src/main/java/com/facebook/presto/sql/analyzer/ExpressionAnalyzer.java

        {
            return new Context(null, null);
        }

-        public static Context inLambda(Map<String, LambdaArgumentDeclaration> nameToLambdaArgumentDeclarationMap)
+        static Context inLambda(Map<String, LambdaArgumentDeclaration> nameToLambdaArgumentDeclarationMap)


ditto.. check rest of methods

sopel39 · 2017-10-12T21:53:51Z

presto-main/src/main/java/com/facebook/presto/sql/analyzer/ExpressionAnalyzer.java

@@ -282,11 +282,11 @@ private Type analyze(Expression expression, Scope scope, Context context)
    private class Visitor
            extends StackableAstVisitor<Type, Context>
    {
-        private final Scope scope;
+        private final Scope baseScope;


I don't think it is needed anymore, you can just use scope from Context. Otherwise it's confusing which scope to use.

This Scope is used to resolve FieldReference instances. Scope from context cannot be used for that.

From semantic point of view FieldReference expressions should only be created for top level expressions (e.g: not in lambdas). Therefore context scope should be OK instead of using some tricky double scoping.

I guess the problem might be with analyzeExpressionsWithInputs which is used in LocalExecutionPlanner after SymbolToInputRewriter. However, I think that there should be another kind of expression (e.g: ChannelReference) for which type should be returned without using Scope at all (see #7398, ChannelReference would only be produced by SymbolToInputRewriter.)

Alternatively, ExpressionAnalyzer might have optional Map<Integer, Type> inputMapping that would be used for FieldReference resolving (if present). The idea is that original (AST) FieldReferences would be translated to symbols so FieldReferences in LocalExecutionPlanner are only created by SymbolToInputRewriter. I think this is nicer temporary solution.

I think you're right about where the problematic FieldReference instances come from. What you suggest sounds like a good direction for the future.

At least minimally you should fix ExpressionAnalyzer with

Alternatively, ExpressionAnalyzer might have optional Map<Integer, Type> inputMapping that would be used for FieldReference resolving (if present). The idea is that original (AST) FieldReferences would be translated to symbols so FieldReferences in LocalExecutionPlanner are only created by SymbolToInputRewriter. I think this is nicer temporary solution.

This will allow to void double scoping in ExpressionAnalyzer now and would be overall cleaner approach.

I added a comment explaining purpose of the ExpressionAnalyzer.Visitor#baseScope field.

sopel39 · 2017-10-12T22:06:07Z

presto-main/src/main/java/com/facebook/presto/sql/analyzer/ExpressionAnalyzer.java

            }
-            Type returnType = process(node.getBody(), new StackableAstVisitorContext<>(Context.inLambda(context.getContext().getScope(), nameToLambdaArgumentDeclarationMap)));
+            for (LambdaArgumentDeclaration lambdaArgument : lambdaArguments) {
+                ResolvedField resolvedField = lambdaScope.resolveField(lambdaArgument, QualifiedName.of(lambdaArgument.getName().getValue()));


you can directly create field id here from RelationId.of(node) and index of LambdaArgumentDeclaration.
You can even use int stream for that, e.g:

IntStream.range(0, lambdaArguments.size()) .forEach(index -> fieldToLambdaArgumentDeclaration.put(new FieldId(RelationId.of(node), index));

, but for loop is also enough

sopel39 · 2017-10-12T22:40:21Z

presto-main/src/main/java/com/facebook/presto/sql/analyzer/ExpressionAnalyzer.java

+                Scope scope = context.getContext().getScope();
+                Optional<ResolvedField> resolvedField = scope.tryResolveField(node, qualifiedName);
+                if (resolvedField.isPresent()) {
+                    if (!context.getContext().isInLambda() || !context.getContext().getFieldToLambdaArgumentDeclaration().containsKey(FieldId.from(resolvedField.get()))) {


I don't think this special case is required as lambda won't resolve entire deference expression.

comments answered/applied

findepi · 2017-10-23T11:38:35Z

rebased

Fixes #9025

sopel39 · 2017-10-31T22:22:06Z

presto-main/src/main/java/com/facebook/presto/sql/analyzer/ExpressionAnalyzer.java

                                        innerExpressionAnalyzer.setExpressionType(argument, getExpressionType(argument));
                                    }
                                }
-                                return innerExpressionAnalyzer.analyze(expression, scope, context.getContext().expectingLambda(types)).getTypeSignature();
+                                return innerExpressionAnalyzer.analyze(expression, baseScope, context.getContext().expectingLambda(types)).getTypeSignature();


context.getContext().getScope()? Definitely expectingLambda preserves this.nameToLambdaArgumentDeclarationMap in current master. Add test for this case (some kind of nested lambdas with reference to outer lambda argument)?

I guess the problem is with this Map<Integer, Type> map that is used to resolve FieldReferences (this information needs to be passed to innerExpressionAnalyzer). #9099 (comment) might work here. It's strange though that no test failed (lack of proper test?)

The right Scope is passed via context.getContext().expectingLambda(types) expression, so this is correct.

sopel39 · 2017-10-31T22:26:22Z

presto-main/src/main/java/com/facebook/presto/sql/planner/TranslationMap.java

-                                .orElseThrow(() -> new IllegalStateException("No symbol mapping for node " + node));
+                if (analysis.isColumnReference(node)) {
+                    Optional<ResolvedField> resolvedField = rewriteBase.getScope().tryResolveField(node);
+                    if (resolvedField.isPresent()) {


This should just be check state: checkState(resolvedField.isPresent()) as column references should be resolvable by corresponding scope

I think they don't need to be. The check was here before and i think i tried to remove it..

It would be strange if the if (resolvedField.isPresent()) { is needed for tests to pass. If the expression is column reference then the scope here should be the one that resolved it during analysis. Subuqery expressions should be translated already and lambda scope doesn't resolve column references at all.

If the scope here is different than the one that resolved column reference during analysis then it's potentially semantically incorrect. Such different scope might not resolve field, but it might also (incorrectly) resolve field to a wrong field index.

In any case, we need to investigate if this issue is introduced by this PR or was already in the code. It could be that the if (resolvedField.isPresent()) { is required and correct because it's just some edge case (e.g: part of ORDER BY scoping or some subquery is not supported and wasn't rewritten). Then we need to at least document it.

There is an edge-case related to order by planning in QueryPlanner, needs to stay

sopel39 · 2017-11-02T21:01:38Z

This will be merged again after release is closed

findepi · 2017-11-02T21:42:33Z

@sopel39 , to facilitate (testing and) merging once release is out, i created a new PR: #9269

facebook-github-bot added the CLA Signed label Oct 4, 2017

sopel39 reviewed Oct 4, 2017

View reviewed changes

findepi mentioned this pull request Oct 4, 2017

Use proper Scope to resolve lambda arguments in TranslationMap #9090

Closed

findepi mentioned this pull request Oct 4, 2017

Lambda name resolution should use Scope mechanism #7790

Closed

findepi assigned martint Oct 6, 2017

findepi requested a review from martint October 6, 2017 17:15

findepi mentioned this pull request Oct 11, 2017

Use Scopes to resolve lambda argument references in ExpressionAnalyzer #9026

Closed

findepi changed the title ~~Fix planning when lambda argument shadows relation column~~ Fix analysis/planning when lambda arguments clash with relation's columns Oct 11, 2017

sopel39 previously requested changes Oct 12, 2017

View reviewed changes

findepi added 5 commits October 31, 2017 01:41

Move ExpressionAnalyzer's Scope to Context

2d3f57f

Use Scope to resolve lambda arguments in ExpressionAnalyzer

9a9beef

Support lambda captures using dereference expressions

53e863d

Introduce Analysis.isColumnReference shorthand

9e223f2

Fix planning when lambda argument shadows relation column

e518c48

Fixes #9025

sopel39 reviewed Oct 31, 2017

View reviewed changes

sopel39 merged commit 9721731 into prestodb:master Nov 2, 2017

This was referenced Nov 2, 2017

Analysis error for table alias in lambda #9023

Closed

Lambda captures do not work with qualified column names #7784

Closed

Wrong query results when lambda scope shadows column name #9025

Closed

findepi mentioned this pull request Nov 2, 2017

Fix analysis/planning when lambda arguments clash with relation's columns (v2) #9269

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix analysis/planning when lambda arguments clash with relation's columns #9099

Fix analysis/planning when lambda arguments clash with relation's columns #9099

findepi commented Oct 4, 2017 •

edited

Loading

findepi commented Oct 4, 2017

sopel39 Oct 4, 2017 •

edited

Loading

findepi Oct 20, 2017

sopel39 Oct 20, 2017 •

edited

Loading

findepi Oct 23, 2017

sopel39 Oct 23, 2017 •

edited

Loading

findepi Oct 23, 2017

sopel39 Oct 23, 2017 •

edited

Loading

findepi Oct 25, 2017

sopel39 Oct 25, 2017

findepi Oct 26, 2017

sopel39 commented Oct 4, 2017 •

edited

Loading

sopel39 Oct 12, 2017

sopel39 Oct 12, 2017

sopel39 Oct 12, 2017

findepi Oct 20, 2017

sopel39 Oct 20, 2017 •

edited

Loading

findepi Oct 20, 2017

sopel39 Oct 20, 2017 •

edited

Loading

findepi Oct 31, 2017

sopel39 Oct 12, 2017

sopel39 Oct 12, 2017

findepi Oct 20, 2017

findepi commented Oct 23, 2017

sopel39 Oct 31, 2017

sopel39 Nov 1, 2017 •

edited

Loading

findepi Nov 2, 2017

sopel39 Oct 31, 2017 •

edited

Loading

findepi Oct 31, 2017

sopel39 Nov 1, 2017 •

edited

Loading

findepi Nov 2, 2017

sopel39 commented Nov 2, 2017

findepi commented Nov 2, 2017

Fix analysis/planning when lambda arguments clash with relation's columns #9099

Fix analysis/planning when lambda arguments clash with relation's columns #9099

Conversation

findepi commented Oct 4, 2017 • edited Loading

findepi commented Oct 4, 2017

sopel39 Oct 4, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sopel39 Oct 20, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sopel39 Oct 23, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sopel39 Oct 23, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sopel39 commented Oct 4, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sopel39 Oct 20, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sopel39 Oct 20, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

findepi commented Oct 23, 2017

Choose a reason for hiding this comment

sopel39 Nov 1, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sopel39 Oct 31, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sopel39 Nov 1, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sopel39 commented Nov 2, 2017

findepi commented Nov 2, 2017

findepi commented Oct 4, 2017 •

edited

Loading

sopel39 Oct 4, 2017 •

edited

Loading

sopel39 Oct 20, 2017 •

edited

Loading

sopel39 Oct 23, 2017 •

edited

Loading

sopel39 Oct 23, 2017 •

edited

Loading

sopel39 commented Oct 4, 2017 •

edited

Loading

sopel39 Oct 20, 2017 •

edited

Loading

sopel39 Oct 20, 2017 •

edited

Loading

sopel39 Nov 1, 2017 •

edited

Loading

sopel39 Oct 31, 2017 •

edited

Loading

sopel39 Nov 1, 2017 •

edited

Loading