Upgrade to 0.161 #63

dabaitu · 2017-01-05T17:20:48Z

No description provided.

flushCache only makes sense in CachingHiveMetastore.

instead of inheritance

The original purpose of this function was to provide an exception-free alternative to array and map subscript operators. This change makes the array version consistent with the function that operates on maps.

Converting the type to uppercase breaks type equality for row types as the field names get uppercased.

This allows Presto start regardless of host resolution at the moment of startup. Previously, Presto will fail to start when any entry from cassandra.contact-points is a host (instead of IP) and is not resolvable. This change postpones host resolution to the first query.

Rename symbols to the actual columns type instead of using the alphabet. Alphabetic are error prone and it is hard to merge patched that adds new columns.

Symbol unaliasing for ExchangeNode canonizes symbols that are aliased in source nodes. Plan after optimization: presto:default> explain SELECT c.custkey FROM customer c, orders o WHERE c.custkey = o.custkey AND o.orderdate >= DATE '1994-01-01'; Query Plan ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - Output[custkey] => [custkey:bigint] - RemoteExchange[GATHER] => custkey:bigint - Project => [custkey:bigint] - InnerJoin[("custkey" = "custkey_0")] => [custkey:bigint, $hashvalue:bigint, custkey_0:bigint, $hashvalue_15:bigint] - RemoteExchange[REPARTITION] => custkey:bigint, $hashvalue:bigint - Project => [custkey:bigint, $hashvalue_14:bigint] $hashvalue_14 := "combine_hash"(BIGINT '0', COALESCE("$operator$hash_code"("custkey"), 0)) - TableScan[hive:hive:default:customer, originalConstraint = true] => [custkey:bigint] LAYOUT: hive custkey := HiveColumnHandle{clientId=hive, name=custkey, hiveType=bigint, hiveColumnIndex=0, columnType=REGULAR} - RemoteExchange[REPARTITION] => custkey_0:bigint, $hashvalue_15:bigint - Project => [$hashvalue_16:bigint, custkey_0:bigint] $hashvalue_16 := "combine_hash"(BIGINT '0', COALESCE("$operator$hash_code"("custkey_0"), 0)) - Filter[("orderdate" >= "$literal$date"(BIGINT '8766'))] => [custkey_0:bigint, orderdate:date] - TableScan[hive:hive:default:orders, originalConstraint = ("orderdate" >= "$literal$date"(BIGINT '8766'))] => [custkey_0:bigint, orderdate:date] LAYOUT: hive custkey_0 := HiveColumnHandle{clientId=hive, name=custkey, hiveType=bigint, hiveColumnIndex=1, columnType=REGULAR} orderdate := HiveColumnHandle{clientId=hive, name=orderdate, hiveType=date, hiveColumnIndex=4, columnType=REGULAR} Plan before optimization: presto:default> explain SELECT c.custkey FROM customer c, orders o WHERE c.custkey = o.custkey AND o.orderdate >= DATE '1994-01-01'; Query Plan ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - Output[custkey] => [custkey:bigint] - RemoteExchange[GATHER] => custkey:bigint - Project => [custkey:bigint] - InnerJoin[("custkey_8" = "custkey_9")] => [custkey:bigint, custkey_8:bigint, $hashvalue:bigint, custkey_0:bigint, custkey_9:bigint, $hashvalue_16:bigint] - Project => [custkey:bigint, custkey_8:bigint, $hashvalue:bigint] - RemoteExchange[REPARTITION] => custkey:bigint, custkey_8:bigint, $hashvalue:bigint, $hashvalue_14:bigint - Project => [custkey:bigint, $hashvalue_15:bigint] $hashvalue_15 := "combine_hash"(BIGINT '0', COALESCE("$operator$hash_code"("custkey"), 0)) - TableScan[hive:hive:default:customer, originalConstraint = true] => [custkey:bigint] LAYOUT: hive custkey := HiveColumnHandle{clientId=hive, name=custkey, hiveType=bigint, hiveColumnIndex=0, columnType=REGULAR} - Project => [custkey_0:bigint, custkey_9:bigint, $hashvalue_16:bigint] - RemoteExchange[REPARTITION] => custkey_0:bigint, custkey_9:bigint, $hashvalue_16:bigint, $hashvalue_17:bigint - Project => [$hashvalue_18:bigint, custkey_0:bigint] $hashvalue_18 := "combine_hash"(BIGINT '0', COALESCE("$operator$hash_code"("custkey_0"), 0)) - Filter[("orderdate" >= "$literal$date"(BIGINT '8766'))] => [custkey_0:bigint, orderdate:date] - TableScan[hive:hive:default:orders, originalConstraint = ("orderdate" >= "$literal$date"(BIGINT '8766'))] => [custkey_0:bigint, orderdate:date] LAYOUT: hive custkey_0 := HiveColumnHandle{clientId=hive, name=custkey, hiveType=bigint, hiveColumnIndex=1, columnType=REGULAR} orderdate := HiveColumnHandle{clientId=hive, name=orderdate, hiveType=date, hiveColumnIndex=4, columnType=REGULAR}

This reduces number of unique symbols in query plan and allows other optimizations to be applied (e.g: running multiple joins in the same stage which operate on same partitions after canonicalization)

PredicatePushdown optimizer created unnecessary symbols for join clauses which are not required.

This reverts commit af03259.

This allows trivial queries to run even when the node is "out of memory"

Previously they were uploaded from the 'PRODUCT_TESTS' job.

This allows us to keep the logs for restarted jobs.

…avior We recently made a change to the column resolution rules for ORDER BY to make them compliant with ANSI SQL. In order to ease the transition from the old semantics, we now add a config option and session property that controls the behavior. The session property is "legacy_order_by". The config option is "deprecated.legacy-order-by".

The arguments were in the wrong order, so the value from FeaturesConfig was not being used to control the default value.

A recent commit (ec2e897) changed the way ORDER BY expressions are handled in a way that causes certain expression to not be "analyzed" and their types be recorded in the Analysis object. extractAggregates() looks for aggregations in node.getOrderBy() and records them for later use by the planner. The new ORDER BY analyzer process the rewritten expressions, which have a different object identity. As a result, the aggregates don't have associated type and implicit coercion information for the planner to use. This change makes it so that the aggregates are extracted from the rewritten expressions.

Due to ec2e897, when analysis fails for certain expressions, the error is misreported as happening in the SELECT clause instead of in the ORDER BY clause. This is because the analyzer processes the rewritten expressions, which contain inlined SELECT expressions and their original locations. This change fixes the issue by analyzing the original unmodified expressions with a synthetic scope built from the output of the SELECT clause that can delegate resolution to the source scope for missing names (essentially, it implements the resolution rules per the SQL spec). One side-effect of this change is that queries whose ORDER BY clause reference columns that appear multiple times in the SELECT clause are now considered invalid due to ambiguous references -- this matches the expected behavior according to the ANSI spec.

This query shape is no longer valid due to ambiguous column references.

billonahill · 2017-01-05T18:19:03Z

👍

This reverts commit 0cf760d, reversing changes made to c1c4575.

martint and others added 30 commits November 10, 2016 19:10

[maven-release-plugin] prepare for next development iteration

bc58837

Support Row Type in New Parquet Reader

6f5891a

Remove flushCache from HiveMetastore interfaces

a5654e8

flushCache only makes sense in CachingHiveMetastore.

Make it easy to use ThriftHiveMetastore with composition

531f39b

instead of inheritance

Fix stats export for Hive metastore

a87bb0e

Improve Table/PartitionOfflineException constructor interface

9c24ca8

Return null from element_at for array for index out of range

e9d55c4

The original purpose of this function was to provide an exception-free alternative to array and map subscript operators. This change makes the array version consistent with the function that operates on maps.

Do not convert type to uppercase

4e51ec2

Converting the type to uppercase breaks type equality for row types as the field names get uppercased.

Update to Airlift sphinx-maven-plugin

864f0c3

Bump tempto version to 1.18

c3fd4ff

Add Cassandra to product test environments

160b8b1

Add simple Cassandra select test

c7aafd3

Add description for running product tests using Docker for Mac

fa4a625

Add description for socks proxy in product tests README

4fe1f66

Categorize error on casting infinity/nan to decimal

d2c2a73

Rename symbols name in TestDomainTranslator

098148d

Rename symbols to the actual columns type instead of using the alphabet. Alphabetic are error prone and it is hard to merge patched that adds new columns.

Refactor TestDomainTranslator for numeric types

9ed9bc0

Saturated floor cast operators for DECIMAL, TINYINT, SMALLINT

157e975

Saturated floor cast operators for REAL

2a25d69

Verify ExchangeNode inputs is not empty

349c0b4

Canonicalize equi join conditions for inner joins

f6c2252

This reduces number of unique symbols in query plan and allows other optimizations to be applied (e.g: running multiple joins in the same stage which operate on same partitions after canonicalization)

Make PredicatePushdown optimizer not create unnecessary symbols

02b347c

PredicatePushdown optimizer created unnecessary symbols for join clauses which are not required.

Add retries for adding a column in shard manager

95d185f

Introduce isBlocked method in ConnectorPageSource

c038ece

Use isBlocked in BlackholePageSource instead of sleep

622a554

Fix unbounded lookahead when parsing parenthesized expression

48f5d23

Add $bucket_number hidden column to Hive

f8648a1

Remove HivePartition constructor used only by tests

b080503

Raghav Sethi and others added 27 commits December 12, 2016 12:45

Switch to airlift toJsonWithLengthLimit

af03259

Revert "Switch to airlift toJsonWithLengthLimit"

0b85bc2

This reverts commit af03259.

Allow escaped commas in jmx.dump-tables config property

08738af

Categorize error for Hive ALTER TABLE failure

b0731df

Fix broken check for partitions in HiveTableLayoutHandle

2ba203f

Use correct output descriptor when creating view

ea48537

Guarantee queries 1MB of memory per node before blocking

2151eb8

This allows trivial queries to run even when the node is "out of memory"

Document escaping commas in jmx history

bc16162

Improve documentation of char semantics

bf2e0ef

Upload artifacts from the 'MAVEN_CHECKS' job

31644f4

Previously they were uploaded from the 'PRODUCT_TESTS' job.

Use Travis build number instead of job number for uploads

42579b9

Store logs in S3 from every Travis job run

2b944d9

This allows us to keep the logs for restarted jobs.

Store test reports in S3 from every Travis job run

ae8eafe

Add missing raptor schema index in metadata

1127e1d

Fix broken default for legacy_order_by session property

df0b1fd

The arguments were in the wrong order, so the value from FeaturesConfig was not being used to control the default value.

Simplify documentation for min_by and max_by

689acaa

Fix broken test in TestAnalyzer

a6c3192

This query shape is no longer valid due to ambiguous column references.

Add release notes for 0.161

24b80ed

Update 0.161 release notes

e8dd198

[maven-release-plugin] prepare release 0.161

e4faf16

switch to 0.161 tag first

b1c4854

resolve merge conflict to 0.161

95819b7

revert back to old twitter tag

dcb3a39

correct revert version

0b0da52

dabaitu merged commit 0cf760d into twitter-forks:twitter-master Jan 6, 2017

Yaliang added a commit to Yaliang/presto that referenced this pull request Feb 3, 2017

Revert "Merge pull request twitter-forks#63 from dabaitu/twitter-master"

79382a2

This reverts commit 0cf760d, reversing changes made to c1c4575.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade to 0.161 #63

Upgrade to 0.161 #63

dabaitu commented Jan 5, 2017

billonahill commented Jan 5, 2017

Upgrade to 0.161 #63

Upgrade to 0.161 #63

Conversation

dabaitu commented Jan 5, 2017

billonahill commented Jan 5, 2017