upgrade to 0.157 #57

dabaitu · 2016-12-01T23:16:10Z

upgrade to 0.157 part 2 of 3

part 1 - remove old twitter event scriber impl
part 2 - upgrade to oss 0.157
part 3 - add new twitter event scriber

When one side of a join has an effective predicate expression in terms of the field use in the join criteria (e.g., v = f(k1), with a join criteria of k1 = k2), and that expression can produce null on non-null input (e.g., nullif, case, if, most of the array/map functions, etc), queries can produce incorrect results. In that scenario, predicate pushdown derives another join condition v = f(k2). Since f() can produce null on non-null input, it's possible for some value of k1 that's equal to k2, f(k2) is null or f(k1) is null. This will cause the join criteria to evaluate to null instead of true. A correct derivation, although less useful for predicate pushdown, would be k1 = k2 AND ((f(k1) IS NULL AND f(k2) IS NULL) OR f(k1) = f(k2)). This change prevents the equality inference logic from considering expressions that may return null on non-null input.

CAST(JSON 'null' AS ...) will also return null

This version avoids allocating arrays that are beyond the JVM limit.

Currently only the ordering column is being printed in the Explain plan output for Window nodes. It is also desirable to know what ordering is used for each of those columns.

Other queries could timeout because they were abandoned causing the test to fail

Rename fields in ORC dictionary reader to make it clear if the field is used for the stripe dictionary or row group dictionary.

Always create dictionary blocks in DRWF for columns using a row group dictionary. This prevents expansion of the dictionary which can create a very large block.

Simplify the connector materialization of connectors in ConnectorManager

Acquire transaction handle in SystemConnector lazily to avoid accessing the transaction manager during begin transaction.

This fixes a regression from the previous commit.

Test and test utility methods were declared that can throw Exception while no exception could be thrown.

It is weird that a method gets already rewrittenNode.

Fixes prestodb#6493

Adding test for b19d3df ("Fix base for counter in AssignUniqueIdOperator"). Without mentioned commit added test fails.

This is a rewrite of the partial aggregation pushdown optimizer to make the code easier to follow and reason about. The approach is as follows: 1. Determine whether the optimization is applicable. At a minimum, there must be an aggregation on top of an exchange. 2. If the aggregation is SINGLE, split it into a FINAL on top of a PARTIAL and reprocess the resulting plan. 3. If the aggregation is a PARTIAL, push it underneath each branch of the exchange. We use a couple of tricks to avoid having to juggle and rename field names as the nodes are rewired: 1. When pushing the partial aggregation through the exchange, the names of the outputs of the aggregation are preserved. 2. If the input->output mappings in the exchange are not simple identity projections without rename, we introduce a projection under the partial aggregation. This helps avoid having to rewrite all the aggregation functions to refer to new names. It also fixes a planning issue under certain scenarios involving aggregation subqueries and partitioned tables. E.g., SELECT * FROM ( SELECT count(*) FROM tpch.tiny.orders HAVING count(DISTINCT custkey) > 1 ) CROSS JOIN t where "t" is a partitioned Hive table.

82620d9 caused a regression when scheduling non-remotely accessible splits, bucketed splits, or splits when network aware scheduling was used

Now that we use a low watermark to trigger scheduling, we don't want to reserve too much space for splits with network affinity, otherwise the scheduler may have to run too frequently when splits have little to no affinity

These connectors uses non-canonincal types for varchar columns in TPC-H, so the output doesn't match. Disable the tests for now.

Don't wait for deletion executor if there are no rows to delete.

billonahill · 2016-12-02T00:32:53Z

👍 assuming all tests pass.

dabaitu · 2016-12-02T01:26:55Z

all tests pass

martint and others added 30 commits October 16, 2016 11:17

Fix incorrect equality inference during predicate pushdown

9af2199

CAST(JSON 'null' AS ...) will also return null

Add release notes for 0.154

fba5868

Remove extra word in 0.154 release notes

01c1b7c

Update to Slice 0.27

a7a095f

This version avoids allocating arrays that are beyond the JVM limit.

[maven-release-plugin] prepare release 0.154

25dca51

[maven-release-plugin] prepare for next development iteration

1e88e32

Add missing dependency for Hadoop KMS

7da0898

Remove unused TRY keyword from grammar

1cdc0c9

Add sort order in Window node in Explain output

e8148f3

Currently only the ordering column is being printed in the Explain plan output for Window nodes. It is also desirable to know what ordering is used for each of those columns.

Remove sampled table support from Raptor

3ee8178

Make Accumulo classes immutable

b42a7f6

Make AccumuloColumnHandle immutable

7168269

Remove usages of stored AccumuloConfig

116154f

Fix intermitent failures in resource group tests

cc9e0e2

Other queries could timeout because they were abandoned causing the test to fail

Rename ORC dictionary reader fields

6692d69

Rename fields in ORC dictionary reader to make it clear if the field is used for the stripe dictionary or row group dictionary.

Use dictionary blocks for DWRF row groups dictionary

e51aac4

Always create dictionary blocks in DRWF for columns using a row group dictionary. This prevents expansion of the dictionary which can create a very large block.

Move initializeSerializer to HiveWriteUtils

ddadea7

Fix warnings in HivePageSink

9e368c1

Initialize Hive serializer when writing empty files

ffa33a5

Support Avro in Hive connector

9086f58

Rename CatalogManager to StaticCatalogStore

05f3599

Simplify connector creation process

6e23ebc

Simplify the connector materialization of connectors in ConnectorManager

Add Metadata.getCatalogHandle

222de54

Prevent loopback in TransactionManager begin transaction

3e6eca2

Acquire transaction handle in SystemConnector lazily to avoid accessing the transaction manager during begin transaction.

Add catalog abstraction to simplify metdata management

88b42d4

Make catalog names transactional

cf04cf0

Resolve procedure catalog name using the current transaction

4b14b8b

Use transaction to resolve catalog in AccessControlManager

6d59236

Use transaction to resolve catalog for table properties

58f2b43

ArturGajowy and others added 27 commits November 9, 2016 08:12

Make export_canonical_path resolve wildcards

2687b22

This fixes a regression from the previous commit.

Remove throws Exception from AbstractTestQueries

56ac17a

Test and test utility methods were declared that can throw Exception while no exception could be thrown.

Remove duplicated code

c65432f

Fix naming of method arguments

539488d

It is weird that a method gets already rewrittenNode.

Do not prune ApplyNode subquery symbols are used in join criteria

f62d484

Fixes prestodb#6493

Rename resource group table ID and remove duplicate class

682ff01

Check if rowId and value mask are using different bits

ad1d360

Test AssignUniqueId generatees unique ids

3e33358

Adding test for b19d3df ("Fix base for counter in AssignUniqueIdOperator"). Without mentioned commit added test fails.

Fail creation of Hive tables bucketed on non-existent columns

946d430

Improve exception for Hive tables bucketed on a non-existent column

4055848

Add bitwise_and_agg and bitwise_or_agg aggregations

0cbdcf0

Fix high coordinator CPU usage when scheduling splits with affinity

83ba326

82620d9 caused a regression when scheduling non-remotely accessible splits, bucketed splits, or splits when network aware scheduling was used

Adjust split queue reservations in topology aware scheduler

cbc82e8

Now that we use a low watermark to trigger scheduling, we don't want to reserve too much space for splits with network affinity, otherwise the scheduler may have to run too frequently when splits have little to no affinity

Disable describe output tests for mysql, cassandra and accumulo

9966866

These connectors uses non-canonincal types for varchar columns in TPC-H, so the output doesn't match. Disable the tests for now.

Improve concurrency for Raptor deletion

75c4128

Don't wait for deletion executor if there are no rows to delete.

Fix cancellation of Raptor backup tasks

286ad21

Add documentation for implicit coercion in Hive connector

1184664

Update Hive documentation for external tables

ff378bf

Add 0.157 release notes

4392cae

[maven-release-plugin] prepare release 0.157

7e38903

update tag to 0.157 to avoid unnecessary conflicts

1613f3c

Upgrade to 0.157, resolve conflicts and add twitter event listener

6960462

update tag

50b4884

upgrade to oss 0.157 without new twitter event scriber

77b5d27

resolve conflicts

a184a61

merge dabaitu-upgrade_to_oss_157

70dc428

dabaitu merged commit 16db1d7 into twitter-forks:twitter-master Dec 2, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

upgrade to 0.157 #57

upgrade to 0.157 #57

dabaitu commented Dec 1, 2016

billonahill commented Dec 2, 2016

dabaitu commented Dec 2, 2016

upgrade to 0.157 #57

upgrade to 0.157 #57

Conversation

dabaitu commented Dec 1, 2016

billonahill commented Dec 2, 2016

dabaitu commented Dec 2, 2016