forked from apache/pig
-
Notifications
You must be signed in to change notification settings - Fork 3
/
CHANGES.txt
5151 lines (2746 loc) · 186 KB
/
CHANGES.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
Pig Change Log
Release 0.15.0 - Unreleased
INCOMPATIBLE CHANGES
IMPROVEMENTS
PIG-4560: Pig 0.15.0 Documentation (daijy)
PIG-4429: Add Pig alias information and Pig script to the DAG view in Tez UI (daijy)
PIG-3994: Implement getting backend exception for Tez (rohini)
PIG-4563: Upgrade to released Tez 0.7.0 (daijy)
PIG-4525: Clarify "Scalar has more than one row in the output." (Niels Basjes via gates)
PIG-4511: Add columns to prune from PluckTuple (jbabcock via cheolsoo)
PIG-4434: Improve auto-parallelism for tez (daijy)
PIG-4495: Better multi-query planning in case of multiple edges (rohini)
PIG-3294: Allow Pig use Hive UDFs (daijy)
PIG-4476: Fix logging in AvroStorage* classes and SchemaTuple class (rdsr via rohini)
PIG-4458: Support UDFs in a FOREACH Before a Merge Join (wattsinabox via daijy)
PIG-4454: Upgrade tez to 0.6.0 (daijy)
PIG-4451: Log partition and predicate filter pushdown information and fix optimizer looping (rohini)
PIG-4430: Pig should support reading log4j.properties file from classpath as well (rdsr via daijy)
PIG-4407: Allow specifying a replication factor for jarcache (jira.shegalov via rohini)
PIG-4401: Add pattern matching to PluckTuple (cheolsoo)
PIG-2692: Make the Pig unit faciliities more generalizable and update javadocs (razsapps via daijy)
PIG-4379: Make RoundRobinPartitioner public (daijy)
PIG-4378: Better way to fix tez local mode test hanging (daijy)
PIG-4358: Add test cases for utf8 chinese in Pig (nmaheshwari via daijy)
PIG-4370: HBaseStorage should support delete markers (bridiver via daijy)
PIG-4360: HBaseStorage should support setting the timestamp field (bridiver via daijy)
PIG-4337: Split Types and MultiQuery e2e tests into multiple groups (rohini)
PIG-4333: Split BigData tests into multiple groups (rohini)
BUG FIXES
PIG-4592: Pig 0.15 stopped working with Hadoop 1.x (daijy)
PIG-4580: Fix TestTezAutoParallelism.testSkewedJoinIncreaseParallelism test failure (daijy)
PIG-4571: TestPigRunner.testGetHadoopCounters fail on Windows (daijy)
PIG-4541: Skewed full outer join does not return records if any relation is empty. Outer join does not
return any record if left relation is empty (daijy)
PIG-4564: Pig can deadlock in POPartialAgg if there is a bag (rohini via daijy)
PIG-4569: Fix e2e test Rank_1 failure (rohini)
PIG-4490: MIN/MAX builtin UDFs return wrong results when accumulating for strings (xplenty via rohini)
PIG-4418: NullPointerException in JVMReuseImpl (rohini)
PIG-4562: Typo in DataType.toDateTime (daijy)
PIG-4559: Fix several new tez e2e test failures (daijy)
PIG-4506: binstorage fails to write biginteger (ssavvides via daijy)
PIG-4556: Local mode is broken in some case by PIG-4247 (daijy)
PIG-4523: Tez engine should use tez config rather than mr config whenever possible (daijy)
PIG-4452: Embedded SQL using "SQL" instead of "sql" fails with string index out of range: -1 error (daijy)
PIG-4543: TestEvalPipelineLocal.testRankWithEmptyReduce fail on Hadoop 1 (daijy)
PIG-4544: Upgrade Hbase to 0.98.12 (daijy)
PIG-4481: e2e tests ComputeSpec_1, ComputeSpec_2 and StreamingPerformance_3 produce different result on Windows (daijy)
PIG-4496: Fix CBZip2InputStream to close underlying stream (petersla via daijy)
PIG-4528: Fix a typo in src/docs/src/documentation/content/xdocs/basic.xml (namusyaka via daijy)
PIG-4532: Pig Documentation contains typo for AvroStorage (fredericschmaljohann via daijy)
PIG-4377: Skewed outer join produce wrong result in some cases (daijy)
PIG-4538: Pig script fail with CNF in follow up MR job (daijy)
PIG-4537: Fix unit test failure introduced by TEZ-2392: TestCollectedGroup, TestLimitVariable, TestMapSideCogroup, etc (daijy)
PIG-4530: StackOverflow in TestMultiQueryLocal running under hadoop20 (nielsbasjes via rohini)
PIG-4529: Pig on tez hit counter limit imposed by MR (daijy)
PIG-4524: Pig Minicluster unit tests broken by TEZ-2333 (daijy)
PIG-4527: NON-ASCII Characters in Javadoc break 'ant docs' (nielsbasjes via daijy)
PIG-4494: Pig's htrace version conflicts with that of hadoop 2.6.0 (daijy)
PIG-4519: Correct link to Contribute page (gliptak via daijy)
PIG-4514: pig trunk compilation is broken - VertexManagerPluginContext.reconfigureVertex change (thejas)
PIG-4503: [Pig on Tez] NPE in UnionOptimizer with multiple levels of union (rohini)
PIG-4509: [Pig on Tez] Unassigned applications not killed on shutdown (rohini)
PIG-4508: [Pig on Tez] PigProcessor check for commit only on MROutput (rohini)
PIG-4505: [Pig on Tez] Auto adjust AM memory can hit OOM with 3.5GXmx (rohini)
PIG-4502: E2E tests build fail with udfs compile (nmaheshwari via daijy)
PIG-4498: AvroStorage in Piggbank does not handle bad records and fails (viraj via rohini)
PIG-4499: mvn-build miss tez classes in pig-h2.jar (daijy)
PIG-4488: Pig on tez mask tez.queue.name (daijy)
PIG-4497: [Pig on Tez] NPE for null scalar (rohini)
PIG-4493: Pig on Tez gives wrong results if Union is followed by Split (rohini)
PIG-4491: Streaming Python Bytearray Bugs (jeremykarn via daijy)
PIG-4487: Pig on Tez gives wrong success message on failure in case of multiple outputs (rohini)
PIG-4483: Pig on Tez output statistics shows storing to same directory twice for union (rohini)
PIG-4480: Pig script failure on Tez with split and order by due to missing sample collection (rohini)
PIG-4484: Ant pull jetty-6.1.26.zip on some platform (daijy)
PIG-4479: Pig script with union within nested splits followed by join failed on Tez (rohini)
PIG-4457: Error is thrown by JobStats.getOutputSize() when storing to a MySql table (rohini)
PIG-4475: Keys in AvroMapWrapper are not proper Pig types (rdsr via daijy)
PIG-4478: TestCSVExcelStorage fails with jdk8 (rohini)
PIG-4474: Increasing intermediate parallelism has issue with default parallelism (rohini)
PIG-4465: Pig streaming ship fails for relative paths on Tez (rohini)
PIG-4461: Use benchmarks for Windows Pig e2e tests (nmaheshwari via daijy)
PIG-4463: AvroMapWrapper still leaks Avro data types and AvroStorageDataConversionUtilities do not handle
Pig maps (rdsr via daijy)
PIG-4460: TestBuiltIn testValueListOutputSchemaComplexType and testValueSetOutputSchemaComplexType tests
create bags whose inner schema is not a tuple (erwaman via daijy)
PIG-4448: AvroMapWrapper leaks Avro data types when the map values are complex avro records (rdsr via daijy)
PIG-4453: Remove test-tez-local target (daijy)
PIG-4443: Write inputsplits in Tez to disk if the size is huge and option to compress pig input splits (rohini)
PIG-4447: Pig Cannot handle nullable values (arrays and records) in avro records (rdsr via daijy)
PIG-4444: Fix unit test failure TestTezAutoParallelism (daijy)
PIG-4445: VALUELIST and VALUESET outputSchema does not match actual schema of data returned when map value schema
is complex (erwaman via daijy)
PIG-4442: Eliminate redundant RPC call to get file information in HPath (cnauroth via daijy)
PIG-4440: Some code samples in documentation use Unicode left/right single quotes, which cause a
parse failure (cnauroth via daijy)
PIG-4264: Port TestAvroStorage to tez local mode (daijy)
PIG-4437: Fix tez unit test failure TestJoinSmoke, TestSkewedJoin (daijy)
PIG-4432: Built-in VALUELIST and VALUESET UDFs do not preserve the schema when the map value type is
a complex type (erwaman via daijy)
PIG-4408: Merge join should support replicated join as a predecessor (bridiver via daijy)
PIG-4389: Flag to run selected test suites in e2e tests (daijy)
PIG-4385: testDefaultBootup fails because it cannot find "pig.properties" (mkudlej via daijy)
PIG-4397: CSVExcelStorage incorrect output if last field value is null (daijy)
PIG-4431: ReadToEndLoader does not close the record reader for the last input split (rdsr via daijy)
PIG-4426: RowNumber(simple) Rank not producing correct results (knoguchi)
PIG-4433: Loading bigdecimal in nested tuple does not work (kpriceyahoo via daijy)
PIG-4410: Fix testRankWithEmptyReduce in tez mode (daijy)
PIG-4392: RANK BY fails when default_parallel is greater than cardinality of field being ranked by (daijy)
PIG-4403: Combining -Dpig.additional.jars.uris with -useHCatalog breaks due to combination
with colon instead of comma (ovlaere via daijy)
PIG-4402: JavaScript UDF example in the doc is broken (cheolsoo)
PIG-4394: Fix Split_9 and Union_5 e2e failures (rohini)
PIG-4391: Fix TestPigStats test failure (rohini)
PIG-4387: Honor yarn settings in tez-site.xml and optimize dag status fetch (rohini)
PIG-4352: Port local mode tests to Tez - TestUnionOnSchema (daijy)
PIG-4359: Port local mode tests to Tez - part4 (daijy)
PIG-4340: PigStorage fails parsing empty map (daijy)
PIG-4366: Port local mode tests to Tez - part5 (daijy)
PIG-4381: PIG grunt shell DEFINE commands fails when it spans multiple lines (daijy)
PIG-4384: TezLauncher thread should be deamon thread (zjffdu via daijy)
PIG-4376: NullPointerException accessing a field of an invalid bag from a nested foreach
(kspringborn via daijy)
PIG-4355: Piggybank: XPath cant handle namespace in xpath, nor can it return more than one match
(cavanaug via daijy)
PIG-4371: Duplicate snappy.version in libraries.properties (daijy)
PIG-4368: Port local mode tests to Tez - TestLoadStoreFuncLifeCycle (daijy)
PIG-4367: Port local mode tests to Tez - TestMultiQueryBasic (daijy)
PIG-4339: e2e test framework assumes default exectype as mapred (rohini)
PIG-2949: JsonLoader only reads arrays of objects (eyal via daijy)
PIG-4213: CSVExcelStorage not quoting texts containing \r (CR) when storing (alfonso.nishikawa via daijy)
PIG-2647: Split Combining drops splits with empty getLocations() (tmwoodruff via daijy)
PIG-4294: Enable unit test "TestNestedForeach" for spark (kellyzly via rohini)
PIG-4282: Enable unit test "TestForEachNestedPlan" for spark (kellyzly via rohini)
PIG-4361: Fix perl script problem in TestStreaming.java (kellyzly via xuefu)
PIG-4354: Port local mode tests to Tez - part3 (daijy)
PIG-4338: Fix test failures with JDK8 (rohini)
PIG-4351: TestPigRunner.simpleTest2 fail on trunk (daijy)
PIG-4350: Port local mode tests to Tez - part2 (daijy)
PIG-4326: AvroStorageSchemaConversionUtilities does not properly convert schema for maps of arrays of records (mprim via daijy)
PIG-4345: e2e test "RubyUDFs_13" fails because of the different result of "group a all" in different engines like "spark", "mapreduce" (kellyzly via rohini)
PIG-4332: Remove redundant jars packaged into pig-withouthadoop.jar for hadoop 2 (rohini)
PIG-4331: update README, '-x' option in usage to include tez (thejas via daijy)
PIG-4327: Schema of map with value that has an alias can't be parsed again (mprim via daijy)
PIG-4330: Regression test for PIG-3584 - AvroStorage does not correctly translate arrays of strings (brocknoland via daijy)
PIG-3615: Update the way that JsonLoader/JsonStorage deal with BigDecimal (tyro89 via daijy)
PIG-4329: Fetch optimization should be disabled when limit is not pushed up (lbendig via cheolsoo)
PIG-3413: JsonLoader fails the pig job in case of malformed json input (eyal via daijy)
PIG-4247: S3 properties are not picked up from core-site.xml in local mode (cheolsoo)
PIG-4242: For indented xmls with multiline content (e.g. wikipedia) XMLLoader cuts out the begining of every line
(holdfenytolvaj via daijy)
Release 0.14.1 - Unreleased
INCOMPATIBLE CHANGES
IMPROVEMENTS
BUG FIXES
PIG-4409: fs.defaultFS is overwritten in JobConf by replicated join at runtime (cheolsoo)
PIG-4404: LOAD with HBaseStorage on secure cluster is broken in Tez (rohini)
PIG-4375: ObjectCache should use ProcessorContext.getObjectRegistry() (rohini)
PIG-4334: PigProcessor does not set pig.datetime.default.tz (rohini)
PIG-4342: Pig 0.14 cannot identify the uppercase of DECLARE and DEFAULT (daijy)
Release 0.14.0
INCOMPATIBLE CHANGES
IMPROVEMENTS
PIG-4321: Documentation for 0.14 (daijy)
PIG-4328: Upgrade Hive to 0.14 (daijy)
PIG-4318: Make PigConfiguration naming consistent (rohini)
PIG-4316: Port TestHBaseStorage to tez local mode (rohini)
PIG-4224: Upload Tez payload history string to timeline server (daijy)
PIG-3977: Get TezStats working for Oozie (rohini)
PIG-3979: group all performance, garbage collection, and incremental aggregation (rohini)
PIG-4253: Add a UniqueID UDF (daijy)
PIG-4160: Provide a way to pass local jars in pig.additional.jars when using a remote
url for a script (acoliver via daijy)
PIG-4246: HBaseStorage should implement getShipFiles (rohini)
PIG-3456: Reduce threadlocal conf access in backend for each record (rohini)
PIG-3861: duplicate jars get added to distributed cache (chitnis via rohini)
PIG-4039: New interface for resetting static variables for jvm reuse (rohini)
PIG-3870: STRSPLITTOBAG UDF (cryptoe via daijy)
PIG-4080: Add Preprocessor commands and more to the black/whitelisting feature (prkommireddi via daijy)
PIG-4162: Intermediate reducer parallelism in Tez should be higher (rohini)
PIG-4186: Fix e2e run against new build of pig and some enhancements (rohini)
PIG-3838: Organize tez code into subpackages (rohini)
PIG-4069: Limit reduce task should start as soon as one map task finishes (rohini)
PIG-4141: Ship UDF/LoadFunc/StoreFunc dependent jar automatically (daijy)
PIG-4146: Create a target to run mr and tez unit test in one shot (daijy)
PIG-4144: Make pigunit.PigTest work in tez mode (daijy)
PIG-4128: New logical optimizer rule: ConstantCalculator (daijy)
PIG-4124: Command for Python streaming udf should be configurable (cheolsoo)
PIG-4114: Add Native operator to tez (daijy)
PIG-4117: Implement merge cogroup in Tez (daijy)
PIG-4119: Add message at end of each testcase with timestamp in Pig system tests (nmaheshwari via daijy)
PIG-4008: Pig code change to enable Tez Local mode (airbots via daijy)
PIG-4091: Predicate pushdown for ORC (rohini via daijy)
PIG-4077: Some fixes and e2e test for OrcStorage (rohini)
PIG-4054: Do not create job.jar when submitting job (daijy)
PIG-4047: Break up pig withouthadoop and fat jar (daijy)
PIG-4062: Add ascending order option to builtin TOP function (raj171 via cheolsoo)
PIG-3558: ORC support for Pig (daijy)
PIG-2122: Parameter Substitution doesn't work in the Grunt shell (daijy)
PIG-4031: Provide Counter aggregation for Tez (daijy)
PIG-4028: add a flag to control the ivy resolve/retrieve output (gkesavan via daijy)
PIG-4015: Provide a way to disable auto-parallism in tez (daijy)
PIG-3846: Implement automatic reducer parallelism (daijy)
PIG-3939: SPRINTF function to format strings using a printf-style template (mrflip via cheolsoo)
PIG-3970: Merge Tez branch into trunk (daijy)
OPTIMIZATIONS
BUG FIXES
PIG-4335: Pig release tarball miss tez classes (daijy)
PIG-4325: StackOverflow when spilling InternalCachedBag (daijy)
PIG-4324: Remove jsch-LICENSE.txt (daijy)
PIG-4267: ToDate has incorrect timezone offsets (bridiver via daijy)
PIG-4319: Make LoadPredicatePushdown InterfaceAudience.Private till PIG-4093 (rohini)
PIG-4312: TestStreamingUDF tez mode leave orphan process on Windows (daijy)
PIG-4314: BigData_5 hang on some machine (daijy)
PIG-4299: SpillableMemoryManager assumes tenured heap incorrectly (prkommireddi via daijy)
PIG-4298: Descending order-by is broken in some cases when key is bytearrays (cheolsoo)
PIG-4263: Move tez local mode unit tests to a separate target (daijy)
PIG-4257: Fix several e2e tests on secure cluster (daijy)
PIG-4261: Skip shipping local resources in tez local mode (daijy)
PIG-4182: e2e tests Scripting_[1-12] fail on Windows (daijy)
PIG-4259: Fix few issues related to Union, CROSS and auto parallelism in Tez (rohini)
PIG-4250: Fix Security Risks found by Coverity (daijy)
PIG-4258: Fix several e2e tests on Windows (daijy)
PIG-4256: Fix StreamingPythonUDFs e2e test failure on Windows (daijy)
PIG-4166: Collected group drops last record when combined with merge join (bridiver via daijy)
PIG-2495: Using merge JOIN from a HBaseStorage produces an error (bridiver via daijy)
PIG-4235: Fix unit test failures on Windows (daijy)
PIG-4245: 1-1 edge vertices should use same jvm opts (rohini)
PIG-4252: Tez container reuse fail when using script udf (daijy)
PIG-4241: Auto local mode mistakenly converts large jobs to local mode when using with Hive tables (cheolsoo)
PIG-4184: UDF backward compatibility issue after POStatus.STATUS_NULL refactory (daijy)
PIG-4238: Property 'pig.job.converted.fetch' should be unset when fetch finishes (lbendig)
PIG-4151: Pig Cannot Write Empty Maps to HBase (daijy)
PIG-4181: Cannot launch tez e2e test on Windows (daijy)
PIG-2834: MultiStorage requires unused constructor argument (daijy)
PIG-4230: Documentation fix: first nested foreach example is incomplete (lbendig via daijy)
PIG-4199: Mapreduce ACLs should be translated to Tez ACLs (rohini)
PIG-4227: Streaming Python UDF handles bag outputs incorrectly (cheolsoo)
PIG-4219: When parsing a schema, pig drops tuple inside of Bag if it contains only one field (lbendig via daijy)
PIG-4226: Upgrade Tez to 0.5.1 (daijy)
PIG-4220: MapReduce-based Rank failing with NPE due to missing Counters (knoguchi)
PIG-3985: Multiquery execution of RANK with RANK BY causes NPE (rohini)
PIG-4218: Pig OrcStorage fail to load a map with null key (daijy)
PIG-4164: After Pig job finish, Pig client spend too much time retry to connect to AM (daijy)
PIG-4212: Allow LIMIT of 0 for variableLimit (constant 0 is already allowed) (knoguchi)
PIG-4196: Auto ship udf jar is broken (daijy)
PIG-4214: Fix unit test fail TestMRJobStats (daijy)
PIG-4217: Fix documentation in BuildBloom (praveenr019 via daijy)
PIG-4215: Fix unit test failure TestParamSubPreproc and TestMacroExpansion (daijy)
PIG-4175: PIG CROSS operation follow by STORE produces non-deterministic results each run (daijy)
PIG-4202: Reset UDFContext state before OutputCommitter invocations in Tez (rohini)
PIG-4205: e2e test property-check does not check all prerequisites (kellyzly via daijy)
PIG-4180: e2e test Native_3 fail on Hadoop 2 (daijy)
PIG-4178: HCatDDL_[1-3] fail on Windows (daijy)
PIG-4046: PiggyBank DBStorage DATETIME should use setTimestamp with java.sql.Timestamp (sinchii via daijy)
PIG-4050: HadoopShims.getTaskReports() can cause OOM with Hadoop 2 (rohini)
PIG-4176: Fix tez e2e test Bloom_[1-3] (daijy)
PIG-4195: Support loading char/varchar data in OrcStorage (daijy)
PIG-4201: Native e2e tests fail when run against old version of pig (rohini)
PIG-4197: Fix typo in Job Stats header: MinMapTIme => MinMapTime (jmartell7 via daijy)
PIG-4194: ReadToEndLoader does not call setConf on pigSplit in initializeReader (shadanan via rohini)
PIG-4187: Fix Orc e2e tests (daijy)
PIG-4177: BigData_1 fail after PIG-4149 (daijy)
PIG-3507: Pig fails to run in local mode on a Kerberos enabled Hadoop cluster (kellyzly via rohini)
PIG-4171: Streaming UDF fails when direct fetch optimization is enabled (cheolsoo)
PIG-4170: Multiquery with different type of key gives wrong result (daijy)
PIG-4104: Accumulator UDF throws OOM in Tez (rohini)
PIG-4169: NPE in ConstantCalculator (cheolsoo)
PIG-4161: check for latest Hive snapshot dependencies (daijy)
PIG-4102: Adding e2e tests and several improvements for Orc predicate pushdown (daijy)
PIG-4156: [PATCH] fix NPE when running scripts stored on hdfs:// (acoliver via daijy)
PIG-4159: TestGroupConstParallelTez and TestJobSubmissionTez should be excluded in Hadoop 20 unit tests (cheolsoo)
PIG-4154: ScriptState#setScript(File) does not close resources (lars_francke via daijy)
PIG-4155: Quitting grunt shell using CTRL-D character throws exception (abhishek.agarwal via daijy)
PIG-4157: Pig compilation failure due to HIVE-7208 (daijy)
PIG-4158: TestAssert is broken in trunk (cheolsoo)
PIG-4143: Port more mini cluster tests to Tez - part 7 (daijy)
PIG-4149: Rounding issue in FindQuantiles (daijy)
PIG-4145: Port local mode tests to Tez - part1 (daijy)
PIG-4076: Fix pom file (daijy)
PIG-4140: VertexManagerEvent.getUserPayload returns ReadOnlyBuffer after TEZ-1449 (daijy)
PIG-4136: No special handling jythonjar/jrubyjar in e2e tests after PIG-4047 (daijy)
PIG-4137: Fix hadoopversion 23 compilation due to TEZ-1469 (daijy)
PIG-4135: Fetch optimization should be disabled if plan contains no limit (cheolsoo)
PIG-4061: Make Streaming UDF work in Tez (hotfix PIG-4061-3.patch)
PIG-4134: TEZ-1449 broke the build (knoguchi)
PIG-4132: TEZ-1246 and TEZ-1390 broke a build (knoguchi)
PIG-4129: Pig -Dhadoopversion=23 compile fail after TEZ-1426 (daijy)
PIG-4127: Build failure due to TEZ-1132 and TEZ-1416 (lbendig)
PIG-4125: TEZ-1347 broke the build
PIG-4123: Increase memory for TezMiniCluster (daijy)
PIG-4122: Fix hadoopversion 23 compilation due to TEZ-1194 (daijy)
PIG-4061: Make Streaming UDF work in Tez (daijy)
PIG-4118: Fix hadoopversion 23 compilation due to TEZ-1237/TEZ-1407 (daijy)
PIG-4109: register local jar fail on Windows when Pig script is remote (daijy)
PIG-4116: Update Pig doc about Hadoop 2 Streaming Python UDF support (cheolsoo)
PIG-4112: NPE in packager when union + group-by followed by replicated join in Tez (rohini via cheolsoo)
PIG-4113: TEZ-1386 breaks hadoop 2 compilation in trunk (cheolsoo)
PIG-4110: TEZ-1382 breaks Hadoop 2 compilation (cheolsoo)
PIG-4105: Fix TestAvroStorage with ibm jdk (fang fang chen via daijy)
PIG-4108: Pig -Dhadoopversion=23 compile fail after TEZ-1317 (daijy)
PIG-4086: Fix Orc e2e tests for tez (daijy)
PIG-4101: Lower tez.am.task.max.failed.attempts to 2 from 4 in Tez mini cluster (cheolsoo)
PIG-4099: "ant copypom" failed with "could not find file $PIG_HOME/ivy/pig.pom to copy" (fang fang chen via cheolsoo)
PIG-4098: Vertex Location Hint api update after TEZ-1041 (jeagles via cheolsoo)
PIG-4088: TEZ-1346 breaks hadoop 2 compilation in trunk (cheolsoo)
PIG-4089: TestMultiQuery.testMultiQueryJiraPig1169 fails in trunk after
PIG-4079 in Hadoop 1 (cheolsoo)
PIG-4085: TEZ-1303 broke hadoop 2 compilation in trunk (cheolsoo)
PIG-4082: TEZ-1278 broke hadoop 2 compilation in trunk (cheolsoo)
PIG-4079: Parallel clause is not honored in local mode (cheolsoo)
PIG-4078: Port more mini cluster tests to Tez - part 6 (rohini)
PIG-4071: Fix TestStore.testSetStoreSchema, TestParamSubPreproc.testGruntWithParamSub,
TestJobSubmission.testReducerNumEstimation (daijy)
PIG-4074: mapreduce.client.submit.file.replication is not honored in cached files (cheolsoo)
PIG-4052: TestJobControlSleep, TestInvokerSpeed are unreliable (daijy)
PIG-4053: TestMRCompiler succeeded with sun jdk 1.6 while failed with sun jdk 1.7 (daijy)
PIG-3982: ant target test-tez should depend on jackson-pig-3039-test-download (daijy)
PIG-4064: Fix tez auto parallelism test failures (daijy)
PIG-4075: TEZ-1311 broke Hadoop2 compilation (cheolsoo)
PIG-4070: Change from TezJobConfig to TezRuntimeConfiguration (rohini)
PIG-4068: ObjectCache causes ClassCastException (cheolsoo)
PIG-4067: TestAllLoader in piggybank fails with new hive version (rohini)
PIG-4065: Fix failing unit tests in Tez (rohini)
PIG-4060: Refactor TezJob and TezLauncher (cheolsoo)
PIG-2689: JsonStorage fails to find schema when LimitAdjuster runs (rohini)
PIG-4056: Remove PhysicalOperator.setAlias (rohini)
PIG-4058: Use single config in Tez for input and output (rohini)
PIG-3886: UdfDistributedCache_1 fails in tez branch (cheolsoo)
PIG-4055 Build broke after TEZ-1130 API rename (knoguchi)
PIG-3935: Port more mini cluster tests to Tez - part 5 (rohini)
PIG-3984: PigServer.shutdown removes the tez resource folder (daijy via rohini)
PIG-4048: TEZ-692 has a incompatible API change removing TezSession (rohini)
PIG-4044: Pig should use avro-mapred-hadoop2.jar instead of avro-mapred.jar when compile with hadoop 2 (daijy)
PIG-4043: JobClient.getMap/ReduceTaskReports() causes OOM for jobs with a large number of tasks (cheolsoo)
PIG-4036: Fix e2e failures - JobManagement_3, CmdErrors_3 and BigData_4 (daijy)
PIG-4041: org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper compiling error (jeagles via cheolsoo)
PIG-4038: SPRINTF should return NULL on any NULL input (mrflip via daijy)
PIG-4025: TestLoadFuncWrapper, TestLoadFuncMetaDataWrapper,TestStoreFuncWrapper
and TestStoreFuncMetadataWrapper fail on IBM JDK (ahireanup via daijy)
PIG-4024: TestPigStreamingUDF and TestPigStreaming fail on IBM JDK (ahireanup via daijy)
PIG-4023: BigDec/Int sort is broken (ahireanup via daijy)
PIG-4003: Error is thrown by JobStats.getOutputSize() when storing to a Hive table (cheolsoo)
PIG-4035: Fix CollectedGroup e2e tests for tez (daijy)
PIG-4034: Exclude TestTezAutoParallelism when -Dhadoopversion=20 (cheolsoo)
PIG-4033: Fix MergeSparseJoin e2e tests on tez (daijy)
PIG-3478: Make StreamingUDF work for Hadoop 2 (lbendig via daijy)
PIG-4032: BloomFilter fails with s3 path in Hadoop 2.4 (cheolsoo)
PIG-4018: Schema validation fails with UNION ONSCHEMA (daijy)
PIG-4022: Fix tez e2e test SkewedJoin_6 (daijy)
PIG-4001: POPartialAgg aggregates too aggressively when multiple values aggregated (tmwoodruff via cheolsoo)
PIG-4027: Always check for latest Tez snapshot dependencies (lbendig via cheolsoo)
PIG-4020: Fix tez e2e tests MapPartialAgg_[2-4], StreamingPerformance_[6-7] (daijy)
PIG-4019: Compilation broken after TEZ-1169 (daijy)
PIG-4014: Fix Rank e2e test failures on tez (daijy)
PIG-4013: Order by multiple column fail on Tez (daijy)
PIG-3983: TestGrunt.testKeepGoigFailed fail on tez mode (daijy)
PIG-3959: Skewed join followed by replicated join fails in Tez (cheolsoo)
PIG-3995: Tez unit tests shouldn't run when -Dhadoopversion=20 (cheolsoo)
PIG-3986: PigSplit to support multiple split class (tongjie via cheolsoo)
PIG-3988: PigStorage: CommandLineParser is not thread safe (tmwoodruff via cheolsoo)
PIG-2409: Pig show wrong tracking URL for hadoop 2 (lbendig via rohini)
PIG-3978: Container reuse does not across PigServer (daijy)
PIG-3974: E2E test data generation fails in cluster mode (lbendig via cheolsoo)
PIG-3969: Javascript UDF fails if no output schema is defined (lbendig via cheolsoo)
PIG-3971: Pig on tez fails to run in Oozie in secure cluster (rohini)
PIG-3968: OperatorPlan.serialVersionUID is not defined (daijy)
Release 0.13.1 - Unreleased
INCOMPATIBLE CHANGES
IMPROVEMENTS
OPTIMIZATIONS
BUG FIXES
PIG-4139: pig query throws error java.lang.NoSuchFieldException: jobsInProgress on MRv1 (satish via cheolsoo)
PIG-4133: Need to update the default $HCAT_HOME dir in the PIG script (mnarayan via cheolsoo)
PIG-4106: Describe shouldn't trigger execution in batch mode (cheolsoo)
Release 0.13.0
INCOMPATIBLE CHANGES
PIG-3996: Delete zebra from svn (cheolsoo)
PIG-3898: Refactor PPNL for non-MR execution engine (cheolsoo)
PIG-3485: Remove CastUtils.bytesToMap(byte[] b) method from LoadCaster interface (cheolsoo)
PIG-3419: Pluggable Execution Engine (achalsoni81 via cheolsoo)
PIG-2207: Support custom counters for aggregating warnings from different udfs (aniket486)
IMPROVEMENTS
PIG-3892: Pig distribution for hadoop 2 (daijy)
PIG-4006: Make the interval of DAGStatus report configurable (cheolsoo)
PIG-3999: Document PIG-3388 (lbendig via cheolsoo)
PIG-3954: Document use of user level jar cache (aniket486)
PIG-3752: Fix e2e Parallel test for Windows (daijy)
PIG-3966: Document variable input arguments of UDFs (lbendig via aniket486)
PIG-3963: Documentation for BagToString UDF (mrflip via daijy)
PIG-3929: pig.temp.dir should allow to substitute vars as hadoop configuration does (aniket486)
PIG-3913: Pig should use job's jobClient wherever possible (fixes local mode counters) (aniket486)
PIG-3941: Piggybank's Over UDF returns an output schema with named fields (mrflip via cheolsoo)
PIG-3545: Seperate validation rules from optimizer (daijy)
PIG-3745: Document auto local mode for pig (aniket486)
PIG-3932: Document ROUND_TO builtin UDF (mrflip via cheolsoo)
PIG-3926: ROUND_TO function: rounds double/float to fixed number of decimal places (mrflip via cheolsoo)
PIG-3901: Organize the Pig properties file and document all properties (mrflip via cheolsoo)
PIG-3867: Added hadoop home to build classpath for build pig with unit test on windows (Sergey Svinarchuk via gates)
PIG-3914: Change TaskContext to abstract class (cheolsoo)
PIG-3672: Pig should not check for hardcoded file system implementations (rohini)
PIG-3860: Refactor PigStatusReporter and PigLogger for non-MR execution engine (cheolsoo)
PIG-3865: Remodel the XMLLoader to work to be faster and more maintainable (aseldawy via daijy)
PIG-3737: Bundle dependent jars in distribution in %PIG_HOME%/lib folder (daijy)
PIG-3771: Piggybank Avrostorage makes a lot of namenode calls in the backend (rohini)
PIG-3851: Upgrade jline to 2.11 (daijy)
PIG-3884: Move multi store counters to PigStatsUtil from MRPigStatsUtil (rohini)
PIG-3591: Refactor POPackage to separate MR specific code from packaging (mwagner via cheolsoo)
PIG-3449: Move JobCreationException to org.apache.pig.backend.hadoop.executionengine (cheolsoo)
PIG-3765: Ability to disable Pig commands and operators (prkommireddi)
PIG-3731: Ability to specify local-mode specific configuration (useful for local/auto-local mode) (aniket486)
PIG-3793: Provide info on number of LogicalRelationalOperator(s) used in the script through LogicalPlanData (prkommireddi)
PIG-3778: Log list of running jobs along with progress (rohini)
PIG-3675: Documentation for AccumuloStorage (elserj via daijy)
PIG-3648: Make the sample size for RandomSampleLoader configurable (cheolsoo)
PIG-259: allow store to overwrite existing directroy (nezihyigitbasi via daijy)
PIG-2672: Optimize the use of DistributedCache (aniket486)
PIG-3238: Pig current releases lack a UDF Stuff(). This UDF deletes a specified length of characters
and inserts another set of characters at a specified starting point (nezihyigitbasi via daijy)
PIG-3299: Provide support for LazyOutputFormat to avoid creating empty files (lbendig via daijy)
PIG-3642: Direct HDFS access for small jobs (fetch) (lbendig via cheolsoo)
PIG-3730: Performance issue in SelfSpillBag (rajesh.balamohan via rohini)
PIG-3654: Add class cache to PigContext (tmwoodruff via daijy)
PIG-3463: Pig should use hadoop local mode for small jobs (aniket486)
PIG-3573: Provide StoreFunc and LoadFunc for Accumulo (elserj via daijy)
PIG-3653: Add support for pre-deployed jars (tmwoodruff via daijy)
PIG-3645: Move FileLocalizer.setR() calls to unit tests (cheolsoo)
PIG-3637: PigCombiner creating log spam (rohini)
PIG-3632: Add option to configure cacheBlocks in HBaseStorage (rohini)
PIG-3619: Provide XPath function (Saad Patel via gates)
PIG-3590: remove PartitionFilterOptimizer from trunk (aniket486)
PIG-3580: MIN, MAX and AVG functions for BigDecimal and BigInteger (harichinnan via cheolsoo)
PIG-3569: SUM function for BigDecimal and BigInteger (harichinnan via rohini)
PIG-3505: Make AvroStorage sync interval take default from io.file.buffer.size (rohini)
PIG-3563: support adding archives to the distributed cache (jdonofrio via cheolsoo)
PIG-3388: No support for Regex for row filter in org.apache.pig.backend.hadoop.hbase.HBaseStorage (lbendig via cheolsoo)
PIG-3522: Remove shock from pig (daijy)
PIG-3295: Casting from bytearray failing after Union even when each field is from a single Loader (knoguchi)
PIG-3444: CONCAT with 2+ input parameters fail (lbendig via daijy)
PIG-3117: A debug mode in which pig does not delete temporary files (ihadanny via cheolsoo)
PIG-3484: Make the size of pig.script property configurable (cheolsoo)
OPTIMIZATIONS
PIG-3882: Multiquery off mode execution is not done in batch and very inefficient (rohini)
BUG FIXES
PIG-4037: TestHBaseStorage, TestAccumuloPigCluster has failures with hadoopversion=23 (daijy)
PIG-4005: depend on hbase-hadoop2-compat rather than hbase-hadoop1-compat when hbaseversion is 95 (daijy)
PIG-4021: Fix TestHBaseStorage failure after auto local mode change (PIG-3463) (daijy)
PIG-4029: TestMRCompiler is broken after PIG-3874 (daijy)
PIG-4030: TestGrunt, TestPigRunner fail after PIG-3892 (daijy)
PIG-3975: Multiple Scalar reference calls leading to missing records (knoguchi via rohini)
PIG-4017: NPE thrown from JobControlCompiler.shipToHdfs (cheolsoo)
PIG-3997: Issue on Pig docs: Testing and Diagnostics (zjffdu via cheolsoo)
PIG-3998: Documentation fix: invalid page links, wrong Groovy udf example (lbendig via cheolsoo)
PIG-4000: Minor documentation fix for PIG-3642 (lbendig via cheolsoo)
PIG-3991: TestErrorHandling.tesNegative7 is broken in trunk/branch-0.13 (cheolsoo)
PIG-3990: ant docs is broken in trunk/branch-0.13 (cheolsoo)
PIG-3989: PIG_OPTS does not work with some version of HADOOP (daijy)
PIG-3739: The Warning_4 e2e test is broken in trunk (aniket486)
PIG-3976: Typo correction in JobStats breaks Oozie (rohini)
PIG-3874: FileLocalizer temp path can sometimes be non-unique (chitnis via cheolsoo)
PIG-3967: Grunt fail if we running more statement after first store (daijy)
PIG-3915: MapReduce queries in Pigmix outputs different results than Pig's (keren3000 via daijy)
PIG-3955: Remove url.openStream() file descriptor leak from JCC (aniket486)
PIG-3958: TestMRJobStats is broken in 0.13 and trunk (aniket486)
PIG-3949: HiveColumnarStorage compile failure with Hive 0.14.0 (daijy)
PIG-3960: Compile fail against Hadoop 2.4.0 after PIG-3913 (daijy)
PIG-3956: UDF profile is often misleading (cheolsoo)
PIG-3950: Removing empty file PColFilterExtractor.java speeds up rebuilds (mrflip via cheolsoo)
PIG-3940: NullPointerException writing .pig_header for field with null name in JsonMetadata.java (mrflip via cheolsoo)
PIG-3944: PigNullableWritable toString method throws NPE on null value (mauzhang via cheolsoo)
PIG-3936: DBStorage fails on storing nulls for non varchar columns (jeremykarn via cheolsoo)
PIG-3945: Ant not sending hadoopversion to piggybank sub-ant (mrflip via cheolsoo)
PIG-3942: Util.buildPp() is incompatible with Non-MR execution engine (cheolsoo)
PIG-3902: PigServer creates cycle (thedatachef via cheolsoo)
PIG-3930: "java.io.IOException: Cannot initialize Cluster" in local mode with hadoopversion=23 dependencies (jira.shegalov via cheolsoo)
PIG-3921: Obsolete entries in piggybank javadoc build script (mrflip via cheolsoo)
PIG-3923: Gitignore file should ignore all generated artifacts (mrflip via cheolsoo)
PIG-3922: Increase Forrest heap size to avoid OutOfMemoryError building docs (mrflip via cheolsoo)
PIG-3916: isEmpty should not be early terminating (rohini)
PIG-3859: auto local mode should not modify reducer configuration (aniket486)
PIG-3909: Type Casting issue (daijy)
PIG-3905: 0.12.1 release can't be build for Hadoop2 (daijy)
PIG-3894: Datetime function AddDuration, SubtractDuration and all Between functions don't check for null values in the input tuple (jennythompson via cheolsoo)
PIG-3889: Direct fetch doesn't set job submission timestamps (cheolsoo)
PIG-3895: Pigmix run script has compilation error (rohini)
PIG-3885: AccumuloStorage incompatible with Accumulo 1.6.0 (elserj via daijy)
PIG-3888: Direct fetch doesn't differentiate between frontend and backend sides (lbendig via daijy)
PIG-3887: TestMRJobStats is broken in trunk (cheolsoo)
PIG-3868: Fix Iterator_1 e2e test on windows (ssvinarchukhorton via rohini)
PIG-3871: Replace org.python.google.* with com.google.* in imports (cheolsoo)
PIG-3858: PigLogger/PigStatusReporter is not set for fetch tasks (lbendig via cheolsoo)
PIG-3798: Registered jar in pig script are appended to the classpath multiple times (cheolsoo)
PIG-3844: Make ScriptState InheritableThreadLocal for threads that need it (amatsukawa via cheolsoo)
PIG-3837: ant pigperf target is broken in trunk (cheolsoo)
PIG-3836: Pig signature has has guava version dependency (amatsukawa via cheolsoo)
PIG-3832: Fix piggybank test compilation failure after PIG-3449 (rohini)
PIG-3807: Pig creates wrong schema after dereferencing nested tuple fields with sorts (daijy)
PIG-3802: Fix TestBlackAndWhitelistValidator failures (prkommireddi)