forked from snowplow/snowplow-rdb-loader
-
Notifications
You must be signed in to change notification settings - Fork 0
/
CHANGELOG
695 lines (635 loc) · 35.5 KB
/
CHANGELOG
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
Version 5.0.0 (2022-10-18)
--------------------------
GCP variant of Snowflake Loader and Stream Transformer (#1091)
Parquet with different schemes fail in databricks loader (#1085)
Loader: add telemetry (#617)
Transformer Kinesis: recover from IllegalArgumentException when checkpointing near end of shard (#1088)
Version 4.3.0 (2022-09-23)
--------------------------
Loader: fix inserting timestamps with wrong timezone to manifest table (#1069)
Snowflake Loader: prioritize transformedStage over loadAuthMethod in config (#1061)
Redshift loader: Pre-transaction migrations did not run (#1051)
Redshift Loader: Failed migration for event batches with multiple version of the same upgraded schema (#1058)
Redshift Loader: Column resizing committed for the failed migrations. Causing errors during retries. (#1057)
Improve stacktraces on JDBC exceptions (#1045)
Version 4.2.2 (2022-09-06)
--------------------------
Loader: feature flag to disable adding load_tstamp column (#1041)
Loader: catch and ignore errors when adding load_tstamp column (#1039)
Loader: improvements to initialization logs and monitoring (#1040)
Databricks loader: make catalog prepended to queries optional (#992)
Version 4.2.1 (2022-07-25)
--------------------------
Snowflake Loader: fix loading path used with temp creds auth method (#1002)
Snowflake Loader: drop temp table in the first step of the load operation (#1001)
Version 4.2.0 (2022-07-19)
--------------------------
Snowflake loader: make on_error continue when type of the incoming data is parquet (#970)
Loader: make the part appended to folder monitoring staging path configurable (#969)
Snowflake loader: make the path used with stage adjustable (#968)
Loader: retry on target initialization (#964)
Snowflake loader: use STS tokens for copying from S3 (#955)
Transformer kinesis: Recover from losing lease to a new worker (#962)
Databricks loader: Generate STS tokens for copying from S3 (#954)
Snowflake loader: Specify file format in the load statement (#957)
Loader: Trim alert message payloads to 4096 characters (#956)
Version 4.1.0 (2022-07-04)
--------------------------
Databricks loader: Support for generated columns (#951)
Loader: Use explicit schema name everywhere (#952)
Loader: Jars cannot load jsch (#942)
Snowflake loader: region and account configuration fields should be optional (#947)
Loader: Include the SQLState when logging a SQLException (#941)
Loader: Handle run directories with UUID suffix in folder monitoring (#949)
Add UUID to streaming transformer directory structure (#945)
Version 4.0.4 (2022-06-20)
--------------------------
Transformer kinesis: make Kinesis consumer more configurable (#865)
Transformer: split batch and streaming configs (#937)
Version 4.0.3 (2022-06-16)
--------------------------
Transformer kinesis: version 4.0.2 Throws java.lang.InterruptedException: sleep interrupted (#938)
Version 4.0.2 (2022-06-14)
--------------------------
Transformer kinesis: Bump parquet-hadoop to 1.12.3 (#933)
Transformer kinesis: exclude hadoop transitive dependencies (#932)
Batch Transformer: add fileFormat field to formats section of example hocon (#848)
Common: set region in the SQS client builder (#587)
Common: Snyk action should only run on push to master (#929)
Snowflake loader: Bump snowflake-jdbc to 3.13.9 (#928)
Loader: use forked version of jsch lib for ssh (#927)
Loader: recover from exceptions on alerting webhook (#925)
Transformer kinesis: always end up in consistent state (#873)
Transformer kinesis: no checkpointing until after SQS message is sent (#917)
Loader: Add logging around using SSH tunnel (#923)
Transformer-kinesis: add missing hadoop-aws dependency for s3 parquet files upload (#920)
Loader: Timeouts on JDBC statements (#914)
Snowflake Loader: make ON_ERROR copy option configurable (#912)
Version 4.0.1 (2022-06-03)
--------------------------
Common: change http4s client backend to blaze-client (#905)
RDB Loader: Fix sqs visibility extensions when processing retries (#908)
Databricks Loader: bump Databricks JDBC driver to 2.6.25 (#910)
Version 4.0.0 (2022-05-26)
--------------------------
Common: Change http4s client backend to async-http-client (#903)
Common: bump http4s to 0.21.33 (#902)
Transformer kinesis: support Parquet output option (#900)
Loader: check if target is ready before submitting the statement (#846)
Add Databricks as a destination (#860)
Batch Transformer: add Parquet output option (#896)
Transformer kinesis: add telemetry (#863)
Transformer kinesis: write shredding_complete.json to S3 (#867)
Snowflake Loader: add load_tstamp (#815)
Redshift Loader: add load_tstamp (#571)
Transformer kinesis: report metrics (#862)
RDB Loader: emit latency statistics on constant intervals (#795)
Transformer kinesis: use output of transformation in updating global state (#824)
Transformer kinesis: fix updating total and bad number of events counter in global state (#823)
Transformer kinesis: add tests for whole processing pipeline (#835)
Transformer kinesis: fix passing checkpoint action during creation of windowed records (#762)
Version 3.0.3 (2022-05-18)
--------------------------
Loader: bump version of load_succeeded schema to 3.0.0 (#889)
Common: bump schema-ddl to 0.15.0 (#894)
Version 3.0.2 (2022-05-12)
--------------------------
Common: bump snowplow-scala-analytics-sdk to 3.0.1 (#872)
Common: publish arm64 and amd64 docker images (#875)
Common: publish distroless docker image (#877)
Common: bump jackson-databind to 2.13.2.2 (#879)
Version 3.0.1 (2022-04-28)
--------------------------
Snowflake Loader: fix folder monitoring copy statement (#851)
Snowflake Loader: make default 'storage.type' Snowflake (#828)
Snowflake Loader: resume warehouse for each loading (#843)
Version 3.0.0 (2022-04-01)
--------------------------
RDB Loader: add Snowflake support (#792)
RDB Loader: support loading wide row (#791)
RDB Loader: extract redshift loader into a separate module (#790)
RDB Loader: modularize configuration to support multiple destinations (#789)
Batch Shredder: add invalid timestamp check (#652)
Batch Shredder: transform events to wide row (#649)
Stream Shredder: add invalid timestamp check (#659)
Stream Shredder: transform events to wide row (#650)
Common: rename shredders to transformers (#793)
Transformer Batch: make it possible to disable spark caching via config (#808)
Transformer Batch: remove event validation (#805)
Version 2.2.0 (2022-02-24)
--------------------------
RDB Loader: stop consuming SQS messages when Loader is busy (#746)
RDB Loader: don't enqueue folders into retry queue if the queue is non-empty (#744)
RDB Loader: mention amount of retries in retry logic (#745)
RDB Loader: make retry behaviour configurable (#742)
RDB Loader: expose an actual AWS exception in readKey (#740)
RDB Loader: fix not respecting no-op schedule if app starts in a window (#724)
RDB Loader: send health check data to statsd (#700)
RDB Loader: add webhook setting to reference config file (#713)
RDB Loader: make all timeouts configurable (#624)
RDB Loader: add loading timeout (#668)
RDB Loader: send total attempts in load_succeeded (#717)
RDB Loader: fix Retry Queue dropping after first attempt (#716)
RDB Loader: make sure health check sends only one alarm (#733)
RDB Shredder: introduce since and until options (#570)
Common: bump aws-java-sdk to 1.12.161 (#736)
Common: bump jackson-dataformat-cbor to 2.12.6 (#550)
Common: bump sbt to 1.6.2 (#735)
Stream Shredder: bump kafka-clients to 2.7.2 (#732)
Stream Shredder: bump commons-io to 2.7 (#731)
Stream Shredder: bump protobuf-java to 3.16.1 (#730)
Version 2.1.0 (2022-01-19)
--------------------------
RDB Loader: track when a load succeeded (#574)
RDB Loader: add DB health monitoring (#656)
RDB Loader: add retry queue (#655)
RDB Loader: switch to HikariCP (#654)
RDB Loader: handle the whole loading within a same transaction (#646)
RDB Loader: use statsd counter instead of gauge for event counts (#523)
RDB Loader: remove 'steps' setting (#626)
RDB Loader: add no-op schedule (#599)
RDB Loader: unify monitoring (#576)
RDB Shredder: allow configuring deduplication (#583)
RDB Shredder: optimize DAG by excluding count (#582)
Version 2.0.0 (2021-12-01)
--------------------------
Batch Shredder: send shredding_complete to SNS (#595)
Stream Shredder: send shredding_complete to SNS (#616)
Common: split shredder and loader config (#596)
RDB Loader: deprecate steps (#625)
Common: use sbt-dynver plugin (#610)
Common: bump aws-java-sdk from 1.11.1019 to 1.12.31 (#613)
Common: bump jackson-scala-module to 2.12.3 (#566)
Common: bump jackson-databind to 2.12.3 (#614)
Common: bump aws-java-sdk-v2 from 2.16.23 to 2.17.59 (#615)
Version 1.2.3 (2021-11-22)
--------------------------
Common: bump schema-ddl to 0.14.3 (#632)
RDB Loader: notify about failure outside of loading (#636)
RDB Loader: don't allow folder monitoring to crash the loader (#628)
RDB Loader: make sure folder monitoring cannot execute concurrently (#627)
Version 1.2.2 (2021-11-12)
--------------------------
RDB Loader: remove logging class from a message (#623)
RDB Loader: add until option to folder monitoring (#620)
RDB Loader: extend SQS messages visibility timeout during loading (#608)
Version 1.2.1 (2021-10-20)
--------------------------
RDB Loader: add since option to folder monitoring (#600)
RDB Loader: drop a root folder from folders monitoring (#602)
Version 1.2.0 (2021-09-03)
--------------------------
Common: bump sbt-scoverage from 1.6.1 to 1.8.2 (#487)
Common: bump scala-library from 2.12.12 to 2.12.14 (#486)
Common: bump sbt-coveralls from 1.2.7 to 1.3.1 (#516)
Common: bump sbt-tpolecat from 0.1.14 to 0.1.20 (#490)
Common: bump decline from 1.4.0 to 2.1.0 (#529)
Common: bump sbt from 1.5.2 to 1.5.5 (#534)
Common: bump http4s-blaze-client from 0.21.21 to 0.21.25 (#538)
Common: bump slf4j-simple from 1.7.30 to 1.7.32 (#540)
Common: bump iglu-scala-client to 1.1.1 (#541)
Common: bump schema-ddl to 0.14.1 (#536)
RDB Shredder: bump spark-core to 3.1.1 (#544)n
RDB Shredder: integrate Sentry (#510)
RDB Shredder: skip CrossBatchDeduplicationSpec (#462)
RDB Loader: split migration into pre-transaction and in-transaction statements (#548)
RDB Loader: change docker base image to adoptopenjdk:11-jre-hotspot-focal (#543)
RDB Loader: bump doobie-core from 0.12.1 to 0.13.4 (#474)
RDB Loader: bump redshift-jdbc42-no-awssdk from 1.2.54.1082 to 1.2.55.1083 (#535)
RDB Loader: clarify error message when connection acquistion has failed (#525)
RDB Loader: add monitoring for unloaded and corrupted runs (#457)
RDB Loader: add webhook-based alarming (#458)
RDB Loader: manage several parallel connections (#537)
Version 1.1.1 (2021-07-29)
--------------------------
RDB Loader: don't raise duplicate SQS message as an error (#542)
Version 1.1.0 (2021-06-08)
--------------------------
Automate the creation of release PR (#498)
Attach jar files to github releases (#447)
Common: use Base64.getDecoder instead of Base64.getUrlDecoder to decode shredder config (#495)
RDB Loader: report metrics to StatsD or Stdout (#384)
Version 1.0.1 (2021-05-19)
--------------------------
RDB Loader: bump redshift-jdbc42-no-awssdk from 1.2.51.1078 to 1.2.54.1082 (#393)
RDB Loader: downgrade sentry-java to 1.7.30 (#452)
RDB Loader: crash loader on discovery failures (#461)
RDB Loader: remove unused messages buffer (#459)
RDB Loader: fix connection acquisition messages (#431)
RDB Loader: fix false cache invalidation in JSONPath resolution (#451)
Common: add jsonpaths to config example (#336)
Common: point to Snowplow roadmap in the README (#450)
Common: bump schema-ddl from 0.12.0 to 0.13.0 (#408)
Common: bump fs2 from 2.5.1 to 2.5.6 (#449)
Common: bump aws-java-sdk from 1.11.990 to 1.11.1019 (#453)
Common: bump kind-projector from 0.11.3 to 0.13.0 (#454)
Common: bump sbt from 1.4.9 to 1.5.2 (#442)
Version 1.0.0 (2021-04-14)
--------------------------
Common: bump base-debian to 0.2.2 (#380)
Common: bump pureconfig from 0.14.0 to 0.14.1 (#329)
Common: bump aws-java-sdk from 1.11.916 to 1.11.990 (#370)
Common: bump decline from 1.3.0 to 1.4.0 (#333)
Common: bump snowplow-scala-tracker to 1.0.0 (#372)
Common: bump kind-projector to 0.11.3 (#371)
Common: bump sbt from 1.4.4 to 1.4.9 (#344)
Common: use a single tag to publish all assets (#379)
Common: remove compRows configuration parameter (#390)
RDB Loader: add 2nd gen load manifest table (#366)
RDB Loader: migrate to doobie (#367)
RDB Loader: skip empty folders (#357)
RDB Shredder: check for incomplete folders asynchronously (#385)
RDB Shredder: shade cats in sbt-assembly (#382)
Common: merge good and bad output folders (#358)
Common: extract shredding logic into common module (#355)
Common: add events count to the shredding complete message (#376)
Stream Shredder: add (#354)
Release 35 (2021-01-27)
-----------------------
RDB Shredder: add min and max timestamps of events for the batch to the SQS message with shredded types (#275)
RDB Shredder: add archive discovery (#263)
RDB Shredder: send shredding info to SQS when it's done (#200)
RDB Shredder: swtich to a HOCON config (#256)
RDB Loader: switch to SQS-only loading (#262)
RDB Loader: swtich to a HOCON config (#250)
RDB Loader: remove EmrEtlRunner support (#251)
RDB Loader: build Docker image (#247)
RDB Loader: integrate Sentry (#235)
RDB Loader: add discovery through SQS stream (#234)
Common: extend copyright notice to 2021 (#287)
Common: get rid of atomic-events folder (#183)
Bump decline from 0.6.2 to 1.3.0 (#223)
Bump aws-java-sdk from 1.11.319 to 1.11.916 (#244)
Bump scalacheck from 1.14.0 to 1.14.3 (#243)
Bump circe-yaml from 0.9.0 to 0.13.1 (#226)
Bump snowplow-events-manifest from 0.2.0 to 0.3.0 (#224)
Bump fs2-core from 2.4.4 to 2.4.6 (#222)
Bump redshift-jdbc42-no-awssdk from 1.2.36.1060 to 1.2.51.1078 (#208)
Bump kind-projector from 0.9.6 to 0.9.10 (#217)
Release 34 (2020-12-09)
-----------------------
RDB Shredder: escape newlines and tabs (#238)
Release 33 (2020-12-01)
-----------------------
Common: integrate coveralls (#220)
Common: integrate Snyk (#188)
Common: integrate Scala Steward (#206)
Common: remove Processing Manifest (#186)
Common: disable test in assembly (#202)
Common: fix ambigrous empty string (#171)
Common: create a separate modules root (#195)
Common: switch to Github Actions (#189)
Common: bump snowplow-scala-analytics-sdk to 2.1.0 (#201)
Common: bump snowplow-badrows to 2.1.0 (#199)
Common: bump specs2-core to 4.10.5 (#197)
Common: bump schema-ddl to 0.12.0 (#192)
Common: bump sbt to 1.4.4 (#196)
Common: bump Scala to 2.12 (#127)
RDB Shredder: optimize DAG (#181)
RDB Shredder: use single etl_tstamp for cross-batch deduplication (#180)
RDB Shredder: bump Spark to 3.0.1 (#101)
RDB Loader: remove Postgres support (#191)
RDB Loader: bump jsch to 0.1.55 (#177)
Loader: mitigate exhausted input error (#203)
Loader: drop legacy S3 paths (#198)
Loader: switch from Free monad to cats-effect IO (#184)
Loader: fix ignored schema in metadata queries (#178)
Release 32 (2020-03-06)
-----------------------
Common: remove VERSION file (#164)
Common: force storage target ids to be valid UUID (#168)
Common: bump iglu-scala-client to 0.6.2 (#167)
Common: add Local Iglu Server to CI/CD process (#157)
Common: allow tabular blacklisting (#156)
RDB Shredder: use snowplow-badrows (#169)
RDB Shredder: fix false positive cross-batch deduplication for RT pipelines (#173)
RDB Shredder: add tabular data output (#151)
RDB Shredder: bump to 0.16.0 (#154)
RDB Loader: bump redshift-jdbc to 1.2.36 (#61)
RDB Loader: add tabular output loading (#152)
RDB Loader: bump to 0.17.0 (#153)
Release 31 (2019-08-14)
-----------------------
Common: add Snowplow Maven resolver (#147)
Common: extend copyright notice to 2019 (#144)
Common: bump sbt-assembly to 1.4.9 (#135)
Common: bump SBT to 1.2.8 (#134)
Common: bump Travis oraclejdk version to 11 (#146)
Common: bump Scala to 2.11.12 (#136)
Common: switch to OpenJDK (#159)
Common: bump release-manager to 0.4.1 (#132)
RDB Shredder: bump to 0.15.0 (#139)
RDB Shredder: factor out scala-common-enrich (#138)
RDB Shredder: persist synthetic duplicates on disk (#142)
RDB Shredder: replace real DynamoDB with local version in CrossBatchDeduplicationSpec (#149)
RDB Shredder: truncate string values exceeding column's limits (#143)
RDB Shredder: bump Spark to 2.3.2 (#162)
RDB Loader: bump to 0.16.0 (#140)
RDB Loader: bump iglu-scala-client to 0.6.0 (#141)
RDB Loader: bump snowplow-scala-tracker to 0.6.1 (#148)
RDB Loader: retry on refused connection (#8)
RDB Loader: fix non-exhaustive match in data discovery (#117)
Release 30 (2018-08-23)
-----------------------
Common: add VERSION file (#82)
Common: bump SBT to 1.1.6 (#89)
Common: use processing manifest (#81)
Common: fix README links (#109)
Common: extend copyright notice to 2018 (#91)
RDB Shredder: bump to 0.14.0 (#96)
RDB Shredder: use RDB Loader AWS SDK (#94)
RDB Shredder: remove auto-creation of event manifests table (#62)
RDB Loader: bump to 0.15.0 (#100)
RDB Loader: improve log output (#23)
RDB Loader: escape input for sanitize function (#87)
RDB Loader: remove unused configuration properties (#95)
RDB Loader: add ability to skip all load manifest interactions (#97)
RDB Loader: make SSL configuration compatible with native JDBC settings (#73)
RDB Loader: improve "no data discovered" error message (#69)
RDB Loader: tolerate deleted folder artifacts when consistency_check is skipped (#80)
RDB Loader: fix manifest population assumes load is most recent (#70)
RDB Loader: fail load if entry exists in manifest (#14)
RDB Loader: bump AWS SDK to 1.11.319 (#79)
RDB Loader: bump snowplow-scala-tracker to 0.5.0 (#57)
RDB Loader: bump specs2 to 4.0.4 (#90)
RDB Loader: bump cats to 1.1.0 (#77)
RDB Loader: fix typo in load_succeeded schema (#108)
Release 29 (2018-06-11)
-----------------------
RDB Shredder: bump to 0.13.1 (#105)
RDB Shredder: bump scala-common-enrich to 0.32.0 (#99)
RDB Shredder: align PostgresConstraints with atomic.events 0.10.0 (#103)
Release 28 (2017-11-13)
-----------------------
Common: add CI/CD (#55)
Common: remove AWS Java SDK shading (#54)
RDB Shredder: add Snowplow and Clojars resolvers (#56)
RDB Shredder: bump Spark to 2.2.0 (#52)
RDB Shredder: bump to 0.13.0 (#49)
RDB Shredder: bump scala-common-enrich to 0.27.0 (#39)
RDB Shredder: overwrite output datasets (#41)
RDB Loader: bump sbt-assembly to 0.14.5 (#51)
RDB Loader: bump SBT to 0.13.16 (#50)
RDB Loader: allow JDBC credentials to be stored in EC2 parameter store (#19)
RDB Loader: add support for SSH tunnels (#22)
RDB Loader: bump AWS SDK to 1.11.208 (#48)
RDB Loader: bump redshift-jdbc to 1.2.8.1005 (#40)
RDB Loader: make loading shredded data always required (#29)
RDB Loader: remove tracking from dry run (#42)
RDB Loader: execute manifest insert in same transaction as load (#36)
RDB Loader: make logkey optional (#35)
Version 0.13.0 (2017-09-06)
---------------------------
Common: migrate CHANGELOG from snowplow/snowplow (#24)
Common: add AWS staging credentials to .travis.yml (#37)
Common: add AWS Credentials to .travis.yml (#28)
RDB Shredder: turn into SBT submodule (#27)
RDB Loader: bump to 0.13.0 (#38)
RDB Loader: fix JSONPath cache resolution bug (#3)
RDB Loader: add step to skip consistency check (#34)
RDB Loader: add --dry-run option (#31)
RDB Loader: add CLI argument to load specific folder (#9)
RDB Loader: add CI/CD (#30)
Snowplow Release 90 Lascaux (2017-07-26)
----------------------------------------
RDB Loader: fix eventual consistency problem (snowplow/snowplow#3113)
RDB Loader: load all runs from shredded, not just the first run found (snowplow/snowplow#2962)
RDB Loader: remove compupdate step (snowplow/snowplow#3178)
RDB Loader: add logging around database load, analyze and vacuum (snowplow/snowplow#2935)
RDB Loader: use Redshift-specific driver to connect to Redshift (snowplow/snowplow#1830)
RDB Loader: remove StorageLoader (snowplow/snowplow#3026)
RDB Loader: accept storage target JSONs on command-line (snowplow/snowplow#3022)
RDB Loader: rewrite StorageLoader in Scala, removing file archiving step (snowplow/snowplow#3023)
Snowplow Release 89 Plain of Jars (2017-06-12)
----------------------------------------------
RDB Shredder: bump to 0.12.0 (snowplow/snowplow#3042)
RDB Shredder: rename from Scala Hadoop Shred (snowplow/snowplow#3031)
RDB Shredder: move from 3-enrich to 4-storage (snowplow/snowplow#3032)
RDB Shredder: change the package to storage from enrich (snowplow/snowplow#3036)
RDB Shredder: port from Scalding to Spark (snowplow/snowplow#3034)
RDB Shredder: bump scala-common-enrich to 0.25 (snowplow/snowplow#3091)
RDB Shredder: bump iglu-scala-client to 0.5.0 (snowplow/snowplow#3090)
RDB Shredder: bump specs2-core to 2.3.13 (snowplow/snowplow#3093)
RDB Shredder: bump Scala version to 2.11 (snowplow/snowplow#3071)
RDB Shredder: upgrade to Java 8 (snowplow/snowplow#3213)
RDB Shredder: run the unit tests systematically in Travis (snowplow/snowplow#3229)
StorageLoader: bump to 0.11.0 (snowplow/snowplow#3214)
StorageLoader: add support for Spark-based Shredder's directory structure (snowplow/snowplow#3044)
EmrEtlRunner: replace hadoop_shred in config.yml.sample with rdb_shredder (snowplow/snowplow#3035)
Snowplow Release 88 Angkor Wat (2017-04-27)
-------------------------------------------
Scala Hadoop Shred: bump to 0.11.0 (snowplow/snowplow#3041)
Scala Hadoop Shred: bump sbt-assembly to 0.14.4 (snowplow/snowplow#3140)
Scala Hadoop Shred: bump SBT to 0.13.13 (snowplow/snowplow#2972)
Scala Hadoop Shred: remove explicit jackson-databind dependency (snowplow/snowplow#3138)
Scala Hadoop Shred: add cross-batch natural deduplication (snowplow/snowplow#2999)
StorageLoader: bump to 0.10.0 (snowplow/snowplow#3109)
StorageLoader: remove Northern Virginia endpoint for Postgres load (snowplow/snowplow#3143)
StorageLoader: handle return code of 4 for EmrEtlRunner in snowplow-runner-and-loader.sh (snowplow/snowplow#3139)
StorageLoader: use storage target JSONs instead of targets section in config.yml (snowplow/snowplow#2992)
StorageLoader: replace table configuration property with schema (snowplow/snowplow#2458)
Common: update READMEs markdown in according with CommonMark (snowplow/snowplow#3157)
Common: add CI/CD for EmrEtlRunner and StorageLoader (snowplow/snowplow#3102)
Snowplow Release 87 Chichen Itza (2017-02-21)
---------------------------------------------
StorageLoader: bump to 0.9.0 (snowplow/snowplow#2961)
StorageLoader: bump JRuby version to 9.1.6.0 (snowplow/snowplow#3051)
StorageLoader: fix typo in S3Tasks.download_events (snowplow/snowplow#2888)
StorageLoader: update manifest table as part of Redshift load transaction (snowplow/snowplow#2280)
Snowplow Release 86 Petra (2016-12-20)
--------------------------------------
Scala Hadoop Shred: bump to 0.10.0 (snowplow/snowplow#2979)
Scala Hadoop Shred: add general top-level exception handling (snowplow/snowplow#2071)
Scala Hadoop Shred: get the CustomPartitionSourceTest working with Hadoop 2.4 (snowplow/snowplow#1960)
Scala Hadoop Shred: fix omitted string interpolation (snowplow/snowplow#2562)
Scala Hadoop Shred: deduplicate event_ids with different event_fingerprints (synthetic duplicates) (snowplow/snowplow#24)
Scala Hadoop Shred: stop catching fatal errors (snowplow/snowplow#1456)
Snowplow Release 83 Bald Eagle (2016-09-06)
-------------------------------------------
StorageLoader: bump to 0.8.0 (snowplow/snowplow#2785)
StorageLoader: bump Ruby version to 2.2.3 (snowplow/snowplow#2870)
StorageLoader: bump Sluice to 0.4.0 (snowplow/snowplow#2786)
StorageLoader: bump Contracts to 0.9 (snowplow/snowplow#2790)
StorageLoader: add explicit mime-types dependency (snowplow/snowplow#2805)
StorageLoader: rebuild Gemfile.lock (snowplow/snowplow#2871)
StorageLoader: use Northern Virginia endpoint not global endpoint for us-east-1 (snowplow/snowplow#2748)
StorageLoader: replace module_function everywhere with self (snowplow/snowplow#2801)
StorageLoader: fix broken contracts (snowplow/snowplow#2461)
Snowplow Release 79 Black Swan (2016-05-12)
-------------------------------------------
Scala Hadoop Shred: bumped to 0.9.0 (snowplow/snowplow#2480)
Scala Hadoop Shred: bumped Scala Common Enrich to 0.23.0 (snowplow/snowplow#2481)
Scala Hadoop Shred: bumped Iglu Scala Client to 0.4.0 (snowplow/snowplow#2449)
Snowplow Release 77 Great Auk (2016-02-28)
------------------------------------------
Scala Hadoop Shred: bumped to 0.8.0
Scala Hadoop Shred: bumped Iglu Scala Client to 0.3.2 (snowplow/snowplow#2319)
StorageLoader: bumped to 0.7.0
StorageLoader: added support for supplying config file as Base64-encoded string (snowplow/snowplow#2227)
StorageLoader: added ability to retrieve AWS credentials from EC2 role (snowplow/snowplow#2226)
StorageLoader: excluded previously-built executables from the build (snowplow/snowplow#2164)
StorageLoader: started printing stack trace for failures not caused by bad configuration (snowplow/snowplow#2160)
StorageLoader: bumped Ruby Tracker to 0.5.2 (snowplow/snowplow#2144)
StorageLoader: moved ANALYZE statements after VACUUM statements (snowplow/snowplow#1361)
StorageLoader: added resolver config option to snowplow-runner-and-loader.sh (snowplow/snowplow#2170)
StorageLoader: updated snowplow-runner-and-loader.sh to use JRuby binaries (snowplow/snowplow#2233)
StorageLoader: removed snowplow-storage-loader.sh (snowplow/snowplow#2444)
Snowplow Release 76 Changeable Hawk-Eagle (2016-01-26)
------------------------------------------------------
Scala Hadoop Shred: bumped to 0.7.0
Scala Hadoop Shred: fixed good tests' checks for empty paths (snowplow/snowplow#2278)
Scala Hadoop Shred: now deduplicating event_id and event_fingerprint pairs (snowplow/snowplow#2246)
Scala Hadoop Shred: fixed incorrect event in SchemaValidationFailed1Spec (snowplow/snowplow#2355)
Scala Hadoop Shred: updated tests to check atomic-events output (snowplow/snowplow#2264)
Scala Hadoop Shred: now only writes atomic-events if JSONs shred successfully (snowplow/snowplow#2245)
Scala Hadoop Shred: removed empty SchemaValidationFailed2Spec (snowplow/snowplow#2271)
Scala Hadoop Shred: fixed test suite issue with multiple input lines (snowplow/snowplow#2270)
Snowplow Release 73 Cuban Macaw (2015-12-04)
--------------------------------------------
Scala Hadoop Shred: bumped to 0.6.0
Scala Hadoop Shred: added .forceToDisk to common to speed up run (snowplow/snowplow#2039)
Scala Hadoop Shred: bumped Iglu Scala Client to 0.3.1 (snowplow/snowplow#2081)
Scala Hadoop Shred: bumped Scala Common Enrich to 0.18.0 (snowplow/snowplow#2016)
Scala Hadoop Shred: applied truncation logic to atomic-events TSV (snowplow/snowplow#2042)
Scala Hadoop Shred: processed enriched events for atomic.events removing JSON fields (snowplow/snowplow#1731)
Scala Hadoop Shred: started using Scala Common Enrich's version of ScalazArgs (snowplow/snowplow#2014)
StorageLoader: bumped to 0.6.0
StorageLoader: added tcpKeepAlive=true to JDBC for long-running COPYs via NAT (snowplow/snowplow#2145)
StorageLoader: fixed setup guide link in README, thanks @diamondo25! (snowplow/snowplow#2025)
StorageLoader: loaded atomic.events from shredded folder (snowplow/snowplow#1795)
Snowplow Release 71 Stork-Billed Kingfisher (2015-10-02)
--------------------------------------------------------
Scala Hadoop Shred: bumped to 0.5.0
Scala Hadoop Shred: updated tests to expect bad row JSONs with timestamps and processing messages (snowplow/snowplow#1953)
Scala Hadoop Shred: added clojars.org as a resolver (snowplow/snowplow#1952)
Scala Hadoop Shred: bumped Scala Common Enrich to 0.16.0 (snowplow/snowplow#1935)
Scala Hadoop Shred: started using BadRow case class from Scala Common Enrich (snowplow/snowplow#1914)
Scala Hadoop Shred: upgraded to Hadoop 2.4 (snowplow/snowplow#1720)
Scala Hadoop Shred: bumped Iglu Scala Client to 0.3.0 (snowplow/snowplow#1221)
StorageLoader: bumped to 0.5.0
StorageLoader: exposed sslmode connection option for loading Postgres and Redshift, thanks @dennisatspaceape! (snowplow/snowplow#1980)
StorageLoader: updated wd_access_log_1.json with 4 new fields (snowplow/snowplow#1941)
Snowplow Release 70 Bornean Green Magpie (2015-08-19)
-----------------------------------------------------
StorageLoader: bumped to 0.4.0
StorageLoader: allowed config to passed in via stdin (snowplow/snowplow#1773)
StorageLoader: added ability to bundle as a JRuby fat jar (snowplow/snowplow#675)
StorageLoader: started loading Postgres via stdin, thanks @mrwalker! (snowplow/snowplow#624)
StorageLoader: added Snowplow event tracking (snowplow/snowplow#679)
StorageLoader: updated to use EmrEtlRunner's expanded config.yml (snowplow/snowplow#1191)
StorageLoader: pinned Contracts to 0.7 (snowplow/snowplow#1497)
StorageLoader: moved "include Contracts" (snowplow/snowplow#1499)
StorageLoader: renamed archive step to archive_enrich (snowplow/snowplow#1544)
StorageLoader: bumped Sluice to 0.2.2 (snowplow/snowplow#1567)
StorageLoader: removed use of symbols for properties in YAML configuration (snowplow/snowplow#1573)
StorageLoader: added Rake task to build app (snowplow/snowplow#1787)
StorageLoader: scrubbed credentials from stderr (snowplow/snowplow#1918)
StorageLoader: added test suite (snowplow/snowplow#1919)
StorageLoader: ensured that _SUCCESS file is written last for enriched events archived to S3 (snowplow/snowplow#1814)
StorageLoader: started automatically converting "s3n" to "s3" in copy statements (snowplow/snowplow#1937)
EmrEtlRunner & StorageLoader: unified the config file format (snowplow/snowplow#878)
EmrEtlRunner & StorageLoader: added support for compressing enriched events, thanks @danisola! (snowplow/snowplow#1265)
EmrEtlRunner & StorageLoader: now supports environment variables in YML config files, thanks @epantera! (snowplow/snowplow#1215)
Snowplow Release 63 Red-Cheeked Cordon-Bleu (2015-04-02)
--------------------------------------------------------
Scala Hadoop Shred: bumped to 0.4.0
Scala Hadoop Shred: bumped Scala Common Enrich to 0.13.0 (snowplow/snowplow#1343)
Scala Hadoop Shred: bumped json4sJackson to 3.2.11 (snowplow/snowplow#1344)
Scala Hadoop Shred: extracted JSONs from derived_contexts field (snowplow/snowplow#786)
Scala Hadoop Shred: updated to reflect new enriched event format (snowplow/snowplow#1332)
Snowplow Version 0.9.14 (2014-12-31)
------------------------------------
Scala Hadoop Shred: bumped to 0.3.0
Scala Hadoop Shred: bumped Scala Common Enrich to 0.10.0 (snowplow/snowplow#1236)
Scala Hadoop Shred: bumped Iglu Scala Client to 0.2.0 (snowplow/snowplow#1230)
Scala Hadoop Shred: loosened match criteria for unstructured events and contexts (snowplow/snowplow#1231)
Snowplow Version 0.9.9 (2014-10-27)
-----------------------------------
StorageLoader: bumped to 0.3.3
StorageLoader: selecting Snowplow's hosted-assets bucket based on region (snowplow/snowplow#1012)
Snowplow Version 0.9.7 (2014-09-02)
-----------------------------------
Scala Hadoop Shred: bumped to version 0.2.1
Scala Hadoop Shred: fixed multiple JSONs not being shredded for a single row (snowplow/snowplow#968)
Scala Hadoop Shred: strengthened test suite (snowplow/snowplow#967)
StorageLoader: bumped to 0.3.2
StorageLoader: removed EMPTYASNULL for loading JSONs (snowplow/snowplow#942)
StorageLoader: made providing jsonpath_assets optional (snowplow/snowplow#958)
StorageLoader: added support for cross-region Redshift COPY (snowplow/snowplow#971)
Snowplow Version 0.9.6 (2014-07-26)
-----------------------------------
Scala Hadoop Shred: bumped to 0.2.0
Scala Hadoop Shred: bumped to Scala Common Enrich 0.5.0 (snowplow/snowplow#918)
Scala Hadoop Shred: trailing empty fields no longer cause shredding for that row to fail (snowplow/snowplow#921)
Scala Hadoop Shred: updated column offsets for enriched events TSV (snowplow/snowplow#915)
StorageLoader: bumped to 0.3.1
StorageLoader: now looking in eu-west-1 region for s3://snowplow-hosted-assets (snowplow/snowplow#895)
StorageLoader: updated combined Bash script to support enrichments path (snowplow/snowplow#917)
Snowplow Version 0.9.5 (2014-07-09)
-----------------------------------
Scala Hadoop Shred: added. Version 0.1.0
StorageLoader: bumped to 0.3.0
StorageLoader: bumped Sluice to 0.2.1 (snowplow/snowplow#881)
StorageLoader: added initial Ruby.contracts support (snowplow/snowplow#391)
StorageLoader: updated config.yml to support shredding (snowplow/snowplow#897)
StorageLoader: added ACCEPTINVCHARS to StorageLoader (snowplow/snowplow#411)
StorageLoader: added :jsonpath_assets: setting for StorageLoader (snowplow/snowplow#606)
StorageLoader: added ability to load custom tables using JSON Paths (snowplow/snowplow#607)
StorageLoader: added --skip shred option (snowplow/snowplow#660)
StorageLoader: added :in: hint on StorageLoader configuration, thanks @joaolcorreia! (snowplow/snowplow#755)
StorageLoader: made sure _SUCCESS flag file is written last for enriched events archived to S3 (snowplow/snowplow#1814) [Fred Blundun]
StorageLoader: pinned Contracts to 0.7 (snowplow/snowplow#1497) [Fred Blundun]
EmrEtlRunner & StorageLoader: validated output_compression configuration using contract (snowplow/snowplow#1820) [Fred Blundun]
EmrEtlRunner & StorageLoader: supported environment variables in YAML config files (snowplow/snowplow#1215) [Fred Blundun]
Snowplow Version 0.9.1 (2014-04-11)
-----------------------------------
StorageLoader: bumped to 0.2.0
StorageLoader: added TIMEFORMAT 'auto' to StorageLoader to handle outlier dvce_timestamps (snowplow/snowplow#427)
Snowplow Version 0.8.11 (2013-10-22)
-----------------------------------
StorageLoader: bumped to 0.1.1
StorageLoader: bumped Sluice to 0.1.5 (snowplow/snowplow#96)
StorageLoader: fixed "\" in fields acts as an escape character for Postgres, thanks @kingo55 (snowplow/snowplow#329)
StorageLoader: added ability to --skip analyze (snowplow/snowplow#335)
StorageLoader: moved VACUUM SORT ONLY to a --include step (snowplow/snowplow#321)
StorageLoader: added COMPROWS to config and --include compupdate option (snowplow/snowplow#344)
StorageLoader: changed Postgres VACUUM FULL to VACUUM (snowplow/snowplow#357)
StorageLoader: added TRUNCATECOLUMNS for Redshift load (snowplow/snowplow#360)
StorageLoader: added FILLRECORD to our Redshift COPY command (snowplow/snowplow#380)
Snowplow Version 0.8.8 (2013-08-04)
-----------------------------------
StorageLoader: bumped to 0.1.0
StorageLoader: bumped Sluice 0.0.7 (snowplow/snowplow#300)
StorageLoader: removed code to delete Hive ETL's empty event files (snowplow/snowplow#306)
StorageLoader: fixed bug where download path has to be set (even when using Redshift) (snowplow/snowplow#280)
StorageLoader: optimized ANALYZE and VACUUM commands (snowplow/snowplow#283)
StorageLoader: added MAXERROR as StorageLoader configuration value for Redshift (snowplow/snowplow#273)
StorageLoader: added support for loading Postgres (snowplow/snowplow#161)
StorageLoader: removed Infobright loading capability (snowplow/snowplow#307)
StorageLoader: added support for loading into multiple storage targets (snowplow/snowplow#311)
Snowplow Version 0.7.6 (2013-03-03)
-----------------------------------
StorageLoader: bumped to 0.0.5
StorageLoader: added Redshift-specific fields to config.yml (part of #159)
StorageLoader: added Redshift load support into StorageLoader (part of #159)
StorageLoader: added missing /Gemfile to BUNDLE_GEMFILE in Bash scripts
Snowplow Version 0.7.1 (2013-01-22)
-----------------------------------
StorageLoader: bumped to 0.0.4
StorageLoader: updated copyright notices
StorageLoader: added .rvmrc file (part of #121, #84)
StorageLoader: removed .gemspec file
StorageLoader: added dependencies to Gemfile and re-generated Gemfile.lock
Snowplow Version 0.7.0 (2013-01-04)
-----------------------------------
StorageLoader: bumped to 0.0.3
StorageLoader: bumped to using Sluice 0.0.6
StorageLoader: added "Complete" message at end of run (part of #97)
StorageLoader: --skip argument now supports a list (snowplow/snowplow#81)
Snowplow Version 0.6.1 (2012-11-28)
-----------------------------------
StorageLoader: bumped to 0.0.2
StorageLoader: changed the data file encloser to NULL (snowplow/snowplow#88)
Snowplow Version 0.6.0 (2012-11-12)
-----------------------------------
StorageLoader: initial release