Skip to content

Latest commit

 

History

History
3001 lines (2641 loc) · 174 KB

CHANGELOG.md

File metadata and controls

3001 lines (2641 loc) · 174 KB

Coming Soon

Starting in Q2 2021, the snowplow/snowplow repository will be moving to Time Based Releases. This will allow us to bundle our recommended Snowplow stack at intervals across the year. We believe this will help us better inform our users of what we are recommending to use right now and what features are available in each release. We want to offer a balance of the latest Snowplow features with full integration and support across the stack. You'll likely see 2 to 4 Time Based Releases per year as we will perform quarterly assessments of whether we want to bundle a new release.

We will continue release our components individually in their respective repositories, so if you wish to be on the bleeding edge you can run the latest versions of the Snowplow stack from outside of the latest Time Based Release.

With each snowplow/snowplow release, you can expect to see this CHANGELOG update to include the main features and fixes since the last release and clarity around the recommended versions on each component. You'll see Blog and Discourse posts to announce the latest features and how you can upgrade from the previous release to the new one.

Release 119 Tycho Magnetic Anomaly Two (2020-04-30)

  • Cloudfront Collector: deprecate (#4319)
  • Clojure Collector: deprecate (#4264)
  • Spark Enrich: deprecate (#4263)
  • Beam Enrich: extract into separate repo (#4282)
  • Scala Stream Collector: bump to 1.0.1 (#4338)
  • Scala Stream Collector: add Snowplow Bintray to resolvers (#4326)
  • Scala Stream Collector: publish Docker image for stdout via Travis (#4333)
  • Scala Stream Collector: fix config example (#4332)
  • Scala Stream Collector: fix incompatible jackson dependencies to enable CBOR (#4266)
  • EmrEtlRunner: bump to 0.37.0 (#4297)
  • EmrEtlRunner: switch AMI bootstrap scripts to HTTPS (#4256)
  • EmrEtlRunner: set ig.count (#4285)
  • EmrEtlRunner: Catch and retry EMR connection exceptions (#4290)
  • Stream Enrich: bump to 1.1.0 (#4296)
  • Stream Enrich: add the possibility to override the DynamoDB endpoint (#3942)
  • Stream Enrich: use info generated by BuildInfo for Processor in Source (#4335)
  • Stream Enrich: remove sbt version from README (#4303)
  • Stream Enrich: bump Scala Common Enrich to 1.1.0 (#4295)
  • Stream Enrich: allow to download data from private S3 or GCS (#4269)
  • Scala Common Enrich: bump to 1.1.0 (#4294)
  • Scala Common Enrich: use recent timestamp for unit tests in WeatherEnrichmentSpec (#4339)
  • Scala Common Enrich: change maxColumns to 140 in .scalafmt.conf (#4314)
  • Scala Common Enrich: add Snowplow Bintray to resolvers (#4325)
  • Scala Common Enrich: return capitalised Unknown value in YAUAA enrichment (#4114)
  • Scala Common Enrich: fix abandon assertions in PiiPseudonymizerEnrichmentSpec (#4322)
  • Scala Common Enrich: handle query string parameters that don't have a value in IgluAdapter (#4330)
  • Scala Common Enrich: bump snowplow-badrows to 1.0.0 (#4292)
  • Scala Common Enrich: fix EtlPipeline short-circuiting on a first bad row (#4320)
  • Scala Common Enrich: add validation for contexts added by enrichments (#3795)
  • Common: fix dead links in collectors' README (#4310)
  • Enrich: update example config for referer parser enrichment to 2-0-0 (#4309)

Release 118 Morgantina (2020-01-16)

  • Spark Enrich: extend copyright notice to 2020 (#4260)
  • Spark Enrich: bump to 2.0.0 (#4236)
  • Spark Enrich: bump scala-common-enrich to 1.0.0 (#4235)
  • Spark Enrich: replace scopt by decline (#4245)
  • Hadoop Event Recovery: remove (#3908)
  • Common: bump Travis Scala version to 2.12.10 (#4150)
  • Scala Stream Collector: extend copyright notice to 2020 (#4261)
  • Scala Stream Collector: bump to 1.0.0 (#4193)
  • Scala Stream Collector: introduce sbt-scalafmt (#4192)
  • Scala Stream Collector: bump sbt-buildinfo to 0.9.0 (#4191)
  • Scala Stream Collector: use sbt-tpolecat (#4190)
  • Scala Stream Collector: bump sbt-assembly to 0.14.9 (#4189)
  • Scala Stream Collector: bump specs2 to 4.5.1 (#4188)
  • Scala Stream Collector: bump pureconfig to 0.11.1 (#4187)
  • Scala Stream Collector: bump akka to 2.5.19 (#4186)
  • Scala Stream Collector: bump prometheus-simpleclient to 0.6.0 (#4184)
  • Scala Stream Collector: bump config to 1.3.4 (#4183)
  • Scala Stream Collector: bump slf4j to 1.7.26 (#4182)
  • Scala Stream Collector: bump joda-time to 2.10.2 (#4181)
  • Scala Stream Collector: bump kafka-clients to 2.2.1 (#4180)
  • Scala Stream Collector: bump google-cloud-pubsub to 1.78.0 (#4179)
  • Scala Stream Collector: bump aws-java-sdk to 1.11.573 (#4178)
  • Scala Stream Collector: integrate the size violation bad row type (#4177)
  • Scala Stream Collector: bump SBT to 1.3.3 (#4176)
  • Scala Stream Collector: bump Scala to 2.12.10 (#4175)
  • Beam Enrich: extend copyright notice to 2020 (#4259)
  • Beam Enrich: bump to 1.0.0 (#4194)
  • Beam Enrich: migrate enrichment specs from Spark Enrich (#4200)
  • Beam Enrich: migrate miscellaneous specs from Spark Enrich (#4199)
  • Beam Enrich: migrate adapters specs from Spark Enrich (#4198)
  • Beam Enrich: bump SBT to 1.3.3 (#4174)
  • Beam Enrich: use sbt-tpolecat (#4172)
  • Beam Enrich: use scalafmt (#4171)
  • Beam Enrich: bump sbt-buildinfo to 0.9.0 (#4170)
  • Beam Enrich: bump scalatest to 3.0.8 (#4169)
  • Beam Enrich: bump paradise to 2.1.1 (#4168)
  • Beam Enrich: bump beam to 2.11.0 (#4167)
  • Beam Enrich: bump scio version to 0.7.4 (#4166)
  • Beam Enrich: bump scala-common-enrich to 1.0.0 (#4165)
  • Beam Enrich: bump Scala to 2.12.10 (#4164)
  • Beam Enrich: bump sbt-native-packager to 1.3.22 (#4163)
  • Stream Enrich: extend copyright notice to 2020 (#4258)
  • Stream Enrich: bump to 1.0.0 (#4162)
  • Stream Enrich: bump snowplow-scala-tracker to 0.6.1 (#4145)
  • Stream Enrich: use sbt-tpolecat (#4173)
  • Stream Enrich: bump sbt-buildinfo to 0.9.0 (#4161)
  • Stream Enrich: bump sbt-assembly to 0.14.9 (#4160)
  • Stream Enrich: bump jinjava to 2.5.0 (#4159)
  • Stream Enrich: bump pureconfig to 0.11.0 (#4158)
  • Stream Enrich: bump scopt to 3.7.1 (#4157)
  • Stream Enrich: bump slf4j to 1.7.26 (#4156)
  • Stream Enrich: bump jackson to 2.9.9 (#4155)
  • Stream Enrich: bump config to 1.3.4 (#4154)
  • Stream Enrich: bump kafka-clients to 2.2.1 (#4153)
  • Stream Enrich: bump amazon-kinesis-client to 1.10.0 (#4152)
  • Stream Enrich: bump aws-java-sdk to 1.11.566 (#4151)
  • Stream Enrich: add custom .jvmopts file (#4104)
  • Stream Enrich: bump specs2 to 4.5.1 (#4149)
  • Stream Enrich: bump scala-common-enrich to 1.0.0 (#4148)
  • Stream Enrich: bump Scala to 2.12.10 (#4147)
  • Stream Enrich: bump scalacheck to 1.14.0 (#4144)
  • Stream Enrich: bump SBT to 1.3.3 (#4098)
  • Stream Enrich: replace sbt-scalafmt-coursier with sbt-scalafmt (#4097)
  • Scala Common Enrich: extend copyright notice to 2020 (#4257)
  • Scala Common Enrich: bump to 1.0.0 (#4026)
  • Scala Common Enrich: wrap all contexts and unstruct events into SelfDescribingData (#4241)
  • Scala Common Enrich: remove shred function (#4233)
  • Scala Common Enrich: make test specifications formatting consistent (#4234)
  • Scala Common Enrich: use snowplow-badrows (#4106)
  • Scala Common Enrich: add custom .jvmopts file (#4103)
  • Scala Common Enrich: bump SBT to 1.2.8 (#4096)
  • Scala Common Enrich: update WeatherEnrichmentSpec (#4073)
  • Scala Common Enrich: bump scala-uri to 1.4.5 (#4072)
  • Scala Common Enrich: bump scalaj-http to 2.4.1 (#4071)
  • Scala Common Enrich: bump mysql-connector-java to 8.0.16 (#4070)
  • Scala Common Enrich: bump postgresql to 42.2.5 (#4069)
  • Scala Common Enrich: bump uap-java to 1.4.3 (#4068)
  • Scala Common Enrich: bump jackson-databind to 2.9.8 (#4067)
  • Scala Common Enrich: bump joda-time to 2.10.1 (#4066)
  • Scala Common Enrich: bump commons-codec to 1.12 (#4064)
  • Scala Common Enrich: clean up dependencies (#4052)
  • Scala Common Enrich: parameterize ApiRequestEnrichment over the effect type (#4046)
  • Scala Common Enrich: parameterize SqlQueryEnrichment over the effect type (#4045)
  • Scala Common Enrich: bump scala-weather to 0.5.0 (#4044)
  • Scala Common Enrich: bump scala-maxmind-iplookups to 0.6.1 (#4043)
  • Scala Common Enrich: use sbt-scalafmt (#4040)
  • Scala Common Enrich: separate EnrichmentRegistry parsing from its construction (#4033)
  • Scala Common Enrich: externalize referer-parser yml file (#3830)
  • Scala Common Enrich: bump scala-forex to 0.7.0 (#4031)
  • Scala Common Enrich: bump scala-referer-parser to 1.0.0 (#4030)
  • Scala Common Enrich: bump iglu-scala-client to 0.6.1 (#4029)
  • Scala Common Enrich: bump jsonpath to 0.6.14 (#4028)
  • Scala Common Enrich: bump Scala to 2.12.10 (#4027)
  • Scala Common Enrich: bump specs2 to 4.5.1 (#4024)
  • Scala Common Enrich: replace scalaz by cats (#4018)
  • Scala Common Enrich: replace json4s with circe (#3602)
  • Scala Common Enrich: use sbt-tpolecat (#4010)

Release 117 Biskupin (2019-12-03)

  • EmrEtlRunner: add support to the spot market for core instances (#3487)
  • EmrEtlRunner: bump to 0.36.0 (#4143)
  • Scala Common Enrich: bump referer-parser to 0.3.1 (#4135)
  • Scala Common Enrich: add anonymization for IPv6 (#4222)
  • Scala Common Enrich: add additional event fingerprint hashing methods (#4226)
  • Scala Common Enrich: bump to 0.38.0 (#4136)
  • Beam Enrich: bump scala-common-enrich to 0.38.0 (#4137)
  • Beam Enrich: fix unit tests failing after update of MaxMind database (#4230)
  • Beam Enrich: fix docker deployment authorization (#4231)
  • Beam Enrich: bump to 0.4.0 (#4138)
  • Stream Enrich: bump scala-common-enrich to 0.38.0 (#4139)
  • Stream Enrich: bump to 0.22.0 (#4140)
  • Spark Enrich: bump scala-common-enrich to 0.38.0 (#4141)
  • Spark Enrich: use hadoop-lzo 0.4.20 from Snowplow Bintray maven (#4238)
  • Spark Enrich: bump to 1.19.0 (#4142)
  • Common: change Travis distribution to Trusty (#4214)
  • Common: publish docker images for scala-stream-collector, beam-enrich and stream-enrich to DockerHub (#4237)
  • Scala Stream Collector: allow users to disable the default redirect endpoint (#4211)
  • Scala Stream Collector: bump Scala version to 2.11.12 (#4206)
  • Scala Stream Collector: bump akka-http to 10.1.10 (#4185)
  • Scala Stream Collector: add support for TLS port binding and certificate (#4085)
  • Scala Stream Collector: remove duplicate section in example hocon config file (#4210)
  • Scala Stream Collector: bump to 0.17.0 (#4208)

Release 116 Madara Rider (2019-09-12)

  • Beam Enrich: fix unit tests failing after update of MaxMind database (#4129)
  • EmrEtlRunner: add support for shredded TSV data (#4074)
  • EmrEtlRunner: update spark_enrich version in config samples to 1.18.0 (#4091)
  • EmrEtlRunner: bump to 0.35.0 (#4112)
  • Scala Stream Collector: add options to configure secure, same-site and http-only for the cookie (#3753)
  • Scala Stream Collector: allow multiple cookie domains to be used (#3994)
  • Scala Stream Collector: provide a way to specify custom path mappings (#4087)
  • Scala Stream Collector: send back a Cache-Control header (#4017)
  • Scala Stream Collector: add sbt-native-packager and Docker config (#4128)
  • Scala Stream Collector: bump Akka HTTP to 10.0.15 (#4131)
  • Scala Stream Collector: bump version to 0.16.0 (#4134)

Release 115 Sigiriya (2019-07-17)

  • EmrEtlRunner: update Contracts for get_failure_details (#4088)
  • EmrEtlRunner: make sure all steps are successfully submitted in case of a transient cluster (#4092)
  • EmrEtlRunner: bump to 0.34.3 (#4089)
  • Event Manifest Populator: bump to 0.1.2 (#4082)
  • Event Manifest Populator: remove part-* pattern (#4081)

Release 114 Polonnaruwa (2019-05-17)

  • Beam Enrich: bump to 0.3.0 (#4061)
  • Beam Enrich: bump scala-common-enrich to 0.37.0 (#4060)
  • Beam Enrich: fix unit tests failing after update of MaxMind database (#4037)
  • Common: mention the contributing guide in the readme (#4007)
  • Common: bump release-manager to 0.4.1 (#4005)
  • Scala Common Enrich: bump to 0.37.0 (#4057)
  • Scala Common Enrich: make IpAddressExtractor fall back to the Forwarded: for= header as a last resort (#4014)
  • Scala Common Enrich: update Sendgrid integration (#4002)
  • Scala Common Enrich: add HTTP remote adapter #3760
  • Scala Common Enrich: add YAUAA enrichment (#4009)
  • Scala Common Enrich: create tutorial for adding an enrichment (#4039)
  • Scala Common Enrich: update WeatherEnrichmentSpec (#4073)
  • Scala Common Enrich: explore more relaxed URL parsing (#3880)
  • Scala Common Enrich: add support to IPs (v4) with port to IP lookup enrichment (#4048)
  • Scala Common Enrich: fix incompatibility between IAB enrichment and Iglu Webhook (#3952)
  • Scala Common Enrich: skip IAB enrichment for IPs v6 addresses (#4078)
  • Spark Enrich: bump to 1.18.0 (#4063)
  • Spark Enrich: bump scala-common-enrich to 0.37.0 (#4062)
  • Stream Enrich: bump to 0.21.0 (#4059)
  • Stream Enrich: bump scala-common-enrich to 0.37.0 (#4058)
  • EmrEtlRunner: extend backoff periods (#4049)
  • EmrEtlRunner: limit requests to the EMR API (#4056)
  • EmrEtlRunner: initialize S3::Aws constant (#4036)
  • EmrEtlRunner: bump to 0.34.2 (#4050)

Release 113 Filitosa (2019-02-27)

  • Scala Stream Collector: bump to 0.15.0 (#3983)
  • Scala Stream Collector: expose Prometheus metrics (#3421)
  • Scala Stream Collector: bump kafka client to 2.1.1 (#3981)
  • Scala Stream Collector: provide a way to add arbitrary Kafka configuration settings (#3968)
  • Scala Stream Collector: add support for an Access-Control-Max-Age header (#3904)
  • Scala Stream Collector: allow for the do not track cookie value configuration to be a regex (#3966)
  • Scala Stream Collector: showcase the usage of env variables in the configuration example (#3971)
  • Scala Stream Collector: extend copyright notice to 2019 (#3997)
  • Scala Common Enrich: bump to 0.36.0 (#3984)
  • Scala Common Enrich: add adapter to pre-process Hubspot webhooks (#3282)
  • Scala Common Enrich: change MarketoAdapter's last_interesting_moment_date type to date-time (#3967)
  • Scala Common Enrich: bump CallRail's call_complete to 1-0-2 (#2501)
  • Scala Common Enrich: support POST requests in API Request enrichment (#3857)
  • Scala Common Enrich: warn users of the user-agent-utils enrichment (#3964)
  • Scala Common Enrich: disable parallel test execution (#3970)
  • Scala Common Enrich: extend copyright notice to 2019 (#3998)
  • Beam Enrich: bump to 0.2.0 (#3990)
  • Beam Enrich: bump scala-common-enrich to 0.36.0 (#3989)
  • Beam Enrich: extend copyright notice to 2019 (#4001)
  • Stream Enrich: bump to 0.20.0 (#3986)
  • Stream Enrich: bump scala-common-enrich to 0.36.0 (#3985)
  • Stream Enrich: bump kafka client to 2.1.1 (#3992)
  • Stream Enrich: provide a way to add arbitrary Kafka configuration settings (#3969)
  • Stream Enrich: showcase the usage of env variables in the configuration example (#3972)
  • Stream Enrich: extend copyright notice to 2019 (#4000)
  • Spark Enrich: bump to 1.17.0 (#3988)
  • Spark Enrich: bump scala-common-enrich to 0.36.0 (#3987)
  • Spark Enrich: add test for the Hubspot adapter (#3977)
  • Spark Enrich: add test for the Marketo adapter (#3976)
  • Spark Enrich: use sbt-buildinfo (#3628)
  • Spark Enrich: extend copyright notice to 2019 (#3999)
  • EmrEtlRunner: bump to 0.34.1 (#3996)
  • EmrEtlRunner: add exponential backoff when getting cluster statuses (#3995)
  • EmrEtlRunner: update spark_enrich version in config sample to 1.17.0 (#3991)

Release 112 Baalbek (2019-02-19)

  • EmrEtlRunner: bump to 0.34.0 (#3935)
  • EmrEtlRunner: add support for running steps on a persistent EMR cluster (#3930)
  • EmrEtlRunner: recover from request timeout (#3943)
  • EmrEtlRunner: set timeout to 120 seconds (#3949)
  • EmrEtlRunner: leverage compaction steps when copying to S3 (#3940)
  • EmrEtlRunner: recover from S3 internal errors (#3950)
  • EmrEtlRunner: rename EMR steps (#3925)
  • EmrEtlRunner: scrub credentials from stderr (#2815)
  • EmrEtlRunner: bump elasticity to 6.0.14 (#3948)
  • EmrEtlRunner: extend copyright notice to 2019 (#3978)
  • Clojure Collector: bump to 2.1.3 (#3947)
  • Clojure Collector: add ebextension to increase the number of file descriptors to 65536 (#3876)
  • Clojure Collector: extend copyright notice to 2019 (#3979)
  • Redshift: widen geo_region to 3 characters (#3822)
  • Postgres: widen geo_region to 3 characters (#3946)
  • Event Manifest Populator: support s3a uri scheme (#3870)

Release 111 Selinunte (2018-10-02)

  • Clojure Collector: bump to 2.1.2 (#3900)
  • Clojure Collector: extend access control headers to all types of requests (#3899)
  • Common: remove Stream Enrich PubSub deployment (#3893)
  • Common: fix Spark Enrich version in R109 CHANGELOG entry (#3895)

Release 110 Valle dei Templi (2018-09-07)

  • Beam Enrich: implement a barebone port of Stream Enrich (#3735)
  • Beam Enrich: support enrichments relying on local files (#3736)
  • Beam Enrich: add support for the PII enrichment (#3888)
  • Beam Enrich: add metrics (#3737)
  • Beam Enrich: build Docker image (#3815)
  • Beam Enrich: add README (#3773)
  • Beam Enrich: add CI/CD (#3757)
  • Stream Enrich: bump to 0.19.1 (#3889)
  • Stream Enrich: remove GCP module (#3865)
  • Stream Enrich: fix parent event context in PII events (#3886)
  • Clojure Collector: bump to 2.1.1 (#3879)
  • Clojure Collector: update CORS configuration (#3875)
  • Clojure Collector: extend copyright notice to 2018 (#3891)
  • Common: add Bintray Docker registry credentials to .travis.yml (#3814)
  • Common: remove Vagrantfile (#3877)

Release 109 Lambaesis (2018-08-21)

  • Scala Stream Collector: bump to 0.14.0 (#3862)
  • Scala Stream Collector: respect a do not track cookie (#3825)
  • Scala Stream Collector: add a way to customize the response from the root path (#3670)
  • Scala Stream Collector: support HEAD requests (#3827)
  • Scala Stream Collector: allow for multiple domains in crossdomain.xml (#3740)
  • Scala Stream Collector: allow overriding of the kinesis endpoint url in the configuration (#3846)
  • Scala Stream Collector: turn BufferConfig's byteLimit and recordLimit into longs (#3807)
  • Scala Common Enrich: bump to 0.35.0 (#3861)
  • Scala Common Enrich: externalize ua-parser rule file (#3793)
  • Scala Common Enrich: bump ua-parser to 1.4.0 (#3811)
  • Scala Common Enrich: make the list of files to cache available from the EnrichmentRegistry (#3789)
  • Scala Common Enrich: change Iglu adapter to consider arrays as multiple events (#3858)
  • Scala Common Enrich: update CloudfrontAccessLogAdapter to support newer 26 field format (#3816)
  • Scala Common Enrich: leverage the x-forwarded-for field in CloudfrontLoader (#2859)
  • Scala Common Enrich: handle comma-separated list of ips (#3771)
  • Scala Common Enrich: bump SBT Bintray to 0.5.4 (#3840)
  • Scala Common Enrich: bump SBT to 1.1.6 (#3839)
  • Stream Enrich: bump to 0.19.0 (#3864)
  • Stream Enrich: bump scala-common-enrich to 0.35.0 (#3863)
  • Stream Enrich: allow overriding of the kinesis endpoint url in the configuration (#3775)
  • Stream Enrich: decorrelate the need for a pii stream and the pii enrichment (#3828)
  • Stream Enrich: bump SBT to 1.1.6 (#3841)
  • Stream Enrich: fix configuration example (#3820)
  • Spark Enrich: bump to 1.16.0 (#3874)
  • Spark Enrich: bump scala-common-enrich to 0.35.0 (closes #3869)
  • Spark Enrich: bump SBT Assembly to 0.14.7 (#3844)
  • Spark Enrich: bump SBT to 1.1.6 (#3843)
  • EmrEtlRunner: bump to 0.33.1 (#3867)
  • EmrEtlRunner: replace recursive functions by their iterative versions (#3866)
  • EmrEtlRunner: retrieve correct latest run id when using s3a (#3871)
  • EmrEtlRunner: update spark_enrich version in config samples to 1.16.0 (#3859)
  • Common: add pull request template (#3818)
  • Common: add issue template (#3819)
  • Common: update CONTRIBUTING.md (#3530)
  • Common: update to new logo in README (#3855)
  • Common: remove Vagrant setup (#3851)
  • Common: add Gitter badge (#3838)
  • Common: bump Travis Scala version to 2.11.12 (#3837)
  • Config: update example config for the UA parser enrichment to version 1-0-1 (#3868)
  • Event Manifest Populator: set visible_to_all_users flag to true (#3201)

Release 108 Val Camonica (2018-07-24)

  • EmrEtlRunner: bump to 0.33.0 (#3800)
  • EmrEtlRunner: add ability to specify an EMR security configuration (#3798)
  • EmrEtlRunner: handle SSE-S3 encrypted S3 buckets (#3456)
  • EmrEtlRunner: replace Sluice by aws-sdk-s3 (#3524)
  • EmrEtlRunner: add --ignore-lock-on-start option (#3537)
  • EmrEtlRunner: check the processing folder for emptiness when resuming from enrich (#3803)
  • EmrEtlRunner: make port in Snowplow monitoring configurable (#3236)
  • EmrEtlRunner: make protocol in Snowplow monitoring configurable (#3791)
  • Clojure Collector: bump to 2.1.0 (#3801)
  • Clojure Collector: make cookie path configurable (#2739)
  • Clojure Collector: do not allow dependencies requiring an HTTP repository (#3559)
  • Clojure Collector: bump lein-ring to 0.12.4 (#3783)
  • Clojure Collector: bump commons-codec to 1.11 (#3782)
  • Clojure Collector: bump metrics-clojure to 2.10.0 (#3781)
  • Clojure Collector: bump compojure to 1.6.1 (#3780)
  • Clojure Collector: bump clojure to 1.9.0 (#3779)
  • Clojure Collector: bump ring to 1.6.3 (#3778)
  • Clojure Collector: remove lein-beanstalk (#3784)

Release 107 Trypillia (2018-07-17)

  • Enrich: update example config for PII to version 2-0-0 (#3812)
  • Scala Common Enrich: add adapter to pre-process Marketo webhooks (#2616)
  • Scala Common Enrich: add adapter to pre-process Vero webhooks (#2757)
  • Scala Common Enrich: propagate the currency code to all the contexts which need it in the GA adapter (#3733)
  • Scala Common Enrich: add IAB Spiders & Robots Enrichment (#937)
  • Scala Common Enrich: bump to 0.34.0 (#3758)
  • Spark Enrich: bump scala-common-enrich to 0.34.0 (#3729)
  • Spark Enrich: add support for the IAB Enrichment (#3772)
  • Spark Enrich: bump to 1.15.0 (#3728)
  • Stream Enrich: bump scala-common-enrich to 0.34.0 (#3730)
  • Stream Enrich: add support for the IAB enrichment (#3797)
  • Stream Enrich: force jackson-databind to 2.9.3 for all projects (#3767)
  • Stream Enrich: rename force-ip-lookups-download to force-cached-files-download (#3809)
  • Stream Enrich: bump to 0.18.0 (#3727)
  • Common: fix travis deployment test (#3805)

Release 106 Acropolis (2018-06-14)

  • EmrEtlRunner: update rdb_shredder version in config samples to 0.13.1 (#3790)
  • EmrEtlRunner: update spark_enrich version in config samples to 1.14.0 (#3804)
  • Scala Common Enrich: add formats as ScalazJson4sUtils.extract as implicit parameter (#3668)
  • Scala Common Enrich: add salt to PII Enrichment (#3648)
  • Scala Common Enrich: bump to 0.33.0 (#3763)
  • Scala Common Enrich: bump user-agent-utils to 1.21 (#3656)
  • Scala Common Enrich: extend PII Enrichment to include identification events in EnrichedEvent (#3580)
  • Scala Common Enrich: fix platform specific error checking in IpLookupsEnrichmentSpec (#3762)
  • Scala Common Enrich: fix unnecessarily-created JSON object as a result of the PII Enrichment (#3636)
  • Spark Enrich: apply automated code formatting (#3655)
  • Spark Enrich: bump scala-common-enrich to 0.33.0 (#3764)
  • Spark Enrich: bump to 1.14.0 (#3765)
  • Spark Enrich: ignore PII identification events from Scala Common Enrich (#3582)
  • Spark Enrich: use automated code formatting (#3654)
  • Stream Enrich: add context for parent event when generating PII event (#3724)
  • Stream Enrich: add end-to-end test using mock streaming (#3639)
  • Stream Enrich: apply automated code formatting (#3651)
  • Stream Enrich: bump scala-common-enrich to 0.33.0 (#3607)
  • Stream Enrich: bump to 0.17.0 (#3608)
  • Stream Enrich: extend PII Enrichment to output a stream of PII identification events (#3581)
  • Stream Enrich: update config.hocon.sample to include a PII output stream (#3579)
  • Stream Enrich: use automated code formatting (#3644)

Release 105 Pompeii (2018-05-07)

  • Stream Enrich: bump to 0.16.1 (#3748)
  • Stream Enrich: ensure a one-to-one relationship between sink and record processor (#3745)
  • Stream Enrich: force jackson-databind to 2.9.3 (#3744)
  • Scala Common Enrich: update WeatherEnrichmentSpec (#3749)

Release 104 Stoplesteinan (2018-04-30)

  • Common: remove trailing hyphen from CHANGELOG entry for R103 (#3731)
  • EmrEtlRunner: fail fast when trying to skip staging or enrich in stream enrich mode (#3726)
  • EmrEtlRunner: factor out steps-generating function (#3718)
  • EmrEtlRunner: uncompress enriched files when copying to HDFS (#3719)
  • EmrEtlRunner: bump to 0.32.0 (#3723)
  • EmrEtlRunner: fix srcPattern for copying stream enriched data to HDFS (#3722)
  • EmrEtlRunner: check if whole enriched.good is non-empty in stream enrich mode (#3717)

Release 103 Paestum (2018-04-17)

  • Scala Common Enrich: bump to 0.32.0 (#3673)
  • Scala Common Enrich: bump scala-maxmind-iplookups to 0.4.0 (#3675)
  • Scala Common Enrich: update IP Lookups Enrichment to support non-legacy database (#3672)
  • Scala Common Enrich: support extraction of IP addresses in the Forwarded header (#3475)
  • Scala Common Enrich: support IPv6 addresses in the IpAddressExtractor (#3474)
  • Scala Common Enrich: bump mandrill event versions to 1-0-1 (#3372)
  • Stream Enrich: bump to 0.16.0 (#3698)
  • Stream Enrich: bump scala-common-enrich to 0.32.0 (#3676)
  • Stream Enrich: force jackson-dataformat-cbor to 2.9.3 (#3701)
  • Spark Enrich: bump to 1.13.0 (#3705)
  • Spark Enrich: bump scala-common-enrich to 0.32.0 (#3674)
  • Spark Enrich: downgrade geoip2 to 2.5.0 (#3702)
  • Clojure Collector: bump to 2.0.0 (#3708)
  • Clojure Collector: make Flash access domains and secure configurable (#2914)
  • Clojure Collector: retrieve configuration only through JVM properties (#3709)
  • Clojure Collector: allow HTTP repositories (#3707)
  • Clojure Collector: add CI/CD (#3712)
  • Config: update database value in config/enrichments/ip_lookups.json (#3671)
  • EmrEtlRunner: update spark_enrich version in config.yml.sample to 1.13.0 (#3710)
  • Common: rename Caravel to Superset (#3595)
  • Common: redirect support request to discourse in CONTRIBUTING.md (#3478)

Release 102 Afontova Gora (2018-04-03)

  • EmrEtlRunner: bump to 0.31.0 (#3679)
  • EmrEtlRunner: add ability to skip load_manifest_check (#3680)
  • EmrEtlRunner: add CI/CD to update AMI bootstrap scripts (#3683)
  • EmrEtlRunner: add stream_config.yml.sample (#3685)
  • EmrEtlRunner: add support for shredding from Kinesis S3 Loader's enriched event output (#3606)
  • EmrEtlRunner: add bootstrap action to prepare AMI 5.x for Snowplow (#3601)
  • EmrEtlRunner: recover from RestClient::ServiceUnavailable when making status checks (#3539)
  • EmrEtlRunner: recover from RestClient::RequestTimeout when making status checks (#3468)
  • EmrEtlRunner: launch bootstrap action for AMI 5.x (#3609)
  • EmrEtlRunner: pass processing manifest config to RDB Shredder (#3619)
  • EmrEtlRunner: fail fast in build script (#3684)
  • EmrEtlRunner: fail fast on duplicated storage target id (#3652)
  • EmrEtlRunner: do not rescue on Exception (#3577)
  • Redshift: remove duplicate create events table comment (#3643)

Release 101 Neapolis (2018-03-21)

  • Scala Stream Collector: bump to 0.13.0 (#3682)
  • Scala Stream Collector: add Google Cloud PubSub sink (#3047)
  • Scala Stream Collector: split into multiple artifacts according to targeted platform (#3621)
  • Scala Stream Collector: expose number of requests over JMX (#3637)
  • Scala Stream Collector: move cross domain configuration to enabled-style (#3556)
  • Scala Stream Collector: truncate events exceeding the configured maximum size into a BadRow (#3587)
  • Scala Stream Collector: remove string interpolation false positive warnings (#3623)
  • Scala Stream Collector: update config.hocon.sample to support Google Cloud PubSub (#3049)
  • Scala Stream Collector: customize useragent for GCP API calls (#3658)
  • Scala Stream Collector: bump kafka-clients to 1.0.1 (#3660)
  • Scala Stream Collector: bump aws-java-sdk to 1.11.290 (#3665)
  • Scala Stream Collector: bump scala-common-enrich to 0.31.0 (#3666)
  • Scala Stream Collector: bump SBT to 1.1.1 (#3629)
  • Scala Stream Collector: bump sbt-assembly to 0.14.6 (#3667)
  • Scala Stream Collector: use sbt-buildinfo (#3626)
  • Scala Stream Collector: extend copyright notice to 2018 (#3687)
  • Stream Enrich: bump to 0.15.0 (#3681)
  • Stream Enrich: add Google Cloud PubSub source (#3150)
  • Stream Enrich: add Google Cloud PubSub sink (#3149)
  • Stream Enrich: split into multiple artifacts according to targeted platform (#3645)
  • Stream Enrich: rename etl version from kinesis to stream-enrich (#3642)
  • Stream Enrich: make source / sink configuration a coproduct (#3555)
  • Stream Enrich: add ability to retrieve resolver and enrichments from Google Cloud Datastore (#3152)
  • Stream Enrich: update config.hocon.sample to support Google Cloud PubSub (#3151)
  • Stream Enrich: customize useragent for GCP API calls (#3193)
  • Stream Enrich: bump kafka-clients to 1.0.1 (#3661)
  • Stream Enrich: bump amazon-kinesis-client to 1.9.0 (#3663)
  • Stream Enrich: bump aws-java-sdk to 1.11.290 (#3662)
  • Stream Enrich: bump SBT to 1.1.1 (#3657)
  • Stream Enrich: bump sbt-assembly to 0.14.6 (#3664)
  • Stream Enrich: use sbt-buildinfo (#3627)
  • Stream Enrich: extend copyright notice to 2018 (#3686)
  • Common: install Ruby 2.4.3 before deploy (#3689)
  • Common: fix CHANGELOG entry for R97 (#3630)

Release 100 Epidaurus (2018-02-26)

  • Redshift: widen se_label to 4,096 to support URLs etc (#196)
  • Redshift: widen sensitive columns in atomic.events to support pseudonymization (#3528)
  • Scala Common Enrich: add PII Enrichment (#3472)
  • Scala Common Enrich: apply automated code formatting (#3532)
  • Scala Common Enrich: bump commons-codec to 1.11 (#3638)
  • Scala Common Enrich: bump to 0.31.0 (#3598)
  • Scala Common Enrich: remove unused version member in Enrichment trait (#3541)
  • Scala Common Enrich: use automated code formatting (#3496)
  • Stream Enrich: bump scala-common-enrich to 0.31.0 (#3597)
  • Stream Enrich: bump to 0.14.0 (#3596)
  • Stream Enrich: use generated Settings for version in test (#3604)

Release 99 Carnac (2018-01-25)

  • Scala Common Enrich: bump to 0.30.0 (#3562)
  • Scala Common Enrich: add adapter for Google Analytics (#3560)
  • Scala Common Enrich: extend copyright notice to 2018 (#3574)
  • Spark Enrich: bump to 1.12.0 (#3565)
  • Spark Enrich: bump scala-common-enrich to 0.30.0 (#3563)
  • Spark Enrich: add tests for the Google Analytics adapter (#3561)
  • Spark Enrich: extend copyright notice to 2018 (#3573)
  • Spark Enrich: change Twitter repository url to https (#3593)
  • EmrEtlRunner: update spark_enrich version in config.yml.sample to 1.12.0 (#3566)
  • Common: extend copyright notice to 2018 in READMEs (#3575)

Release 98 Argentomagus (2018-01-05)

  • Scala Stream Collector: bump to 0.12.0 (#3548)
  • Scala Stream Collector: make Flash access domains and secure configurable (#2915)
  • Scala Stream Collector: add URL redirect replacement macro (#3491)
  • Scala Stream Collector: allow use of the originating scheme during cookie bounce (#3512)
  • Scala Stream Collector: replace Location header with RawHeader to preserve double encoding (#3546)
  • Scala Stream Collector: bump nsq-java-client to 1.2.0 (#3519)
  • Scala Stream Collector: document the stdout sink better (#3515)
  • Scala Stream Collector: fix stdout sink configuration (#3550)
  • Scala Stream Collector: fix scaladoc for 'ipAndPartitionKey' (#3513)
  • Stream Enrich: bump to 0.13.0 (#3549)
  • Stream Enrich: bump scala-common-enrich to 0.29.0 (#3553)
  • Stream Enrich: bump nsq-java-client to 1.2.0 (#3520)
  • Scala Common Enrich: bump to 0.29.0 (#3552)
  • Scala Common Enrich: add validation of tracker-sent timestamps (#336)
  • Scala Common Enrich: add validation of collector_tstamp (#3416)
  • Redshift: update version of atomic.events to 0.9.0 (#3517)
  • Common: trigger the publishing of Stream Enrich when it is under test (#3557)

Release 97 Knossos (2017-12-18)

  • Common: reenable publishLocal in travis for spark enrich tests to pass (#3516)
  • Common: rename AWS deployment credentials in .travis.yml (close #3115)
  • EmrEtlRunner: add ability to skip RDB Loader consistency check (#3529)
  • EmrEtlRunner: bump to 0.30.0 (#3526)
  • EmrEtlRunner: uncompress gzipped raw files when copying to HDFS (#3525)
  • EmrEtlRunner: update spark_enrich version in config.yml.sample to 1.11.0 (#3002)
  • Scala Common Enrich: add Adapter to pre-process Olark events (#1014)
  • Scala Common Enrich: add adapter to pre-process Mailgun webhooks (#2734)
  • Scala Common Enrich: add adapter to pre-process Statusgator webhooks (#2169)
  • Scala Common Enrich: add adapter to pre-process Unbounce webhooks (#2615)
  • Scala Common Enrich: add function to camelCase all JSON fields in Adaptor (#3538)
  • Scala Common Enrich: bump user-agent-utils to 1.20 (#2930)
  • Scala Common Enrich: default port to 443 if scheme is https (#3483)
  • Scala Common Enrich: make enrichments.ExtractEventTypeSpec timezone-safe (#3481)
  • Scala Common Enrich: remove toSecond parameter in Adapter (#3534)
  • Scala Common Enrich: tolerate content type for GET requests sent to Clojure Collector (#2743)
  • Scala Common Enrich: bump to 0.28.0 (#2725)
  • Spark Enrich: add test for Mailgun Adapter (#2763)
  • Spark Enrich: add test for Olark Adapter (#2792)
  • Spark Enrich: add test for StatusGator Adapter (#2722)
  • Spark Enrich: add test for Unbounce Adapter (#2745)
  • Spark Enrich: bump to 1.11.0 (#3533)
  • Spark Enrich: fix tests that fail when running on an alternative iglu service (#3503)
  • Spark Enrich: fix tests that fail with error when running on a platform that doesn't have native-lzo (#3508)
  • Spark Enrich: improve error message in test to show index line (#3494)
  • Spark Enrich: bump scala-common-enrich to 0.28.0 (#2724)

Release 96 Zeugma (2017-11-21)

  • Scala Stream Collector: bump to 0.11.0 (#3433)
  • Scala Stream Collector: update config.hocon.sample to support NSQ (#3294)
  • Scala Stream Collector: add NSQ sink (#2093)
  • Scala Stream Collector: make Kinesis, Kafka and NSQ config a coproduct (#3449)
  • Scala Stream Collector: keep sending records when the Kinesis stream is resharding (#3453)
  • Stream Enrich: bump to 0.12.0 (#3432)
  • Stream Enrich: update config.hocon.sample to support NSQ (#3339)
  • Stream Enrich: add NSQ sink (#3337)
  • Stream Enrich: add NSQ source (#3336)
  • Common: decorrelate CI/CD for Scala Stream Collector and Stream Enrich (#3441)

Release 95 Ellora (2017-11-13)

  • Redshift: add migration script for 0.8.0 to 0.9.0 (#3440)
  • Redshift: widen domain_sessionidx column in atomic.events from smallint to integer (#1788)
  • Redshift: update atomic.events to use ZSTD compression (#3435)
  • EmrEtlRunner: bump to 0.29.0 (#3469)
  • EmrEtlRunner: reintroduce processing directory not empty no-op (#3458)
  • EmrEtlRunner: retrieve the correct latest run ID during archive_shredded step (#3436)
  • EmrEtlRunner: fix pagination issue when retrieving latest run id (#3434)
  • EmrEtlRunner: update rdb_loader version in config.yml.sample to 0.14.0 (#3418)
  • EmrEtlRunner: update rdb_shredder version in config.yml.sample to 0.13.0 (#3460)
  • EmrEtlRunner: update spark_enrich version in config.yml.sample to 1.10.0 (#3461)
  • EmrEtlRunner: bump AMI version in example config to 5.9.0 (#3465)
  • EmrEtlRunner: force bundler 1.15.4 during CI/CD (#3493)
  • Spark Enrich: overwrite output datasets (#3443)
  • Spark Enrich: bump to 1.10.0 (#3428)
  • Spark Enrich: add test for Cloudfront Sep 2016 (#3000)
  • Spark Enrich: bump scala-common-enrich to 0.27.0 (#3427)
  • Spark Enrich: bump Spark to 2.2.0 (#3466)
  • Scala Common Enrich: bump to 0.27.0 (#3429)
  • Scala Common Enrich: add support for new field in CloudFront access logs (#2933)
  • Config: add GCP mirror into config/iglu_resolver.json (#3430)
  • Storage: replace example Postgres storage target configuration with 1-1-0 (#3463)
  • Storage: replace example Redshift storage target configuration with 2-1-0 (#3462)
  • Data modeling: remove web model (#3471)

Release 94 Hill of Tara (2017-10-10)

  • Stream Enrich: bump to 0.11.1 (#3454)
  • Stream Enrich: keep sending records when the Kinesis stream is resharding (#3452)

Release 93 Virunum (2017-10-03)

  • Scala Stream Collector: bump to 0.10.0 (#3424)
  • Scala Stream Collector: replace spray by akka-http (#3299)
  • Scala Stream Collector: replace argot by scopt (#3298)
  • Scala Stream Collector: add support for cookie bounce (#2697)
  • Scala Stream Collector: allow raw query params (#3273)
  • Scala Stream Collector: add support for the Chinese Kinesis endpoint (#3335)
  • Scala Stream Collector: use the DefaultAWSCredentialsProviderChain for Kinesis Sink (#3245)
  • Scala Stream Collector: use Kafka callback based API to detect failures to send messages (#3317)
  • Scala Stream Collector: make Kafka sink more fault tolerant by allowing retries (#3367)
  • Scala Stream Collector: fix incorrect property used for kafkaProducer.batch.size (#3173)
  • Scala Stream Collector: configuration decoding with pureconfig (#3318)
  • Scala Stream Collector: stop making the assembly jar executable (#3410)
  • Scala Stream Collector: add config dependency (#3326)
  • Scala Stream Collector: upgrade to Java 8 (#3328)
  • Scala Stream Collector: bump Scala version to 2.11 (#3311)
  • Scala Stream Collector: bump SBT to 0.13.16 (#3312)
  • Scala Stream Collector: bump sbt-assembly to 0.14.5 (#3329)
  • Scala Stream Collector: bump aws-java-sdk-kinesis to 1.11 (#3310)
  • Scala Stream Collector: bump kafka-clients to 0.10.2.1 (#3325)
  • Scala Stream Collector: bump scala-common-enrich to 0.26.0 (#3305)
  • Scala Stream Collector: bump iglu-scala-client to 0.5.0 (#3309)
  • Scala Stream Collector: bump specs2-core to 3.9.4 (#3308)
  • Scala Stream Collector: bump scalaz-core to 7.0.9 (#3307)
  • Scala Stream Collector: bump joda-time to 2.9 (#3323)
  • Scala Stream Collector: remove commons-codec dependency (#3324)
  • Scala Stream Collector: remove snowplow-thrift-raw-event dependency (#3306)
  • Scala Stream Collector: remove joda-convert dependency (#3304)
  • Scala Stream Collector: remove mimepull dependency (#3302)
  • Scala Stream Collector: remove scalazon dependency (#3300)
  • Scala Stream Collector: run the unit tests systematically in Travis (#3409)
  • Stream Enrich: bump to 0.11.0 (#3425)
  • Stream Enrich: support AT_TIMESTAMP as initial position (#3360)
  • Stream Enrich: add ability to force re-download IP lookup databases on reboot (#3159)
  • Stream Enrich: add support for the Chinese Kinesis and DynamoDB endpoints (#3344)
  • Stream Enrich: replace argot by scopt (#3345)
  • Stream Enrich: use Kafka callback based API to detect failures to send messages (#2974)
  • Stream Enrich: make Kafka sink more fault tolerant by allowing retries (#2973)
  • Stream Enrich: make partition key for enriched event stream user-configurable (#1924)
  • Stream Enrich: fix incorrect property used for kafkaProducer.batch.size (#3380)
  • Stream Enrich: flush Kafka producer (#3342)
  • Stream Enrich: configuration decoding with pureconfig (#3394)
  • Stream Enrich: stop catching fatal errors (#1455)
  • Stream Enrich: stop making the assembly jar executable (#3411)
  • Stream Enrich: change package name (#3340)
  • Stream Enrich: add commons-codec dependency (#3349)
  • Stream Enrich: add json4s dependency (#3348)
  • Stream Enrich: upgrade to Java 8 (#3392)
  • Stream Enrich: bump Scala version to 2.11 (#3388)
  • Stream Enrich: bump SBT to 0.13.16 (#3382)
  • Stream Enrich: bump sbt-assembly to 0.14.5 (#3391)
  • Stream Enrich: bump kafka-clients to 0.10.2.1 (#3413)
  • Stream Enrich: bump config to 1.3.1 (#3412)
  • Stream Enrich: bump iglu-scala-client to 0.5.0 (#3387)
  • Stream Enrich: bump scalacheck to 1.11.3 (#3386)
  • Stream enrich: bump scala-common-enrich to 0.26.0 (#3385)
  • Stream Enrich: bump specs2 to 2.3.13 (#3383)
  • Stream Enrich: bump scalaz-core to 7.0.9 (#3381)
  • Stream Enrich: bump amazon-kinesis-client to 1.8.1 (#3379)
  • Stream Enrich: bump aws-java-sdk to 1.11 (#3377)
  • Stream Enrich: remove scalaz-specs2 dependency (#3347)
  • Stream Enrich: remove scalazon dependency (#3341)
  • Stream Enrich: remove unused dependencies (#3346)
  • Stream Enrich: run the unit tests systematically in Travis (#3408)
  • Scala Common Enrich: bump to 0.26.0 (#3333)
  • Scala Common Enrich: drop Scala 2.10 (#3285)
  • Scala Common Enrich: replace akka-http with scalaj (#3330)
  • Scala Common Enrich: bump scala-uri to 0.5.0 (#2893)
  • Scala Common Enrich: bump scala-weather to 0.3.0 (#3334)
  • Kinesis Elasticsearch Sink: remove (#3275)

Release 92 Maiden Castle (2017-09-11)

  • EmrEtlRunner: release lock in case of no-op (#3396)
  • EmrEtlRunner: treat archive_enriched and archive_shredded as separate steps (#3401)
  • EmrEtlRunner: do not pass --skip shred to RDB Loader when skipping RDB Shredder (#3403)
  • EmrEtlRunner: if RDB Loader step hangs and is cancelled, logs are not retrieved (#3399)
  • EmrEtlRunner: ensure appropriate log level for RDB logs (#3369)
  • EmrEtlRunner: unlink downloaded RDB logs (#3363)
  • EmrEtlRunner: do not try to download non-existent RDB loader log files (#3405)
  • EmrEtlRunner: rescue the intermittent RestClient::SSLCertificateNotVerified error (#2572)
  • EmrEtlRunner: pass GZIP compression argument to S3DistCp as "gz" not "gzip" (#3415)
  • EmrEtlRunner: update rdb_loader version in config.yml.sample to 0.13.0 (#3418)
  • EmrEtlRunner: bump to 0.28.0 (#3404)
  • Documentation: fix broken links in storage/postgres's README.md (#3390)
  • RDB Shredder: remove (#3398)
  • RDB Loader: remove (#3393)

Release 91 Stonehenge (2017-08-17)

  • EmrEtlRunner: use S3DistCp not Sluice for staging step (#276)
  • EmrEtlRunner: add an S3DistCp step for the _SUCCESS file produced by RDB Shredder (#3137)
  • EmrEtlRunner: add step to delete raw events from HDFS before shredding (#2545)
  • EmrEtlRunner: use S3DistCp to move raw files from S3 to HDFS for all collector formats (#3136)
  • EmrEtlRunner: add file- and Consul-based locking mechanism (#3352)
  • EmrEtlRunner: move current behavior into a run command (#3104)
  • EmrEtlRunner: add lint command which validates Iglu resolver and enrichments (#1946)
  • EmrEtlRunner: add backend for a generate command (#3105)
  • EmrEtlRunner: add --resume-from option (#3128)
  • EmrEtlRunner: remove support for --start and --end flags (#3132)
  • EmrEtlRunner: remove support for --process-enrich and --process-shred flags (#3365)
  • EmrEtlRunner: handle run= sub-folders if resuming from shred (#2693)
  • EmrEtlRunner: add "ongoing run" message on exit with return code 4 (#3129)
  • EmrEtlRunner: add "no logs to process" message on exit with return code 3 (#2644)
  • EmrEtlRunner: retrieve RDB loader logs only when it failed or the entire run was successful (#3361)
  • EmrEtlRunner: bump rspec to 3.5.0 (#3116)
  • EmrEtlRunner: bump to 0.27.0 (#3358)

Release 90 Lascaux (2017-07-26)

  • Common: update CI/CD to push S3 artifacts to all regional Hosted Assets buckets (#3242)
  • Common: add CI/CD to deploy RDB Loader to Snowplow Hosted Assets (#3025)
  • Common: no longer bundle StorageLoader in Bintray download (#3024)
  • Storage: replace example Redshift storage target configuration with 2-0-0 (#3281)
  • Event Manifest Populator: bump to 0.1.1 (#3295)
  • Event Manifest Populator: support pre-R83 enriched events (#3293)
  • EmrEtlRunner: make targets loading consistent with enrichments (#3268)
  • EmrEtlRunner: expose arbitrary EMR configuration options (#3255)
  • EmrEtlRunner: add maximizeResourceAllocation option to EMR cluster configuration (#3253)
  • EmrEtlRunner: move max attempts configuration to EMR cluster configuration (#3246)
  • EmrEtlRunner: use Elasticity to specify Thrift-specific configuration (#3252)
  • EmrEtlRunner: bump elasticity version to 6.0.12 (#3249)
  • EmrEtlRunner: remove storage.download from config.yml.sample (#3265)
  • EmrEtlRunner: add rdb_loader to config.yml.sample (#3266)
  • EmrEtlRunner: add S3DistCp step to move enriched and shredded files to archive (#1777)
  • EmrEtlRunner: add RDB Loader step for each target (#3121)
  • EmrEtlRunner: bump to 0.26.0 (#3254)
  • RDB Loader: fix eventual consistency problem (#3113)
  • RDB Loader: load all runs from shredded, not just the first run found (#2962)
  • RDB Loader: remove compupdate step (#3178)
  • RDB Loader: add logging around database load, analyze and vacuum (#2935)
  • RDB Loader: use Redshift-specific driver to connect to Redshift (#1830)
  • RDB Loader: remove StorageLoader (#3026)
  • RDB Loader: accept storage target JSONs on command-line (#3022)
  • RDB Loader: rewrite StorageLoader in Scala, removing file archiving step (#3023)
  • Java Tracker: bump git submodule to 0.8.2 (#3260)
  • Ruby Tracker: bump git submodule to 0.6.1 (#3264)
  • .NET Tracker: bump git submodule to 1.0.2 (#3258)
  • Python Tracker: bump git submodule to 0.8.0 (#3263)
  • Golang Tracker: bump git submodule to 1.1.0 (#3259)
  • Node.js Tracker: bump git submodule to 0.3.0 (#3262)
  • Android Tracker: bump git submodule to 0.6.2 (#3257)
  • JavaScript Tracker: bump git submodule to 2.8.0 (#3261)

Release 89 Plain of Jars (2017-06-12)

  • Documentation: fix incorrect hyphen underlining for R88 (#3198)
  • Common: refactor CI/CD deploy scripts into one (#3100)
  • Common: update CI/CD to deploy Spark Enrich (#3069)
  • Common: refactor CI/CD is release tag scripts into one (#3101)
  • Common: update CI/CD to deploy RDB Shredder (#3038)
  • Common: fix travis build due to the changes to the precise image (#3210)
  • Common: build local Scala Common Enrich before publishing Kinesis-related artifacts (#3220)
  • Common: add Sonatype credentials to .travis.yml (#3217)
  • Common: bump Scala to 2.11 in .travis.yml (#3227)
  • Scala Common Enrich: bump to 0.25.0 (#3089)
  • Scala Common Enrich: bump scala-iglu-client to 0.5.0 (#3092)
  • Scala Common Enrich: remove scala-util (#3054)
  • Scala Common Enrich: get rid of deprecated erasure method calls (#3008)
  • Scala Common Enrich: bump scalaz to 7.0.9 (#3055)
  • Scala Common Enrich: bump scalding-args to 0.13.0 (#3058)
  • Scala Common Enrich: bump specs2 to 2.3.13 (#3059)
  • Scala Common Enrich: bump scalaz-specs2 to 0.2 (#3060)
  • Scala Common Enrich: bump scala-forex to 0.5.0 (#3057)
  • Scala Common Enrich: bump sbt to 0.13.13 (closes #3056)
  • Scala Common Enrich: bump Scala to 2.11.11 (#3007)
  • Scala Common Enrich: add Scala 2.11 cross-building (#3061)
  • Scala Common Enrich: make EnrichedEvent Serializable (#3081)
  • Scala Common Enrich: fix failing WeatherEnrichmentSpec expectation (#3205)
  • Scala Common Enrich: remove ScalazArgs (#3209)
  • Scala Common Enrich: upgrade to Java 8 (#3212)
  • Scala Common Enrich: add CI/CD (#3216)
  • Spark Enrich: bump to 1.9.0 (#3072)
  • Spark Enrich: rename from Scala Hadoop Enrich (#3064)
  • Spark Enrich: change the package from hadoop to spark (#3076)
  • Spark Enrich: bump sbt-assembly to 0.14.3 (#3078)
  • Spark Enrich: bump SBT to 0.13.13 (#3065)
  • Spark Enrich: port from Scalding to Spark (#3067)
  • Spark Enrich: bump scala-common-enrich to 0.25 (#3096)
  • Spark Enrich: bump Scalaz to 7.0.9 (#3097)
  • Spark Enrich: bump iglu-scala-client to 0.5.0 (#3098)
  • Spark Enrich: bump specs2-core to 2.3.13 (#3099)
  • Spark Enrich: bump Scala version to 2.11 (#3070)
  • Spark Enrich: upgrade to Java 8 (#2381)
  • Spark Enrich: fix SqlQueryEnrichmentCfLinesSpec (#3224)
  • Spark Enrich: fix CurrencyConversionTransactionSpec (#3225)
  • Spark Enrich: run the unit tests systematically in Travis (#3228)
  • EmrEtlRunner: bump to 0.25.0 (#3039)
  • EmrEtlRunner: update to run Spark Enrich instead of Scala Hadoop Enrich (#3066)
  • EmrEtlRunner: update to run RDB Shredder instead of Scala Hadoop Shred (#3033)
  • EmrEtlRunner: add ability to run Spark jobs (#641)
  • EmrEtlRunner: replace hadoop_shred in config.yml.sample with rdb_shredder (#3035)
  • EmrEtlRunner: bump elasticity version to 6.0.11 (#3053)
  • EmrEtlRunner: use the Scalding step provided by Elasticity (#3052)
  • EmrEtlRunner: replace hadoop_enrich in config.yml.sample with spark_enrich (#3068)
  • EmrEtlRunner: bump AMI version in example config to 5.5.0 (#3207)
  • RDB Shredder: bump to 0.12.0 (#3042)
  • RDB Shredder: rename from Scala Hadoop Shred (#3031)
  • RDB Shredder: move from 3-enrich to 4-storage (#3032)
  • RDB Shredder: change the package to storage from enrich (#3036)
  • RDB Shredder: port from Scalding to Spark (#3034)
  • RDB Shredder: bump scala-common-enrich to 0.25 (#3091)
  • RDB Shredder: bump iglu-scala-client to 0.5.0 (#3090)
  • RDB Shredder: bump specs2-core to 2.3.13 (#3093)
  • RDB Shredder: bump Scala version to 2.11 (#3071)
  • RDB Shredder: upgrade to Java 8 (#3213)
  • RDB Shredder: run the unit tests systematically in Travis (#3229)
  • StorageLoader: bump to 0.11.0 (#3214)
  • StorageLoader: add support for Spark-based Shredder's directory structure (#3044)

Release 88 Angkor Wat (2017-04-27)

  • Documentation: fix incorrect release date for R87 (#3126)
  • Common: update copyright years in README (#3148)
  • Common: add CI/CD for EmrEtlRunner and StorageLoader (#3102)
  • Common: add CI/CD for Event Manifest Populator (#3170)
  • Common: add AWS staging credentials to .travis.yml (#3114)
  • Common: update script to sync ap-northeast-2 (Seoul) Snowplow Hosted Assets bucket (#3160)
  • Common: update READMEs markdown in according with CommonMark (#3157)
  • Event Manifest Populator: add Spark job to backpopulate DynamoDB duplicate storage (#3158)
  • Scala Common Enrich: fix failing WeatherEnrichmentSpec expectation (#3154)
  • Scala Common Enrich: bump to 0.24.1 (#3155)
  • Scala Hadoop Shred: bump sbt-assembly to 0.14.4 (#3140)
  • Scala Hadoop Shred: bump SBT to 0.13.13 (#2972)
  • Scala Hadoop Shred: bump to 0.11.0 (#3041)
  • Scala Hadoop Shred: remove explicit jackson-databind dependency (#3138)
  • Scala Hadoop Shred: add cross-batch natural deduplication (#2999)
  • Storage: add example storage target configuration JSONs (#2990)
  • StorageLoader: bump to 0.10.0 (#3109)
  • StorageLoader: remove Northern Virginia endpoint for Postgres load (#3143)
  • StorageLoader: handle return code of 4 for EmrEtlRunner in snowplow-runner-and-loader.sh (#3139)
  • StorageLoader: use storage target JSONs instead of targets section in config.yml (#2992)
  • StorageLoader: replace table configuration property with schema (#2458)
  • EmrEtlRunner: bump to 0.24.0 (#3040)
  • EmrEtlRunner: update hadoop_shred version in config.yml.sample to 0.11.0 (#3197)
  • EmrEtlRunner: add script to convert config.yml targets section into JSON format (#3135)
  • EmrEtlRunner: remove targets section from config.yml.sample (#2989)
  • EmrEtlRunner: no longer use sources property when loading Elasticsearch (#2993)
  • EmrEtlRunner: use storage target JSONs instead of targets section in config.yml (#2991)

Release 87 Chichen Itza (2017-02-21)

  • EmrEtlRunner: bump to 0.23.0 (#2960)
  • EmrEtlRunner: bump JRuby version to 9.1.6.0 (#3050)
  • EmrEtlRunner: bump Elasticity to 6.0.10 (#3013)
  • EmrEtlRunner: remove AnonIpHash from contracts.rb (#2523)
  • EmrEtlRunner: remove UnmatchedLzoFilesError check (#2740)
  • EmrEtlRunner: use S3DistCp not Sluice for archive_raw step (#1977)
  • EmrEtlRunner: add warning about the array of in buckets in config.yml (#2462)
  • EmrEtlRunner: add dedicated return code of 4 for DirectoryNotEmptyError (#2546)
  • EmrEtlRunner: add support for specifying EBS for Hadoop workers (#2950)
  • EmrEtlRunner: add example EBS configuration to config.yml.sample (#3012)
  • EmrEtlRunner: catch Elasticity ThrottlingExceptions while waiting for EMR (#3028)
  • EmrEtlRunner: catch Elasticity ArgumentErrors while waiting for EMR (#3027)
  • StorageLoader: bump to 0.9.0 (#2961)
  • StorageLoader: bump JRuby version to 9.1.6.0 (#3051)
  • StorageLoader: fix typo in S3Tasks.download_events (#2888)
  • StorageLoader: update manifest table as part of Redshift load transaction (#2280)
  • Redshift: added manifest table (#2265)

Release 86 Petra (2016-12-20)

  • Common: add AWS credentials to .travis.yml (#2963)
  • Common: add CI/CD for Scala Hadoop Enrich (#2982)
  • Common: add CI/CD for Scala Hadoop Shred (#2928)
  • Common: migrate Hadoop Event Recovery deployment to Release Manager (#2983)
  • Common: remove short-hostname addon from travis.yml (#2674)
  • Common: update script to sync us-east-2 (Ohio) Snowplow Hosted Assets bucket (#2986)
  • Common: update script to sync ca-central-1 (Montreal) Snowplow Hosted Assets bucket (#3004)
  • Common: update script to sync eu-west-2 (London) Snowplow Hosted Assets bucket (#3005)
  • Common: use AWS environment variables to sync Snowplow Hosted Assets buckets (#2985)
  • Scala Hadoop Shred: bump to 0.10.0 (#2979)
  • Scala Hadoop Shred: add general top-level exception handling (#2071)
  • Scala Hadoop Shred: get the CustomPartitionSourceTest working with Hadoop 2.4 (#1960)
  • Scala Hadoop Shred: fix omitted string interpolation (#2562)
  • Scala Hadoop Shred: deduplicate event_ids with different event_fingerprints (synthetic duplicates) (#24)
  • Scala Hadoop Shred: stop catching fatal errors (#1456)
  • EmrEtlRunner: update hadoop_shred version in config.yml.sample to 0.10.0 (#3003)
  • Data modeling: add drill fields to web block (#2956)
  • Data modeling: resolve issues with web model (#2954)
  • Data modeling: restrict table scan on deduplication queries (#2929)
  • Data modeling: add web model (#2925)
  • Data modeling: delete example models (#2836)
  • Data modeling: remove outdated recipes (#2626)

Release 85 Metamorphosis (2016-11-15)

  • Scala Stream Collector: bump to 0.9.0 (#2936)
  • Scala Stream Collector: add Kafka sink (#2937)
  • Scala Stream Collector: update config.hocon.sample to support Kafka (#2943)
  • Scala Stream Collector: move sink.kinesis.buffer to sink.buffer in config.hocon.sample (#2938)
  • Stream Enrich: bump to 0.10.0 (#2942)
  • Stream Enrich: add Kafka sink (#2939)
  • Stream Enrich: add Kafka source (#2941)
  • Stream Enrich: update config.hocon.sample to support Kafka (#2940)
  • Stream Enrich: fix incorrect parsing of S3 urls (#2921)

Release 84 Steller's Sea Eagle (2016-10-07)

  • Common: standardise sbt-assembly settings (#2900)
  • Common: refactor Kinesis release CI/CD (#2887)
  • Common: update script to sync ap-south-1 (Mumbai) Snowplow Hosted Assets bucket (#2903)
  • Scala Stream Collector: bump to 0.8.0 (#2886)
  • Scala Stream Collector: add scala_ into artifact filename in Bintray (#2843)
  • Scala Stream Collector: use nuid query parameter value to set the 3rd party network id cookie (#2512)
  • Scala Stream Collector: configurable cookie path (#2528)
  • Scala Stream Collector: call Config.resolve() to resolve environment variables in hocon (#2879)
  • Stream Enrich: bump to 0.9.0 (#2728)
  • Stream Enrich: bump Scala Tracker to 0.3.0 (#2898)
  • Stream Enrich: bump Scala Common Enrich to 0.24.0 (#2729)
  • Stream Enrich: tolerate trailing slashes for paths in IP Lookups Enrichment configuration (#2744)
  • Stream Enrich: call Config.resolve() to resolve environment variables in hocon (#2878)
  • Kinesis Elasticsearch Sink: bump to 0.8.0 (#2885)
  • Kinesis Elasticsearch Sink: bump Scala Tracker to 0.3.0 (#2899)
  • Kinesis Elasticsearch Sink: allow parametrized timeouts for jest client (#2897)
  • Kinesis Elasticsearch Sink: does not take into account buffer configurations (#2895)
  • Kinesis Elasticsearch Sink: error messages are not helpful (#2896)
  • Kinesis Elasticsearch Sink: ensure field names do not contain any dots (#2894)
  • Kinesis Elasticsearch Sink: add support for Elasticsearch 2.x (#2525)
  • Kinesis Elasticsearch Sink: call Config.resolve() to resolve environment variables in hocon (#2880)
  • StorageLoader: remove all JSON Path files (#2905)
  • Redshift: remove all Redshift DDL for Iglu Central schemas (#2904)

Release 83 Bald Eagle (2016-09-06)

  • Scala Tracker: bump git submodule to 0.3.0 (#2726)
  • ActionScript 3.0 Tracker: bump git submodule to 0.3.0 (#2727)
  • Scala Common Enrich: bump to 0.24.0 (#2715)
  • Scala Common Enrich: add SQL Query Enrichment (#2321)
  • Scala Common Enrich: add POST support to IgluAdapter (#1184)
  • Scala Hadoop Enrich: bump to 1.8.0 (#2716)
  • Scala Hadoop Enrich: bump Scala Common Enrich to 0.24.0 (#2717)
  • Scala Hadoop Enrich: add test for SQL Query Enrichment (#2718)
  • Scala Hadoop Enrich: make resolver config in JobSpecHelpers injectable (#2825)
  • EmrEtlRunner: bump to 0.22.0 (#2784)
  • EmrEtlRunner: bump Ruby version to 2.2.3 (#2869)
  • EmrEtlRunner: bump Sluice to 0.4.0 (#1708)
  • EmrEtlRunner: bump Contracts to 0.9 (#2789)
  • EmrEtlRunner: rebuild Gemfile.lock (#2872)
  • EmrEtlRunner: add version recognition of currently installed commons-codec (#2735)
  • EmrEtlRunner: update snowplow-ami4-bootstrap.sh to take optional commons-codec version argument (#2713)
  • EmrEtlRunner: fix bug with double compression in shred step if enrich skipped (#2586)
  • EmrEtlRunner: pass GZIP compression argument to S3DistCp as "gz" not "gzip" (#2679)
  • EmrEtlRunner: update hadoop_enrich version in config.yml.sample to 1.8.0 (#2756)
  • EmrEtlRunner: replace deprecated Dir.exists? with Dir.exist? (#2799)
  • EmrEtlRunner: fix contract for fatal_with (#2810)
  • EmrEtlRunner: use region-specific Snowplow Hosted Assets buckets (#2813)
  • EmrEtlRunner: disable contract on build_fix_filenames due to Contracts issue #238 (#2828)
  • Storage: add Kinesis S3 git submodule (#2706)
  • StorageLoader: bump to 0.8.0 (#2785)
  • StorageLoader: bump Ruby version to 2.2.3 (#2870)
  • StorageLoader: bump Sluice to 0.4.0 (#2786)
  • StorageLoader: bump Contracts to 0.9 (#2790)
  • StorageLoader: add explicit mime-types dependency (#2805)
  • StorageLoader: rebuild Gemfile.lock (#2871)
  • StorageLoader: use Northern Virginia endpoint not global endpoint for us-east-1 (#2748)
  • StorageLoader: replace module_function everywhere with self (#2801)
  • StorageLoader: fix broken contracts (#2461)
  • StorageLoader: write JSON path for com.amazon.aws.lambda/s3_notification_event (#2590)
  • StorageLoader: write JSON path for com.snowplowanalytics.snowplow/application_foreground/jsonschema/1-0-0 (#2857)
  • StorageLoader: write JSON path for com.snowplowanalytics.snowplow/application_background/jsonschema/1-0-0 (#2856)
  • StorageLoader: write JSON path for com.snowplowanalytics.snowplow/application_error/jsonschema/1-0-0 (#2855)
  • Redshift: add Redshift DDL for com.snowplowanalytics.snowplow/application_foreground/jsonschema/1-0-0 (#2854)
  • Redshift: add Redshift DDL for com.snowplowanalytics.snowplow/application_background/jsonschema/1-0-0 (#2853)
  • Redshift: add Redshift DDL for com.snowplowanalytics.snowplow/application_error/jsonschema/1-0-0 (#2852)
  • Redshift: add Redshift DDL for com.amazon.aws.lambda/s3_notification_event/jsonschema/1-0-0 (#2589)

Release 82 Tawny Eagle (2016-08-08)

  • Common: publish each Kinesis app individually to Bintray (#2492)
  • Kinesis Elasticsearch Sink: bump to 0.7.0 (#2816)
  • Kinesis Elasticsearch Sink: configure transport port (#2102)
  • Kinesis Elasticsearch Sink: add support for HTTP protocol (#2092)
  • Kinesis Elasticsearch Sink: unify logger configuration (#1699)

Release 81 Kangaroo Island Emu (2016-06-16)

  • Documentation: fix broken link in Thrift Schemas' README.md (#2498)
  • Common: add encrypted S3 credentials to .travis.yml (#2673)
  • Common: delete publish-kinesis-release.bash (#2711)
  • Android Tracker: bump git submodule to 0.5.4 (#2710)
  • JavaScript Tracker: bump git submodule to 2.6.1 1. (#2708)
  • Objective-C Tracker: bump git submodule to 0.6.1 (#2709)
  • Golang Tracker: add git submodule (#2619)
  • Scala Common Enrich: bump to 0.23.1 (#2699)
  • Scala Common Enrich: bump commons codec to 1.10 (#2691)
  • Stream Enrich: bump to 0.8.1 (#2701)
  • Stream Enrich: bump Scala Common Enrich to 0.23.1 (#2700)
  • Hadoop Event Recovery: update README instructions (#2348)
  • Hadoop Event Recovery: add continuous deployment (#2692)
  • Hadoop Event Recovery: rename from Scala Hadoop Bad Rows (#2694)
  • Hadoop Event Recovery: allow source row to be transformed with JavaScript (#2223)
  • Hadoop Event Recovery: capitalize Snowplow correctly in copyright notices (#2641)
  • StorageLoader: write JSON path for com.clearbit/person (#2631)
  • StorageLoader: write JSON path for com.clearbit/company (#2632)
  • StorageLoader: write JSON path for com.amazon.aws.lambda/java_context (#2560)
  • Redshift: add Redshift DDL for com.clearbit/person/jsonschema/1-0-0 (#2633)
  • Redshift: add Redshift DDL for com.clearbit/company/jsonschema/1-0-0 (#2634)
  • Redshift: add Redshift DDL for com.amazon.aws.lambda/java_context/jsonschema/1-0-0 (#2559)

Release 80 Southern Cassowary (2016-05-30)

  • Common: add CI/CD for Kinesis apps (#2621)
  • Common: add Bintray credentials to .travis.yml (#2618)
  • Common: change Kinesis pipeline status from "Beta" to "Production-ready" in READMEs (#2629)
  • Config: update config/iglu_resolver.json version to 1-0-1 (#2479)
  • Scala Stream Collector: bump to 0.7.0 (#2595)
  • Scala Stream Collector: increase tolerance of timings in tests (#2614)
  • Scala Stream Collector: send nonempty response to POST requests (#2606)
  • Scala Stream Collector: crash when unable to find stream instead of hanging (#2583)
  • Scala Stream Collector: stop using deprecated Config.getMilliseconds method (#2570)
  • Scala Stream Collector: move example configuration file to examples folder (#2566)
  • Scala Stream Collector: upgrade the log level for reports of stream nonexistence from INFO to ERROR (#2384)
  • Scala Stream Collector: crash rather than hanging when unable to bind to the supplied port (#2551)
  • Scala Stream Collector: bump Spray version to 1.3.3 (#2522)
  • Scala Stream Collector: bump Scala version to 2.10.5 (#2565)
  • Scala Stream Collector: fix omitted string interpolation (#2561)
  • Stream Enrich: bump to 0.8.0 (#2596)
  • Stream Enrich: bump Common Enrich to 0.23.0 (#2612)
  • Stream Enrich: bump Iglu Scala Client to 0.4.0 (#2688)
  • Stream Enrich: add configuration setting for MaxRecords (#2610)
  • Stream Enrich: use nonEmpty method to check whether lists are empty (#2608)
  • Stream Enrich: refactor functions to avoid return keyword (#2607)
  • Stream Enrich: upgrade the log level for reports of stream nonexistence from INFO to ERROR (#2598)
  • Stream Enrich: crash when unable to find stream instead of hanging (#2584)
  • Stream Enrich: add standard copyright notice to AbstractSourceSpec.scala (#2580)
  • Stream Enrich: make logging more succinct in case of failure (#1723)
  • Stream Enrich: move example configuration file to examples folder (#2567)
  • Stream Enrich: remove src/main/resolver.json.sample (#1932)
  • Stream Enrich: use json4s to combine the enrichment configuration JSONs (#2259)
  • Kinesis Elasticsearch Sink: bump to 0.6.0 (#2597)
  • Kinesis Elasticsearch Sink: add configuration setting for MaxRecords (#2611)
  • Kinesis Elasticsearch Sink: crash when unable to find stream instead of hanging (#2585)
  • Kinesis Elasticsearch Sink: move example configuration file to examples folder (#2568)

Release 79 Black Swan (2016-05-12)

  • Documentation: removed closes from CHANGELOG tickets for R78 (#2534)
  • Common: changed Vagrantfile to use NFS and extra CPU cores by default (#2482)
  • Config: removed duplicated enabled property in ua_parser_config.json (#2424)
  • Config: enabled switched to false in currency_conversion_config.json (#2327)
  • Config: enabled switched to false in weather_enrichment_config.json (#2326)
  • EmrEtlRunner: bumped AMI version in example config to 4.5.0 (#2604)
  • EmrEtlRunner: updated hadoop_enrich version in config.yml.sample to 1.7.0 (#2661)
  • EmrEtlRunner: updated hadoop_shred version in config.yml.sample to 0.9.0 (#2662)
  • Scala Common Enrich: bumped user-agent-utils version to latest (#2516)
  • Scala Common Enrich: transaction item quantity type changed to JInteger (#2157)
  • Scala Common Enrich: bumped to 0.23.0 (#2486)
  • Scala Common Enrich: improved OWM error if user doesn't have historical weather (#2325)
  • Scala Common Enrich: added API Request Enrichment (#2051)
  • Scala Common Enrich: bumped Iglu Scala Client to 0.4.0 (#2333)
  • Scala Common Enrich: added HTTP Header Extractor Enrichment (#1373)
  • Scala Hadoop Enrich: bumped to 1.7.0 (#2446)
  • Scala Hadoop Enrich: bumped Scala Common Enrich to 0.23.0 (#2485)
  • Scala Hadoop Enrich: bumped Iglu Scala Client to 0.4.0 (#2478)
  • Scala Hadoop Enrich: added test for API Request Enrichment (#2603)
  • Scala Hadoop Shred: bumped to 0.9.0 (#2480)
  • Scala Hadoop Shred: bumped Scala Common Enrich to 0.23.0 (#2481)
  • Scala Hadoop Shred: bumped Iglu Scala Client to 0.4.0 (#2449)

Release 78 Great Hornbill (2016-03-15)

  • Common: removed openjdk7 from .travis.yml (#2533)
  • Scala Common Enrich: bumped to 0.22.0
  • Scala Common Enrich: added handling for bad rows which are too long to print in full (#2419)
  • Kinesis: updated publish-kinesis-release.bash (#2477)
  • Scala Stream Collector: bumped to 0.6.0
  • Scala Stream Collector: added Scala Common Enrich as a library dependency (#2153)
  • Scala Stream Collector: added click redirect mode (#549)
  • Scala Stream Collector: configured the ability to use IP address as partition key (#2331)
  • Scala Stream Collector: converted bad rows to new format (#2006)
  • Scala Stream Collector: shared a single thread pool for all writes to Kinesis (#2369)
  • Scala Stream Collector: specified UTF-8 encoding everywhere (#2147)
  • Scala Stream Collector: made cookie name customizable, thanks @kazjote! (#2474)
  • Scala Stream Collector: added boolean collector.cookie.enabled setting (#2488)
  • Scala Stream Collector: made backoffPolicy fields macros (#2518)
  • Scala Stream Collector: updated AWS credentials to support iam/env/default not cpf (#1518)
  • Scala Kinesis Enrich: bumped to 0.7.0
  • Scala Kinesis Enrich: renamed to Stream Enrich (#2418)
  • Scala Kinesis Enrich: bumped Kinesis Client Library to 1.6.1 (#1823)
  • Scala Kinesis Enrich: bumped Scala Common Enrich to 0.21.0 (#2033)
  • Scala Kinesis Enrich: bumped Iglu Scala Client to 0.3.1 (#2080)
  • Scala Kinesis Enrich: configured the ability to use IP address as partition key (#2332)
  • Scala Kinesis Enrich: started emitting KCL metrics to CloudWatch (#2357)
  • Scala Kinesis Enrich: converted bad rows to new format (#1207)
  • Scala Kinesis Enrich: removed outdated comment about ClasspathPropertiesFileCredentialsProvider from sample config file (#1519)
  • Scala Kinesis Enrich: removed redundant documentation from README (#2032)
  • Scala Kinesis Enrich: updated test suite with valid self-describing JSONs (#2151)
  • Scala Kinesis Enrich: updated Scala Tracker to 0.2.0 and enabled EC2 context (#2109)
  • Scala Kinesis Enrich: updated to use new EtlPipeline (#1933)
  • Scala Kinesis Enrich: specified UTF-8 encoding everywhere (#2148)
  • Kinesis Elasticsearch Sink: bumped to 0.5.0
  • Kinesis Elasticsearch Sink: bumped Kinesis Client Library to 1.6.1 (#1824)
  • Kinesis Elasticsearch Sink: bumped Scala Common Enrich to 0.22.0 (#2152)
  • Kinesis Elasticsearch Sink: added mixed output mode (#2412)
  • Kinesis Elasticsearch Sink: added new canonical event fields (#2089)
  • Kinesis Elasticsearch Sink: moved the stream-type setting into the main sink configuration object (#2490)
  • Kinesis Elasticsearch Sink: made source and sink fields macros (#2519)
  • Kinesis Elasticsearch Sink: renamed Build object to match project (#2002)
  • Kinesis Elasticsearch Sink: converted bad rows to new format (#1208)
  • Kinesis Elasticsearch Sink: updated schema regular expression in line with Iglu Central (#1998)
  • Kinesis Elasticsearch Sink: cached the mapping of field name to field type (#2090)
  • Kinesis Elasticsearch Sink: specified UTF-8 encoding everywhere (#2149)
  • Kinesis Elasticsearch Sink: stopped sending timestamp instead of failure count (#1951)
  • Kinesis Elasticsearch Sink: made performance of conversion from TSV to JSON linear (#1847)
  • Kinesis Elasticsearch Sink: updated to latest version of EnrichedEvent (#2089)

Release 77 Great Auk (2016-02-28)

  • Documentation: updated tracker status table (#1999)
  • Documentation: fixed incorrect entries in CHANGELOG (#2443)
  • Common: made optionality of Lingual and HBase in config.yml clearer (#2206)
  • Common: fixed OpenJDK build in Travis CI (#2447)
  • Scala Hadoop Enrich: bumped to 1.6.0
  • Scala Hadoop Enrich: bumped Scala Common Enrich to 0.21.0 (#2442)
  • Scala Common Enrich: bumped to 0.21.0
  • Scala Common Enrich: fixed exception for invalid API key in currency conversion (#2441)
  • Scala Common Enrich: fixed exception on same currency conversion (#2437)
  • Scala Common Enrich: switched from javax.script to org.mozilla.javascript for JavaScriptEnrichment (#2453)
  • Scala Hadoop Shred: bumped to 0.8.0
  • Scala Hadoop Shred: bumped Iglu Scala Client to 0.3.2 (#2319)
  • EmrEtlRunner: bumped to 0.21.0
  • EmrEtlRunner: attached monitoring tags to jobflow (#425)
  • EmrEtlRunner: now throwing exception if processing thrift with --skip s3distcp or AMI 2.x.x (#1648)
  • EmrEtlRunner: added bootstrap action to prepare AMI >= 3.8.0 (#2320)
  • EmrEtlRunner: bumped Elasticity to 6.0.7 (#2400)
  • EmrEtlRunner: added support for Amazon EMR 4.x.x series (#1926)
  • EmrEtlRunner: prevented bad CLI options from throwing stack trace (#1930)
  • EmrEtlRunner: made error for nonempty processing bucket collector-agnostic (#1961)
  • EmrEtlRunner: bumped Ruby Tracker to 0.5.2 (#2143)
  • EmrEtlRunner: improved retry logic for EMR bootstrap timeouts (#2150)
  • EmrEtlRunner: excluded previously-built executables from the build (#2163)
  • EmrEtlRunner: added support for additional_info in EMR section of configuration (#2211)
  • EmrEtlRunner: added Elasticsearch stage to help message (#2323)
  • EmrEtlRunner: updated hadoop_enrich version in config.yml.sample to 1.6.0 (#2459)
  • EmrEtlRunner: updated hadoop_shred version in config.yml.sample to 0.8.0 (#2370)
  • EmrEtlRunner: removed snowplow-emr-etl-runner.sh (#2445)
  • StorageLoader: bumped to 0.7.0
  • StorageLoader: added support for supplying config file as Base64-encoded string (#2227)
  • StorageLoader: added ability to retrieve AWS credentials from EC2 role (#2226)
  • StorageLoader: excluded previously-built executables from the build (#2164)
  • StorageLoader: started printing stack trace for failures not caused by bad configuration (#2160)
  • StorageLoader: bumped Ruby Tracker to 0.5.2 (#2144)
  • StorageLoader: moved ANALYZE statements after VACUUM statements (#1361)
  • StorageLoader: added resolver config option to snowplow-runner-and-loader.sh (#2170)
  • StorageLoader: updated snowplow-runner-and-loader.sh to use JRuby binaries (#2233)
  • StorageLoader: removed snowplow-storage-loader.sh (#2444)
  • StorageLoader: wrote JSON Path file for com.optimizely/visitor_dimension event (#2436)
  • StorageLoader: wrote JSON Path file for com.optimizely/visitor_audience event (#2435)
  • StorageLoader: wrote JSON Path file for com.optimizely/visitor event (#2434)
  • StorageLoader: wrote JSON Path file for com.optimizely/variation event (#2433)
  • StorageLoader: wrote JSON Path file for com.optimizely/state event (#2432)
  • StorageLoader: wrote JSON Path file for com.optimizely/experiment event (#2431)
  • StorageLoader: wrote JSON Path file for io.augur.snowplow/identity_lite (#1958)
  • Redshift: wrote Redshift DDL for com.optimizely/visitor_dimension event (#2430)
  • Redshift: wrote Redshift DDL for com.optimizely/visitor_audience event (#2429)
  • Redshift: wrote Redshift DDL for com.optimizely/visitor event (#2428)
  • Redshift: wrote Redshift DDL for com.optimizely/variation event (#2427)
  • Redshift: wrote Redshift DDL for com.optimizely/state event (#2426)
  • Redshift: wrote Redshift DDL for com.optimizely/experiment event (#2425)
  • Redshift: added Redshift DDL for io.augur.snowplow/identity_lite (#1957)

Release 76 Changeable Hawk-Eagle (2016-01-26)

  • Scala Hadoop Enrich: bumped to 1.5.1
  • Scala Hadoop Enrich: bumped Scala Common Enrich to 0.20.1 (#2338)
  • Scala Common Enrich: bumped to 0.20.1
  • Scala Common Enrich: now using only base MIME type in content-type check for SendGrid Adapter (#2328)
  • Scala Hadoop Shred: bumped to 0.7.0
  • Scala Hadoop Shred: fixed good tests' checks for empty paths (#2278)
  • Scala Hadoop Shred: now deduplicating event_id and event_fingerprint pairs (#2246)
  • Scala Hadoop Shred: fixed incorrect event in SchemaValidationFailed1Spec (#2355)
  • Scala Hadoop Shred: updated tests to check atomic-events output (#2264)
  • Scala Hadoop Shred: now only writes atomic-events if JSONs shred successfully (#2245)
  • Scala Hadoop Shred: removed empty SchemaValidationFailed2Spec (#2271)
  • Scala Hadoop Shred: fixed test suite issue with multiple input lines (#2270)
  • EmrEtlRunner: updated hadoop_enrich version in config.yml.sample to 1.5.1 (#2339)
  • EmrEtlRunner: changed in bucket example in config.yml.sample to s3://my-in-bucket (#2358)
  • EmrEtlRunner: updated archive bucket examples in config.yml (#2368)
  • EmrEtlRunner: updated hadoop_shred version in config.yml.sample to 0.7.0 (#2360)
  • StorageLoader: wrote JSON Paths file for com.google.analytics.enhanced-ecommerce/action (#2136)
  • StorageLoader: wrote JSON Paths file for com.google.analytics.enhanced-ecommerce/actionFieldObject (#2135)
  • StorageLoader: wrote JSON Paths file for com.google.analytics.enhanced-ecommerce/impressionFieldObject (#2134)
  • StorageLoader: wrote JSON Paths file for com.google.analytics.enhanced-ecommerce/productFieldObject (#2133)
  • StorageLoader: wrote JSON Paths file for com.google.analytics.enhanced-ecommerce/promotionFieldObject (#2132)
  • Redshift: added Redshift DDL for com.google.analytics.enhanced-ecommerce/promotionFieldObject (#2131)
  • Redshift: added Redshift DDL for com.google.analytics.enhanced-ecommerce/productFieldObject (#2130)
  • Redshift: added Redshift DDL for com.google.analytics.enhanced-ecommerce/impressionFieldObject (#2129)
  • Redshift: added Redshift DDL for com.google.analytics.enhanced-ecommerce/actionFieldObject (#2128)
  • Redshift: added Redshift DDL for com.google.analytics.enhanced-ecommerce/action (#2127)

Release 75 Long-Legged Buzzard (2016-01-02)

  • Scala Hadoop Enrich: bumped to 1.5.0
  • Scala Hadoop Enrich: bumped Scala Common Enrich to 0.20.0 (#2200)
  • Scala Hadoop Enrich: added test for loading Urban Airship Connect ndjson files (#2168)
  • Scala Hadoop Enrich: added test for SendGrid Adapter (#2194)
  • Scala Common Enrich: bumped to 0.20.0
  • Scala Common Enrich: added JsonLoader for Urban Airship, Mixpanel et al (#2210)
  • Scala Common Enrich: added Adapter to pre-process Urban Airship events (#2167)
  • Scala Common Enrich: abstracted Mandrill reformatParameters function into Adapter (#2171)
  • Scala Common Enrich: added Adapter to pre-process SendGrid events (#1161)
  • EmrEtlRunner: bumped to 0.20.0
  • EmrEtlRunner: updated hadoop_enrich version in config.yml.sample to 1.5.0 (#2282)
  • EmrEtlRunner: added raw s3 -> hdfs step with group by (#2253)
  • EmrEtlRunner: added directory flattening code (#2232)
  • EmrEtlRunner: added support for ndjson loader format (#2251)
  • EmrEtlRunner: improved test coverage of runner.rb (#2250)
  • Redshift: added Redshift DDL for a com.sendgrid/processed event (#2172)
  • Redshift: added Redshift DDL for a com.sendgrid/dropped event (#2173)
  • Redshift: added Redshift DDL for a com.sendgrid/delivered event (#2174)
  • Redshift: added Redshift DDL for a com.sendgrid/deferred event (#2175)
  • Redshift: added Redshift DDL for a com.sendgrid/bounce event (#2176)
  • Redshift: added Redshift DDL for a com.sendgrid/open event (#2177)
  • Redshift: added Redshift DDL for a com.sendgrid/click event (#2178)
  • Redshift: added Redshift DDL for a com.sendgrid/spamreport event (#2179)
  • Redshift: added Redshift DDL for a com.sendgrid/unsubscribe event (#2180)
  • Redshift: added Redshift DDL for a com.sendgrid/group_unsubscribe event (#2181)
  • Redshift: added Redshift DDL for a com.sendgrid/group_resubscribe event (#2182)
  • Redshift: added Redshift DDL for com.urbanairship.connect/UNINSTALL event (#2283)
  • Redshift: added Redshift DDL for com.urbanairship.connect/TAG_CHANGE event (#2284)
  • Redshift: added Redshift DDL for com.urbanairship.connect/SEND event (#2285)
  • Redshift: added Redshift DDL for com.urbanairship.connect/RICH_READ event (#2286)
  • Redshift: added Redshift DDL for com.urbanairship.connect/RICH_DELIVERY event (#2287)
  • Redshift: added Redshift DDL for com.urbanairship.connect/RICH_DELETE event (#2288)
  • Redshift: added Redshift DDL for com.urbanairship.connect/REGION event (#2289)
  • Redshift: added Redshift DDL for com.urbanairship.connect/PUSH_BODY event (#2290)
  • Redshift: added Redshift DDL for com.urbanairship.connect/OPEN event (#2291)
  • Redshift: added Redshift DDL for com.urbanairship.connect/LOCATION event (#2292)
  • Redshift: added Redshift DDL for com.urbanairship.connect/IN_APP_MESSAGE_RESOLUTION event (#2293)
  • Redshift: added Redshift DDL for com.urbanairship.connect/IN_APP_MESSAGE_EXPIRATION event (#2294)
  • Redshift: added Redshift DDL for com.urbanairship.connect/IN_APP_MESSAGE_DISPLAY event (#2295)
  • Redshift: added Redshift DDL for com.urbanairship.connect/FIRST_OPEN event (#2296)
  • Redshift: added Redshift DDL for com.urbanairship.connect/CUSTOM event (#2297)
  • Redshift: added Redshift DDL for com.urbanairship.connect/CLOSE event (#2298)
  • StorageLoader: added JSON Path file for com.sendgrid/processed event (#2183)
  • StorageLoader: added JSON Path file for com.sendgrid/dropped event (#2184)
  • StorageLoader: added JSON Path file for com.sendgrid/delivered event (#2185)
  • StorageLoader: added JSON Path file for com.sendgrid/deferred event (#2186)
  • StorageLoader: added JSON Path file for com.sendgrid/bounce event (#2187)
  • StorageLoader: added JSON Path file for com.sendgrid/open event (#2188)
  • StorageLoader: added JSON Path file for com.sendgrid/click event (#2189)
  • StorageLoader: added JSON Path file for com.sendgrid/spamreport event (#2190)
  • StorageLoader: added JSON Path file for com.sendgrid/unsubscribe event (#2191)
  • StorageLoader: added JSON Path file for com.sendgrid/group_unsubscribe event (#2192)
  • StorageLoader: added JSON Path file for com.sendgrid/group_resubscribe event (#2193)
  • StorageLoader: added JSON Path file for com.urbanairship.connect/UNINSTALL event (#2299)
  • StorageLoader: added JSON Path file for com.urbanairship.connect/TAG_CHANGE event (#2300)
  • StorageLoader: added JSON Path file for com.urbanairship.connect/SEND event (#2301)
  • StorageLoader: added JSON Path file for com.urbanairship.connect/RICH_READ event (#2302)
  • StorageLoader: added JSON Path file for com.urbanairship.connect/RICH_DELIVERY event (#2303)
  • StorageLoader: added JSON Path file for com.urbanairship.connect/RICH_DELETE event (#2304)
  • StorageLoader: added JSON Path file for com.urbanairship.connect/REGION event (#2305)
  • StorageLoader: added JSON Path file for com.urbanairship.connect/PUSH_BODY event (#2306)
  • StorageLoader: added JSON Path file for com.urbanairship.connect/OPEN event (#2307)
  • StorageLoader: added JSON Path file for com.urbanairship.connect/LOCATION event (#2308)
  • StorageLoader: added JSON Path file for com.urbanairship.connect/IN_APP_MESSAGE_RESOLUTION event (#2309)
  • StorageLoader: added JSON Path file for com.urbanairship.connect/IN_APP_MESSAGE_EXPIRATION event (#2310)
  • StorageLoader: added JSON Path file for com.urbanairship.connect/IN_APP_MESSAGE_DISPLAY event (#2311)
  • StorageLoader: added JSON Path file for com.urbanairship.connect/FIRST_OPEN event (#2312)
  • StorageLoader: added JSON Path file for com.urbanairship.connect/CUSTOM event (#2313)
  • StorageLoader: added JSON Path file for com.urbanairship.connect/CLOSE event (#2314)
  • Data modeling: removed events enriched from web-recalculate (#2275)
  • Data modeling: added cookie-to-user-id map to web-recalculate (#2274)

Release 74 European Honey Buzzard (2015-12-22)

  • Common: added encrypted OWM API key to .travis.yml (#2243)
  • Scala Hadoop Enrich: bumped to 1.4.0
  • Scala Hadoop Enrich: bumped Scala Common Enrich to 0.19.0 (#2255)
  • Scala Common Enrich: bumped to 0.19.0
  • Scala Common Enrich: added weather enrichment (#456)
  • Scala Common Enrich: fixed issue with BC timestamp in ExtractEventTypeSpec (#2257)
  • Scala Common Enrich: fixed currency conversion enrichment's test for invalid API key (#2258)
  • StorageLoader: wrote JSON path file for org.openweathermap/weather (#2240)
  • Redshift: added Redshift DDL for org.openweathermap/weather (#2241)

Release 73 Cuban Macaw (2015-12-04)

  • EmrEtlRunner: bumped to 0.19.0
  • EmrEtlRunner: added hadoop_elasticsearch to config.yml.sample (#2124)
  • EmrEtlRunner: added support for Elasticsearch in targets section of config (#826)
  • EmrEtlRunner: bumped Elasticity to 6.0.5 (#2026)
  • EmrEtlRunner: stopped skipping the whole job just because enrich and shred are being skipped (#2049)
  • Scala Common Enrich: bumped Iglu Scala Client to 0.3.1 (#2079)
  • Scala Common Enrich: bumped version to 0.18.0
  • Scala Common Enrich: moved ScalazArgs into shared library (#2010)
  • Scala Common Enrich: removed executable bit from Scala source files (#2022)
  • Scala Common Enrich: removed JSON length checks (#2041)
  • Scala Common Enrich: removed truncation code (#2044)
  • Scala Common Enrich: stopped attempting to catch fatal errors (#2045)
  • Scala Hadoop Enrich: bumped to 1.3.0
  • Scala Hadoop Enrich: bumped Scala Common Enrich to 0.18.0 (#2015)
  • Scala Hadoop Enrich: added Iglu Scala Client as an explicit dependency (#2115)
  • Scala Hadoop Enrich: added .forceToDisk to speed up run (#859)
  • Scala Hadoop Enrich: started using Scala Common Enrich's version of ScalazArgs (#2013)
  • Scala Hadoop Shred: bumped to 0.6.0
  • Scala Hadoop Shred: added .forceToDisk to common to speed up run (#2039)
  • Scala Hadoop Shred: bumped Iglu Scala Client to 0.3.1 (#2081)
  • Scala Hadoop Shred: bumped Scala Common Enrich to 0.18.0 (#2016)
  • Scala Hadoop Shred: applied truncation logic to atomic-events TSV (#2042)
  • Scala Hadoop Shred: processed enriched events for atomic.events removing JSON fields (#1731)
  • Scala Hadoop Shred: started using Scala Common Enrich's version of ScalazArgs (#2014)
  • Storage: fixed README's link to architecture image, thanks @miike! (#2156)
  • Hadoop Elasticsearch Sink: added. (#824)
  • StorageLoader: bumped to 0.6.0
  • StorageLoader: added tcpKeepAlive=true to JDBC for long-running COPYs via NAT (#2145)
  • StorageLoader: fixed setup guide link in README, thanks @diamondo25! (#2025)
  • StorageLoader: loaded atomic.events from shredded folder (#1795)
  • Postgres: bumped atomic.events to 0.7.0
  • Postgres: added migration script for 0.6.0 to 0.7.0 (#2047)
  • Postgres: removed JSON fields from atomic.events (#1949)
  • Redshift: bumped atomic.events to 0.8.0
  • Redshift: added migration script for 0.4.0 to 0.8.0 (#2155)
  • Redshift: added migration script for 0.5.0 to 0.8.0 (#2119)
  • Redshift: added migration script for 0.6.0 to 0.8.0 (#2120)
  • Redshift: added migration script for 0.7.0 to 0.8.0 (#2048)
  • Redshift: removed JSON fields from atomic.events (#1849)
  • Data Modeling: added separators to custom fingerprint in deduplication queries (#2198)
  • Data Modeling: renamed dvce_tstamp to dvce_created_tstamp in basic recipes (#2166)
  • Data Modeling: removed JSON fields from deduplication queries (#2197)

Release 72 Great Spotted Kiwi (2015-10-15)

  • Documentation: added Scala Tracker to 1-trackers/README.md (#2114)
  • Common: added forwarding of port 3000 to Vagrantfile for Clojure Collector (#2011)
  • Unity Tracker: added git submodule (#2113)
  • Clojure Collector: bumped to 1.1.0
  • Clojure Collector: added URI redirect ability (#1102)
  • Clojure Collector: added basic README for the java-servlet (#2012)
  • Scala Common Enrich: bumped to 0.17.0
  • Scala Common Enrich: added cookie extractor enrichment, thanks @kazjote! (#2072)
  • Scala Common Enrich: converted SnowplowAdapter from object to package (#2040)
  • Scala Common Enrich: added Adapter to pre-process URI redirect events (#1103)
  • Scala Hadoop Enrich: bumped to 1.2.0
  • Scala Hadoop Enrich: bumped Scala Common Enrich to 0.17.0 (#2027)
  • Redshift: added Redshift DDL for a com.snowplowanalytics.snowplow/uri_redirect event (#1104)
  • Redshift: added Redshift DDL for com.amazon.aws.ec2/instance_identity_document (#2086)
  • Redshift: added Redshift DDL for org.ietf/http_cookie (#2096)
  • StorageLoader: wrote JSON Path file for com.snowplowanalytics.snowplow/uri_redirect event (#1105)
  • StorageLoader: wrote JSON path file for org.ietf/http_cookie (#2097)
  • StorageLoader: wrote JSON path file for com.amazon.aws.ec2/instance_identity_document (#2085)
  • Deduplication: added SQL queries to deduplicate without event fingerprint (#2110)
  • Deduplication: updated SQL queries to use event fingerprint (#2091)

Release 71 Stork-Billed Kingfisher (2015-10-02)

  • Enrich: added example event fingerprint enrichment configuration JSON (#1990)
  • EmrEtlRunner: bumped to 0.18.0
  • EmrEtlRunner: updated AMI version in config.yml.sample to 3.7.0 (#1959)
  • EmrEtlRunner: updated combine_configurations.rb to add ssl_mode: disable (#1996)
  • Scala Common Enrich: bumped to 0.16.0
  • Scala Common Enrich: added derived_tstamp enrichment (#1550)
  • Scala Common Enrich: added validation that v_collector is set (#1600)
  • Scala Common Enrich: added validation that collector_tstamp is set and valid (#1611)
  • Scala Common Enrich: added event_vendor/name/format/version to enriched event, thanks @danisola! (#1800)
  • Scala Common Enrich: ported JSON schema from Scala Hadoop Shred, thanks @danisola! (#1637)
  • Scala Common Enrich: bumped referer-parser to 0.3.0 (#1839)
  • Scala Common Enrich: changed etl_tstamp in EnrichmentManager from String to Joda DateTime (#1841)
  • Scala Common Enrich: added support for four new fields in CloudFront access logs (#1865)
  • Scala Common Enrich: bumped user-agent-utils to 1.16 (#1905)
  • Scala Common Enrich: changed BadRow class to use ProcessingMessages (#1936)
  • Scala Common Enrich: ensured that all timestamp fields are nonnegative (#1938)
  • Scala Common Enrich: started catching all exceptions in EtlPipeline (#1954)
  • Scala Common Enrich: added event_fingerprint enrichment (#1965)
  • Scala Common Enrich: bumped Iglu Scala Client to 0.3.0 (#1989)
  • Scala Common Enrich: renamed dvce_tstamp to dvce_created_tstamp (#1995)
  • Scala Common Enrich: started extracting true_tstamp from querystring (#1968)
  • Scala Hadoop Enrich: bumped to 1.1.0
  • Scala Hadoop Enrich: bumped Scala Common Enrich to 0.16.0 (#1807)
  • Scala Hadoop Enrich: updated tests to expect bad row JSONs with timestamps and processing messages (#1751)
  • Scala Hadoop Enrich: updated to use new EtlPipeline (#1931)
  • Scala Hadoop Enrich: bad rows for Thrift payloads now contain the original Thrift record (#1950)
  • Scala Hadoop Enrich: simplified validation projection code (#1986)
  • Scala Hadoop Shred: bumped to 0.5.0
  • Scala Hadoop Shred: updated tests to expect bad row JSONs with timestamps and processing messages (#1953)
  • Scala Hadoop Shred: added clojars.org as a resolver (#1952)
  • Scala Hadoop Shred: bumped Scala Common Enrich to 0.16.0 (#1935)
  • Scala Hadoop Shred: started using BadRow case class from Scala Common Enrich (#1914)
  • Scala Hadoop Shred: upgraded to Hadoop 2.4 (#1720)
  • Scala Hadoop Shred: bumped Iglu Scala Client to 0.3.0 (#1221)
  • Redshift: bumped atomic.events to 0.7.0
  • Redshift: added migration script for 0.6.0 to 0.7.0 (#1988)
  • Redshift: added migration script for 0.5.0 to 0.7.0 (#2058)
  • Redshift: added event_vendor/name/format/version to atomic.events (#1801)
  • Redshift: updated wd_access_log_1.sql with 4 new fields and renamed "x_edge_request_type" to "x_edge_request_id" (#1940)
  • Redshift: added event_fingerprint to atomic.events (#1971)
  • Redshift: added true_tstamp to atomic.events (#1984)
  • Redshift: renamed dvce_tstamp to dvce_created_tstamp (#1993)
  • Redshift: added comment containing table version to atomic.events (#2020)
  • Redshift: added migration script for wd_access_log_1.sql 1-0-3 to 1-0-4 (#2029)
  • Postgres: bumped atomic.events to 0.6.0
  • Postgres: added migration script for 0.5.0 to 0.6.0 (#1987)
  • Postgres: added event_vendor/name/format/version to atomic.events (#1802)
  • Postgres: added event_fingerprint to atomic.events (#1970)
  • Postgres: added true_tstamp to atomic.events (#1985)
  • Postgres: renamed dvce_tstamp to dvce_created_tstamp (#1994)
  • Postgres: added comment containing table version to atomic.events (#2021)
  • StorageLoader: bumped to 0.5.0
  • StorageLoader: exposed sslmode connection option for loading Postgres and Redshift, thanks @dennisatspaceape! (#1980)
  • StorageLoader: updated wd_access_log_1.json with 4 new fields (#1941)
  • Data Modeling: updated web-incremental so failure is recoverable (#1974)
  • Data Modeling: renamed dvce_tstamp to dvce_created_tstamp (#2024)

Release 70 Bornean Green Magpie (2015-08-19)

  • Common: added Ruby script to generate unified config.yml and iglu-resolver.json from runner.yml and loader.yml (#1774)
  • Common: aded postgres.yml to up.playbooks (#1767)
  • Common: added Vagrant push script to publish Ruby apps (#1784)
  • Enrich: moved enrichments folder out of EmrEtlRunner (#1574)
  • Enrich: changed campaign_attribution.json configuration to true (#1608)
  • EmrEtlRunner & StorageLoader: unified the config file format (#878)
  • EmrEtlRunner & StorageLoader: added support for compressing enriched events, thanks @danisola! (#1265)
  • EmrEtlRunner & StorageLoader: now supports environment variables in YML config files, thanks @epantera! (#1215)
  • EmrEtlRunner: bumped to 0.17.0
  • EmrEtlRunner: added retry logic for EMR bootstrap timeouts (#354)
  • EmrEtlRunner: added Snowplow event tracking (#678)
  • EmrEtlRunner: added tags for monitoring to config.yml (#1163)
  • EmrEtlRunner: improved hierarchy in config.yml (#1447)
  • EmrEtlRunner: added Snowplow tracking to config.yml (#1448)
  • EmrEtlRunner: moved Iglu resolver into dedicated CLI argument (#1542)
  • EmrEtlRunner: renamed archive step to archive_raw (#1543)
  • EmrEtlRunner: bumped Sluice to 0.2.2 (#1566)
  • EmrEtlRunner: removed use of symbols for properties in YAML configuration (#1572)
  • EmrEtlRunner: allowed nil for config.yml's bootstrap field (#1575)
  • EmrEtlRunner: simplified trail slash code now that nils are supported (#1588)
  • EmrEtlRunner: pinned Contracts to 0.7 (#1590)
  • EmrEtlRunner: now fails job if odd number of lzo files in processing (#1728)
  • EmrEtlRunner: added an early check that shredded is empty (#1749)
  • EmrEtlRunner: allowed config to be passed in via stdin (#1772)
  • EmrEtlRunner: added Rake task to build app (#1786)
  • EmrEtlRunner: moved Logging module into new Monitoring module (#1797)
  • EmrEtlRunner: ensured that _SUCCESS file is written last for enriched events in S3 (#1808)
  • EmrEtlRunner: replaced m1.small with m1.medium in config.yml, thanks @danrama! (#1826)
  • EmrEtlRunner: recovered from 500 error while checking job status (#1828)
  • EmrEtlRunner: recovered from IOError while checking job status (#1881)
  • EmrEtlRunner: changed .ruby-version to "jruby" (#1888)
  • EmrEtlRunner: now only accepts an array of in buckets (#1910)
  • EmrEtlRunner: validated output_compression configuration using contract (#1820)
  • EmrEtlRunner: handled exception when the connection times out when checking the cluster, thanks @danisola! (#1599)
  • EmrEtlRunner: bumped Elasticity to 6.0.3 (#1939)
  • Deduplication: added timetracking and updated schema name (#1962)
  • StorageLoader: bumped to 0.4.0
  • StorageLoader: allowed config to passed in via stdin (#1773)
  • StorageLoader: added ability to bundle as a JRuby fat jar (#675)
  • StorageLoader: started loading Postgres via stdin, thanks @mrwalker! (#624)
  • StorageLoader: added Snowplow event tracking (#679)
  • StorageLoader: updated to use EmrEtlRunner's expanded config.yml (#1191)
  • StorageLoader: pinned Contracts to 0.7 (#1497)
  • StorageLoader: moved "include Contracts" (#1499)
  • StorageLoader: renamed archive step to archive_enrich (#1544)
  • StorageLoader: bumped Sluice to 0.2.2 (#1567)
  • StorageLoader: removed use of symbols for properties in YAML configuration (#1573)
  • StorageLoader: added Rake task to build app (#1787)
  • StorageLoader: scrubbed credentials from stderr (#1918)
  • StorageLoader: added test suite (#1919)
  • StorageLoader: ensured that _SUCCESS file is written last for enriched events archived to S3 (#1814)
  • StorageLoader: started automatically converting "s3n" to "s3" in copy statements (#1937)
  • StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring.batch/emr_job_started (#1875)
  • StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring.batch/emr_job_succeeded (#1876)
  • StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring.batch/emr_job_failed (#1877)
  • StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring.batch/emr_job_status (#1878)
  • StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring.batch/jobflow_step_status (#1879)
  • StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring.batch/load_succeeded (#1884)
  • StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring.batch/load_failed (#1885)
  • StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring.batch/application_context (#1942)
  • Redshift: added Redshift DDL for com.snowplowanalytics.monitoring.batch/emr_job_started (#1870)
  • Redshift: added Redshift DDL for com.snowplowanalytics.monitoring.batch/emr_job_succeeded (#1871)
  • Redshift: added Redshift DDL for com.snowplowanalytics.monitoring.batch/emr_job_failed (#1872)
  • Redshift: added Redshift DDL for com.snowplowanalytics.monitoring.batch/emr_job_status (#1873)
  • Redshift: added Redshift DDL for com.snowplowanalytics.monitoring.batch/jobflow_step_status (#1874)
  • Redshift: added Redshift DDL for com.snowplowanalytics.monitoring.batch/load_succeeded (#1882)
  • Redshift: added Redshift DDL for com.snowplowanalytics.monitoring.batch/load_failed (#1883)
  • Redshift: added Redshift DDL for com.snowplowanalytics.monitoring.batch/application_context (#1943)

Release 69 Blue-Bellied Roller (2015-07-24)

  • Incremental SQL Model: added the new incremental queries (#1857)
  • Incremental SQL Model: changed how query performance is tracked (#1855)
  • Incremental SQL Model: added new setup queries (#1853)
  • Incremental SQL Model: added migration queries (#1852)
  • Incremental SQL Model: updated the SQL runner playbook (#1851)
  • Incremental SQL Model: updated diagram (#1850)
  • Deduplication: added a step that deduplicates events (#1866)
  • Incremental SQL Model: replaced RANK with ROW_NUMBER (#1867)
  • Mobile SQL Model: added sessionization and DAU queries (#1891
  • SQL Models: renamed full and incremental to allow for more models ) (#1892)
  • StorageLoader: wrote JSON path file for com.snowplowanalytics.snowplow/client_session (#1922)
  • Redshift: added Redshift DDL for com.snowplowanalytics.snowplow/client_session (#1921)

Release 68 Turquoise Jay (2015-07-23)

  • EmrEtlRunner: bumped to 0.16.0
  • EmrEtlRunner: bumped Elasticity to 6.0.2 (#1903)
  • EmrEtlRunner: named the processing bucket in its associated "is not empty" error (#1911)
  • EmrEtlRunner: made in bucket an array (#1750)
  • EmrEtlRunner: determined path to Hadoop enrich based on its version (#1789)
  • EmrEtlRunner: added unit test for add_trailing_slashes function (#1904)

Release 67 Bohemian Waxwing (2015-07-13)

  • Common: added NFS and CORE configuration to Vagrantfile to enhance performance (#1831)
  • Scala Stream Collector: bumped to 0.5.0
  • Scala Stream Collector: stdout bad sink now prints to stderr (#1799)
  • Scala Stream Collector: added splitter for large event arrays (#941)
  • Scala Stream Collector: increased maximum record size from 50kB to 1MB (#1753)
  • Scala Stream Collector: added tests for splitting large requests (#1683)
  • Scala Stream Collector: updated bad rows to include timestamp (#1681)
  • Scala Stream Collector: handled case where IP is not present (#1680)
  • Scala Stream Collector: did some reorganisation and refactoring of the project (#1678)
  • Scala Stream Collector: added json4s dependency (#1673)
  • Scala Stream Collector: added bad stream (#1502)
  • Scala Common Enrich: bumped to 0.15.0
  • Scala Common Enrich: fixed JavascriptScriptEnrichmentSpec test to pass openjdk7 (#1793)
  • Scala Common Enrich: bumped scala-maxmind-iplookups to 0.3.0 (#1771)
  • Scala Common Enrich: bumped Scala Forex to 0.3.0 (#1770)
  • Scala Common Enrich: updated bad rows to include timestamp (#1577)
  • Scala S3 Sink: removed project from repo (#1672)
  • Scala Kinesis Enrich: bumped to 0.6.0
  • Scala Kinesis Enrich: bumped to Scala Common Enrich 0.15.0 (#1685)
  • Scala Kinesis Enrich: tries to send 503 records (#1756)
  • Scala Kinesis Enrich: made back-off fields macros (#1745)
  • Scala Kinesis Enrich: increased maximum record size to 1MB (#1736)
  • Scala Kinesis Enrich: logging all bad rows (#1722)
  • Scala Kinesis Enrich: exception installing MaxMind file must terminate (#1711)
  • Scala Kinesis Enrich: sending Snowplow hearbeat (#1406)
  • Scala Kinesis Enrich: allowed records of over 1Mb when running in local mode (#1663)
  • Scala Kinesis Enrich: fixed error when fetching MaxMind file from s3:// URI (#1645)
  • Scala Kinesis Enrich: sending a warning via Snowplow if no enrichment JSONs are retrieved from DynamoDB (#1621)
  • Scala Kinesis Enrich: sending failure to sink event to kinesis to Snowplow (#1798)
  • Scala Kinesis Enrich: etl_tstamp should be Redshift Formatted not raw (#1842)
  • Kinesis Elasticsearch Sink: bumped to 0.4.0
  • Kinesis Elasticsearch Sink: removed Scala Common Enrich as an assembly dependency (#1819)
  • Kinesis Elasticsearch Sink: bumped to Scala Common Enrich 0.15.0 (#1811)
  • Kinesis Elasticsearch Sink: allowed use of AWS creds instead of DefaultAWSCredentialsProviderChain (#1803)
  • Kinesis Elasticsearch Sink: app no longer hangs without shutting down (#1743)
  • Kinesis Elasticsearch Sink: updated the Elasticsearch version (#1734)
  • Kinesis Elasticsearch Sink: sent event to Snowplow on heartbeat (#1706)
  • Kinesis Elasticsearch Sink: added Scala Tracker dependency (#1705)
  • Kinesis Elasticsearch Sink: sending event to Snowplow when unable to write to Elasticsearch (#1704)
  • Kinesis Elasticsearch Sink: sending event to Snowplow on shutdown (#1703)
  • Kinesis Elasticsearch Sink: sending event to Snowplow on initialization (#1702)
  • Kinesis Elasticsearch Sink: initialized bad stream eagerly rather than lazily (#1677)
  • Kinesis Elasticsearch Sink: updated amazon-kinesis-connectors to 1.1.2 (#1675)
  • Kinesis Elasticsearch Sink: specifying character encoding in SnowplowElasticsearchTransformer (#1654)
  • Kinesis Elasticsearch Sink: updated bad rows to include timestamp (#1578)
  • Kinesis Elasticsearch Sink: moved location fields into elasticsearch section (#1517)
  • Kinesis Elasticsearch Sink: corrected shredding example in comment (#1276)
  • Redshift: added Redshift DDL for com.snowplowanalytics.monitoring/application_warning (#1809)
  • Redshift: added Redshift DDL for com.snowplowanalytics.monitoring/heartbeat (#1764)
  • Redshift: added Redshift DDL for com.snowplowanalytics.monitoring/sink_write_failed (#1763)
  • Redshift: added Redshift DDL for com.snowplowanalytics.monitoring/application_initialized (#1762)
  • Redshift: added Redshift DDL for com.snowplowanalytics.monitoring/application_shutdown (#1761)
  • Redshift: added Redshift DDL for com.snowplowanalytics.monitoring/stream_write_failed (#1844)
  • Redshift: added Redshift DDL for com.snowplowanalytics.snowplow/web_page (#1835)
  • Redshift: added migration script for 0.3.0 to 0.6.0 (#1832)
  • Redshift: added migration script for 0.4.0 to 0.6.0 (#1833)
  • StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring/application_warning (#1810)
  • StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring/heartbeat (#1760)
  • StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring/sink_write_failed (#1759)
  • StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring/application_initialized (#1758)
  • StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring/application_shutdown (#1757)
  • StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring/stream_write_failed (#1843)
  • StorageLoader: wrote JSON path file for com.snowplowanalytics.snowplow/web_page (#1836)

Release 66 Oriental Skylark (2015-06-16)

  • Documentation: replaced Hive ETL references with Kinesis Enrich in Scala Hadoop Enrich's README (#1671)
  • Documentation: fixed links in Scala Common Enrich's README.md, thanks @bigsnarfdude! (#1669)
  • Scala Tracker: added git submodule (#1724)
  • Scala Hadoop Enrich: bumped to 1.0.0
  • Scala Hadoop Enrich: renamed build to snowplow-hadoop-enrich (#1718)
  • Scala Hadoop Enrich: updated dependencies to Hadoop 2.4 (#1716)
  • Scala Hadoop Enrich: bumped Scala Common Enrich to 0.14.0 (#1700)
  • Scala Hadoop Enrich: updated Core2015RefreshSpec to include JavascriptScriptEnrichment (#1746)
  • Scala Common Enrich: bumped to 0.14.0
  • Scala Common Enrich: added JavaScript scripting enrichment (#378)
  • Scala Common Enrich: made IpLookupsEnrichment error message more informative (#1426)
  • Scala Common Enrich: commons-codec dependency is no longer test-only (#1712)
  • Scala Common Enrich: bumped commons-lang3 to 3.4 (#1713)
  • Scala Common Enrich: made mkt_ and refr_ fields TSV safe, thanks @jasonbosco! (#1643)
  • Scala Common Enrich: updated JodaTime dependency to 2.2 (#1748)
  • Scala Common Enrich: now handles null message in stripInstanceEtc (#1622)
  • EmrEtlRunner: bumped to 0.15.0
  • EmrEtlRunner: now using new scala-hadoop-enrich jar path in Hosted Assets (#1719)
  • EmrEtlRunner: updated ami_version in config.yml to 3.6.0 (#1651)
  • EmrEtlRunner: added bootstrap action to prepare AMI 3.x for Snowplow (#1714)
  • EmrEtlRunner: now setting buffer for processing thrift in core-site.xml (#1715)
  • EmrEtlRunner: added S3DistpCp step for thrift files in processing (#1647)
  • EmrEtlRunner: added example javascript_script_config to enrichments folder (#1755)
  • StorageLoader: wrote JSON Path file for com.mparticle.snowplow/app_event (#1688)
  • StorageLoader: wrote JSON Path file for com.mparticle.snowplow/social_event (#1690)
  • StorageLoader: wrote JSON Path file for com.mparticle.snowplow/transaction_event (#1692)
  • StorageLoader: wrote JSON Path file for a com.mparticle.snowplow/session_context (#1694)
  • Redshift: added Redshift DDL for a com.mparticle.snowplow/app_event (#1686)
  • Redshift: added Redshift DDL for a com.mparticle.snowplow/social_event (#1689)
  • Redshift: added Redshift DDL for a com.mparticle.snowplow/transaction_event (#1691)
  • Redshift: added Redshift DDL for a com.mparticle.snowplow/session_context (#1693)
  • Data Modeling: removed restrictions in sessions and visitors-source (#1725)

Release 65 Scarlet Rosefinch (2015-05-08)

  • Scala Stream Collector: bumped to 0.4.0
  • Scala Stream Collector: bumped Scalazon to 0.11 (#1504)
  • Scala Stream Collector: added support for PutRecords API (#1227)
  • Scala Stream Collector: added CORS support (#1165)
  • Scala Stream Collector: added CORS-style support for ActionScript3 Tracker (#1331)
  • Scala Stream Collector: added ability to disable third-party cookies (#1363)
  • Scala Stream Collector: removed automatic creation of stream (#1464)
  • Scala Stream Collector: added macros to config.hocon.sample (#1471)
  • Scala Stream Collector: logged the name of the stream to which records are written (#1503)
  • Scala Stream Collector: added shutdown hook to send stored events (#1535)
  • Scala Stream Collector: added configurable exponential backoff with jitter (#1592)
  • Scala Kinesis Enrich: bumped to 0.5.0
  • Scala Kinesis Enrich: bumped Scala Common Enrich to 0.13.1 (#1618)
  • Scala Kinesis Enrich: bumped Scalazon to 0.11 (#1492)
  • Scala Kinesis Enrich: bumped Kinesis Client Library to 1.2.1 (#1580)
  • Scala Kinesis Enrich: added ability to retrieve resolver and enrichments from DynamoDB (#1289)
  • Scala Kinesis Enrich: added support for PutRecords API (#1418)
  • Scala Kinesis Enrich: removed automatic creation of streams (#1465)
  • Scala Kinesis Enrich: fixed checkpointing (#1467)
  • Scala Kinesis Enrich: logged the name of the stream to which records are written (#1493)
  • Scala Kinesis Enrich: added macros to config.hocon.sample (#1513)
  • Scala Kinesis Enrich: moved Iglu resolver to dedicated CLI argument (#1534)
  • Scala Kinesis Enrich: updated README examples with new configuration (#1549)
  • Scala Kinesis Enrich: stopped retrying in the case of a ShutdownException or InvalidStateException (#1552)
  • Scala Kinesis Enrich: stopped ignoring region setting for DynamoDB table (#1576)
  • Scala Kinesis Enrich: updated test suite to accommodate changes (#1581)
  • Scala Kinesis Enrich: added Clojars as a resolver (#1586)
  • Scala Kinesis Enrich: added configurable exponential backoff with jitter (#1591)
  • Scala Kinesis Enrich: randomize partition keys for bad events (#1631)
  • Scala Kinesis Enrich: stopped sending records of over 50kB (#1649)
  • Kinesis Elasticsearch Sink: bumped to 0.3.0
  • Kinesis Elasticsearch Sink: made DynamoDB region configurable (#1583)
  • Kinesis Elasticsearch Sink: added macros to config.hocon.sample (#1515)
  • Kinesis Elasticsearch Sink: changed "connector" to "sink" in config (#1474)
  • Kinesis Elasticsearch Sink: stopped failing silently for inputs with fewer than 24 tab-separated fields (#1584)
  • Kinesis Elasticsearch Sink: stopped analyzing text fields by default (#1624)
  • Kinesis Elasticsearch Sink: removed automatic creation of bad stream (#1626)
  • Kinesis Elasticsearch Sink: randomized partition keys for failed records (#1633)
  • Kinesis LZO S3 Sink: bumped to 0.2.0
  • Kinesis LZO S3 Sink: removed automatic creation of stream (#1529)
  • Kinesis LZO S3 Sink: changed "connector" to "sink" in config (#1473)
  • Kinesis LZO S3 Sink: made DynamoDb region configurable (#1582)
  • Kinesis LZO S3 Sink: added macros to config.hocon.sample (#1472)
  • Kinesis LZO S3 Sink: changed the configuration to use the S3 region instead of the full endpoint URI (#1327)

Release 64 Palila (2015-04-16)

  • Common: added top-level data modeling folder (#1523)
  • Common: updated root README to include data modeling (#1612)
  • ActionScript 3.0 Tracker: added git submodule (#1546)
  • EmrEtlRunner: bumped to 0.14.0
  • EmrEtlRunner: bumped Elasticity to 4.0.5 (#758)
  • EmrEtlRunner: added support for specifying EMR service role (#1595)
  • EmrEtlRunner: added support for specifying EMR jobflow role (#1232)
  • Scala Common Enrich: bumped to 0.13.1
  • Scala Common Enrich: prevented UaParserEnrichment from creating a new Parser on every event (#1616)
  • Scala Hadoop Enrich: bumped to 0.14.1
  • Scala Hadoop Enrich: bumped Scala Common Enrich to 0.13.1 (#1617)
  • Redshift: bumped atomic.events to 0.6.0
  • Redshift: added migration script for 0.5.0 to 0.6.0 (#1606)
  • Redshift: increased mkt_clickid to varchar(128) (#1605)
  • Redshift: removed legacy cubes (#1613)
  • Postgres: bumped atomic.events to 0.5.0
  • Postgres: added migration script for 0.4.0 to 0.5.0 (#1604)
  • Postgres: increased mkt_clickid to varchar(128) (#1603)
  • Postgres: removed legacy cubes (#1614)
  • Postgres: added user_id field to migration script for 0.3.0 to 0.4.0 (#1620)
  • Data Modeling: updated reference data.iso_country_codes so DISTSTYLE is ALL (#1393)
  • SQL Runner: added basic sessions / visits / page views model that can be pivoted on directly from any BI tool (#1273)
  • Looker: simplified LookML model and made it consistent with Redshift data models (#1522)

Release 63 Red-Cheeked Cordon-Bleu (2015-04-02)

  • Common: updated kinesis push to remove sub-folders from zipfile (#1378)
  • EmrEtlRunner: added example configuration JSONs for new enrichments (#1545)
  • Scala Common Enrich: bumped to 0.13.0
  • Scala Common Enrich: bumped referer-parser to 0.2.3 (#670)
  • Scala Common Enrich: converted transactions from given currency to base currency (#370)
  • Scala Common Enrich: bumped CampaignAttributionEnrichment version to 0.2.0 (#1338)
  • Scala Common Enrich: added mkt_clickid and mkt_network fields to POJO (#1073)
  • Scala Common Enrich: added derived_contexts field to POJO (#787)
  • Scala Common Enrich: added geo_timezone field to POJO (#787)
  • Scala Common Enrich: added etl_tags field to POJO (#1247)
  • Scala Common Enrich: added currency fields to POJO (#1316)
  • Scala Common Enrich: changed enrichment configuration to use SchemaCriterion rather than SchemaKey (#1353)
  • Scala Common Enrich: extracted original IP address from CollectorPayload headers (#1372)
  • Scala Common Enrich: extracted dvce_sent_tstamp from stm field (#1383)
  • Scala Common Enrich: added dvce_sent_tstamp to POJO (#1384)
  • Scala Common Enrich: added refr_domain_userid and refr_dvce_sent_tstamp to POJO (#1449)
  • Scala Common Enrich: added domain_sessionid field to POJO (#1538)
  • Scala Common Enrich: added derived_tstamp field to POJO (#1557)
  • Scala Common Enrich: populated refr_ fields based on page_url querystring (#1461)
  • Scala Common Enrich: populated domain_sessionid field based on "sid" parameter (#1541)
  • Scala Common Enrich: parsed the page URI in the EnrichmentManager (#1463)
  • Scala Common Enrich: added ua-parser enrichment (#62)
  • Scala Common Enrich: added ability to disable user-agent-utils enrichment (#792)
  • Scala Common Enrich: used Netaporter to parse querystrings if httpclient fails, thanks @danisola! (#1429)
  • Scala Hadoop Enrich: bumped to 0.14.0
  • Scala Hadoop Enrich: bumped Scala Common Enrich to 0.13.0 (#1340)
  • Scala Hadoop Enrich: added integration tests for currency conversion enrichment (#1430)
  • Scala Hadoop Enrich: added tests for other new EnrichedEvent fields (#1337)
  • Scala Hadoop Shred: bumped to 0.4.0
  • Scala Hadoop Shred: bumped Scala Common Enrich to 0.13.0 (#1343)
  • Scala Hadoop Shred: bumped json4sJackson to 3.2.11 (#1344)
  • Scala Hadoop Shred: extracted JSONs from derived_contexts field (#786)
  • Scala Hadoop Shred: updated to reflect new enriched event format (#1332)
  • Scala Kinesis Enrich: bumped to 0.4.0
  • Scala Kinesis Enrich: bumped Scala Common Enrich to 0.13.0 (#1369)
  • Scala Kinesis Enrich: emitted updated EnrichedEvent (#1368)
  • Scala Kinesis Enrich: unified logger configuration, thanks @kazjote! (#1367)
  • Redshift: bumped atomic.events to 0.5.0
  • Redshift: added migration script for 0.4.0 to 0.5.0 (#1335)
  • Redshift: added refr_domain_userid and refr_dvce_tstamp to atomic.events (#1450)
  • Redshift: added dvce_sent_tstamp column (#1385)
  • Redshift: added foreign key constraint to all Redshift shredded tables (#1365)
  • Redshift: changed JSON field encodings to lzo (#1350)
  • Redshift: added etl_tags column (#1245)
  • Redshift: added column for mkt_clickid and mkt_network (#1093)
  • Redshift: widened domain_userid column to hold UUID (#1090)
  • Redshift: added Redshift DDL for ua_parser_context (#789)
  • Redshift: added new derived_contexts field (#784)
  • Redshift: updated ip_address to support IPv6 addresses (#656)
  • Redshift: added new currency fields (#366)
  • Redshift: added domain_sessionid column (#1539)
  • Redshift: widened structured event, URL, and referer fields (#1553)
  • Redshift: added derived_tstamp column (#1558)
  • Postgres: bumped atomic.events to 0.4.0
  • Postgres: added migration script for 0.3.0 to 0.4.0 (#1347)
  • Postgres: added refr_domain_userid and refr_dvce_tstamp to atomic.events (#1451)
  • Postgres: added dvce_sent_tstamp column (#1386)
  • Postgres: added column for geo_timezone (#1336)
  • Postgres: added etl_tags column (#1246)
  • Postgres: removed primary key constraint on event_id (#1187)
  • Postgres: added column for mkt_clickid and mkt_network (#1092)
  • Postgres: widened domain_userid column to hold UUID (#1091)
  • Postgres: added new derived_contexts field (#785)
  • Postgres: updated ip_address to support IPv6 addresses (#655)
  • Postgres: added new currency fields (#365)
  • Postgres: added domain_sessionid column (#1540)
  • Postgres: widened structured event, URL, and referer fields (#1554)
  • Postgres: added derived_tstamp column (#1559)
  • StorageLoader: wrote JSON Path file for ua_parser_context (#790)
  • Kinesis Elasticsearch Sink: bumped to 0.2.0
  • Kinesis Elasticsearch Sink: added new EnrichedEvent fields (#1345)
  • Kinesis Elasticsearch Sink: stopped verifying number of fields in enriched event (#1333)
  • Kinesis Elasticsearch Sink: changed organization to com.snowplowanalytics in BuildSettings (#1279)
  • Kinesis Elasticsearch Sink: renamed application.conf.example to config.hocon.sample (#1244)

Release 62 Tropical Parula (2015-03-17)

  • Common: updated vagrant up to work with latest Peru version (#1475)
  • Ruby Tracker: bumped git submodule to 0.4.1 (#1488)
  • Python Tracker: bumped git submodule to 0.6.0 (#1487)
  • PHP Tracker: bumped git submodule to 0.2.1 (#1486)
  • JavaScript Tracker: bumped git submodule to 2.3.0 (#1485)
  • Java Tracker: bumped git submodule to 0.7.0 (#1484)
  • Objective-C Tracker: renamed from iOS Tracker and bump git submodule to 0.3.2 (#1483)
  • EmrEtlRunner: bumped to 0.13.0
  • EmrEtlRunner: fixed copy to staging for Tomcat7 logs with hyphen after .txt (#1480)
  • EmrEtlRunner: added missing :archive: in BucketHash (#1475)
  • EmrEtlRunner: added support for custom bootstrap actions, thanks @danisola! (#1405)
  • EmrEtlRunner: removed time_diff as a dependency (#1352)
  • EmrEtlRunner: fixed breaking get_assets spec (#1287)
  • EmrEtlRunner: now tolerating more exception types in EmrJob's wait_for (#358)
  • EmrEtlRunner: bumped Contracts to 0.7 (#1498)
  • EmrEtlRunner: moved include Contracts into classes and modules (#1438)

Release 61 Pygmy Parrot (2015-03-02)

  • Common: bumped VERSION file to r61-pygmy-parrot
  • Common: added Gradle to up.playbooks (#1270)
  • Common: added .travis.yml file and Travis button to repo (#1359)
  • Common: added Release button to README (#1428)
  • Common: added License button to README (#1427)
  • Clojure Collector: bumped to 1.0.0
  • Clojure Collector: updated access-valve to depend on Tomcat 8 classes (#1203)
  • Clojure Collector: updated .ebextensions to depend on Tomcat 8 (#1202)
  • Clojure Collector: added ability to disable third-party cookies (#1362)
  • Clojure Collector: added CORS support (#1146)
  • Clojure Collector: added CORS-style support for ActionScript3 Tracker (#1330)
  • Clojure Collector: added support for /:vendor/:version to HEAD (#1166)
  • Clojure Collector: now using UTF-8 for character encoding throughout (#1354)
  • Scala Common Enrich: bumped to 0.12.0
  • Scala Common Enrich: updated SnowplowAdapter to accept "charset=UTF-8" (#1424)
  • Scala Common Enrich: Base64 decoding does not specify UTF-8 charset (#1403)
  • Scala Common Enrich: removed incorrect extra layer of URL decoding from non-Bas64-encoded JSONs (#1396)
  • Scala Common Enrich: added support for ti_nm for transaction item name as well as ti_na (#1401)
  • Scala Common Enrich: added CloudfrontAccessLogAdapter (#1282)
  • Scala Common Enrich: made timestamp field of CollectorPayload an Option (#1417)
  • Scala Hadoop Enrich: bumped to 0.13.0
  • Scala Hadoop Enrich: bumped Scala Common Enrich to 0.12.0 (#1395)
  • Scala Hadoop Enrich: added test for non-Base64-encoded JSON (#1394)
  • Scala Hadoop Enrich: updated tests to include Unicode (#1390)
  • Scala Hadoop Enrich: added integration test for CloudfrontAccessLogAdapter (#1423)
  • Scala Hadoop Bad Rows: removed .travis.yml (#1382)
  • EmrEtlRunner: bumped to 0.12.0
  • EmrEtlRunner: now appending region name to Clojure Collector log files (#1379)
  • EmrEtlRunner: added support for moving and archiving timestamped Clojure Collector log files (#1400)
  • EmrEtlRunner: now appending rather than prepending instance names to Clojure Collector log files (#1404)
  • EmrEtlRunner: changed Clojure Collector log timestamp format to match CloudFront logs (#1398)
  • EmrEtlRunner: added dedicated return code for no files to process (#1397)
  • EmrEtlRunner: now allowing tsv// and json// as :etl:collector_format (#1284)
  • EmrEtlRunner: now performing S3DistCp from processing for tsv/com.amazon.aws.cloudfront/* (#1431)
  • EmrEtlRunner: added output directory empty check prior to staging step (#1151)
  • StorageLoader: updated shell script to only run StorageLoader if EmrEtlRunner found files (#1399)
  • StorageLoader: wrote JSON Path file for a com.snowplowanalytics.snowplow/flash_context (#1305)
  • StorageLoader: wrote JSON Path file for a com.snowplowanalytics.snowplow/timing event (#1388)
  • StorageLoader: wrote JSON Path file for a com.amazon.aws.cloudfront/wd_access_log event (#1285)
  • StorageLoader: wrote JSON Path file for a com.google.analytics/cookies context (#1409)
  • StorageLoader: wrote JSON Path file for a com.snowplowanalytics.snowplow/desktop_context (#1421)
  • Redshift: added Redshift DDL for a com.snowplowanalytics.snowplow/timing event (#1387)
  • Redshift: added Redshift DDL for a com.snowplowanalytics.snowplow/flash_context (#1304)
  • Redshift: added Redshift DDL for a com.amazon.aws.cloudfront/wd_access_log event (#1286)
  • Redshift: added Redshift DDL for a com.google.analytics/cookies context (#1408)
  • Redshift: added Redshift DDL for a com.snowplowanalytics.snowplow/desktop_context (#1420)

Release 60 Bee Hummingbird (2015-02-03)

  • Common: added VERSION file in root to assist vagrant push (#1293)
  • Common: added vagrant push scripting to publish Kinesis apps (#1288)
  • Common: added lzo.yml to up.playbooks (#1325)
  • Thrift Raw Event: bumped Thrift version to 0.9.1 (#1225)
  • Thrift Raw Event: added collector-payload-1 and schema-sniffer-1 (#1322)
  • Thrift Raw Event: created a subproject for each Thrift class (#1298)
  • Thrift Raw Event: updated README and project description to reflect new structure (#1300)
  • Thrift Raw Event: renamed to thrift-schemas (#1299)
  • Scala Stream Collector: bumped to 0.3.0
  • Scala Stream Collector: started sending CollectorPayloads instead of SnowplowRawEvents (#1226)
  • Scala Stream Collector: added support for POST requests (#187)
  • Scala Stream Collector: added support for any {api-vendor}/{api-version} for GET and POST (#652)
  • Scala Stream Collector: stopped decoding URLs (#1217)
  • Scala Stream Collector: changed 1x1 pixel response to use a stable GIF (#1260)
  • Scala Stream Collector: renamed default.conf to config.hocon.sample (#1243)
  • Scala Stream Collector: started using ThreadLocal to handle Thrift serialization, thanks @denismo and @pkallos! (#1254)
  • Scala Stream Collector: added healthcheck for load balancers, thanks @duncan! (#1360)
  • EmrEtlRunner: bumped to 0.11.0
  • EmrEtlRunner: added "thrift" collector format (#1301)
  • EmrEtlRunner: implemented time_diff manually (#1310)
  • EmrEtlRunner: fixed failure reporting when jobflow step(s) created_at is nil (#1351)
  • Scala Common Enrich: bumped to 0.11.0
  • Scala Common Enrich: added schema-sniffer-1 and collector-payload-1 dependencies (#1296)
  • Scala Common Enrich: bumped user-agent-utils version to 1.14 (#1224)
  • Scala Common Enrich: changed EnrichedEvent field name to ip_organization (#1145)
  • Scala Common Enrich: changed "thrift" to "thrift-raw" in Loader object (#1302)
  • Scala Common Enrich: added tests for getLoader function (#558)
  • Scala Hadoop Enrich: bumped to 0.12.0
  • Scala Hadoop Enrich: bumped Scala Common Enrich to 0.11.0 (#1294)
  • Scala Hadoop Enrich: added collector-payload-1 and snowplow-thrift-raw-event as test dependencies (#1248)
  • Scala Hadoop Enrich: added support for processing Thrift raw events, thanks @pkallos! (#538)
  • Scala Hadoop Enrich: added tests to Hadoop Enrich for processing Thrift raw events (#559)
  • Scala Kinesis Enrich: bumped to 0.3.0
  • Scala Kinesis Enrich: bumped Scala Common Enrich to 0.11.0 (#1295)
  • Scala Kinesis Enrich: renamed default.conf to config.hocon.sample (#1242)
  • Kinesis Elasticsearch Sink: added LICENSE-2.0.txt (#1329)
  • Kinesis LZO S3 Sink: added. Version 0.1.0, thanks @pkallos! (#1016)

Version 0.9.14 (2014-12-31)

  • Common: added dedicated Vagrant setup (#1266)
  • Common: added Quickstart section to README (#1268)
  • Common: added script to sync region-specific Snowplow Hosted Assets buckets (#1269)
  • CloudFront Collector: replaced 1x1 pixel with stable GIF (#1259)
  • Clojure Collector: bumped to 0.9.1
  • Clojure Collector: increased Tomcat's HTTP header tolerance to 64kB (#1249)
  • Clojure Collector: changed 1x1 pixel response to use a stable GIF (#1258)
  • EmrEtlRunner: bumped to 0.10.0
  • EmrEtlRunner: removed hyphen from the pattern match for Clojure Collector logs (#1194)
  • EmrEtlRunner: on job failure, log overall jobflow and individual step statuses (#1153)
  • Scala Common Enrich: bumped to 0.10.0
  • Scala Common Enrich: bumped Scala Iglu Client to 0.2.0 (#1222)
  • Scala Common Enrich: updated SnowplowAdapter to accept payload_data versions above 1-0-0 (#1220)
  • Scala Common Enrich: updated SnowplowAdapter to make charset=utf-8 optional (#1257)
  • Scala Common Enrich: added Adapter to pre-process Pingdom events (#1164)
  • Scala Common Enrich: added Adapter to pre-process PagerDuty events (#1158)
  • Scala Common Enrich: added Adapter to pre-process Mandrill events (#1061)
  • Scala Hadoop Enrich: bumped to 0.11.0
  • Scala Hadoop Enrich: bumped Scala Common Enrich to 0.10.0 (#1223)
  • Scala Hadoop Enrich: added test job for PingdomAdapter (#1176)
  • Scala Hadoop Enrich: added test job for PagerdutyAdapter (#1175)
  • Scala Hadoop Enrich: added test job for MandrillAdapter (#1171)
  • Scala Hadoop Enrich: added test job for more relaxed payload_data schema matching (#1235)
  • Scala Hadoop Shred: bumped to 0.3.0
  • Scala Hadoop Shred: bumped Scala Common Enrich to 0.10.0 (#1236)
  • Scala Hadoop Shred: bumped Iglu Scala Client to 0.2.0 (#1230)
  • Scala Hadoop Shred: loosened match criteria for unstructured events and contexts (#1231)
  • StorageLoader: wrote JSON Path file for com.pingdom/incident_notify_of_close event (#1182)
  • StorageLoader: wrote JSON Path file for com.pingdom/incident_assign event (#1181)
  • StorageLoader: wrote JSON Path file for com.pingdom/incident_notify_user event (#1251)
  • StorageLoader: wrote JSON Path file for com.pagerduty/incident event (#1177)
  • StorageLoader: wrote JSON Path file for com.mandrill/message_sent event (#1059)
  • StorageLoader: wrote JSON Path file for com.mandrill/message_bounced event (#1058)
  • StorageLoader: wrote JSON Path file for com.mandrill/message_opened event (#1057)
  • StorageLoader: wrote JSON Path file for com.mandrill/message_marked_as_spam event (#1056)
  • StorageLoader: wrote JSON Path file for com.mandrill/message_delayed event (#1055)
  • StorageLoader: wrote JSON Path file for com.mandrill/message_soft_bounced event (#1054)
  • StorageLoader: wrote JSON Path file for com.mandrill/message_clicked event (#1053)
  • StorageLoader: wrote JSON Path file for com.mandrill/message_rejected event (#1052)
  • StorageLoader: wrote JSON Path file for com.mandrill/recipient_unsubscribed event (#1051)
  • Redshift: added Redshift DDL for a com.pingdom/incident_notify_of_close event (#1180)
  • Redshift: added Redshift DDL for a com.pingdom/incident_assign event (#1179)
  • Redshift: added Redshift DDL for a com.pingdom/incident_notify_user (#1252)
  • Redshift: added Redshift DDL for a com.pagerduty/incident event (#1178)
  • Redshift: added Redshift DDL for a com.mandrill/message_sent event (#1050)
  • Redshift: added Redshift DDL for a com.mandrill/message_bounced event (#1049)
  • Redshift: added Redshift DDL for a com.mandrill/message_opened event (#1048)
  • Redshift: added Redshift DDL for a com.mandrill/message_marked_as_spam event (#1047)
  • Redshift: added Redshift DDL for a com.mandrill/message_delayed event (#1046)
  • Redshift: added Redshift DDL for a com.mandrill/message_soft_bounced event (#1045)
  • Redshift: added Redshift DDL for a com.mandrill/message_clicked event (#1044)
  • Redshift: added Redshift DDL for a com.mandrill/message_rejected event (#1043)
  • Redshift: added Redshift DDL for a com.mandrill/recipient_unsubscribed event (#1042)
  • Redshift: removed trailing commas from com.mailchimp SQL table definitions (#1174)

Version 0.9.13 (2014-12-01)

  • Scala Common Enrich: bumped to 0.9.1
  • Scala Common Enrich: added error handling for Netaporter URI parsing (#1216)
  • Scala Kinesis Enrich: bumped to 0.2.1
  • Scala Kinesis Enrich: bumped Scala Common Enrich to 0.9.1
  • Scala Kinesis Enrich: fixed conflict with Specs2 version, thanks @knservis! (#1213)
  • Scala Hadoop Enrich: bumped to 0.10.1
  • Scala Hadoop Enrich: bumped Scala Common Enrich to 0.9.1
  • Deleted test-file in repository root (#1219)

Version 0.9.12 (2014-11-26)

  • Scala Stream Collector: bumped to 0.2.0
  • Scala Stream Collector: changed organization to "com.snowplowanalytics" (#1168)
  • Scala Stream Collector: made the --config option mandatory (#1128)
  • Scala Stream Collector: added ability to set AWS credentials from environment variables (#1116)
  • Scala Stream Collector: now enforcing Java 7 for compilation (#1068)
  • Scala Stream Collector: increased request character limit to 32768 (#987)
  • Scala Stream Collector: improved performance by using Future, thanks @pkallos! (#580)
  • Scala Stream Collector, Scala Kinesis Enrich: made endpoint configurable, thanks @sambo1972! (#978)
  • Scala Stream Collector, Scala Kinesis Enrich: added support for IAM roles, thanks @pkallos! (#534)
  • Scala Stream Collector, Scala Kinesis Enrich: replaced stream list with describe to tighten permissions, thanks @pkallos! (#535)
  • Scala Kinesis Enrich: bumped to 0.2.0
  • Scala Kinesis Enrich: bumped Scala Common Enrich to 0.9.0
  • Scala Kinesis Enrich: changed organization to "com.snowplowanalytics" (#1167)
  • Scala Kinesis Enrich: made the --config option mandatory (#1126)
  • Scala Kinesis Enrich: updated instructions in README (#1125)
  • Scala Kinesis Enrich: added ability to set AWS credentials from environment variables (#1117)
  • Scala Kinesis Enrich: now enforcing Java 7 for compilation (#1067)
  • Scala Kinesis Enrich: replaced printlns with Java Logger (#521)
  • Scala Kinesis Enrich: started sending bad records to a separate stream (#463)
  • Scala Kinesis Enrich: added page_url and page_referrer back into enrichment output (#686)
  • Scala Kinesis Enrich: stopped opening a new file for each enriched event, thanks @pkallos! (#714)
  • Scala Common Enrich: bumped to 0.9.0
  • Scala Common Enrich: added BadRow from Scala Hadoop Enrich (#1118)
  • Scala Common Enrich: added ability to override collector-set nuid with tracker-set tnuid (#1095)
  • Scala Common Enrich: made URI parsing more permissive using NetAPorter's URI library, thanks @rupeshmane! (#1172)
  • Scala Hadoop Enrich: bumped to 0.10.0
  • Scala Hadoop Enrich: bumped Scala Common Enrich to 0.9.0
  • Scala Hadoop Enrich: moved BadRow into Scala Common Enrich (#1119)
  • Scala Hadoop Enrich: updated README with new Snowplow capitalization (#1127)
  • Kinesis Elasticsearch Sink: added. Version 0.1.0

Version 0.9.11 (2014-11-10)

  • Clojure Collector: bumped to 0.9.0
  • Clojure Collector: add support for /:vendor/:version to GET (#1131)
  • Scala Common Enrich: bumped to 0.8.0
  • Scala Common Enrich: bumped json4s to 3.2.11 (#1141)
  • Scala Common Enrich: bumped Scala Iglu Client to 0.1.1 (#1140)
  • Scala Common Enrich: removed check that POST request has body and content-type (#1132)
  • Scala Common Enrich: moved payload API detection into CollectorApi.parse (#1113)
  • Scala Common Enrich: fixed bug in CljTomcatLoader expecting request body to be "_" instead of "-" (#1112)
  • Scala Common Enrich: added Adapter to pre-process CallRail events (#1108)
  • Scala Common Enrich: added Adapter to pre-process MailChimp events (#1086)
  • Scala Common Enrich: added Adapter to pre-process Iglu-compatible events (#1060)
  • Scala Hadoop Enrich: bumped to 0.9.0
  • Scala Hadoop Enrich: added job test for unrecognized api name/version (#1115)
  • Scala Hadoop Enrich: updated DiscardableCfLinesSpec given /not-ice.png is no longer discarded (#1114)
  • Scala Hadoop Enrich: added test job for MailchimpAdapter (#1159)
  • Scala Hadoop Enrich: added test job for CallrailAdapter (#1160)
  • Redshift: removed not null constraint on change_form's value column (#1162)
  • Redshift: added Redshift DDL for a com.callrail/call_complete event (#1110)
  • Redshift: added Redshift DDL for a com.mailchimp/campaign_sending_status event (#1085)
  • Redshift: added Redshift DDL for a com.mailchimp/cleaned_email event (#1084)
  • Redshift: added Redshift DDL for a com.mailchimp/email_address_change event (#1083)
  • Redshift: added Redshift DDL for a com.mailchimp/profile_update event (#1082)
  • Redshift: added Redshift DDL for a com.mailchimp/unsubscribe event (#1081)
  • Redshift: added Redshift DDL for a com.mailchimp/subscribe event (#1080)
  • StorageLoader: wrote JSON Path file for com.callrail/call_complete event (#1109)
  • StorageLoader: wrote JSON Path file for com.mailchimp/campaign_sending_status event (#1079)
  • StorageLoader: wrote JSON Path file for com.mailchimp/cleaned_email event (#1078)
  • StorageLoader: wrote JSON Path file for com.mailchimp/email_address_change event (#1077)
  • StorageLoader: wrote JSON Path file for com.mailchimp/profile_update event (#1076)
  • StorageLoader: wrote JSON Path file for com.mailchimp/unsubscribe event (#1075)
  • StorageLoader: wrote JSON Path file for com.mailchimp/subscribe event (#1074)

Version 0.9.10 (2014-11-06)

  • StorageLoader: wrote JSON Path file for PerformanceTiming (#1147)
  • StorageLoader: wrote JSON Path file for social_interaction (#1029)
  • StorageLoader: wrote JSON Path file for site_search (#1027)
  • StorageLoader: wrote JSON Path file for change_form (#1025)
  • StorageLoader: wrote JSON Path file for submit_form (#1023)
  • StorageLoader: wrote JSON Path file for remove_from_cart (#1021)
  • StorageLoader: wrote JSON Path file for add_to_cart (#1019)
  • Redshift: converted all Redshift DDLs to use tabs (#1034)
  • Redshift: added Redshift DDL for PerformanceTiming (#1032)
  • Redshift: added Redshift DDL for social_interaction (#1030)
  • Redshift: added Redshift DDL for site_search (#1028)
  • Redshift: added Redshift DDL for change_form (#1026)
  • Redshift: added Redshift DDL for submit_form (#1024)
  • Redshift: added Redshift DDL for remove_from_cart (#1022)
  • Redshift: added Redshift DDL for add_to_cart (#1020)

Version 0.9.9 (2014-10-27)

  • .NET Tracker: added git submodule. Version 0.1.0 (#1000)
  • PHP Tracker: added git submodule. Version 0.1.0 (#1013)
  • Clojure Collector: bumped to 0.8.0
  • Clojure Collector: fixed regression in log record format caused by #854 (#992)
  • Clojure Collector: correctly handles multiple IPs in X-Forwarded-For (#970)
  • StorageLoader: bumped to 0.3.3
  • StorageLoader: selecting Snowplow's hosted-assets bucket based on region (#1012)
  • EmrEtlRunner: bumped to 0.9.2
  • EmrEtlRunner: no rows to process now returns 0, not 1 (#1018)
  • EmrEtlRunner: fixed bug where --process-enrich doesn't work, thanks @kingo55! (#1089)
  • EmrEtlRunner: now checking that output directories are empty before running (#1124)
  • Scala Common Enrich: bumped to 0.7.0
  • Scala Common Enrich: bumped scala-maxmind-iplookups to 0.2.0 (#1002)
  • Scala Common Enrich: added support for non-GA campaign attribution: phase 1 (#402)
  • Scala Common Enrich: rewrote AttributionEnrichments tests as RefererParserEnrichment tests (#974)
  • Scala Common Enrich: allow but downcase a-f characters in incoming event_id (#1006)
  • Scala Common Enrich: extract useragent from ua parameter (#1011)
  • Scala Common Enrich: fixed issue where unset integer fields throw an NPE (#570)
  • Scala Common Enrich: fixed issue where unset double fields throw an NPE (#1062)
  • Scala Common Enrich: added tests for ConversionUtils.stringToJInteger (#1064)
  • Scala Common Enrich: now enforcing Java 7 for compilation (#1065)
  • Scala Hadoop Enrich: bumped to 0.8.0
  • Scala Hadoop Enrich: bumped Scala Common Enrich to 0.7.0 (#995)
  • Scala Hadoop Enrich: added test for empty integer and double fields to ensure no NPE thrown (#1063)
  • Scala Hadoop Enrich: now enforcing Java 7 for compilation (#1066)
  • Scala Hadoop Enrich: updated test jobs to reflect updated useragent parsing (#1070)

Version 0.9.8 (2014-09-18)

  • iOS Tracker: added git submodule. Version 0.1.1 (#982)
  • Android Tracker: added git submodule. Version 0.1.1 (#983)
  • Clojure Collector: bumped to 0.7.0
  • Clojure Collector: merged snowplow/tomcat-cf-access-log-valve into Snowplow as clojure-collector/access-valve (#898)
  • Clojure Collector: bumped access-valve to 0.1.0
  • Clojure Collector: changed access-valve's package path to com.snowplowanalytics.snowplow.collectors.clojure.accessvalve (#924)
  • Clojure Collector: changed access-valve to use Gradle (#899)
  • Clojure Collector: changed access-valve to publish to war-resources/.ebextensions (#900)
  • Clojure Collector: updated access-valve and added web.xml to log request body and content type (#901)
  • Clojure Collector: fixed empty querystring in access-valve (#938)
  • Clojure Collector: fixed IP address forwarding for VPC-based environments (#854)
  • Clojure Collector: added support for API vendor and version in routing (#925)
  • Clojure Collector: added support for POST as well as GET (#654)
  • Scala Stream Collector: fixed broken link to thrift-raw-event, thanks @bamos! (#955)
  • Scala Common Enrich: bumped to 0.6.0
  • Scala Common Enrich: split out Clojure and CloudFront Collector event processing (#943)
  • Scala Common Enrich: added CljTomcatLoaderSpec tests (#963)
  • Scala Common Enrich: filtering non-GETs from CloudfrontLoader (#944)
  • Scala Common Enrich: replaced all Argonaut code with json4s (#945)
  • Scala Common Enrich: renamed CanonicalOutput to EnrichedEvent (#964)
  • Scala Common Enrich: replaced CanonicalInput and TrackerPayload with CollectorPayload and RawEvent (#946)
  • Scala Common Enrich: updated EnrichmentManager to process RawEvent not CanonicalInput (#903)
  • Scala Common Enrich: added Snowplow Tp2 Adapter to convert event JSON to NEL of RawEvents (#904)
  • Scala Common Enrich: geo-IP lookup now supports ip parameter on querystring (#961)
  • Scala Common Enrich: IP address anonymization now works with ip parameter on querystring (#960)
  • Scala Hadoop Enrich: bumped to 0.7.0
  • Scala Hadoop Enrich: bumped to Scala Common Enrich 0.6.0 (#940)
  • Scala Hadoop Enrich: updated to support generating multiple enriched events from one raw payload (#902)
  • StorageLoader: wrote JSON Path file for mobile_context (#776)
  • StorageLoader: wrote JSON Path file for geolocation_context (#962)
  • Redshift: added Redshift DDL for mobile_context (#542)
  • Redshift: added Redshift DDL for geolocation_context (#950)

Version 0.9.7 (2014-09-02)

  • Ruby Tracker: bumped git submodule to 0.3.0 (#939)
  • Java Tracker: bumped git submodule to 0.5.1 (#948)
  • Node.js Tracker: added git submodule. Version 0.1.0 (#949)
  • Trackers: fixed broken git submodule links, thanks @OAGr! (#957)
  • EmrEtlRunner: bumped to 0.9.1
  • EmrEtlRunner: fixed @jobflow.ec2_subnet_id not being set due to incorrect guard, thanks @rslifka! (#956)
  • EmrEtlRunner: fixed bugs in --process-bucket (#973)
  • EmrEtlRunner: renamed --process-bucket option to --process-enrich (#972)
  • EmrEtlRunner: changed -s option for --skip to -x prevent clash with -s for --start (#975)
  • EmrEtlRunner: now allows shredding without prior enrichment (#927)
  • StorageLoader: bumped to 0.3.2
  • StorageLoader: removed EMPTYASNULL for loading JSONs (#942)
  • StorageLoader: added missing targetUrl field to ad_impression JSON Path file, thanks @gisripa! (#951)
  • StorageLoader: made providing jsonpath_assets optional (#958)
  • StorageLoader: added support for cross-region Redshift COPY (#971)
  • Hive Storage: bumped table-def.q to 0.2.0
  • Hive Storage: added and removed fields to synchronize with 0.9.6's enriched event format (#965)
  • Scala Hadoop Shred: bumped to version 0.2.1
  • Scala Hadoop Shred: fixed multiple JSONs not being shredded for a single row (#968)
  • Scala Hadoop Shred: strengthened test suite (#967)

Version 0.9.6 (2014-07-26)

  • Java Tracker: bumped git submodule to 0.4.0 (#892)
  • EmrEtlRunner: bumped to 0.9.0
  • EmrEtlRunner: passed etl_tstamp into Hadoop Enrich as an argument (#396)
  • EmrEtlRunner: removed enrichment-specific code (#811)
  • EmrEtlRunner: removed enrichment-specific parameters from config.yml.sample (#809)
  • EmrEtlRunner: replaced enrichment-specific arguments from EmrEtlRunner (#808)
  • EmrEtlRunner: removed %3D code following Scalding upgrade (#849)
  • EmrEtlRunner: fixed contract on partition_by_run (#894)
  • EmrEtlRunner: updated Bash script to support enrichments path (#916)
  • StorageLoader: bumped to 0.3.1
  • StorageLoader: now looking in eu-west-1 region for s3://snowplow-hosted-assets (#895)
  • StorageLoader: updated combined Bash script to support enrichments path (#917)
  • Scala Hadoop Enrich: bumped to 0.6.0
  • Scala Hadoop Enrich: bumped Scala to 2.10.4 (#912)
  • Scala Hadoop Enrich: bumped Scalding to 0.11.1 (#911)
  • Scala Hadoop Enrich: bumped Hadoop to 1.2.1 (#913)
  • Scala Hadoop Enrich: bumped to Scala Common Enrich 0.5.0 (#788)
  • Scala Hadoop Enrich: passed etl_tstamp into Scala Common Enrich (#817)
  • Scala Hadoop Enrich: removed event_vendor and ue_name and renamed ue_properties to unstruct_event (#835)
  • Scala Hadoop Enrich: removed %3D handling for compatibility with old Scalding Args (#850)
  • Scala Hadoop Enrich: added ability to download additional MaxMind databases (#885)
  • Scala Hadoop Enrich: added runHadoop and Tool.main tests (#914)
  • Scala Common Enrich: bumped to 0.5.0
  • Scala Common Enrich: bumped user-agent-utils version, thanks @pkallos! (#662)
  • Scala Common Enrich: bumped referer-parser to 0.2.2 (#864)
  • Scala Common Enrich: bumped httpclient to 4.3.3 (#897)
  • Scala Common Enrich: bumped scala-maxmind-geoip to scala-maxmind-iplookups 0.1.0 (#882)
  • Scala Common Enrich: stored etl_tstamp in new field in CanonicalOutput (#818)
  • Scala Common Enrich: removed event_vendor and ue_name and renamed ue_properties to unstruct_event (#836)
  • Scala Common Enrich: made referer parsing configurable with list of internal domains (#857)
  • Scala Common Enrich: migrated configurable enrichments to new EnrichmentRegistry (#858)
  • Scala Common Enrich: added validation of enrichments JSON (#807)
  • Scala Common Enrich: replaced "anon_ip_quartets" with "anon_ip_octets" everywhere (#547)
  • Scala Common Enrich: added ability to extract event_id from querystring (#723)
  • Scala Common Enrich: extracted CanonicalInput's userId as network_userid, thanks @pkallos! (#855)
  • Scala Common Enrich: added MaxMind region_name field (#873)
  • Scala Common Enrich: added IP -> ISP lookup (#861)
  • Scala Common Enrich: added IP -> organization lookup (#887)
  • Scala Common Enrich: added IP -> domain lookup (#886)
  • Scala Common Enrich: added IP -> net speed lookup (#889)
  • Scala Common Enrich: added validation for transaction ID (#428)
  • Scala Common Enrich: renamed Tests to Specs for consistency (#618)
  • Scala Hadoop Shred: bumped to 0.2.0
  • Scala Hadoop Shred: bumped to Scala Common Enrich 0.5.0 (#918)
  • Scala Hadoop Shred: trailing empty fields no longer cause shredding for that row to fail (#921)
  • Scala Hadoop Shred: updated column offsets for enriched events TSV (#915)
  • Redshift: bumped atomic.events to 0.4.0
  • Redshift: added migration script for 0.3.0 to 0.4.0
  • Redshift: added etl_tstamp to atomic.events (#819)
  • Redshift: removed event_vendor and ue_name and renamed ue_properties to unstruct_event (#834)
  • Redshift: added new MaxMind fields (#871)
  • Redshift: applied runlength encoding to all fields keyed off IP address (#883)
  • Redshift: migration script added for 0.3.0 to 0.4.0 (#838)
  • Postgres: bumped atomic.events to 0.3.0
  • Postgres: added migration script for 0.2.0 to 0.3.0
  • Postgres: added etl_tstamp to atomic.events (#820)
  • Postgres: removed event_vendor and ue_name and renamed ue_properties to unstruct_event (#833)
  • Postgres: added new MaxMind fields (#871)
  • Postgres: migration script added for 0.2.0 to 0.3.0 (#837)

Version 0.9.5 (2014-07-09)

  • Ruby Tracker: added git submodule. Version 0.1.0 (#645)
  • Java Tracker: added git submodule. Version 0.2.0 (#843)
  • JavaScript Tracker: bumped git submodule to 2.0.0 (#635)
  • Python Tracker: bumped Python Tracker git submodule to 0.4.0 (#634)
  • Scala Hadoop Shred: added. Version 0.1.0
  • EmrEtlRunner: bumped to 0.8.0
  • EmrEtlRunner: updated S3DistCp steps to use new S3DistCpStep from Elasticity (#629)
  • EmrEtlRunner: added --skip s3distcp option (#313)
  • EmrEtlRunner: added ability to start Lingual in EmrEtlRunner (#623)
  • EmrEtlRunner: added ability to start HBase in EmrEtlRunner (#622)
  • EmrEtlRunner: improved load performance by switching ETL to write out to HDFS (#278)
  • EmrEtlRunner: now invoking Scala Hadoop Shredder after main job (#644)
  • EmrEtlRunner: added :iglu: section to config.yml for Scala Hadoop Shred (#814)
  • EmrEtlRunner: updated to run Scala Hadoop Shred following Hadoop Enrich (#815)
  • EmrEtlRunner: added --skip shred option (#659)
  • StorageLoader: bumped to 0.3.0
  • StorageLoader: bumped Sluice to 0.2.1 (#881)
  • StorageLoader: added initial Ruby.contracts support (#391)
  • StorageLoader: updated config.yml to support shredding (#897)
  • StorageLoader: added ACCEPTINVCHARS to StorageLoader (#411)
  • StorageLoader: wrote JSON Path files for ad_* events (#642)
  • StorageLoader: wrote JSON Path file for link_click (#599)
  • StorageLoader: wrote JSON Path file for screen_view (#643)
  • StorageLoader: wrote JSON Path file for schema.org's WebPage (#772)
  • StorageLoader: added :jsonpath_assets: setting for StorageLoader (#606)
  • StorageLoader: added ability to load custom tables using JSON Paths (#607)
  • StorageLoader: added --skip shred option (#660)
  • StorageLoader: added :in: hint on StorageLoader configuration, thanks @joaolcorreia! (#755)
  • Redshift: added Redshift DDL for ad_* events (#639)
  • Redshift: added Redshift DDL for link_click events (#600)
  • Redshift: added Redshift DDL for screen_view events (#640)
  • Redshift: added Redshift DDL for schema.org's WebPage (#771)
  • Looker Analytics: wrote LookML for ad_* events (#605)
  • Looker Analytics: wrote LookML for screen_view events (#637)
  • Looker Analytics: wrote LookML for link_click events (#636)
  • Looker Analytics: wrote LookML for schema.org's WebPage (#770)
  • Looker Analytics: updated LookML to use liquid templating (#851)

Version 0.9.4 (2014-05-30)

  • Redshift: added reference_data.country_codes (#779)
  • Postgres: added reference_data.country_codes (#781)
  • Looker Analytics: New 'traffic_pulse' dashboard with globally configurable drill-down variables (#765)
  • Looker Analytics: Snowplow website specific dimensions and metrics removed: base model is now company-generic (#764)
  • Looker Analytics: cleaner joining of data sets in Looker model (#763)
  • Looker Analytics: dimensions and metrics renamed to make it clearer for an analyst getting started with the data (#761)
  • Looker Analytics: added distkeys and sortkeys to derived tables to speed up query times (#696)
  • Looker Analytics: derived tables now auto-generated when new data is loaded into atomic.events (#688)
  • Looker Analytics: 'visits' renamed to 'sessions' (#762)
  • Looker Analytics: LookML models versioned using SchemaVer (#766)

Version 0.9.3 (2014-05-21)

  • EmrEtlRunner: bumped to 0.7.0
  • EmrEtlRunner: bumped Sluice to 0.2.1 (#405)
  • EmrEtlRunner: bumped Elasticity to 3.0.4 (#665)
  • EmrEtlRunner: replaced hadoop_version setting with ami_version setting (#701)
  • EmrEtlRunner: fixed handling of region, placement and ec2_subnet_id (#754)
  • EmrEtlRunner: fixed regression where 0 files staged still kicks off EMR (#409)
  • EmrEtlRunner: stopped Sluice file operation threads being killed by folders (#401)
  • EmrEtlRunner: fixed disabling of Cascading error catching (#721)
  • EmrEtlRunner: renamed Clojure Collector log files in processing bucket to support multiple instances (#717)
  • EmrEtlRunner: added initial Ruby.contracts support into EmrEtlRunner (#392)
  • EmrEtlRunner: updated to use the Ruby Logger (#194)
  • EmrEtlRunner: updated so it's embeddable in other applications (#128)
  • EmrEtlRunner: added ability to bundle as a JRuby fat jar (#674)
  • EmrEtlRunner: added initial unit tests (#672)
  • Clojure Collector: bumped to 0.6.0
  • Clojure Collector: load balancer IP address getting stored in logs (#719)
  • Documentation: removed all Snowplow tracking from READMEs, thanks @acinader! (#720)
  • Documentation: fixed EmrEtlRunner documentation is (slightly) inconsistent, thanks @pvdb! (#749)

Version 0.9.2 (2014-04-30)

  • Scala Hadoop Enrich: bumped to 0.5.0
  • Scala Hadoop Enrich: bumped to Scala Common Enrich 0.4.0 (#699)
  • Scala Hadoop Enrich: bumped SBT to 0.13.2 (#702)
  • Scala Hadoop Enrich: bumped to using using sbt-assembly 0.11.2 (#704)
  • Scala Common Enrich: bumped to 0.4.0
  • Scala Common Enrich: upgraded to support new and future CloudFront file formats (#698)
  • Scala Common Enrich: bumped SBT to 0.13.2 (#703)
  • Scala Hadoop Bad Rows: added. Version 0.1.0
  • Hive Storage: bumped table-def.q to 0.1.0
  • Hive Storage: added new unstructured fields to Hive table definition (#709)
  • Hive Storage: added raw page_url and page_referrer into Hive table (#710)
  • Hive Storage: added name_tracker field to Hive table (#711)

Version 0.9.1 (2014-04-11)

  • Scala Hadoop Enrich: bumped to 0.4.0
  • Scala Hadoop Enrich: bumped to Scala Common Enrich 0.3.0 (#497)
  • Scala Hadoop Enrich: renamed AnonQuartets to AnonOctets (#498)
  • Scala Hadoop Enrich: renamed all Snowplow Hadoop Tests to Specs (#515)
  • Scala Hadoop Enrich: added page_url and page_referrer back into ETL's output (#483)
  • Scala Common Enrich: bumped to 0.3.0
  • Scala Common Enrich: bumped Argonaut to 6.0.3 (#620)
  • Scala Common Enrich: added app and mob as valid platform codes, thanks @kinabalu! (#524)
  • Scala Common Enrich: added support for remaining platform codes (#516)
  • Scala Common Enrich: updated POJO in Scalding ETL to include new unstructured fields (#362)
  • Scala Common Enrich: updated POJO in Scalding ETL to include name_tracker field (#595)
  • Scala Common Enrich: extract evn from Tracker Protocol (#604)
  • Scala Common Enrich: extract tna from Tracker Protocol (#616)
  • Scala Common Enrich: extract and validate unstructured events (#142)
  • Scala Common Enrich: extract and validate custom contexts (#426)
  • Scala Common Enrich: reformat incoming event and context JSONs (#589)
  • Scala Common Enrich: make sure to error a JSON if > length (#567)
  • EmrEtlRunner: bumped to 0.6.0
  • EmrEtlRunner: bumped Elasticity to 3.0.2 (#587)
  • EmrEtlRunner: allowed AWS VPC selection in EmrEtlRunner (#581)
  • EmrEtlRunner: set :visible_to_all_users to true for EMR jobs, thanks @smugryan! (#560)
  • Redshift: atomic-def script bumped to 0.3.0
  • Redshift: migration script added for 0.2.2 to 0.3.0
  • Redshift: added new unstructured fields to Redshift table definition (#361)
  • Redshift: changed distkey to be event_id, not domain_userid (#584)
  • Redshift: added raw page_url and page_referrer into Redshift table (#591)
  • Redshift: added name_tracker field to Redshift table (#594)
  • Redshift: converted Redshift varchar(38) for event IDs to char(36) (#282)
  • Postgres: atomic-def script bumped to 0.2.0
  • Postgres: migration script added for 0.1.x to 0.2.0
  • Postgres: added new unstructured fields to Postgres table definition (#359)
  • Postgres: added raw page_url and page_referrer into Postgres table (#592)
  • Postgres: added name_tracker field to Postgres table (#593)
  • Postgres: converted varchar(36) for event IDs to char(36) (#596)
  • StorageLoader: bumped to 0.2.0
  • StorageLoader: added TIMEFORMAT 'auto' to StorageLoader to handle outlier dvce_timestamps (#427)
  • JavaScript Tracker: bumped git submodule to 1.0.1 (#585)
  • Python Tracker: added git submodule pointing to 0.1.0 (#586)

Version 0.9.0 (2014-02-04)

  • Thrift Raw Event: added. Version 0.1.0
  • Thrift Raw Event: specified Thrift IDL for new raw event schema (#430)
  • Scala Stream Collector: added. Version 0.1.0
  • Scala Stream Collector: implemented new spray-can (Akka Http) Scala stream collector (#432)
  • Scala Kinesis Enrich: added. Version 0.1.0
  • Scala Kinesis Enrich: implemented initial Kinesis-based enrichment (#460)
  • Scala Common Enrich: bumped to 0.2.0
  • Scala Common Enrich: added Thrift SnowplowRawEvent as a dependency to common-enrich (#475)
  • Scala Common Enrich: added ability to read Thrift SnowplowRawEvent (Thrift) (#462)
  • Scala Common Enrich: renamed CloudFront to Cloudfront in code (#495)
  • Scala Common Enrich: renamed AnonQuartets to AnonOctets (#491)
  • Scala Common Enrich: added raw -> CanonicalInput tests (#484)
  • Scala Common Enrich: updated GET payload extraction to handle empty payloads (#502)
  • Git submodules: changed git:// protocol in .gitmodules to https:// (#512)
  • NodeJS Collector: removed contrib-nodejs-collector from 2-collectors (#474)
  • JavaScript Tracker: bumped JS Tracker submodule to 0.13.1 release (#511)

Version 0.8.13 (2014-01-08)

  • Looker Analytics: added 0.1.0
  • Looker Analytics: created Snowplow metadata model for Looker BI (www.looker.com) (#472)

Version 0.8.12 (2014-01-07)

  • Hadoop ETL: bumped to 0.3.6
  • Hadoop ETL: bumped to SBT 0.13.0 (#404)
  • Hadoop ETL: bumped to using sbt-assembly 0.10.1 (#421)
  • Hadoop ETL: bumped to Scala 2.10.3 (#423)
  • Hadoop ETL: bumped to Scalding 0.8.11 (#422)
  • Hadoop ETL: upgraded useragent utils to 1.11 & moved to Maven dependency (#416)
  • Hadoop ETL: added test running back into sbt-assembly step (#420)
  • Hadoop ETL: updated copyright messages to be Snowplow not SnowPlow, and to 2014 not 2013 (#419)
  • Hadoop ETL: added ValidatedString as a type to package.scala (#328)
  • Hadoop ETL: added missing validation to stringToJByte (#408)
  • Hadoop ETL: missing page URI no longer interpreted as bad row (#399)
  • Hadoop ETL: updated CfRegex to reflect Cfcs(Cookie) can be empty (#410)
  • Hadoop ETL: numeric fields in tr_ and ti_ now parsed to doubles, not madeTsvSafe strings (#400)
  • Hadoop ETL: moved ETL core into separate project scala-enrich-common (#417)
  • Scala Common Enrich: updated ETL versioning to include host and common versions (#448)
  • Postgres: bumped cube-pages.sql to 0.1.1
  • Postgres: minor fix: cube_pages.complete referenced non-existent table cube_pages.basic, thanks @mrwalker! (#414)

Version 0.8.11 (2013-10-22)

  • Hadoop ETL: bumped to 0.3.5
  • Hadoop ETL: added Argonaut 6.0 as a dependency (#342)
  • Hadoop ETL: added fromTimestamp to EventEnrichments (#340)
  • Hadoop ETL: added makeTsvSafe to ConversionUtils (#338)
  • Hadoop ETL: added JsonUtils (#323)
  • Hadoop ETL: added support for 3 and 4 return values from MapTransformer (#324)
  • Hadoop ETL: updated GetJsonPayload to use Argonaut and renamed to JsonPayload (#339)
  • Hadoop ETL: added ability to mask IP addresses in ETL (#309)
  • Hadoop ETL: refr_ and page_ fields now stored raw (#374)
  • Hadoop ETL: defensively fixed raw spaces in page and referer URLs (#346)
  • Hadoop ETL: fixed regression, single-encoded %s logic didn't account for % itself (#347)
  • Hadoop ETL: added unit tests for fixTabsNewlines (#332)
  • Hadoop ETL: tests now report the failing CanonicalOutput field (#325)
  • Hadoop ETL: now handling all fields double-encoded as per CloudFront post-14-September (#348)
  • Hadoop ETL: added support for 21 Oct CloudFront access log format (#384)
  • Hadoop ETL: added truncation to refr_term (#379)
  • Hadoop ETL: added truncation to se_label (#394)
  • Hadoop ETL: made all prior ME.identity fields TSV-safe (#395)
  • EmrEtlRunner: bumped to 0.5.0
  • EmrEtlRunner: bumped Sluice to 0.1.5 (#96)
  • EmrEtlRunner: bumped Elasticity to 2.6 (#345)
  • EmrEtlRunner: enabled EMR Job Flow debugging for easier access to logs (#279)
  • EmrEtlRunner: ETL job no longer fails if there's no data for last run period (#296)
  • EmrEtlRunner: empty processing dir check now works if dir contains 1 file (#326)
  • EmrEtlRunner: added ability to mask IP addresses in ETL (#309)
  • EmrEtlRunner: made the examples match what you get from git out of the box, thanks @shermozle (#331)
  • StorageLoader: bumped to 0.1.1
  • StorageLoader: bumped Sluice to 0.1.5 (#96)
  • StorageLoader: fixed "" in fields acts as an escape character for Postgres, thanks @kingo55 (#329)
  • StorageLoader: added ability to --skip analyze (#335)
  • StorageLoader: moved VACUUM SORT ONLY to a --include step (#321)
  • StorageLoader: added COMPROWS to config and --include compupdate option (#344)
  • StorageLoader: changed Postgres VACUUM FULL to VACUUM (#357)
  • StorageLoader: added TRUNCATECOLUMNS for Redshift load (#360)
  • StorageLoader: added FILLRECORD to our Redshift COPY command (#380)
  • Postgres: fixed error in recipes_basic.technology_mobile recipe (#397)

Version 0.8.10 (2013-10-18)

  • Redshift: bumped atomic.events to 0.2.2
  • Redshift: added migration script for 0.2.1 to 0.2.2
  • Redshift: moved events table to a new atomic schema in atomic-def.sql (#301)
  • Redshift: added SQL DDL to define Redshift recipes (#297)
  • Redshift: added SQL DDL to define Redshift cubes (#298)
  • Postgres: bumped atomic.events to 0.1.1
  • Postgres: added migration script for 0.1.0 to 0.1.1
  • Postgres: renamed table-def file to atomic-def.sql
  • Postgres: moved NOT NULL constraint on event field to event_vendor field (#318)
  • Postgres: added SQL DDL to define Postgres recipes (#303)
  • Postgres: added SQL DDL to define Postgres cubes (#302)
  • Documentation: fixed wrong path to no-js-tracker subdirectory, thanks @gregakespret (#343)
  • Documentation: improved "Find out more" table in README, thanks @dideler (#353)

Version 0.8.9 (2013-09-05)

  • Hadoop ETL: bumped to 0.3.4
  • Hadoop ETL: updated to handle singly-encoded %s in CloudFront querystring field (#333)

Version 0.8.8 (2013-08-04)

  • JavaScript Tracker: moved into own repo (#277)
  • Hadoop ETL: bumped to 0.3.3
  • Hadoop ETL: URL-decodes "%3D" to "=" to allow Hive-style directory names as arguments (#305)
  • Hadoop ETL: bumped referer-parser to 0.1.1 to fix java.lang.NullPointerException (#314)
  • EmrEtlRunner: bumped to 0.4.0
  • EmrEtlRunner: bumped Sluice to 0.0.7 (#299)
  • EmrEtlRunner: removed :snowplow: section from config.yml.sample (#289)
  • EmrEtlRunner: simplified EmrEtlRunner and its config (#287)
  • EmrEtlRunner: added run= to timestamped ETL folder names (#294)
  • EmrEtlRunner: updated "Jobflow started" stdout message to include jobflow ID (#315)
  • Hive ETL: removed folder 3-enrich/hive-etl as no longer supported (#286)
  • Hive storage: updated hive-storage scripts to work with current Redshift-format flatfile (#290)
  • Infobright: removed folder 4-storage/infobright as not currently supported (#285)
  • Postgres: add Postgres table definition in atomic schema (#160)
  • StorageLoader: bumped to 0.1.0
  • StorageLoader: bumped Sluice 0.0.7 (#300)
  • StorageLoader: removed code to delete Hive ETL's empty event files (#306)
  • StorageLoader: fixed bug where download path has to be set (even when using Redshift) (#280)
  • StorageLoader: optimized ANALYZE and VACUUM commands (#283)
  • StorageLoader: added MAXERROR as StorageLoader configuration value for Redshift (#273)
  • StorageLoader: added support for loading Postgres (#161)
  • StorageLoader: removed Infobright loading capability (#307)
  • StorageLoader: added support for loading into multiple storage targets (#311)

Version 0.8.7 (2013-07-07)

  • JavaScript Tracker: bumped to 0.12.0
  • JavaScript Tracker: fixed document reference to use documentAlias (#247)
  • JavaScript Tracker: fixed bug with setCustomUrl (#267)
  • JavaScript Tracker: changed ev_ to se_ for structured events (#197)
  • JavaScript Tracker: fixed Firefox failure when "Always ask" set for cookies (#163)
  • JavaScript Tracker: fixed bug in page ping functionality detected in IE 8 (#260)
  • JavaScript Tracker: replaced forEach as not supported in IE 6-8 (#295)
  • EmrEtlRunner: fixed bug in config.yml.sample (#291)
  • Arduino tracker: added git submodule link (#292)

Version 0.8.6 (2013-06-03)

  • Hadoop ETL: bumped to 0.3.2
  • Hadoop ETL: bumped Scalding to 0.8.5
  • Hadoop ETL: bumped Scala version to 2.10.0
  • Hadoop ETL: bumped scala-maxmind-geoip to 0.0.5 to work with Scala 2.10.0
  • Hadoop ETL: bumped SBT from 0.12.1 to 0.12.3
  • Hadoop ETL: bumped Specs2 to 1.14
  • Hadoop ETL: replaced Bytes in CanonicalOutput with JBytes (#254)
  • Hadoop ETL: disabled "corruption" detection in ETL overriding custom URLs with longer collector referer URLs (#268)
  • EmrEtlRunner: bumped to 0.3.0
  • EmrEtlRunner: updated config.yml.sample to support spot task instances
  • EmrEtlRunner: let EmrEtlRunner use spot task instances (#193)
  • EmrEtlRunner: consolidate small files prior to running ETL job (#207)

Version 0.8.5 (2013-05-24)

  • Hadoop ETL: bumped to 0.3.1
  • Hadoop ETL: now supports downloading GeoLiteCity.dat from public S3 URL if needed, thanks @petervanwesep (part of #258)
  • Hadoop ETL: added Twitter Maven Repo as a resolution repo, thanks @rgabo (#239)
  • Hadoop ETL: stripping control characters in addition to tabs and newlines (#259)
  • Hadoop ETL: fixed issue with large values for se_value (#263)
  • Hadoop ETL: renamed ev_ fields in CanonicalOutput to se_
  • Hadoop ETL: extractResolution renamed and fails gracefully if view dimensions exceed Integer max size (#264)
  • EmrEtlRunner: bumped to 0.2.1
  • EmrEtlRunner: returns public S3 URL to GeoLiteCity.dat file if hosted by Snowplow, thanks @petervanwesep (part of #258)
  • Redshift: table-def script bumped to 0.2.1
  • Redshift: migration script added for 0.2.0 to 0.2.1
  • Redshift: bumped se_value from a float to a double
  • Redshift: increased size of _urlport fields, thanks @petervanwesep (#266)
  • Infobright: bumped setup_ and verify_infobright.sql to 0.0.9
  • Infobright: added migration script 0.0.8->0.0.9
  • Infobright: increased size of _urlport fields, thanks @petervanwesep (#266)

Version 0.8.4 (2013-05-16)

  • Hadoop ETL: bumped to 0.3.0
  • Hadoop ETL: added geo-ip lookup to Scalding ETL
  • Hadoop ETL: bumped referer-parser from 0.1.0-M6 to to 0.1.0
  • Hadoop ETL: removed truncation of page_referrer (#236)
  • Hadoop ETL: added truncation of referer path/qs/fragment (#235)
  • Hadoop ETL: removing tabs found in referer search terms (#234)
  • Hadoop ETL: fixed client timestamp so it's not incorrectly localised - thanks @rgabo (#238)
  • Hadoop ETL: added parsing of collector version cv (#243)
  • Hadoop ETL: bumped Scalaz from 7.0.0-M9 to 7.0.0
  • Hadoop ETL: removed .gets from extractPageUri (#249)
  • EmrEtlRunner: bumped to 0.2.0
  • EmrEtlRunner: now passes MaxMind .dat file into Scalding ETL (#213)
  • EmrEtlRunner: improve messages when ETL job starts and fails (#230)
  • Redshift: table-def script bumped to 0.2.0
  • Redshift: migration script added for 0.1.0 to 0.2.0
  • Redshift: added geo-ip fields to Redshift table definition (#226)
  • Redshift: rename ev_ fields to se_ for structured events (#227)

Version 0.8.3 (2013-05-14)

  • JavaScript Tracker: bumped to 0.11.2
  • JavaScript Tracker: added unstructured events, thanks @rgabo, @tarsolya, @lackac (#198)
  • JavaScript Tracker: remove leading ampersand in querystring (#188)
  • Clojure Collector: bumped to 0.5.0
  • Clojure Collector: upgraded to use Tomcat AccessLogValve 0.0.4 (#240)
  • Clojure Collector: now logging Clojure Collector and Tomcat AccessLogValve versions (#239)
  • Common: completed splitting custom event type into: unstructured and structured events (#133)

Version 0.8.2 (2013-05-08)

  • Clojure Collector: bumped to 0.4.0
  • Clojure Collector: remove duplicate of wrap-request-logging in middleware.clj (#221)
  • Clojure Collector: check/potentially bump lein-ring dependency in project.clj (#222)
  • Clojure Collector: simplify building Clojure Collector, thanks @butlermh (#223, #225)
  • Clojure Collector: fix Tomcat log bug of missing cs(Referer) (#220)

Version 0.8.1 (2013-04-12)

  • Hadoop ETL: bumped to 0.2.0
  • Hadoop ETL: break referer_url into constituent parts (part of #175)
  • Hadoop ETL: remove raw referrer_url (as no space in Redshift table defn) (part of #175)
  • Hadoop ETL: added referer parsing (#176)
  • Redshift: table-def script bumped to 0.1.0
  • Redshift: migration script added for 0.0.1 to 0.1.0
  • Redshift: add/update referer fields in Redshift table definition (#204)
  • Redshift: fix bug where mkt_source and mkt_medium are getting swapped around (#215)
  • Common: replaced embedded architecture images with CloudFront-hosted images
  • Common: completed rename of 3-etl to 3-enrich (#99)
  • Common: "SnowPlow" -> "Snowplow" in 1st and 2nd level READMEs

Version 0.8.0 (2013-04-03)

  • Hadoop ETL: added. Version 0.1.0 (#177)
  • Hadoop ETL: truncate 6 "high risk" fields for Redshift (raw useragent, page title etc) (#192)
  • Hadoop ETL: ev_value now extracted as a float (#201)
  • EmrEtlRunner: bumped to 0.1.0
  • EmrEtlRunner: updated to work with new config.yml fields (part of #178)
  • EmrEtlRunner: added support for Hadoop ETL (part of #178)
  • EmrEtlRunner: added run ID and human-friendly job name (#100)
  • EmrEtlRunner: added run IDs to output folders (Hadoop ETL only) (#79)
  • EmrEtlRunner: changed .rvmrc to .ruby-version, thanks @richo (part of #190)
  • StorageLoader: changed .rvmrc to .ruby-version, thanks @richo (part of #190)
  • StorageLoader: added final missing /Gemfile to BUNDLE_GEMFILE in Bash script, thanks @frutik (#206)
  • Common: started rename of 3-etl to 3-enrich (part of #99)

Version 0.7.6 (2013-03-03)

  • HiveQL: redshift-etl.q added. Version 0.0.1 (#174)
  • HiveQL: hive-rolling-etl.q renamed to hive-etl.q and bumped to 0.5.7
  • HiveQL: non-hive-rolling-etl.q renamed to mysql-infobright-etl.q and bumped to 0.0.8 (part of #172)
  • EmrEtlRunner: bumped to 0.0.9
  • EmrEtlRunner: renamed :snowplow: variable names and added new Redshift one in config.yml (part of #172)
  • EmrEtlRunner: updated to support Redshift as a storage format (#173)
  • EmrEtlRunner: added missing /Gemfile to BUNDLE_GEMFILE in Bash script
  • StorageLoader: bumped to 0.0.5
  • StorageLoader: added Redshift-specific fields to config.yml (part of #159)
  • StorageLoader: added Redshift load support into StorageLoader (part of #159)
  • StorageLoader: added missing /Gemfile to BUNDLE_GEMFILE in Bash scripts
  • Redshift: table-def.sql script added. Version 0.0.1 (#158)
  • Infobright: bumped setup_ and verify_infobright.sql to 0.0.8
  • Infobright: widened useragent field (#184)
  • Infobright: added migration script 0.0.7->0.0.8
  • Serde: fixed and enabled broken tests (#14). Version unchanged

Version 0.7.5 (2013-02-25)

  • JavaScript Tracker: bumped to 0.11.1
  • JavaScript Tracker: fixed bug with cookie secure flag killing user ID cookies (#181)

Version 0.7.4 (2013-02-22)

  • JavaScript Tracker: bumped to 0.11.0
  • JavaScript Tracker: introduced setAppId() and deprecated setSiteId() (#168)
  • JavaScript Tracker: 1st party user ID now transmitted as duid (domain uid) (part of #150)
  • JavaScript Tracker: now sends dtm - the client timestamp (#149)
  • JavaScript Tracker: deprecated and disabled attachUserId()
  • JavaScript Tracker: deprecated getVisitorId() and getVisitorInfo() - use getDomainUserId() and getDomainUserInfo() instead
  • JavaScript Tracker: add setUserId which sets the uid field (#167)
  • JavaScript Tracker: SnowPlow cookies no longer tied to site ID (#148)
  • Clojure Collector: bumped to 0.3.0
  • Clojure Collector: now append nuid (network aka 3rd party) user ID, not uid (#150)
  • Serde: bumped to 0.5.5
  • Serde: renamed tstamp field to dtm
  • Serde: dt and tm split into dvce_x and collector_x (#149)
  • Serde: extract new nuid and duid fields (#150)
  • Serde: renamed visit_id to domain_sessionidx (#171)
  • HiveQL: hive-rolling-etl.q bumped to 0.5.6
  • HiveQL: non-hive-rolling-etl.q bumped to 0.0.7
  • HiveQL: dt and tm split into dvce_x and collector_x (#149)
  • HiveQL: now extracts uid, nuid and duid (#150)
  • HiveQL: renamed visit_id to domain_sessionidx (#171)
  • Infobright: bumped setup_infobright.sql to 0.0.7
  • Infobright: renamed dt and tm to dvce_x and collector_x (#149)
  • Infobright: now supports uid, nuid and duid (#150)
  • Infobright: renamed visit_id to domain_sessionidx (#171)
  • Infobright: added migration script 0.0.6 CloudFront collector -> 0.0.7
  • Infobright: added migration script 0.0.6 Clojure collector -> 0.0.7

Version 0.7.3 (2013-02-15)

  • JavaScript Tracker: bumped to 0.10.0
  • JavaScript Tracker: updated copyright notices
  • JavaScript Tracker: removed deprecated setAccount(), setTracker(), setHeartBeatTimer() - BREAKING CHANGE (#86)
  • JavaScript Tracker: added document charset to querystring (#138)
  • JavaScript Tracker: page ping no longer killed by 1 heartbeat w/o activity (#132)
  • JavaScript Tracker: added document & viewport dimensions (#94)
  • JavaScript Tracker: introduced trackStructEvent and deprecated trackEvent (#143)
  • JavaScript Tracker: cleaned up getRequest code to use improved requestStringBuilder
  • JavaScript Tracker: fixed logImpression (was using wrong argument names) (#162)
  • JavaScript Tracker: added scroll offsets to page ping (#127)
  • Serde: bumped to 0.5.4
  • Serde: updated copyright notices
  • Serde: structured events now logged as "struct" not "custom" - DATA CHANGE
  • Serde: added setting of new event_vendor field (to com.snowplowanalytics) (#144)
  • Serde: added extraction of doc charset (#138)
  • Serde: added extraction of document & viewport dimensions (#94)
  • Serde: added extraction of scroll offsets for enhanced page ping (#127)
  • Serde: added extraction of URL components (#105)
  • HiveQL: hive-rolling-etl.q bumped to 0.5.5
  • HiveQL: non-hive-rolling-etl.q bumped to 0.0.6
  • HiveQL: updated copyright notices
  • HiveQL: now supports charset, document & viewport, URL components, event_vendor and enhanced page ping
  • Infobright: bumped setup_infobright.sql to 0.0.6
  • Infobright: updated copyright notices
  • Infobright: added migration scripts (0.0.4->.6; 0.0.5->.6)
  • Infobright: added charset, document & viewport, URL components, event_vendor enhanced page ping

Version 0.7.2 (2013-01-29)

  • No-JavaScript Tracker: added. Version 0.1.0
  • JavaScript Tracker: bumped to 0.9.1
  • JavaScript Tracker: fixed bug where secure flag not being set on cookies sent via HTTPS
  • Clojure Collector: bumped to 0.2.0
  • Clojure Collector: fixed Tomcat config issue of times being recorded in 12-hour clock
  • Serde: added NoJsTrackerTest
  • Serde: fixed CljTomcatFormatTest

Version 0.7.1 (2013-01-22)

  • EmrEtlRunner: bumped to 0.0.8
  • EmrEtlRunner: updated copyright notices
  • EmrEtlRunner: added .rvmrc file (part of #121, #84)
  • EmrEtlRunner: removed .gemspec file
  • EmrEtlRunner: added dependencies to Gemfile and re-generated Gemfile.lock
  • StorageLoader: bumped to 0.0.4
  • StorageLoader: updated copyright notices
  • StorageLoader: added .rvmrc file (part of #121, #84)
  • StorageLoader: removed .gemspec file
  • StorageLoader: added dependencies to Gemfile and re-generated Gemfile.lock
  • Documentation: updated to use bundle install (#122)

Version 0.7.0 (2013-01-04)

  • Clojure Collector: added. Version 0.1.0
  • HiveQL: hive-rolling-etl.q bumped to 0.5.4
  • HiveQL: non-hive-rolling-etl.q bumped to 0.0.5
  • HiveQL: v_collector now set via Hive variable, not Serde (#118)
  • EmrEtlRunner: bumped to 0.0.7
  • EmrEtlRunner: bumped to using Sluice 0.0.6
  • EmrEtlRunner: added "Complete" message at end of run (part of #97)
  • EmrEtlRunner: validates "clj-tomcat" as collector format (#119)
  • EmrEtlRunner: passes collector format through to HiveQL (#119)
  • EmrEtlRunner: support for log files generated by Clojure Collector on Tomcat (#117)
  • Serde: added broken CljTomcatFormatTest
  • StorageLoader: bumped to 0.0.3
  • StorageLoader: bumped to using Sluice 0.0.6
  • StorageLoader: added "Complete" message at end of run (part of #97)
  • StorageLoader: --skip argument now supports a list (#81)
  • Infobright: bumped setup_infobright.sql to 0.0.5
  • Infobright: added migration script (0.0.4 -> 0.0.5)
  • Infobright: user_id field widened to 38 chars to support UUID

Version 0.6.5 (2012-12-26)

  • JavaScript Tracker: bumped to 0.9.0
  • JavaScript Tracker: each event now sent with an event type e (#63)
  • JavaScript Tracker: refactoring of event definition code
  • JavaScript Tracker: added attachUserId(boolean) method (#92)
  • JavaScript Tracker: removed configCustomData from logImpression (#115)
  • JavaScript Tracker: cleaned up activity tracking (page pings)
  • JavaScript Tracker: added a combine only option to snowpak.sh
  • Serde: bumped to 0.5.3
  • Serde: now extracts event type (e) from querystring (#63)
  • Serde: now attaches UUID event_id to each event (#89)
  • Serde: added support for IP address override in querystring (#90)
  • Serde: no longer dies on corrupted querystring (#114)
  • HiveQL: hive-rolling-etl.q bumped to 0.5.3
  • HiveQL: non-hive-rolling-etl.q bumped to 0.0.4
  • HiveQL: event and event_id now extracted from Serde (#63, #89)
  • EmrEtlRunner: updated config file template

Version 0.6.4 (2012-12-20)

  • HiveQL: renamed table-def.q to non-hive-format-table-def.q
  • HiveQL: added hive-format-table-def.q (#111)
  • Infobright: bumped setup_infobright.sql to 0.0.4
  • Infobright: added migration script (0.0.3 -> 0.0.4)
  • Infobright: now supports long br_langs and urls (#107)
  • Infobright: removed lookup from fields which slow a large load (#107)

Version 0.6.3 (2012-12-18)

  • JavaScript Tracker: bumped to 0.8.2
  • JavaScript Tracker: fixed regressions from splitting JS into multiple files (#103)
  • HiveQL: hive-rolling-etl.q bumped to 0.5.2
  • HiveQL: addded missing comma in hive-rolling-etl.q (#112)

Version 0.6.2 (2012-11-29)

  • JavaScript Tracker: bumped to 0.8.1
  • JavaScript Tracker: fixed bug with trailing comma (#102)
  • JavaScript Tracker: removed console.log when not debugging (#101)
  • JavaScript Tracker: removed minified sp.js from version control (added .gitignore to keep it out)
  • SnowCannon: bumped submodule to latest shermozle/SnowCannon commit

Version 0.6.1 (2012-11-28)

  • JavaScript Tracker: bumped to 0.8.0
  • JavaScript Tracker: rename ice.png to i - BREAKING CHANGE (#29)
  • JavaScript Tracker: added setCollectorCf() and deprecated setAccount() (#32)
  • JavaScript Tracker: Tracker constructor now supports Cf or Url (part of #44)
  • JavaScript Tracker: getTrackerCf() and -Url() added, getTracker() deprecated (part of #44)
  • JavaScript Tracker: added tracker version (tv) to querystring (#41)
  • JavaScript Tracker: added color depth tracking (part of #69)
  • JavaScript Tracker: added timezone tracking (part of #69)
  • JavaScript Tracker: added user fingerprinting (#70)
  • JavaScript Tracker: broke out .js into multiple files (#55)
  • EmrEtlRunner: bumped to 0.0.6
  • EmrEtlRunner: --skip takes multiple args (part of #83, supercedes #80)
  • EmrEtlRunner: add --process-bucket to process a bucket directly (part of #83)
  • StorageLoader: bumped to 0.0.2
  • StorageLoader: changed the data file encloser to NULL (#88)
  • Serde: bumped to 0.5.2
  • Serde: now extracts color depth, timezone and fingerprint fields
  • Serde: added useragent into ETL (#68)
  • Serde: now extracts platform field
  • HiveQL: hive-rolling-etl.q bumped to 0.5.1
  • HiveQL: non-hive-rolling-etl.q bumped to 0.0.3
  • HiveQL: now extracts color depth, timezone and fingerprint fields
  • HiveQL: now includes raw useragent as a separate field (#68)
  • HiveQL: platform field no longer a placeholder
  • HiveQL: event_name field renamed to event (prep for #89)
  • HiveQL: added event_id as a placeholder
  • Infobright: bumped setup_infobright.sql to 0.0.3
  • Infobright: added migration script (0.0.1/2 -> 0.0.3)
  • Infobright: now includes color depth, timezone and fingerprint fields
  • Infobright: now includes raw useragent (#68)
  • Infobright: event_name field renamed to event
  • Infobright: added event_id as a placeholder (prep for #89)

Version 0.6.0 (2012-11-12)

  • EmrEtlRunner: bumped to 0.0.5
  • EmrEtlRunner: bumped gem dependencies to match StorageLoader (including Sluice 0.0.4)
  • EmrEtlRunner: renamed snowplow-emr-etl.sh to snowplow-emr-etl-runner.sh
  • StorageLoader: added. Ruby app to load SnowPlow events into local databases etc
  • Serde: bumped to 0.5.1
  • Serde: changed all Booleans to Bytes for non-Hive output
  • HiveQL: bumped non-hive-rolling-etl.q to 0.0.2
  • HiveQL: changed non-hive-rolling-etl.q to use the two _bt Byte fields
  • Infobright: bumped setup_infobright.sql to 0.0.2
  • Infobright: changed booleans to tinyint(1)s (non-breaking change)

Version 0.5.2 (2012-11-05)

  • EmrEtlRunner: bump to 0.0.4
  • EmrEtlRunner: fixed reference to old version of Hive deserializer in config.yml (fixes #71)
  • EmrEtlRunner: fixed bug using sub-folders with the Processing Bucket (fixes #72)
  • EmrEtlRunner: can now skip move-files-to-Processing-Bucket or EMR stages (fixes #58)
  • EmrEtlRunner: S3 filecopy code now moved to Sluice, an external Ruby gem

Version 0.5.1 (2012-10-31)

  • Data model: stubbed new event_name and platform fields
  • Infobright: added setup scripts and docs into 4-storage/infobright (fixes #57)
  • Infobright: added version handling (v_tracker, v_collector, v_etl)
  • HiveQL: removed hive-exact-etl.q as no longer supported
  • HiveQL: added non-hive-rolling-etl.q for Infobright- (and other db-)friendly event file format
  • HiveQL: added version handling (v_tracker, v_collector, v_etl) (fixes #42)
  • Serde: bumped to 0.5.0
  • Serde: updated to avoid throwing exceptions on a bad field, fixes #52 (thanks @mtibben!)
  • Serde: moved some exception handling closer to the throw point, pull req #66 (thanks @mtibben!)
  • Serde: added continue_on_unexpected_error (thanks @mtibben!)
  • Serde: tabs are changed to 4 spaces, fixes #61
  • Serde: browser features are now also available as individual fields, for non-hive-rolling-etl.q to use
  • Serde: added version handling (v_tracker, v_collector, v_etl)
  • EmrEtlRunner: bumped to 0.0.3
  • EmrEtlRunner: moved 3 .rb files in lib/ into lib/snowplow-emr-etl-runner
  • EmrEtlRunner: added/updated configuration options (:etl: section and hiveql versioning params)

Version 0.5.0 (2012-10-24)

  • Tidied up folder structure inside 3-etl/
  • Serde: assembles to /target, not to /upload any more (and jars won't be committed to Git)
  • EmrEtlRunner: added. Ruby application to run Hive ETL process on Amazon EMR

Version 0.4.10 (2012-10-10)

  • SnowCannon: bumped submodule to latest shermozle/SnowCannon commit
  • HiveQL: moved app_id to end of table for backwards compatibility
  • HiveQL: fixed bug where pointing to serde 0.4.8 NOT new serde 0.4.9

Version 0.4.9 (2012-10-01)

  • Serde: fixed bug where row not nulled if a critical field un-parseable
  • Serde: added support for new application ID (#33)
  • Serde: added deserialization of ecommerce fields, plus tests (#34, #51)
  • Serde: test suite enhancements (adding Scala helper objects)
  • Serde: added tests including #13 and #10
  • snowplow.js: bumped to 0.7.0
  • snowplow.js: renamed said to aid for application ID

Version 0.4.8 (2012-09-14)

  • Serde: added support for /i as well as /ice.png (issue #35)
  • Serde: added support for new (2012-09-12) CloudFront format
  • Serde: handles Cf bucket with Forward Query String = yes (issue #39)
  • Serde: made marketing attribution parsing more robust

Version 0.4.7 (2012-09-05)

  • snowplow.js: bumped to version 0.6
  • snowplow.js: added setSiteId functionality
  • snowplow.js: added ecommerce tracking

Version 0.4.6 (2012-08-18)

  • snowplow.js: added setCollectorUrl functionality

Version 0.4.5 (2012-08-03)

  • Serde: upgraded httpclient and tweaked URL code (issue #15)
  • Serde: now extracting our 5 marketing fields (issue #12)
  • Serde: added support for client-timestamp (issue #18)
  • Serde: now stripping line breaks (issue #23)

Version 0.4.4 (2012-07-28)

  • Restructured into 5 sub-systems
  • Updated README to explain sub-systems

Version 0.4.3 (2012-07-02)

  • Removed status code checks from Serde
  • Serde now outputs into /upload folder (to be uploaded by SnowPlow::Etl Ruby gem)

Version 0.4.2 (2012-06-19)

  • Moved serde into /hive from own repo

Version 0.4.1 (2012-06-16)

  • Updated serde to 0.4.4
  • Moved documentation to wiki

Version 0.4.0 (2012-05-30)

  • Improved names of querystring params
  • Added page-url to QS as fallback
  • Added Hive Deserializer as submodule
  • Documentation updates

Version 0.3.0 (2012-05-18)

  • Mostly documentation

Version 0.2.0

  • Formalised minification process

Version 0.1.0

  • Initial release of SnowPlow.js