All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
- If S3 Inventory is enabled in
apiary-data-lake
, create Hives3_inventory
database on startup. - Add script
/s3_inventory_repair.sh
which can be used as the entrypoint of this Docker image to create and repair S3 inventory tables in the inventory database (if S3 inventory is enabled). The intent is to run the image this way on a scheduled basis in Kubernetes after AWS creates new inventory partition files in S3 each day.
- Updated
apiary-metastore-listener
andkafka-metastore-listener
versions to6.0.0
(was5.0.2
).
- Enable Prometheus exporter when running on Kubernetes instead of sending metrics to CloudWatch.
- Added an optional Apiary metastore listener which can be used to send Hive metadata events to a Kafka topic.
- Updated
apiary-metastore-listener
version to5.0.2
(was4.2.0
).
- Set EKS hostname to ECS_TASK_ID required for enabling metastore metrics.
- Update using https for maven central repository as it no longer supports insecure communication over plain HTTP.
- Fix Ranger Solr auditing by upgrading
apiary-extensions
version to5.0.1
(was5.0.0
)
- Atlas cluster name is set to Apiary
ATLAS_CLUSTER_NAME
env variable when using Atlas plugin. If not set, will default toINSTANCE_NAME
var.
- Update Ranger version from to
2.0.0
(was1.1.0
). - Update Ranger metastore plugin to
5.0.0
(was4.2.0
). - Support Ranger audit-only mode for read-only HMS endpoint when audit destination is SOLR.
- Add Atlas hive-bridge metastore listener, to send metadata events to Kafka.
- set DefaultAWSCredentialsProviderChain as default hadoop-aws credential provider.
- Updated
emr-apps.repo
to5.24.0
(was5.15.0
). - Updated
emr-platform.repo
to1.17.0
(was1.6.0
).
- Upgrade Hive to
2.3.4
(was2.3.3
) in order to fix https://issues.apache.org/jira/browse/HIVE-18767 - see #59 (Hive version is controlled by the version ofemr-apps.repo
).
- If Ranger is configured on the metastore, the read-only instance of
the metastore will be configured for audit-only by using
ApiaryRangerAuthAllAccessPolicyProvider
in apiary-metastore-ranger-plugin
- ReadOnlyAuth Pre Event Listener to manage Hive database whitelist in read-only metastores apiary-metastore-extensions.
- Support for
_
inHIVE_DB_NAMES
variable. Fixes [#5] (ExpediaGroup/apiary#5).
- Updated apiary-metastore-listener to 4.0.0 (was 1.1.0).
- Updated apiary-gluesync-listener to 4.0.0 (was 1.1.0).
- Updated apiary-ranger-plugin to 4.0.0 (was 1.1.0).
- Updated apiary-metastore-metrics to 4.0.0 (was 1.1.0).
- Updated apiary-metastore-auth to 4.0.0 (was 1.1.0).
- Auto configure Hive metastore heapsize when running on ECS.
- Replace EMRFS with hadoop-aws S3A libraries.
- Option to send metastore metrics to CloudWatch - see #4.
- Refactor Environment variable names.
- Migrate secrets from Hashicorp Vault to AWS SecretsManager.
- Update startup script to configure Log4j, to fix sending Hive Metastore logs to CloudWatch.
- Deploy RangerAuth Pre Event Listener from apiary-metastore-extensions.
- Deploy GlueSync Listener from apiary-metastore-extensions.
- Deploy SNS Listener from apiary-metastore-extensions.
- Additional check to support external MySQL database for Hive Metastore, required to implement #48.
- Fix to update cacerts for Java.
- Fix Hive Metastore logging.