Name | Description | Type | Default | Required |
apiary_assume_roles | Cross account AWS IAM roles allowed write access to managed Apiary S3 buckets using assume policy. | list(any) |
[] |
no |
apiary_consumer_iamroles | AWS IAM roles allowed unrestricted (not subject to apiary_customer_condition ) read access to all data in managed Apiary S3 buckets. |
list(string) |
[] |
no |
apiary_consumer_prefix_iamroles | AWS IAM roles allowed unrestricted (not subject to apiary_customer_condition ) read access to certain prefixes in managed Apiary S3 buckets. See below section for more information and format. |
map(map(list(string))) |
{} |
no |
apiary_customer_accounts | AWS account IDs for clients of this Metastore. | list(string) |
[] |
no |
apiary_customer_condition | IAM policy condition applied to customer account S3 object access. | string |
"" |
no |
apiary_database_name | Database name to create in RDS for Apiary. | string |
"apiary" |
no |
apiary_deny_iamrole_actions | List of S3 actions that 'apiary_deny_iamroles' are not allowed to perform. | list(string) |
[ |
no |
apiary_deny_iamroles | AWS IAM roles denied access to Apiary managed S3 buckets. | list(string) |
[] |
no |
apiary_domain_name | Apiary domain name for Route 53. | string |
"" |
no |
apiary_domain_private_zone | Apiary domain private zone 53. | bool |
true |
no |
apiary_governance_iamroles | AWS IAM governance roles allowed read and tagging access to managed Apiary S3 buckets. | list(string) |
[] |
no |
apiary_log_bucket | Bucket for Apiary logs.If this is blank, module will create a bucket. | string |
"" |
no |
apiary_log_prefix | Prefix for Apiary logs. | string |
"" |
no |
apiary_managed_schemas | List of maps, each map contains schema name from which S3 bucket names will be derived, and various properties. The corresponding S3 bucket will be named as apiary_instance-aws_account-aws_region-schema_name. | list(map(string)) |
[] |
no |
apiary_producer_iamroles | AWS IAM roles allowed write access to managed Apiary S3 buckets. | map(any) |
{} |
no |
apiary_rds_additional_sg | Comma-separated string containing additional security groups to attach to RDS. | list(any) |
[] |
no |
apiary_shared_schemas | Schema names which are accessible from read-only metastore, default is all schemas. | list(any) |
[] |
no |
apiary_tags | Common tags that are added to all resources. | map(any) |
n/a | yes |
apiary_extra_tags_s3 | Extra tags that are added to apiary_s3_logs_bucket. | map(any) |
n/a | no |
atlas_cluster_name | Name of the Atlas cluster where metastore plugin will send DDL events. Defaults to var.instance_name if not set. |
string |
"" |
no |
atlas_kafka_bootstrap_servers | Kafka instance url. | string |
"" |
no |
aws_region | AWS region. | string |
n/a | yes |
apiary_common_producer_iamroles | AWS IAM roles allowed general (not tied to schema) write access to managed Apiary S3 buckets. | list(string) |
[] |
no |
dashboard_namespace | k8s namespace to deploy grafana dashboard. | string |
"monitoring" |
no |
db_apply_immediately | Specifies whether any cluster modifications are applied immediately, or during the next maintenance window. | bool |
false |
no |
db_backup_retention | The number of days to retain backups for the RDS Metastore DB. | string |
"7" |
yes |
db_backup_window | Preferred backup window for the RDS Metastore DB in UTC. | string |
"02:00-03:00" |
no |
db_copy_tags_to_snapshot | Copy all Cluster tags to snapshots. | bool |
true |
no |
db_enable_performance_insights | Enable RDS Performance Insights | bool |
false |
no |
db_enhanced_monitoring_interval | RDS monitoring interval (in seconds) for enhanced monitoring. Valid values are 0, 1, 5, 10, 15, 30, 60. Default is 0. | number |
0 |
no |
db_instance_class | Instance type for the RDS Metastore DB. | string |
"db.t4g.medium" |
yes |
db_instance_count | Desired count of database cluster instances. | string |
"2" |
no |
db_maintenance_window | Preferred maintenance window for the RDS Metastore DB in UTC. | string |
"wed:03:00-wed:04:00" |
no |
db_master_username | Aurora cluster MySQL master user name. | string |
"apiary" |
no |
db_ro_secret_name | Aurora cluster MySQL read-only user SecretsManger secret name. | string |
"" |
no |
db_rw_secret_name | Aurora cluster MySQL read/write user SecretsManager secret name. | string |
"" |
no |
disallow_incompatible_col_type_changes | Hive metastore setting to disallow validation when incompatible schema type changes. | bool |
true |
no |
docker_registry_auth_secret_name | Docker Registry authentication SecretManager secret name. | string |
"" |
no |
ecs_domain_extension | Domain name to use for hosted zone created by ECS service discovery. | string |
"lcl" |
no |
elb_timeout | Idle timeout for Apiary ELB. | string |
"1800" |
no |
enable_apiary_s3_log_hive | Create hive database to archive s3 logs in parquet format.Only applicable when module manages logs S3 bucket. | bool |
true |
no |
enable_autoscaling | Enable read only Hive Metastore k8s horizontal pod autoscaling. | bool |
true |
no |
enable_data_events | Enable managed buckets S3 event notifications. | bool |
false |
no |
enable_gluesync | Enable metadata sync from Hive to the Glue catalog. | bool |
false |
no |
enable_hive_metastore_metrics | Enable sending Hive Metastore metrics to CloudWatch. | bool |
false |
no |
enable_metadata_events | Enable Hive Metastore SNS listener. | bool |
false |
no |
enable_s3_paid_metrics | Enable managed S3 buckets request and data transfer metrics. | bool |
false |
no |
enable_vpc_endpoint_services | Enable metastore NLB, Route53 entries VPC access and VPC endpoint services, for cross-account access. | bool |
true |
no |
encrypt_db | Specifies whether the DB cluster is encrypted | bool |
false |
no |
external_data_buckets | Buckets that are not managed by Apiary but added to Hive Metastore IAM role access. | list(any) |
[] |
no |
external_database_host | External Metastore database host to support legacy installations, MySQL database won't be created by Apiary when this option is specified. | string |
"" |
no |
external_database_host_readonly | External Metastore database host to support legacy installations. | string |
"" |
no |
hive_metastore_port | Port on which both Hive Metastore readwrite and readonly will run. | number |
9083 |
no |
hms_additional_environment_variables | Additional environment variables for the Hive Metastore. | map(any) |
{} |
no |
hms_housekeeper_additional_environment_variables | Additional environment variables for Hive Housekeeper. | map(any) |
{} |
no |
hms_autogather_stats | Read-write Hive metastore setting to enable/disable statistics auto-gather on table/partition creation. | bool |
true |
no |
hms_docker_image | Docker image ID for the Hive Metastore. | string |
n/a | yes |
hms_docker_version | Version of the Docker image for the Hive Metastore. | string |
n/a | yes |
hms_instance_type | Hive Metastore instance type, possible values: ecs,k8s. | string |
"ecs" |
no |
hms_log_level | Log level for the Hive Metastore. | string |
"INFO" |
no |
hms_nofile_ulimit | Ulimit for the Hive Metastore container. | string |
"32768" |
no |
hms_ro_cpu | CPU for the read only Hive Metastore ECS task. Valid values can be 256, 512, 1024, 2048 and 4096. Reference: |
string |
"512" |
no |
hms_ro_db_connection_pool_size | Read-only Hive metastore setting for size of the MySQL connection pool. Default is 10. | number |
10 |
no |
hms_ro_ecs_task_count | Desired ECS task count of the read only Hive Metastore service. | string |
"3" |
no |
hms_ro_heapsize | Heapsize for the read only Hive Metastore. Valid values: |
string |
"2048" |
no |
hms_ro_k8s_replica_count | Initial Number of read only Hive Metastore k8s pod replicas to create. | number |
"2048" |
no |
hms_ro_k8s_max_replica_count | Max Number of read only Hive Metastore k8s pod replicas to create. | number |
"2048" |
no |
hms_rw_k8s_pdb_settings | Add PodDisruptionBudget to the HMS rw pods. | object |
max_unavailable = 1 |
no |
hms_rw_k8s_rolling_update_strategy | Configure HMS RW deployment rolling strategy. | object |
max_unavailable = 1 |
no |
hms_ro_target_cpu_percentage | Read only Hive Metastore autoscaling threshold for CPU target usage. | number |
"2048" |
no |
hms_ro_request_partition_limit | Read only Hive Metastore limits of request partitions. | string |
n/a | no |
hms_ro_node_affinity | Add node affinities to the Hive metastore pods. | list(object) |
n/a | no |
hms_ro_tolerations | Add tolerations to the Hive metastore pods. | list(object) |
n/a | no |
hms_rw_cpu | CPU for the read/write Hive Metastore ECS task. Valid values can be 256, 512, 1024, 2048 and 4096. Reference: |
string |
"512" |
no |
hms_rw_db_connection_pool_size | Read-write Hive metastore setting for size of the MySQL connection pool. Default is 10. | number |
10 |
no |
hms_rw_ecs_task_count | Desired ECS task count of the read/write Hive Metastore service. | string |
"3" |
no |
hms_rw_heapsize | Heapsize for the read/write Hive Metastore. Valid values: |
string |
"2048" |
no |
hms_rw_k8s_replica_count | Initial Number of read/write Hive Metastore k8s pod replicas to create. | number |
"2048" |
no |
hms_rw_k8s_pdb_settings | Add PodDisruptionBudget to the HMS rw pods. | object |
max_unavailable = 1 |
no |
hms_rw_k8s_rolling_update_strategy | Configure HMS RW deployment rolling strategy. | object |
max_unavailable = 1 |
no |
hms_rw_request_partition_limit | Read Write Hive Metastore limits of request partitions. | string |
n/a | no |
hms_rw_node_affinity | Add node affinities to the Hive metastore pods. | list(object) |
n/a | no |
hms_rw_tolerations | Add tolerations to the Hive metastore pods. | list(object) |
n/a | no |
iam_name_root | Name to identify Hive Metastore IAM roles. | string |
"hms" |
no |
ingress_cidr | Generally allowed ingress CIDR list. | list(string) |
n/a | yes |
instance_name | Apiary instance name to identify resources in multi-instance deployments. | string |
"" |
no |
k8s_docker_registry_secret | Docker Registry authentication K8s secret name. | string |
"" |
no |
kafka_bootstrap_servers | Kafka bootstrap servers to send metastore events, setting this enables Hive Metastore Kafka listener. | string |
"" |
no |
kafka_topic_name | Kafka topic to send metastore events. | string |
"" |
no |
kiam_arn | Kiam server IAM role ARN. | string |
"" |
no |
ldap_base | Active directory LDAP base DN to search users and groups. | string |
"" |
no |
ldap_ca_cert | Base64 encoded Certificate Authority bundle to validate LDAPS connections. | string |
"" |
no |
ldap_secret_name | Active directory LDAP bind DN SecretsManager secret name. | string |
"" |
no |
ldap_url | Active directory LDAP URL to configure Hadoop LDAP group mapping. | string |
"" |
no |
metastore_namespace | k8s namespace to deploy metastore containers. | string |
"metastore" |
no |
oidc_provider | EKS cluster OIDC provider name, required for configuring IAM using IRSA. | string |
"" |
no |
private_subnets | Private subnets. | list(any) |
n/a | yes |
ranger_audit_db_url | Ranger DB audit provider configuration. | string |
"" |
no |
ranger_audit_secret_name | Ranger DB audit secret name. | string |
"" |
no |
ranger_audit_solr_url | Ranger Solr audit provider configuration. | string |
"" |
no |
ranger_policy_manager_url | Ranger admin URL to synchronize policies. | string |
"" |
no |
rds_max_allowed_packet | RDS/MySQL setting for parameter 'max_allowed_packet' in bytes. Default is 128MB (Note that MySQL default is 4MB). | number |
134217728 |
no |
rw_ingress_cidr | Read-Write metastore ingress CIDR list. If not set, defaults to var.ingress_cidr . |
list(string) |
[] |
no |
s3_enable_inventory | Enable S3 inventory configuration. | bool |
false |
no |
s3_inventory_customer_accounts | AWS account IDs allowed to access s3 inventory database. | list(string) |
[] |
no |
s3_inventory_format | Output format for S3 inventory results. Can be Parquet, ORC, CSV | string |
"ORC" |
no |
s3_inventory_update_schedule | Cron schedule to update S3 inventory tables (if enabled). Defaults to every 12 hours. | string |
"0 */12 * * *" |
no |
s3_lifecycle_abort_incomplete_multipart_upload_days | Number of days after which incomplete multipart uploads will be deleted. | string |
"7" |
no |
s3_lifecycle_policy_transition_period | S3 Lifecycle Policy number of days for Transition rule | string |
"30" |
no |
s3_log_expiry | Number of days after which Apiary S3 bucket logs expire. | string |
"365" |
no |
s3_logs_sqs_delay_seconds | The time in seconds that the delivery of all messages in the queue will be delayed. | number |
300 |
no |
s3_logs_sqs_message_retention_seconds | Time in seconds after which message will be deleted from the queue. | number |
345600 |
no |
s3_logs_sqs_receive_wait_time_seconds | The time for which a ReceiveMessage call will wait for a message to arrive (long polling) before returning. | number |
10 |
no |
s3_logs_sqs_visibility_timeout_seconds | Time in seconds after which message will be returned to the queue if it is not deleted. | number |
3600 |
no |
s3_storage_class | S3 storage class after transition using lifecycle policy | string |
no |
secondary_vpcs | List of VPCs to associate with Service Discovery namespace. | list(any) |
[] |
no |
system_schema_customer_accounts | AWS account IDs allowed to access system database. | list(string) |
[] |
no |
system_schema_name | Name for the internal system database | string |
"apiary_system" |
no |
table_param_filter | A regular expression for selecting necessary table parameters for the SNS listener. If the value isn't set, then no table parameters are selected. | string |
"" |
no |
vpc_id | VPC ID. | string |
n/a |
yes |
enable_dashboard | make EKS & ECS dashboard optional | bool |
true |
no |
rds_family | RDS Family | string |
aurora5.6 |
no |
datadog_metrics_enabled | Enable Datadog metrics for HMS | bool |
false |
no |
datadog_metrics_hms_readwrite_readonly | Prometheus Metrics sent to datadog | list(string) |
["metrics_classloading_loaded_value","metrics_threads_count_value","metrics_memory_heap_max_value","metrics_init_total_count_tables_value","metrics_init_total_count_dbs_value","metrics_memory_heap_used_value","metrics_init_total_count_partitions_value"] | no |
datadog_metrics_port | Port in which metrics will be send for Datadog | string |
8080 |
no |
datadog_key_secret_name | Name of the secret containing the DataDog API key. This needs to be created manually in AWS secrets manager. This is only applicable to ECS deployments. | string |
null |
no |
datadog_agent_version | Version of the Datadog Agent running in the ECS cluster. This is only applicable to ECS deployments. | string |
7.50.3-jmx |
no |
datadog_agent_enabled | Whether to include the datadog-agent container. This is only applicable to ECS deployments. | string |
false |
no |
enable_tcp_keepalive | tcp_keepalive settings on HMS pods. To use this you need to enable the ability to cahnge sysctl settings on your kubernetes cluster. For EKS you need to allow this on your cluster ( check EKS version for details). If your EKS version is below 1.24 you need to create a PodSecurityPolicy allowing the following sysctls "net.ipv4.tcp_keepalive_time", "net.ipv4.tcp_keepalive_intvl","net.ipv4.tcp_keepalive_probes" and a ClusterRole + Rolebinding for the service account running the HMS pods or all services accounts in the namespace where Apiary is running so that kubernetes can apply the tcp)keepalive configuration. For EKS 1.25 and above check this Also see tcp_keepalive_* variables. | bool |
false |
no |
tcp_keepalive_time | Sets net.ipv4.tcp_keepalive_time (seconds). | number |
200 |
no |
tcp_keepalive_intvl | Sets net.ipv4.tcp_keepalive_intvl (seconds) | number |
30 |
no |
tcp_keepalive_probes | Sets net.ipv4.tcp_keepalive_probes (seconds) | number |
2 |
no |
ecs_platform_version | ECS Service Platform Version | string |
no |
ecs_requires_compatibilities | ECS task definition requires compatibilities. | list(string) |
["EC2", "FARGATE"] |
no |
hms_ecs_metrics_readonly_namespace | ECS readwrite metrics namespace | string |
hmsreadonlylegacy |
no |
hms_ecs_metrics_readwrite_namespace | ECS readonly metrics namespace | string |
hmsreadwritelegacy |
no |
hms_k8s_metrics_readonly_namespace | K8s readwrite metrics namespace | string |
hms_readonly |
no |
s3_versioning_expiration_days | Number of days (TTL) before objects are expired. Bucket need to have versioning enabled. | number |
7 |
no |
enable_splunk_logging | Enable sending longs to Splunk. When enabling we also need splunk_hec_token, splunk_hec_host and splunk_index. | bool |
false | no |
splunk_hec_token | The token used for authentication with the Splunk HTTP Event Collector (HEC). This is required for sending logs to Splunk. Compatible with both EC2 and FARGATE ECS task definitions. | string |
no | |
splunk_hec_host | The hostname or URL of the Splunk HTTP Event Collector (HEC) endpoint to which logs will be sent. | string |
no | |
splunk_hec_index | The index in Splunk where logs will be stored. This is used to organize and manage logs within Splunk. | string |
no |
A list of maps. Each map entry describes a role that is created in this account, and a list of principals (IAM ARNs) in other accounts that are allowed to assume this role. Each entry also specifies a list of Apiary schemas that this role is allowed to write to.
An example entry looks like:
apiary_assume_roles = [
name = "client_name"
principals = [ "arn:aws:iam::account_number:role/cross-account-role" ]
schema_names = [ "dm","lz","test_1" ]
max_role_session_duration_seconds = "7200",
allow_cross_region_access = true
map entry fields:
Name | Description | Type | Default | Required |
name | Short name of the IAM role to be created. Full name will be apiary-<name>-<region> . |
string | - | yes |
principals | List of IAM role ARNs from other accounts that can assume this role. | list(string) | - | yes |
schema_names | List of Apiary schemas that this role can read/write. | list(string) | - | yes |
max_role_session_duration_seconds | Number of seconds that the assumed credentials are valid for. | string | "3600" |
no |
allow_cross_region_access | If true , will allow this role to write these Apiary schemas in all AWS regions that these schemas exist in (in this account). If false , can only write in this region. |
bool | false |
no |
A list of maps. Schema names from which S3 bucket names will be derived, corresponding S3 bucket will be named as apiary_instance-aws_account-aws_region-schema_name, along with S3 storage properties like storage class and number of days for transitions.
An example entry looks like:
apiary_managed_schemas = [
schema_name = "sandbox"
s3_lifecycle_policy_transition_period = "30"
s3_storage_class = "INTELLIGENT_TIERING"
s3_object_expiration_days = 60
tags=jsonencode({ Domain = "search", ComponentInfo = "1234" })
enable_data_events_sqs = "1"
encryption = "aws:kms" //supported values for encryption are AES256,aws:kms
admin_roles = "role1_arn,role2_arn" //kms key management will be restricted to these roles.
client_roles = "role3_arn,role4_arn" //s3 bucket read/write and kms key usage will be restricted to these roles.
customer_accounts = "account_id1,account_id2" //this will override module level apiary_customer_accounts
map entry fields:
Name | Description | Type | Default | Required |
schema_name | Name of the S3 bucket. Full name will be apiary_instance-aws_account-aws_region-schema_name . |
string | - | yes |
enable_data_events_sqs | If set to "1" , S3 data event notifications for ObjectCreated and ObjectRemoved will be sent to an SQS queue for processing by external systems. |
string | - | no |
s3_lifecycle_policy_transition_period | Number of days for transition to a different storage class using lifecycle policy. | string | "30" | No |
s3_storage_class | Destination S3 storage class for transition in the lifecycle policy. For valid values for S3 Storage classes, reference: | string | "INTELLIGENT_TIERING" | No |
s3_object_expiration_days | Number of days after which objects in Apiary managed schema buckets expire. | number | null | No |
tags | Additional tags added to the S3 data bucket. The map of tags must be encoded as a string using jsonencode (see sample above). If the var.apiary_tags collection and the tags passed to apiary_managed_schemas both contain the same tag name, the tag value passed to apiary_managed_schemas will be used. |
string | null | no |
encryption | S3 objects encryption type, supported values are AES256,aws:kms. | string | null | no |
admin_roles | IAM roles configured with admin access on corresponding KMS keys,required when encryption type is aws:kms . |
string | null | no |
client_roles | IAM roles configured with usage access on corresponding KMS keys,required when encryption type is aws:kms . |
string | null | no |
A list of cross-account IAM role ARNs that are allowed to read all data in all Apiary managed schemas. These roles are not subject to any restrictions imposed by
An example entry looks like:
apiary_consumer_iamroles = [
A map of map of list of IAM roles. Each top-level map entry is the name of an Apiary managed schema. Each entry in that map is an S3 prefix in that schema. The value of that map entry
is a list of IAM roles that has unrestricted read access to objects under that S3 prefix. These roles are not subject to any restrictions imposed by
An example entry looks like:
apiary_consumer_prefix_iamroles = {
sandbox = {
"prefix1/with/several/levels" = [
prefix2 = [
test = {
prefixroletest = [
"prefixroletest2" = [
A string that defines a list of conditions that restrict which objects in an Apiary schema's S3 bucket may be read cross-account by accounts in the customer_accounts
The string is a semicolon-delimited list of comma-delimited strings that specify conditions that are valid in AWS S3 bucket policy
Condition sections. This condition is applied to every Apiary schema's S3 bucket policy.
An example entry to limit access to:
- Only requests from certain VPC CIDR blocks
- And only to objects that have:
- Either an S3 tag of
or - An S3 tag of
looks like:
- Either an S3 tag of
apiary_customer_condition = <<EOF
"IpAddress": {"aws:VpcSourceIp": ["",""]},
"StringEquals": {"s3:ExistingObjectTag/data-sensitivity": "false" };
"IpAddress": {"aws:VpcSourceIp": ["",""]},
"StringLike": {"s3:ExistingObjectTag/data-type": "image*" }
Each semicolon-demlimited section will create a new statement entry in the bucket policy's Statement
array. Each comma-delimited section will create an entry in the
section of the Statement
entry. For the above example, the Statement
and Condition
entries would be:
"Statement": [
"Sid": "Apiary customer account object permissions",
"Effect": "Allow",
"Principal": {
"AWS": [
"Action": [
"Resource": "arn:aws:s3:::apiary-<account_num>-<region>-<schema_name>/*",
"Condition": {
"StringEquals": {
"s3:ExistingObjectTag/data-sensitivity": "false"
"IpAddress": {
"aws:VpcSourceIp": [
"Sid": "Apiary customer account object permissions",
"Effect": "Allow",
"Principal": {
"AWS": [
"Action": [
"Resource": "arn:aws:s3:::apiary-<account_num>-<region>-<schema_name>/*",
"Condition": {
"StringLike": {
"s3:ExistingObjectTag/data-type": "image*"
"IpAddress": {
"aws:VpcSourceIp": [
- Note that any IAM roles in
would not be subject to the restrictions fromapiary_customer_condition
, and so could read any S3 object, even if they don't have adata-sensitivity
tag, or if thedata-sensitivity
tag istrue
, or if there is nodata-type
tag ofimage*
. - Note that any IAM roles in
would not be subject to the restrictions fromapiary_customer_condition
for the schemas and prefixes specified in the map, and so could read any S3 object under those prefixes, even if they don't have adata-sensitivity
tag, or if thedata-sensitivity
tag istrue
, or if there is nodata-type
tag ofimage*
A list of cross-account IAM role ARNs that are allowed to read and write data in all Apiary managed schemas.
An example entry looks like:
common_producer_iamroles = [
Write access is granted by default for roles within the same AWS account. If you would like to protect the bucket so only certain roles can write you can use deny_global_write_access
and producer_roles
If you would like to protect all buckets you can set the default variable deny_global_write_access
to true
. However, enabling only one bucket looks like this:
apiary_managed_schemas = [
schema_name = "sandbox"
deny_global_write_access = true,
producer_roles = "arn:aws:iam::000000000:role/role-1,arn:aws:iam::000000000:role/role-2"