Study Apache Iceberg ecosystems in AWS
[WIP] Study note about Apache Iceberg ecosystems in AWS.
S3 Tables
S3 Tables supports IAM-based and resource-based access control and automatic maintenance operations for Iceberg tables stored in buckets.
- S3 Tables is available in S3 table buckets.
- Unreferenced file removal is enabled for all tables by default
- It can be configured per table
- Compaction and snapshot is enabled for all tables by default
- It can be configured per table
- Resource mapping between AWS Glue (left is S3 Table resource, right is Glue resource)
- Table bucket = Catalog
- Namespace = Database
- Table = Table
- Client configuration to use Glue Iceberg endpoint
- Sigv4 properties : Sigv4 must be enabled, the signing name is glue
- Warehouse location :
<accountid>:s3tablescatalog/<table-bucket-name>
- Endpoint URI : Refer to the AWS Glue service endpoints reference guide for the region-specific endpoint
Quotas
- Table buckets per region in an AWS account = 10
- Namespaces in a table bucket = 10,000
- Tables in a table bucket = 10,000
Limitations
- Presigned URLs to access objects associated with a table are not supported.
- Tags are not supported for table buckets and tables. Therefore, support for attribute-based access control and tag-based allocation is unavailable.