Delta Lake

Protocol

Integrations / ecosystem

Observations / thoughts / questions

Links


Iceberg

Update June 4th, 2024: Databricks acquired Tablular. Delta Lake and Iceberg will probably be merged gradually in the near future.

Spec

Integrations / ecosystem

Observations / thoughts / questions

Atomic data commit

from File System Operations

Tables do not require rename, except for tables that use atomic rename to implement the commit operation for new metadata files.

from Metastore Tables

The atomic swap needed to commit new versions of table metadata

Delete format

Maintenance


Hudi

Spec

Integrations / ecosystem

Observations / thoughts / questions

Concurrency Control

Hudi implements a file level, log based concurrency control protocol on the Hudi timeline, which in-turn relies on bare minimum atomic puts to cloud storage. (from: Lakehouse Concurrency Control: Are we too optimistic?)

Hudi guarantees that the actions performed on the timeline are atomic & timeline consistent based on the instant time. Atomicity is achieved by relying on the atomic puts to the underlying storage to move the write operations through various states in the timeline. (from: Timeline)


Kudo

Schema design

Integrations / ecosystem

Tightly integrated with Impala. Has integration with NiFi and Spark.

Observations / thoughts / questions


Links