Delta 2.0 vs Iceberg 0.14.0 : TPC-DS benchmark

DataBeans
3 min readAug 1, 2022

--

After the announcement of Delta 2.0 during the Data + AI Summit in which Databricks fully open sourced Delta Lake, and the release of Iceberg 0.14.0 which added performance improvements for scan planning and spark queries, the community’s interest in the impact of these releases on their respective performance has risen. So by popular demand, we decided to run the TPC-DS benchmark on Delta 2.0 and Iceberg 0.14.0.

What is TPC-DS?

TPC-DS is a data warehousing benchmark defined by the Transaction Processing Performance Council (TPC). TPC is a non-profit organization founded by the database community in the late 1980s with the goal of developing benchmarks that may be used objectively to test database system performance by simulating real-world scenarios. TPC has had a significant impact on the database industry.

“Decision support” is what the “DS” in TPC-DS stands for. There are 99 queries in total, ranging from simple aggregations to advanced pattern analysis.

Environment setup:

In this benchmark we used Iceberg 0.14.0 and Delta 2.0 with the environment components listed in the table below:

Benchmark results:

  1. Overall performance

Delta was 1.4X faster than Iceberg in overall performance. It took Delta 1.78 hours to finish loading and running the queries. Meanwhile, it took Iceberg 2.5 hours to do the same; the difference being 43 minutes.[chart-1]

chart-1: Load and query performance

2. Load performance

Delta was 1.1X faster than Iceberg in load performance.[chart-2]

Delta’s load performance remained the same compared to the 1.2.0 version.

The same applies to Iceberg; load performance remained the same compared to its 0.13.1 version.

Chart-2: load performance

3. Query performance

Delta was 1.57X faster than Iceberg in query performance.[chart-3]

It took Delta 1.13 hours to finish running the queries. Meanwhile, it took Iceberg 1.79 hours to do the same; the difference being 39 minutes.

Delta 2.0 query performance remained the same compared to the 1.2.0 version.

Iceberg 0.14.0’s query performance improved by 20% compared to the 0.13.1 version.

Chart-3: query performance

Conclusion:

To further analyse and extract your own insights from this benchmark, you can download the full benchmark reports here.

--

--

DataBeans
DataBeans

Written by DataBeans

Simplify your data pipelines through simple reusable components [databeans.fr]