Apache Iceberg Quickly Becoming Large-Scale Analytics Format Of Choice - CDInsights

Apache Iceberg Quickly Becoming Large-Scale Analytics Format Of Choice

Apache Iceberg enables multiple applications to process the same dataset and to understand the metadata inside each table.

Written By
DC
David Curry
Jan 10, 2023
2 minute read
Apache Iceberg enables multiple applications to process the same dataset and to understand the metadata inside each table.

Apache Iceberg, which was born out of a love/hate relationship that many Netflix engineers had with data warehouse software Apache Hive, has in the space of five years become the go-to choice for developers working on large-scale analytics tables. 

In 2015, Netflix engineers Ryan Blue and Daniel Weeks began work on Iceberg as a solution to many of the issues developers had with Apache Hive, which was heavily integrated into Netflix infrastructure. The problems had become so commonplace that engineers routinely avoided using Hive services, inputting data manually instead, which led to slower productivity. 

With Iceberg, Netflix aimed to ensure the correctness and validity of data transactions, regardless of errors, power failures, and other issues that can occur at the processing stage. It also wanted to improve the performance of table software through the use of finer-grained operations, allowing analysis to be done at the file level, and simplify the operation and maintenance of tables.

See also: AWS re:Invent Keynote: EDA and Loosely Coupled Systems

The team succeeded in this task, with Netflix shifting much of their operations to Iceberg. A year after publishing Iceberg, the team donated the project to Apache Software Foundation, and launched their own data automation platform for data warehouse storage, called Tabular.

In the years since donating Iceberg to Apache, it has been adopted by a long list of major tech companies, including Adobe, Airbnb, Apple, Google, LinkedIn, Snapchat, and Snowflake. Many of them are prioritizing Apache Iceberg over other formats, with Google getting feedback from a lot of Cloud customers on why Iceberg should be of higher priority than Databricks Delta and Hudi, two alternative big data analytics formats. 

Google has retained the availability of all three formats for the time being, although Sudhir Hasbe, a senior director of product management at Google Cloud, confirmed to The Register that Apache Iceberg was becoming the “primary format”. Cloudera and Snowflake also announced support for Iceberg in the past two years, with signs of moving away from other formats in the future.

See also: 22 Top Cloud Database Vendors

There are plenty of benefits to Apache Iceberg. Many developers of analytics applications cite the vendor and platform agnostic approach, provided by the Apache Foundation, as of value in comparison to Databricks and other formats, which are not as widely supported. As for the application itself, Apache Iceberg enables multiple applications to process the same dataset and to understand the metadata inside each table, and any updates to massive data lake tables are processed at a much faster rate than other formats. Apache Iceberg also has improvements to data management and reliability, with better identification and resolution of issues inside the tables. 

DC

David is a technology writer with several years experience covering all aspects of IoT, from technology to networks to security.

Recommended for you...

What It Takes to Make AI Useful in Enterprise Networking
Santosh Dornal
Apr 27, 2026
Data Masking at Scale: Architecting Privacy for Real-time and AI-driven Systems
Yash Mehta
Apr 23, 2026
7 Key Considerations for Choosing Container Base Images for Java Apps
Dmitry Chuyko
Mar 11, 2026
The Manual Migration Trap: Why 70% of Data Warehouse Modernization Projects Exceed Budget or Fail

Featured Resources from RT Insights

What It Takes to Make AI Useful in Enterprise Networking
Santosh Dornal
Apr 27, 2026
Data Masking at Scale: Architecting Privacy for Real-time and AI-driven Systems
Yash Mehta
Apr 23, 2026
How Cloud Quantum Computing Services are Shaping the Future of HPC
Cloud Spending Trends: From Expansion to Optimization in the AI Era
Cloud Data Insights Logo

Cloud Data Insights is a blog that provides insights into the latest trends and developments in the cloud data space. We cover topics related to cloud data management, data analytics, data engineering, and data science.

Property of TechnologyAdvice. © 2026 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.