SHARE
Facebook X Pinterest WhatsApp

Apache Iceberg Quickly Becoming Large-Scale Analytics Format Of Choice

Apache Iceberg enables multiple applications to process the same dataset and to understand the metadata inside each table.

Written By
DC
David Curry
Jan 10, 2023
Apache Iceberg enables multiple applications to process the same dataset and to understand the metadata inside each table.

Apache Iceberg, which was born out of a love/hate relationship that many Netflix engineers had with data warehouse software Apache Hive, has in the space of five years become the go-to choice for developers working on large-scale analytics tables. 

In 2015, Netflix engineers Ryan Blue and Daniel Weeks began work on Iceberg as a solution to many of the issues developers had with Apache Hive, which was heavily integrated into Netflix infrastructure. The problems had become so commonplace that engineers routinely avoided using Hive services, inputting data manually instead, which led to slower productivity. 

With Iceberg, Netflix aimed to ensure the correctness and validity of data transactions, regardless of errors, power failures, and other issues that can occur at the processing stage. It also wanted to improve the performance of table software through the use of finer-grained operations, allowing analysis to be done at the file level, and simplify the operation and maintenance of tables.

See also: AWS re:Invent Keynote: EDA and Loosely Coupled Systems

The team succeeded in this task, with Netflix shifting much of their operations to Iceberg. A year after publishing Iceberg, the team donated the project to Apache Software Foundation, and launched their own data automation platform for data warehouse storage, called Tabular.

In the years since donating Iceberg to Apache, it has been adopted by a long list of major tech companies, including Adobe, Airbnb, Apple, Google, LinkedIn, Snapchat, and Snowflake. Many of them are prioritizing Apache Iceberg over other formats, with Google getting feedback from a lot of Cloud customers on why Iceberg should be of higher priority than Databricks Delta and Hudi, two alternative big data analytics formats. 

Google has retained the availability of all three formats for the time being, although Sudhir Hasbe, a senior director of product management at Google Cloud, confirmed to The Register that Apache Iceberg was becoming the “primary format”. Cloudera and Snowflake also announced support for Iceberg in the past two years, with signs of moving away from other formats in the future.

See also: 22 Top Cloud Database Vendors

There are plenty of benefits to Apache Iceberg. Many developers of analytics applications cite the vendor and platform agnostic approach, provided by the Apache Foundation, as of value in comparison to Databricks and other formats, which are not as widely supported. As for the application itself, Apache Iceberg enables multiple applications to process the same dataset and to understand the metadata inside each table, and any updates to massive data lake tables are processed at a much faster rate than other formats. Apache Iceberg also has improvements to data management and reliability, with better identification and resolution of issues inside the tables. 

DC

David is a technology writer with several years experience covering all aspects of IoT, from technology to networks to security.

Recommended for you...

The Manual Migration Trap: Why 70% of Data Warehouse Modernization Projects Exceed Budget or Fail
The Role of Data Governance in ERP Systems
Sandip Roy
Nov 28, 2025
2025 Cloud Database Market: The Year in Review
CDInsights Team
Nov 13, 2025
6 Proven Day-2 Strategies for Scaling Kubernetes
Aviv Shukron
Nov 6, 2025

Featured Resources from RT Insights

In the Race for Speed, Is Semantic Layer the Supply Chain’s Biggest Blind Spot?
Sajal Rastogi
Jan 25, 2026
The Manual Migration Trap: Why 70% of Data Warehouse Modernization Projects Exceed Budget or Fail
The Difficult Reality of Implementing Zero Trust Networking
Misbah Rehman
Jan 6, 2026
Cloud Evolution 2026: Strategic Imperatives for Chief Data Officers
Cloud Data Insights Logo

Cloud Data Insights is a blog that provides insights into the latest trends and developments in the cloud data space. We cover topics related to cloud data management, data analytics, data engineering, and data science.

Property of TechnologyAdvice. © 2026 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.