SHARE
Facebook X Pinterest WhatsApp

What is a Data Lakehouse? 

A lakehouse is a new, open architecture that combines the best elements of data lakes and data warehouses.

Written By
DC
David Curry
Aug 5, 2022
 A lakehouse is a new, open architecture that combines the best elements of data lakes and data warehouses.

A data lakehouse might be the next step in data storage and processing, combining the best of data warehouse and data lake architecture into a new system that is built for the next decade of technological development. 

When data lakes were first introduced by Pentaho CTO James Dixon, experts in the field were split between the potential value of lakes as a fix to some of the issues with standard data warehouse solutions and what appeared to be simply a marketing term for a set of products built around the Hadoop system. 

Some also took issue with the potential for data silos, caused by a data lakes ability to store and process all types of data, whether structured, semi-structured or unstructured. That concern was warranted, with an entire industry springing up over the last decade to accommodate the huge influx in unstructured data.  

Data lakes have improved in value and sophistication over the past few years, which some consider a comeback for an architecture. Others perceive that data lakes have evolved into what Databricks and Snowflake are both claiming to have coined data lakehouses. 

See also: How the Data ‘Lakehouse’ Might Usurp the Warehouse and the Lake

“The lakehouse is a new data management architecture that radically simplifies enterprise data infrastructure and accelerates innovation in an age when machine learning is poised to disrupt every industry,” said Ali Ghodsi, CEO of Databricks. “In the past most of the data that went into a company’s products or decision making was structured data from operational systems, whereas today, many products incorporate AI in the form of computer vision and speech models, text mining, and others. Why use a lakehouse instead of a data lake for AI? A lakehouse gives you data versioning, governance, security and ACID properties that are needed even for unstructured data.”

In Databricks’ overview of the topic, it illustrates how data lakehouse architecture embeds a metadata and governance layer during data processing. This means that data from a diverse set of data can be processed and stored in a unitary system, which improves accessibility for everyone in an organization. 

Accessibility is important, as it is one of the key issues of previous generation data storage and processing solutions. With a data lakehouse, different departments in an organization can get access to datasets without having to go through the engineering department, which can improve productivity and enable deeper analysis of the data. 

Another benefit of the data lakehouse is additional security, as organizations can limit access to documents without the worry of additional copies being made. This level of control, down to the column or row level, is very difficult to achieve once data is offloaded to a data warehouse or stored in multiple areas.

With an open unitary system, organizations can also connect third-party analytics, visualization, and other tools directly to the data source, which can enable businesses to see analysis and visualization as close to real-time as possible. 

DC

David is a technology writer with several years experience covering all aspects of IoT, from technology to networks to security.

Recommended for you...

The Manual Migration Trap: Why 70% of Data Warehouse Modernization Projects Exceed Budget or Fail
The Role of Data Governance in ERP Systems
Sandip Roy
Nov 28, 2025
2025 Cloud Database Market: The Year in Review
CDInsights Team
Nov 13, 2025
6 Proven Day-2 Strategies for Scaling Kubernetes
Aviv Shukron
Nov 6, 2025

Featured Resources from RT Insights

In the Race for Speed, Is Semantic Layer the Supply Chain’s Biggest Blind Spot?
Sajal Rastogi
Jan 25, 2026
The Manual Migration Trap: Why 70% of Data Warehouse Modernization Projects Exceed Budget or Fail
The Difficult Reality of Implementing Zero Trust Networking
Misbah Rehman
Jan 6, 2026
Cloud Evolution 2026: Strategic Imperatives for Chief Data Officers
Cloud Data Insights Logo

Cloud Data Insights is a blog that provides insights into the latest trends and developments in the cloud data space. We cover topics related to cloud data management, data analytics, data engineering, and data science.

Property of TechnologyAdvice. © 2026 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.