Modern companies know that analytics needs to be at the heart of their decision-making and business strategy. That’s why many have invested in pathways to access vast amounts of data. But now, they are encountering a new challenge: Over 70% of the data generated today is no longer structured, easy to manage, find or analyze. This challenge begs the question: What can be done to make this data meaningful to the business?
Unfortunately, a “one size fits all” approach to data architecture and management doesn’t exist. But here are different approaches that have historically been used while also introducing the lesser-known, emerging model of data mesh.
Breaking Down the Role of Traditional Data Architecture
Data Lakes: A data lake is a giant, central storage repository that holds a vast amount of raw data in its native format until it is needed. Data lakes help data scientists and analysts who are tasked with determining whether raw data from an organization’s datasets can be turned into actionable insights. Through its flat architecture, the data lake provides more flexibility, storage, and usage at a lower cost, so that the data from the business systems can be replicated into a single repository.
Data lakes can be beneficial for industries like oil and gas that accumulate large, complex data sets. On average, an oil company generates 1.5 terabytes of Internet of Things data daily. By leveraging data lakes for exploration, this industry can optimize directional drilling, lower operating expenses, improve safety, and stay compliant with regulatory requirements.
Data Warehouses: Once data scientists or analysts find value in the various datasets within these data lakes, that refined data and intelligence can be brought into a data warehouse. A data warehouse is a structured storage architecture used to hold cleansed and transformed data from various sources for historical reporting and large-scale decision support.
Data warehouses tend to be quite large and central to business success. They require significant engineering and operational effort to build and maintain. They can be platformed in on-premises systems, cloud deployments, or data-warehouse-as-a-service offerings.
The refined data and intelligence housed in a data warehouse are commonly aggregated and shaped to be more “business-friendly” to inform better reuse and decision-making for enterprises. This approach can be helpful for organizations that need to make repeatable business decisions and drive operational efficiencies.
For instance, Walmart used its data warehouses to test inventory management methods in its U.S. and Canadian stores. As a result, Walmart could make more informed decisions to open new locations in Canada and close certain U.S. stores in an effort to accommodate its customers’ needs.
Operational Data Store (ODS): Because the data warehouse is massive and has many moving parts, it can be difficult to update that data frequently to support fast-moving decisions.
An ODS integrates and transforms the minimal cross-system data required to provide real-time decision support. This separates the high-compute transformations for fast intelligence from the large, regular needs of the data warehouse. Data, decisions, and alerts from the ODS are often moved into the data warehouse or data lake for archival use.
Any organization managing minute-to-minute decisioning — patient care, manufacturing lines, or energy management are great examples — could benefit from an ODS approach.
How Data Mesh Fits
Alongside these three classic modes of data storage, the data mesh concept is a growing architectural design principle that allows data scientists and analysts to examine data anywhere in a system: across the data lake, data warehouse, and ODS, as well as source systems. This creates a “virtual data hub” that enables robust, enterprise-wide exploration without the cost and overhead of replicating data of unknown value into the data lake.
Once value is discovered, engineering effort is used to pipe the data and intelligence into the data lake, data warehouse, and/or operational data store for consumption.
Numerous industries can benefit from data mesh. Many financial services companies are grappling with how to modernize outdated technology. The systems they typically use have been in place for 50+ years, and any attempt at updating slows down system processes and incurs risk.
Healthcare is another industry that greatly benefits from the data mesh concept. It allows providers to navigate security positions tailored to HIPAA while also using patient data to improve the care experience and overall outcomes.
Key Considerations when Migrating to Data Management Models
While the value proposition for any of these data management models is compelling, it’s important to recognize that organizations can face several challenges with migration.
There’s a sizable investment for these infrastructures (think large on-prem appliances like Netezza). Additionally, the amount of labor and skill required to maintain a system is very different compared to what it takes to build it, so organizations must adapt accordingly.
In order to lay the foundation for well-governed data management, companies must:
- Understand the long-term strategy: If a company knows what long-term analytics success looks like, it’s possible to bring data into the decision-making process and provide data scientists and analysts with the tools they need to be successful.
- Be nimble and flexible: Ideally, organizations should put flexibility front and center when migrating to a data management solution. This way, the data architect’s designs will meet the organization’s needs today and grow with it based on future needs.
- Invest in the right people and skillsets: Because these data architectures are relatively new, there is often a knowledge gap. While the data warehouse has been around for 50+ years, newer systems like data lake or data mesh are often less understood. Business and technology leaders must be in lockstep when it comes to making critical investments and having the right expertise in place to drive these solutions. In today’s talent-constrained environment, often bringing in a third-party solution provider can make sense for both implementation and day-to-day management.
There’s a business case to be made for investing in and implementing a well-governed, measured approach to data management. Taking a proactive and strategic approach to data management can save time, money, and resources while unlocking even more powerful insights that lead to business outcomes.
Ken Seier is chief architect of data and artificial intelligence at Insight Enterprises, a Fortune 500 solutions integrator helping organizations accelerate their digital journey to modernize their business and maximize the value of technology. Ken and his team have been responsible for billions of dollars of revenue and savings through responsible analytics initiatives and innovation.