記事

What Is Enterprise Data Warehousing (EDW)?

Learn about data warehousing architecture, benefits, and best practices.

概要

This article covers how a data warehouse stores data, the architecture underlying them, how they enable various business units to leverage and analyze information, and the benefits of data warehousing, including data consistency, stable data storage, and ease of data accessibility.

It also examines the differences among data warehouses, data lakes, and lakehouses, how they can complement each other, and key issues organizations should consider when designing and implementing a data warehouse.

What is data warehousing?

A data warehouse is a centralized repository designed to support business intelligence (BI), reporting, and advanced analytics. It consolidates data from many sources across an organization, including transactional databases, operational systems, spreadsheets, and external datasets.

Once integrated into the warehouse, this information becomes easier for analysts, engineers, and data scientists to query and analyze. Teams can generate dashboards, reports, and visualizations that reveal patterns in sales, operations, customer behavior, and other business activities. Because a data warehouse integrates and standardizes information from multiple systems, it enables organizations to analyze both historical and current data in one place.

In practice, this consolidation helps organizations reduce data silos and maintain consistent definitions for metrics and business data. As a result, many enterprises rely on a data warehouse as a shared analytical foundation for data-driven decision-making.

Key benefits of data warehousing

The table below summarizes the core advantages organizations gain from implementing a data warehouse.

Benefit	Description	Business Impact
Data consistency	Standardized data formats and definitions across systems	Ensures teams work from accurate and reliable information
Nonvolatile storage	Data stored in the warehouse remains stable after loading	Supports long-term historical analysis and auditing
Subject-oriented structure	Data organized around business domains such as finance, sales, or operations	Makes information easier for departments to access and analyze
Historical analysis	Data changes tracked over time	Enables trend analysis, forecasting, and long-term business insights
Improved performance	Analytical workloads are separated from operational systems	Prevents reporting queries from slowing transactional applications

Together, these characteristics allow organizations to deliver large volumes of historical and operational data quickly to the people and tools that need it.

How is a data warehouse architected?

Data warehouse architecture organizes data processing and analytics into layers that manage how information is collected, stored, and accessed. These layers ensure that data can be integrated from many sources, transformed into consistent formats, and delivered efficiently to analytics tools.

Data typically enters the warehouse through an extract, transform, and load (ETL) pipeline that cleans and prepares information for analysis. After transformation, the data is stored in a system optimized for analytical queries and reporting. Users and applications then interact with the warehouse through analytical engines and visualization tools.

The layered structure below shows how these components work together.

**Data warehouse architecture layers**
Architecture Layer	Function	Typical Technologies
Data ingestion and storage layer	Collects and stores data from multiple internal and external sources after ETL processing	Databases, data storage systems, SSD-based storage
Processing and analytics layer	Executes queries, performs aggregations, and manages security and optimization	OLAP engines, query optimizers, data processing engines
Access and presentation layer	Allows users and applications to analyze and visualize data	BI platforms, dashboards, SQL tools, data visualization software

This layered architecture helps organizations scale analytics workloads while maintaining performance, security, and accessibility for business users.

Data warehouse vs. data lake vs. data lakehouse

Often, organizations will implement a combination of data warehouses and data lakes to complement each other and their distinct data needs. Let’s take a look at the differences between them and how they stack up to a data lakehouse.

The primary purpose of a data warehouse is to analyze data. It can intake vast amounts of data from various sources and store it for extended periods of time. Once stored, that data is formatted so it can be readily accessed and analyzed to deliver business insights and inform better decision-making. A data warehouse is a good option for organizations looking for a structured data solution focused on business intelligence, business reporting, and data analytics.

A data lake can also store and capture data in real time from a wide range of sources, including business applications, mobile apps, internet of things (IoT) devices, and more. But the data does not need to be formatted or defined first. This is referred to as “schema on read.” As a result, data lakes can cost-effectively hold vast amounts of raw data and easily scale as an organization’s data requirements change. In addition, users can access the data in various ways without having to know special languages like SQL. Data lakes are suitable for enterprises that need a flexible, affordable data solution to drive advanced analytics and machine learning (ML) models.

A data lakehouse is a relatively new architecture that combines the cost-effectiveness and scalability of a data lake with the robust information management of a data warehouse. A lakehouse can integrate various data sources automatically without having to make them fit formats or schemas. And it can store and process data in a single platform accessible to users across the enterprise. That allows enterprises to have a “single source of truth” for their data but still leverage advanced analytics and machine learning capabilities of a data lake to maximize business value—all in one solution.

	Data warehouse	Data lake	Lakehouse
Data type	Relational and structured	Relational and non-relational; can be structured, semi-structured, and unstructured	Structured, semi-structured, and unstructured
Schema	On write	On read	On read and on write
Data quality	Consistent data that serves a single source of truth across the organization	Raw data that has not been organized or curated	A single source of truth for curated data and storage for raw data
Use cases	Core/batch reporting, data visualizations, business intelligence	Machine learning, advanced analytics, forecasting, real-time analysis	Real-time analytics, machine learning, immediate business insights, forecasting, reporting
Primary users	Business analysts	Data scientists, data, developers, engineers	Business analysts, data scientists, data, developers, engineers
Scalability	Difficult and expensive to scale	Can scale easily and cost effectively	Can automatically scale on demand
Cost	Storage and data management costs more than with a data lake	Storage and data management costs less than with a data warehouse	Storage is cost-effective and fast

Designing and implementing a data warehouse

Designing and implementing a data warehouse requires coordination between IT teams, data professionals, and business stakeholders. The goal is to build an environment that supports analytics needs today while remaining flexible enough to accommodate new data sources, analytical models, and growing data volumes.

A well-designed warehouse balances several technical and organizational considerations. These include identifying the business questions the warehouse will support, determining where data originates, structuring that data for efficient analysis, and selecting the systems required to manage and access it. Performance, governance, and operational processes must also be planned to ensure the system remains reliable as usage grows.

The following table summarizes the primary factors organizations evaluate when designing and implementing a data warehouse.

**Data warehouse design and implementation considerations**
Design Consideration	What It Involves	Why It Matters
Business requirements	Identifying analytics goals, reporting needs, and key metrics	Ensures the warehouse supports real business decisions
Data sources	Determining where data originates, including operational systems and external feeds	Enables accurate and comprehensive data integration
Data modeling	Structuring data using schemas such as star or snowflake models	Improves query performance and analytical usability
System architecture	Selecting database platforms, ETL pipelines, and analytics tools	Determines scalability, reliability, and integration capabilities
Performance optimization	Applying indexing, partitioning, and caching strategies	Maintains fast query performance as data volumes grow
Process integration	Aligning the warehouse with business workflows and decision processes	Ensures analytics outputs are actionable
Refresh and recovery	Establishing schedules for data updates, backup, and restoration	Maintains data reliability and operational resilience

By addressing these considerations early in the design process, organizations can build a data warehouse environment that supports reliable analytics, scalable growth, and long-term data strategy.

Data warehousing solutions

Your unique business requirements should be top of mind when considering a data warehouse solution for your organization. How do you intend to use it to draw meaningful insights and improve business decision-making?

It’s also important to consider:

Connectivity and advanced capabilities. Look for a data warehouse architecture that offers connectivity to various data sources within your organization so you can intake and store data seamlessly. That may require a solution with data modeling and mapping, quality assurance, ETL capabilities, connectivity to BI tools, and more. This is critical to ensuring you can load and analyze data quickly and efficiently.
Performance. The quicker a data warehouse can process and export information, the faster your users can make critical business decisions and drive value. Look for tools that improve performance like parallel processing, advanced processing power, and maximum uptime that allow you to process data more rapidly and efficiently.
Scalability. An effective data warehouse solution should also be future-ready with the ability to scale as your organization’s data needs change. That includes the ability to handle larger amounts of data over extended periods.
Cloud vs. on-premises. An on-premises data warehouse will reside on your network, which can offer maximum control over things like latency. But it can also require much more investment in both hardware and IT support personnel. Cloud solutions don’t require a physical environment for your data warehouse, so you don’t need to independently house or maintain it. That means it’s easy to deploy, more affordable, and can scale more easily.

Why choose Teradata for your data warehouse needs?

With Teradata VantageCloud, our complete cloud analytics and data platform, you can deploy a data warehouse at an enterprise level that offers flexibility, performance, and analytics to a wide range of users.

Our next-generation cloud-native architecture can seamlessly bring disparate sources of data together to create a single source of truth for your organization. It easily manages mixed workloads. And it ensures data integrity with real-time updates.

That means greater flexibility, scalability, and security for your organization so you can access business intelligence more quickly and efficiently leverage more informed decision-making.

Data warehousing FAQ

What is an example of a data warehouse?

An enterprise data warehouse (EDW) is one example of this data storage technology. EDWs help businesses improve customer insights, optimize supply chain management, and increase productivity. Walmart was one of the first large organizations to employ an EDW, which it uses to guide suppliers, inform point-of-sales analytics, and more deeply understand customer buying habits. Today’s EDW solutions commonly involve a combination of environments designed to address a range of enterprise needs. For example, business users may require access to curated, structured data that can easily be applied to business intelligence applications. The EDW may also involve data lakes that store unstructured data with the potential to be used in business applications but that are not assigned to any particular function.

What are the key components of a data warehouse?

A data warehouse generally includes:

A database that stores and manages data
An ETL process responsible for extracting data from source systems and transforming it into a format for it to be analyzed
Query and analysis tools so end users can create reports, build dashboards, and conduct data analysis
A data access layer that manages data access, security, and authentication
Data governance, security, backup, and recovery tools

What schemas are often used in data warehouses?

The two most common schemas used in data warehouses are star schemas and snowflake schemas. A star schema simplifies queries and provides robust query performance because it minimizes the effort needed to retrieve data. So, they’re commonly used in scenarios where query performance is a top priority. A snowflake schema often requires more complex query processing due to an increased number of joins needed to retrieve data, but they use less space. Therefore, they can be suitable in scenarios where storage space is a concern.