Explain components of Data Warehouse Architecture.
Components of Data Warehouse Architecture
A typical data warehouse has four main components: a central database, ETL (extract, transform, and load) tools, metadata, and access tools. All of these components are engineered for speed so that we can get results quickly and analyze data on the fly.
Figure: Components of data warehouse
The figure shows the essential elements of a typical warehouse. We see the ETL shown on the left. The Data staging element serves as the next building block. In the middle, we see the Data Storage component that handles the data warehouses data. This element not only stores and manages the data; it also keeps track of data using the metadata repository. The Information Delivery component is shown on the right consists of all the different ways of making the information from the data warehouses available to the users. The major four components of Datawarehouse are listed below:
1. Central database
A database serves as the foundation of your data warehouse. Traditionally, these have been standard relational databases running on-premise or in the cloud. But because of Big Data, the mitu need for true, real-time performance, and a drastic reduction in the cost of RAM, in-memory e databases are rapidly gaining in popularity.
2. Data integration
Data is pulled from source systems and modified to align the information for rapid analytical consumption using a variety of data integration approaches such as ETL (extract, transform, load) and ELT as well as real-time data replication, bulk-load processing, data transformation, and data quality and enrichment services.
3. Metadata
Metadata is data about your data. It specifies the source, usage, values, and other features of the datasets in your data warehouse. There are business metadata, which adds context to your data, and technical metadata, which describes how to access data - including where it resides and how it is structured.
4. Data warehouse access tools
Access tools allow users to interact with the data in your data warehouse. Examples of access tools include query and reporting tools, application development tools, data mining tools, and OLAP tools.
OR IN LONG,
A data warehouse (DWH) design consists of six main key components:
1. Data Warehouse Database
The central component of a data warehousing architecture is a databank that stocks all enterprise data and makes it manageable for reporting. Obviously, this means you need to choose which kind of database you’ll use to store data in your warehouse.
The following are the four database types that you can use:
• Typical relational databases are the row-centered databases you perhaps use on an everyday basis. For example, Microsoft SQL Server, SAP, Oracle, and IBM DB2.
• Analytics databases are precisely developed for data storage to sustain and manage analytics—for example, Teradata and Greenplum.
• Data warehouse applications aren’t exactly a kind of storage database, but several dealers now offer applications that offer software for data management as well as hardware for storing data. For example, SAP Hana, Oracle Exadata, and IBM Netezza.
• Cloud-based databases can be hosted and retrieved on the cloud so that you don’t have to procure any hardware to set up your data warehouse—for example, Amazon Redshift, Google BigQuery, and Microsoft Azure SQL.
2. Extraction, Transformation, and Loading Tools (ETL)
ETL tools are central components of an enterprise data warehouse architecture. These tools help with extracting data from different sources, transforming it into a suitable arrangement, and loading it into a data warehouse.
The ETL tool you choose will determine:
• The time expended in data extraction
• Approaches to extracting data
• Kind of transformations applied and the simplicity to do so
• Business rule definition for data validation and cleansing to improve end-product analytics
• Filling mislaid data
• Outlining information distribution from the fundamental depository to your BI applications
3. Metadata
In the data warehouse architecture, metadata describes the data warehouse and offers a framework for data. It helps in constructing, preserving, handling, and making use of the data warehouse.
It can be characterized into two types:
• Technical Metadata, which comprises information that can be used by developers and managers when executing warehouse development and administration tasks.
• Business Metadata comprises information that offers an easily understandable standpoint of the data stored in the warehouse.
Metadata plays an important role for businesses and the technical teams to understand the data present in the warehouse and convert it into information.
4. Data Warehouse Access Tools
A data warehouse uses a database or group of databases as a foundation. Data warehouse corporations generally cannot work with databases without the use of tools unless they have database administrators available. However, that is not the case with all business units. This is why they use the assistance of several no-code data warehousing tools, such as:
• Query and reporting tools, which help users produce corporate reports for analysis that can be in the form of spreadsheets, calculations, or interactive visuals.
• Application development tools, which help create tailored reports and present them in interpretations intended for particular reporting purposes.
• Data mining tools for data warehousing, which systematize the procedure of identifying arrays and links in huge quantities of data using cutting-edge statistical modeling methods.
• OLAP tools, which help construct a multi-dimensional data warehouse and allow the analysis of enterprise data from numerous viewpoints.
5. Data Warehouse Bus
It defines the data flow within a data warehousing bus architecture and includes a data mart. A data mart is an access level used to transfer data to the users. It is used for partitioning data that is produced for a particular user group.
6. Data Warehouse Reporting Layer
The reporting layer in the data warehouse allows the end-users to access the BI interface or BI database architecture. The purpose of the reporting layer in the data warehouse is to act as a dashboard for data visualization, create reports, and take out any required information.
OR IN SHORT,
1. Data Warehouse Database: The central component of a data warehouse architecture is a database that stocks all enterprise data and makes it manageable for reporting.
2. Load Manager: This component performs the operations associated with the extraction and load of data into the warehouse. These tasks include the simple transformation of data to prepare data for entry into the warehouse.
3. Warehouse Manager: A warehouse manager is responsible for the warehouse management process. The operations performed by the warehouse manager are the analysis, aggregation, backup and collection of data, de-normalization of the data.
4. Query Manager: It performs all the operations related to the management of user queries. The query manager is responsible for directing the queries to suitable tables. By directing the queries to appropriate tables, it speeds up the query request and response process. In addition, the query manager is responsible for scheduling the execution of the queries posted by the user.
5. MetaData: Metadata is defined as data about data that describes the data warehouse. The data that is used to represent other data is known as metadata. For example, the index of a book serves as metadata for the contents in the book. In other words, we can say that metadata is the summarized data that leads us to detailed data. There are two types of metadata in data warehousing:
- Technical Metadata comprises information that can be used by developers and managers when executing warehouse development and administration tasks.
- Business Metadata includes information that offers an easily understandable standpoint of the data stored in the warehouse.
6. Data Warehouse Access Tools: Access tools allow users to interact with the data in data warehouse. These tools fall into four different categories: query and reporting tools, application development tools, data mining tools, and OLAP tools.
Comments
Post a Comment