What is Data ware house ? How it is different from databases ?

  Datawarehouse

  •  Data warehousing is a collection of tools and techniques using which more knowledge can be driven out from a large amount of data. This helps with the decision-making process and improving information resources.
  • Datawarehouse is basically a database of unique data structures that allows the relatively quick and easy performance of complex queries over a large amount of data. It is created from multiple heterogeneous sources.
  •  A data warehouse is built to store a huge amount of historical data and empowers fast requests over all the data, typically using Online Analytical Processing (OLAP).

Data Warehousing vs. Databases
Data warehouses and databases both are relative data systems, but both are made to serve different purposes. A data warehouse is built to store a huge amount of historical data and empowers fast requests over all the data, typically using Online Analytical Processing (OLAP). A database is made to store current transactions and allow quick access to specific transactions for ongoing business processes, commonly known as Online Transaction Processing (OLTP).

                      OR,


Data Warehousing vs. Databases
A data warehouse is not the same as a database:
A database is a transactional system that monitors and updates real-time data in order to have only the most recent data available.
A data warehouse is programmed to aggregate structured data over time.
For example, a database might only have the most recent address of a customer, while a data warehouse might have all the addresses for the customer for the past 10 years.

Comments

Popular posts from this blog

Pure Versus Partial EC

Suppose that a data warehouse for Big-University consists of the following four dimensions: student, course, semester, and instructor, and two measures count and avg_grade. When at the lowest conceptual level (e.g., for a given student, course, semester, and instructor combination), the avg_grade measure stores the actual course grade of the student. At higher conceptual levels, avg_grade stores the average grade for the given combination. a) Draw a snowflake schema diagram for the data warehouse. b) Starting with the base cuboid [student, course, semester, instructor], what specific OLAP operations (e.g., roll-up from semester to year) should one perform in order to list the average grade of CS courses for each BigUniversity student. c) If each dimension has five levels (including all), such as “student < major < status < university < all”, how many cuboids will this cube contain (including the base and apex cuboids)?

Short note on E-Government Architecture