Define the terms data warehousing and data mining. Discuss key areas that demand the use of data warehousing and data mining.

 Datawarehouse

  •  Data warehousing is a collection of tools and techniques using which more knowledge can be driven out from a large amount of data. This helps with the decision-making process and improving information resources.
  • Datawarehouse is basically a database of unique data structures that allows the relatively quick and easy performance of complex queries over a large amount of data. It is created from multiple heterogeneous sources.
  •  A data warehouse is built to store a huge amount of historical data and empowers fast requests over all the data, typically using Online Analytical Processing (OLAP).

Datamining

  • Data mining refers to extracting knowledge from large amounts of data. The data sources can include databases, data warehouses, the web, etc.
  • Data mining refers to the analysis of data. It is the computer-supported process of analyzing huge sets of data that have either been compiled by computer systems or have been downloaded into the computer.
  •  In the data mining process, the computer analyzes the data and extracts useful information from it. It looks for hidden patterns within the data set and tries to predict future behavior. Data mining is primarily used to discover and indicate relationships among data sets.
  • Data mining aims to enable business organizations to view business behaviors and trends relationships that allow the business to make data-driven decisions.


 The key areas that demand the use of data warehousing and data mining

Agriculture

The agricultural census performed by the Ministry of Agriculture, Government of India, compiles a large number of agricultural parameters at the national level. District-wise agricultural production area and yield of crops are compiled, and analysis, mining, and forecast statistics on the consumption of fertilizers can be turned into a data merge. Data in agricultural inputs such as seeds and fertilizers can also be effectively analyzed in a warehouse. Data from the livestock census can be turned into a data warehouse. Land use part statistics can also be analyzed in a warehousing environment. Thus, there is a substantial application of data warehouse housing and data mining techniques in the agricultural sector.

Rural Development

Data on individuals below the poverty line can be built into a data warehouse. Drinking water census data (from the drinking water missions) can be effectively utilized by OLAP and data mining technologies. Monitoring and analysis of progress made on the implementation of rural development programs can also be made using OLAP and data mining technologies.

Health

Community needs assessment data, immunization data, and data from national programs on controlling blindness, leprosy, and malaria can all be used for data warehousing implementation, OLAP, and data mining applications. Generate patient, employee, and financial records and share data with other entities, like insurance companies, NGOs, and medical aid services Use data mining to identify patient trends Provide feedback to physicians on procedures and tests.

Planning

At the planning commission, data warehouses can be built for the state plan data on all sectors, labor, energy, education, trade and industry, five-year plan, etc.

Commerce and Trade

Datalink on trade can be analyzed and converted into a data warehouse. World price monitoring systems can be made to perform better by using data warehousing and data mining technologies.

Education

The sixth all-India educational survey data has been converted into a data warehouse. Various types of analytical queries and reports can be answered. Store and analyze information about faculty and students maintain student portals to facilitate student activities extract information for research grants and assess student demographics integrate information from different sources into a single repository for analysis and strategic decision-making.


Comments

Popular posts from this blog

Suppose that a data warehouse for Big-University consists of the following four dimensions: student, course, semester, and instructor, and two measures count and avg_grade. When at the lowest conceptual level (e.g., for a given student, course, semester, and instructor combination), the avg_grade measure stores the actual course grade of the student. At higher conceptual levels, avg_grade stores the average grade for the given combination. a) Draw a snowflake schema diagram for the data warehouse. b) Starting with the base cuboid [student, course, semester, instructor], what specific OLAP operations (e.g., roll-up from semester to year) should one perform in order to list the average grade of CS courses for each BigUniversity student. c) If each dimension has five levels (including all), such as “student < major < status < university < all”, how many cuboids will this cube contain (including the base and apex cuboids)?

Suppose that a data warehouse consists of the four dimensions; date, spectator, location, and game, and the two measures, count and charge, where charge is the fee that a spectator pays when watching a game on a given date. Spectators may be students, adults, or seniors, with each category having its own charge rate. a) Draw a star schema diagram for the data b) Starting with the base cuboid [date; spectator; location; game], what specific OLAP operations should perform in order to list the total charge paid by student spectators at GM Place in 2004?

Explain market-Oriented Cloud computing architecture.