Difference between data mining and web mining

 Data Mining  

Data Mining refers to extracting vital information or knowledge from huge data sets. Data Mining is carried out by an individual in a specific condition on a particular set of data with an objective in mind. Data can be anything such as a file, video, photos, text, etc.

Process of data mining

  • Business Understandings
  • Data Understandings
  • Data Preparation
  • Modeling
  • Evaluation
  • Deployment

What is web mining?

Web mining refers to the process of using data mining techniques to extract useful patterns trends and information usually with the help of the internet by dealing with it from web-based documents and services, server logs, and hyperlinks. The main objective of web mining is to discover the designs in web information by collecting and analyzing data in order to get important insights.

Web mining is further divided into three different types

  • Web content mining

Web content mining refers to the process of extracting data from web pages in order to search different patterns trends that gives useful insight. There are various techniques to extract useful data like web scraping.

  • Web structure mining

Web structure mining refers to the process in which data from hyperlinks that lead to multiple pages are gathered and prepared to search for new patterns and trends. 

  • Web usage mining
When a web application is hosted, multiple web server logs get generated about the application user's web activity.



Comments

Popular posts from this blog

Suppose that a data warehouse for Big-University consists of the following four dimensions: student, course, semester, and instructor, and two measures count and avg_grade. When at the lowest conceptual level (e.g., for a given student, course, semester, and instructor combination), the avg_grade measure stores the actual course grade of the student. At higher conceptual levels, avg_grade stores the average grade for the given combination. a) Draw a snowflake schema diagram for the data warehouse. b) Starting with the base cuboid [student, course, semester, instructor], what specific OLAP operations (e.g., roll-up from semester to year) should one perform in order to list the average grade of CS courses for each BigUniversity student. c) If each dimension has five levels (including all), such as “student < major < status < university < all”, how many cuboids will this cube contain (including the base and apex cuboids)?

Discuss classification or taxonomy of virtualization at different levels.

Suppose that a data warehouse consists of the four dimensions; date, spectator, location, and game, and the two measures, count and charge, where charge is the fee that a spectator pays when watching a game on a given date. Spectators may be students, adults, or seniors, with each category having its own charge rate. a) Draw a star schema diagram for the data b) Starting with the base cuboid [date; spectator; location; game], what specific OLAP operations should perform in order to list the total charge paid by student spectators at GM Place in 2004?