Suppose that a data warehouse consists of the three dimensions time, doctor, and patient, and the two measures count and charge, where a charge is the fee that a doctor charges a patient for a visit. a) Draw a schema diagram for the above data warehouse using one of the schemas. [star, snowflake, fact constellation] b) Starting with the base cuboid [day, doctor, patient], what specific OLAP operations should be performed in order to list the total fee collected by each doctor in 2004? c) To obtain the same list, write an SQL query assuming the data are stored in a relational database with the schema fee (day, month, year, doctor, hospital, patient, count, charge)

Solution:

a) Star Schema is shown in the figure below:



b) First, we should use the roll-up operation to get the year 2004(rolling-up from the day then month to a year). After getting that, we need to use the slice operation to select (2004). Second, we should use roll-up operation again to get all patients. Then, we need to use the slice operation to select (all). Finally, we get a list of the total fee collected by each doctor in 2004. So,

1. roll up from day to month to year

2. slice for year = “2004”

3. roll up on patient from the individual patient to all

4. slice for patient = “all”

5. get the list of the total fees collected by each doctor in 2004


c) Select doctor, Sum(charge)

From fee

Where year = 2004

Group by doctor

Comments

Popular posts from this blog

Explain Parallel Efficiency of MapReduce.

Explain Enterprise Batch Processing Using Map-Reduce.

Suppose that a data warehouse for Big-University consists of the following four dimensions: student, course, semester, and instructor, and two measures count and avg_grade. When at the lowest conceptual level (e.g., for a given student, course, semester, and instructor combination), the avg_grade measure stores the actual course grade of the student. At higher conceptual levels, avg_grade stores the average grade for the given combination. a) Draw a snowflake schema diagram for the data warehouse. b) Starting with the base cuboid [student, course, semester, instructor], what specific OLAP operations (e.g., roll-up from semester to year) should one perform in order to list the average grade of CS courses for each BigUniversity student. c) If each dimension has five levels (including all), such as “student < major < status < university < all”, how many cuboids will this cube contain (including the base and apex cuboids)?