Explain query processing in detail with example.

Query Processing 

Energy efficiency is an important feature in designing and executing databases. The aims of query processing are to transform a query written in a high-level language, typically SQL, into a correct and efficient execution strategy expressed in a low-level language (implementing the relational algebra), and to execute the strategy to retrieve the required data. Thus, Query Processing is the activities involved in parsing, validating, optimizing, and executing a query.

The steps involved in processing a query processing is shown in the figure and they are:

1. Parsing and translation
2. Optimization
3. Evaluation




1. Parsing and translation

The main work of a query processor is to convert a query string into query objects i.e., converting the query submitted by the user, into a form understood by the query processing engine. It converts the search string into definite instructions. The query parser must analyze the query language i.e., recognizing and interpreting operators (AND, OR, NOT, +, etc.), placing the operators into groups etc. The basic job of the parser is to extract the tokens (e.g., keywords, operators, operands, literal strings, etc.) into their corresponding internal data elements (i.e., relational algebra operations and operands) and structures (i.e., query tree, query graph). The parser also verifies the validity and syntax of the query string.

Optimizing the Query

In this stage, the Query optimizer tries to find the most efficient way of executing a given query by considering the possible plans. It maximizes the performance of a query. It portrays the query plans as a tree, results flowing from bottom to top. Query processor applies rules to the internal data structures of the query to transform these structures into their equivalent but more efficient representations. Rules may be based on various mathematical models and heuristics.

Evaluating the Query

The final step in processing a query is the evaluation phase. An evaluation plan tells precisely the algorithm for each operation along with the coordination among the operations. The best evaluation plan that a user generates by the optimization engine is selected and then executed (There may exist various methods for executing the same query). The evaluation plan comprises of a relational algebra tree, providing information at each node (for each table) along with the implementation methods to be employed for each relational operator.

Example  Consider the following SQL query respectively:

SELECT Stu_name, Stu_address

FROM Student

WHERE age ≤ 25;

Solution:- This query can be translated into either of the following relational-algebra expressions:


After parsing and translation into a relational algebra expression, the query is then transformed into a form, usually a query tree or query graph, that can be handled by the optimization engine.

The optimization engine then performs various analyses on the query data, generating a number of valid evaluation plans. From there, it determines the most appropriate evaluation plan to execute.

After the evaluation plan has been selected, it is passed into the DMBS' query-execution engine (also referred to as the runtime database processor), where the plan is executed and the results are returned.

OR IN SHORT

1. Parsing and translation

  • Translate the query into its internal form. This is then translated into relational algebra.
  • The parser checks syntax and verifies relation.

2. Optimization

SQL is a very high-level language:

  • The users specify what to search for not how the search is actually done
  • The algorithms are chosen automatically by the DBMS.
  • For a given SQL query there may be many possible execution plans.
  • Amongst all equivalent plans choose the one with the lowest cost.
  • Cost is estimated using statistical information from the database catalog.


3. Evaluation

The query evaluation engine takes a query evaluation plan, executes that plan, and returns the answer to that query.

Comments

Popular posts from this blog

Suppose that a data warehouse for Big-University consists of the following four dimensions: student, course, semester, and instructor, and two measures count and avg_grade. When at the lowest conceptual level (e.g., for a given student, course, semester, and instructor combination), the avg_grade measure stores the actual course grade of the student. At higher conceptual levels, avg_grade stores the average grade for the given combination. a) Draw a snowflake schema diagram for the data warehouse. b) Starting with the base cuboid [student, course, semester, instructor], what specific OLAP operations (e.g., roll-up from semester to year) should one perform in order to list the average grade of CS courses for each BigUniversity student. c) If each dimension has five levels (including all), such as “student < major < status < university < all”, how many cuboids will this cube contain (including the base and apex cuboids)?

Suppose that a data warehouse consists of the four dimensions; date, spectator, location, and game, and the two measures, count and charge, where charge is the fee that a spectator pays when watching a game on a given date. Spectators may be students, adults, or seniors, with each category having its own charge rate. a) Draw a star schema diagram for the data b) Starting with the base cuboid [date; spectator; location; game], what specific OLAP operations should perform in order to list the total charge paid by student spectators at GM Place in 2004?

Explain market-Oriented Cloud computing architecture.