How beam search and logic programming is used to mine graph? Explain.

 Beam search  

A beam search is a heuristic search technique that combines elements of breadth-first and best-first searches. Like a breadth-first search, the beam search maintains a list of nodes that represent a frontier in the search space. Whereas the breadth-first adds all neighbors to the list, the beam search orders the neighboring nodes according to some heuristic and only keeps the n (or k or 3) best, where is the beam size. This can significantly reduce the processing and storage requirements for the search. Some applications are speech recognition, vision, planning, machine learning, graph mining, and so on.


Beam Search Algorithm

1.Let beam width = n.

2 Create two empty lists: OPEN and CLOSED.

3. Start from the initial node (say N) and put it in the 'ordered' OPEN list.

4. Repeat steps 5 to 9 until the GOAL node is reached.

5. If the OPEN list is empty, then EXIT the loop returning 'False'. 

6. Select the first/top node (N) in the OPEN list and move it to the CLOSED list. Also capture the information of the parent node. 

7. If N is a GOAL node, then move the node to the CLOSED list and exit the loop returning 'True'. The solution can be found by backtracking the path.

8. Else if, expand node N to generate the 'immediate' next nodes linked to node N and add all those to the OPEN list then sort all elements of an OPEN list by heuristic function. 

9. Nodes to be expanded should not be greater than beam width 1. Prune and update the OPEN list by keeping the best n successors.

2nd part

Inductive logic programming

Inductive logic programming, or ILP for short, is traditionally concerned with detecting patterns in data in form of logical programs (such as Prolog code). Initially, ILP was used for the automatic program synthesis and is based on inductive inference in a subset of first-order logic. However, ILP has been extended over time to cover a wider range of tasks, such as classification & regression, clustering, and pattern mining to name a few. ILP can be used to find patterns and hidden regularities in relational data. In contrast to traditional pattern mining assuming that the data is given in form of attribute-value based on a single relational table, relational data mining is able to analyze data in a multi-relational form. That means that examples can possess a variable number of attributes, which can include other objects with properties. Within ILP these examples or objects can be described as Prolog programs. More specialized approaches use a more domain-specific description of objects. This includes for examples approached for sequence and graph mining. Furthermore, research topics around inductive databases (link) build on parts on relational data mining.

Comments

Popular posts from this blog

Suppose that a data warehouse for Big-University consists of the following four dimensions: student, course, semester, and instructor, and two measures count and avg_grade. When at the lowest conceptual level (e.g., for a given student, course, semester, and instructor combination), the avg_grade measure stores the actual course grade of the student. At higher conceptual levels, avg_grade stores the average grade for the given combination. a) Draw a snowflake schema diagram for the data warehouse. b) Starting with the base cuboid [student, course, semester, instructor], what specific OLAP operations (e.g., roll-up from semester to year) should one perform in order to list the average grade of CS courses for each BigUniversity student. c) If each dimension has five levels (including all), such as “student < major < status < university < all”, how many cuboids will this cube contain (including the base and apex cuboids)?

Suppose that a data warehouse consists of the four dimensions; date, spectator, location, and game, and the two measures, count and charge, where charge is the fee that a spectator pays when watching a game on a given date. Spectators may be students, adults, or seniors, with each category having its own charge rate. a) Draw a star schema diagram for the data b) Starting with the base cuboid [date; spectator; location; game], what specific OLAP operations should perform in order to list the total charge paid by student spectators at GM Place in 2004?

Discuss classification or taxonomy of virtualization at different levels.