Explain SUPPORT VECTOR MACHINE and SVM algorithm with its advantages and disadantages.

 SUPPORT VECTOR MACHINE

Support Vector Machine or SVM is one of the most popular Supervised Learning algorithms, which is used for Classification as well as Regression problems. The goal of the SVM algorithm is to create the best line or decision boundary that can segregate n-dimensional space into classes so that we can easily put the new data point in the correct category in the future. This best decision boundary is called a hyperplane.SVM chooses the extreme points/vectors that help in creating the hyperplane. These extreme cases are called support vectors, and hence algorithm is termed a Support Vector Machine.

A Support Vector Machine (SVM) performs classification by finding the hyperplane (classifier) that maximizes the margin between the two classes subject to the constraint that all the training tuples should be correctly classified. The hyperplane is defined by using the data points that are closest to the boundary. These points are called support vectors and the decision boundary itself is called support vector machine. The main advantage of the SVM classifier is that it minimizes the training set error and the test set error.


To obtain an SVM classifier with the best generalization performance, appropriate training is required. The most commonly used and popular algorithm for training SVM is the sequential minimal optimization (SMO) algorithm. The main advantage of the SMO algorithm is that it works analytically on a fixed size working set by decomposing the large training data set. So, that it can work fine even for large data sets and it also gives superb performances in almost all kinds of training data sets.



Advantages of SVM
1. SVM works relatively well when there is a clear margin of separation between classes.
2 SVM is more effective in high-dimensional spaces. 
3. SVM is effective in cases where the number of dimensions is greater than the number of samples. 
4. SVM is relatively memory efficient. 
5. SVM can be used for both regression and classification problems.
6..SVM can work well with image data as well.

Disadvantages of SVM
1. SVM algorithm is not suitable for large data sets.
2 SVM does not perform very well when the data set has more noise i.e. target classes are overlapping.
3. In cases where the number of features for each data point exceeds the number of training data samples, the SVM will underperform.
4. As the support vector classifier works by putting data points, above and below the classifying hyperplane there is no probabilistic explanation for the classification.
5. It is difficult to understand and interpret the SVM model compared to decision tree as SVM is more complex

Comments

Popular posts from this blog

Suppose that a data warehouse for Big-University consists of the following four dimensions: student, course, semester, and instructor, and two measures count and avg_grade. When at the lowest conceptual level (e.g., for a given student, course, semester, and instructor combination), the avg_grade measure stores the actual course grade of the student. At higher conceptual levels, avg_grade stores the average grade for the given combination. a) Draw a snowflake schema diagram for the data warehouse. b) Starting with the base cuboid [student, course, semester, instructor], what specific OLAP operations (e.g., roll-up from semester to year) should one perform in order to list the average grade of CS courses for each BigUniversity student. c) If each dimension has five levels (including all), such as “student < major < status < university < all”, how many cuboids will this cube contain (including the base and apex cuboids)?

Suppose that a data warehouse consists of the four dimensions; date, spectator, location, and game, and the two measures, count and charge, where charge is the fee that a spectator pays when watching a game on a given date. Spectators may be students, adults, or seniors, with each category having its own charge rate. a) Draw a star schema diagram for the data b) Starting with the base cuboid [date; spectator; location; game], what specific OLAP operations should perform in order to list the total charge paid by student spectators at GM Place in 2004?

Explain market-Oriented Cloud computing architecture.