Lecture 11: Support Vector Machines 11.1 Overview

Gali  Zimmerman-Moreno

Outline

Title

Abstract

Introduction

Lecture 11: Support Vector Machines 11.1 Overview

Gali Zimmerman-Moreno

2004

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

1 11.1 Overview In this lecture we will describe methods for performing a binary classification task on a linearly non-separable data by the means of linear classification. We first explore the linear problem and the mathematical methods used to solve this problem. We then perform a generalization that will allow us to deal with more complex, noisy or non-linear situations, by embedding the input data into a higher dimensional feature-space, in which the data is separable (the concept is demonstrated in Figure 11.1). This will be accomplished a “kernel trick”

Dr. J. M. Ashfaque (MInstP)

We explain the support vector machine algorithm, and its extension the kernel method, for machine learning using small datasets. We also briefly discuss the Vapnik-Chervonenkis theory which forms the theoretical foundation of machine learning. This review is based on lectures given by the second author.

downloadDownload free PDF View PDFchevron_right

Support vector machines

Alessia Mammone

Wiley Interdisciplinary Reviews: Computational Statistics, 2009

Support vector machines (SVMs) are a family of machine learning methods, originally introduced for the problem of classification and later generalized to various other situations. They are based on principles of statistical learning theory and convex optimization, and are currently used in various domains of application, including bioinformatics, text categorization, and computer vision.  2009 John Wiley & Sons, Inc. WIREs Comp Stat 2009 1 283-289 S upport vector machines (SVMs), introduced by

downloadDownload free PDF View PDFchevron_right

Finding Potential Support Vectors in Separable Classification Problems

Simone Del Favero

IEEE transactions on neural networks and learning systems, 2013

The paper considers the classification problem using Support Vector Machines, and investigates how to maximally reduce the size of the training set without losing information. Under separable dataset assumptions, we derive the exact conditions stating which observations can be discarded without diminishing the overall information content. For this purpose, we introduce the concept of Potential Support Vectors, i.e., those data that can become Support Vectors when future data become available. Complementary, we also characterize the set of Discardable Vectors, i.e., those data that, given the current dataset, can never become Support Vectors. These vectors are thus useless for future training purposes, and can eventually be removed without loss of information. We then provide an efficient algorithm based on linear programming which returns the potential and discardable vectors by constructing a simplex tableau. Finally we compare it with alternative algorithms available in the literature on some synthetic data as well as on datasets from standard repositories.

downloadDownload free PDF View PDFchevron_right

Special issue on support vector machines

Colin Campbell

Neurocomputing, 2003

Support vector machines (SVMs) are currently a very active research area within machine learning. Motivated by statistical learning theory, SVMs have been successfully applied to numerous tasks, among others in data mining, computer vision, and bioinformatics. SVMs are examples of a broader category of learning approaches which utilize the concept of kernel substitution, which makes the task of learning more tractable by exploiting an implicit mapping into a high-dimensional space. SVMs have many appealing properties for machine learning. For example, the classic SVM learning task involves convex quadratic programming, a problem that does not su er from the 'local minima' problem and whose solution may easily be found by using one of the many specially e cient algorithms developed for it in the optimization theory. Furthermore, recently developed model selection strategies can be applied, so that few, if any, learning parameters need to be set by the operator. Above all, they have been found to work very well in practice.

downloadDownload free PDF View PDFchevron_right

Support Vector Machines with Applications 1

Milan Janković

Support vector machines (SVMs) appeared in the early nineties as optimal margin classifiers in the context of Vapnik's statistical learning theory. Since then SVMs have been successfully applied to real-world data analysis problems, often providing improved results compared with other techniques. The SVMs operate within the framework of regularization theory by minimizing an empirical risk in a well-posed and consistent way. A clear advantage of the support vector approach is that sparse solutions to classification and regression problems are usually obtained: only a few samples are involved in the determination of the classification or regression functions. This fact facilitates the application of SVMs to problems that involve a large amount of data, such as text processing and bioinformatics tasks. This paper is intended as an introduction to SVMs and their applications, emphasizing their key features. In addition, some algorithmic extensions and illustrative real-world applications of SVMs are shown.

downloadDownload free PDF View PDFchevron_right

Support Vector Machines for balanced binary classification tasks: A study of kernels

Jaime Cascante Vega

downloadDownload free PDF View PDFchevron_right

On the parameter optimization of Support Vector Machines for binary classification

Paulo Gaspar

Classifying biological data is a common task in the biomedical context. Predicting the class of new, unknown information allows researchers to gain insight and make decisions based on the available data. Also, using classification methods often implies choosing the best parameters to obtain optimal class separation, and the number of parameters might be large in biological datasets. Support Vector Machines provide a well-established and powerful classification method to analyse data and find the minimal-risk separation between different classes. Finding that separation strongly depends on the available feature set and the tuning of hyper-parameters. Techniques for feature selection and SVM parameters optimization are known to improve classification accuracy, and its literature is extensive. In this paper we review the strategies that are used to improve the classification performance of SVMs and perform our own experimentation to study the influence of features and hyper-parameters in the optimization process, using several known kernels.

downloadDownload free PDF View PDFchevron_right

On margin and support vector separability in Support Vector Machines for Regression

Nicola Ancona

1999

In this report we show some simple properties of SVM for regression. In particular we show that for close to zero, minimizing the norm of w is equivalent to maximizing the distance between the optimal approximating hyperplane solution of SVMR and the closest points in the data set. So, in this case, there exists a complete analogy between SVM for regression and classi cation, and the -tube plays the same role as the margin between classes. Moreover we show that for every the set of support vectors found by SVMR is linearly separable in the feature space and the optimal approximating hyperplane is a separator for this set. As a consequence, we show that for every regression problem there exists a classi cation problem which is linearly separable in the feature space. This is due to the fact that the solution of SVMR separates the set of support vectors in two classes: the support vectors living above and the one living below the optimal approximating hyperplane solution of SVMR. The position of the support vectors with respect to the hyperplane is given by the sign of ( i ? i ). Finally, we present a simple algorithm for obtaining a sparser representation of the optimal approximating hyperplane by using SVM for classi cation.

downloadDownload free PDF View PDFchevron_right

Nonparallel Support Vector Machines for Pattern Classification

Yingjie Tian

IEEE Transactions on Cybernetics, 2014

We propose a novel nonparallel classifier, named nonparallel support vector machine (NPSVM), for binary classification. Totally different with the existing nonparallel classifiers, such as the generalized eigenvalue proximal support vector machine (GEPSVM) and the twin support vector machine (TWSVM), our NPSVM has several incomparable advantages: (1) Two primal problems are constructed implementing the structural risk minimization principle; (2) The dual problems of these two primal problems have the same advantages as that of the standard SVMs, so that the kernel trick can be applied directly, while existing TWSVMs have to construct another two primal problems for nonlinear cases based on the approximate kernel-generated surfaces, furthermore, their nonlinear problems can not degenerate to the linear case even the linear kernel is used; (3) The dual problems have the same elegant formulation with that of standard SVMs and can certainly be solved efficiently by sequential minimization optimization (SMO) algorithm, while existing GEPSVM or TWSVMs are not suitable for large scale problems; (4) It has the inherent sparseness as standard SVMs; (5) Existing TWSVMs are only the special cases of the NPSVM when the parameters of which are appropriately chosen. Experimental results on lots of data sets show the effectiveness of our method in both sparseness and classification accuracy, and therefore confirm the above conclusion further. In some sense, our NPSVM is a new starting point of nonparallel classifiers.

downloadDownload free PDF View PDFchevron_right

Learning with Supportive Vectors : An Introduction to Support Vector Machines and their Applications

Purushottam Kar

Support Vector Machines have acquired a central position in the field of Machine Learning and Pattern Recognition in the past decade and have known to deliver state-of-the-art performance in applications such as text categorization, hand-written character recognition, bio-sequence analysis, etc. In this article we provide a gentle introduction into the workings of Support Vector Machines (also known as SVMs) and attempt to provide some insight into the learning mechanisms involved. We begin with a general introduction to mathematical learning and move on to discuss the learning framework used by the SVM architecture.

downloadDownload free PDF View PDFchevron_right

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

Related papers

Support vector machines explained

André Luiz

Tutorial paper., Mar, 2009

downloadDownload free PDF View PDFchevron_right

Linear Support Vector Machines

philip torr

2011

Abstract Linear support vector machines (svms) have become popular for solving classication tasks due to their fast and simple online application to large scale data sets. However, many problems are not linearly separable. For these problems kernel-based svms are often used, but unlike their linear variant they suer from various drawbacks in terms of computational and memory eciency.

downloadDownload free PDF View PDFchevron_right

A Tutorial on Support Vector Machines for Pattern Recognition

Elena KHvostova

Data Mining and Knowledge Discovery, 1998

The tutorial starts with an overview of the concepts of VC dimension and structural risk minimization. We then describe linear Support Vector Machines (SVMs) for separable and non-separable data, working through a non-trivial example in detail. We describe a mechanical analogy, and discuss when SVM solutions are unique and when they are global. We describe how support vector training can be practically implemented, and discuss in detail the kernel mapping technique which is used to construct SVM solutions which are nonlinear in the data. We show how Support Vector machines can have very large (even infinite) VC dimension by computing the VC dimension for homogeneous polynomial and Gaussian radial basis function kernels. While very high VC dimension would normally bode ill for generalization performance, and while at present there exists no theory which shows that good generalization performance is guaranteed for SVMs, there are several arguments which support the observed high accuracy of SVMs, which we review. Results of some experiments which were inspired by these arguments are also presented. We give numerous examples and proofs of most of the key theorems. There is new material, and I hope that the reader will find that even old material is cast in a fresh light.

downloadDownload free PDF View PDFchevron_right

Support Vector Machines for Classification

Dmitriy Fradkin

Wiley Encyclopedia of Operations Research and Management Science, 2010

downloadDownload free PDF View PDFchevron_right

Fast Support Vector Machine Classification using linear SVMs

Hans Burkhardt

18th International Conference on Pattern Recognition (ICPR'06), 2006

In many classification applications, Support Vector Machines (SVMs) have proven to be highly performing and easy to handle classifiers with very good generalization abilities. However, one drawback of the SVM is its rather high classification complexity which scales linearly with the number of Support Vectors (SVs). This is due to the fact that for the classification of one sample, the kernel function has to be evaluated for all SVs. To speed up classification, different approaches have been published, most which of try to reduce the number of SVs. In our work, which is especially suitable for very large datasets, we follow a different approach: as we showed in [12], it is effectively possible to approximate large SVM problems by decomposing the original problem into linear subproblems, where each subproblem can be evaluated in Ω(1). This approach is especially successful, when the assumption holds that a large classification problem can be split into mainly easy and only a few hard subproblems. On standard benchmark datasets, this approach achieved great speedups while suffering only sightly in terms of classification accuracy and generalization ability. In this contribution, we extend the methods introduced in [12] using not only linear, but also non-linear subproblems for the decomposition of the original problem which further increases the classification performance with only a little loss in terms of speed. An implementation of our method is available in [13]. Due to page limitations, we had to move some of theoretic details (e.g. proofs) and extensive experimental results to a technical report [14].

downloadDownload free PDF View PDFchevron_right

Introduction to a SVM code for binary classification

Ademir Xavier

Study notes summarizing the basics on Support Vector Machines (SVM) for binary classification. Some pseudo-codes are provided as an introduction to SVM programming.

downloadDownload free PDF View PDFchevron_right

Locally Linear Support Vector Machines

philip torr

2011

Abstract Linear support vector machines (svms) have become popular for solving classification tasks due to their fast and simple online application to large scale data sets. However, many problems are not linearly separable. For these problems kernel-based svms are often used, but unlike their linear variant they suffer from various drawbacks in terms of computational and memory efficiency.

downloadDownload free PDF View PDFchevron_right

Support Vector Machines with Applications

Alberto MUNOZ

Statistical Science, 2006

downloadDownload free PDF View PDFchevron_right

CS229 Lecture notes Support Vector Machines

maneesha s

This set of notes presents the Support Vector Machine (SVM) learning algorithm. SVMs are among the best (and many believe are indeed the best) "off-the-shelf" supervised learning algorithm. To tell the SVM story, we'll need to first talk about margins and the idea of separating data with a large "gap." Next, we'll talk about the optimal margin classifier, which will lead us into a digression on Lagrange duality. We'll also see kernels, which give a way to apply SVMs efficiently in very high dimensional (such as infinitedimensional) feature spaces, and finally, we'll close off the story with the SMO algorithm, which gives an efficient implementation of SVMs.

downloadDownload free PDF View PDFchevron_right

Linear Subclass Support Vector Machines

Nikolaos Gkalelis

IEEE Signal Processing Letters, 2000

In this letter, linear subclass support vector machines (LSSVMs) are proposed that can efficiently learn a piecewise linear decision function for binary classification problems. This is achieved using a nongaussianity criterion to derive the subclass structure of the data, and a new formulation of the optimization problem that exploits the subclass information. LSSVMs provide low computation cost during training and evaluation, and offer competitive recognition performance in comparison to other popular SVM-based algorithms. Experimental results on various datasets confirm the advantages of LSSVMs.

downloadDownload free PDF View PDFchevron_right