Faster Directions for Second Order SMO

Barbero, Álvaro; Dorronsoro, José R.

doi:10.1007/978-3-642-15822-3_4

Outline

Faster Directions for Second Order SMO

Jose Dorronsoro

2010, Lecture Notes in Computer Science

https://doi.org/10.1007/978-3-642-15822-3_4

visibility

…

description

2 pages

link

1 file

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

Second order SMO represents the state-of-the-art in SVM training for moderate size problems. In it, the solution is attained by solving a series of subproblems which are optimized w.r.t just a pair of multipliers. In this paper we will illustrate how SMO works in a two stage fashion, setting first the values of the bounded multipliers to the penalty factor C and proceeding then to adjust the non-bounded multipliers. Furthermore, during this second stage the selected pairs for update often appear repeatedly during the algorithm. Taking advantage of this, we shall propose a procedure to combine previously used descent directions that results in much fewer iterations in this second stage and that may also lead to noticeable savings in kernel operations.

Georgios C Anagnostopoulos

2007 International Joint Conference on Neural Networks, 2007

The Support Vector Machine is a widely employed machine learning model due to its repeatedly demonstrated superior generalization performance. The Sequential Minimal Optimization (SMO) algorithm is one of the most popular SVM training approaches. SMO is fast, as well as easy to implement; however, it has a limited working set size (2 points only). Faster training times can result if the working set size can be increased without significantly increasing the computational complexity. In this paper, we extend the 2-point SMO formulation to a 4-point formulation and address the theoretical issues associated with such an extension. We show that modifying the SMO algorithm to increase the working set size is beneficial in terms of the number of iterations required for convergence, and shows promise for reducing the overall training time.

downloadDownload free PDF View PDFchevron_right

Global Convergence of SMO Algorithm for Support Vector Regression

Norikazu Takahashi

IEEE Transactions on Neural Networks, 2008

Global convergence of the sequential minimal optimization (SMO) algorithm for support vector regression (SVR) is studied in this paper. Given l training samples, SVR is formulated as a convex quadratic programming problem with l pairs of variables. We prove that if two pairs of variables violating the optimality condition are chosen for update in each step and subproblems are solved in a certain way then the SMO algorithm always stops within a finite number of iterations after finding an optimal solution. Also, efficient implementation techniques for the SMO algorithm are presented and compared experimentally with other SMO algorithms.

downloadDownload free PDF View PDFchevron_right

A Faster Gradient Ascent Learning Algorithm for Nonlinear SVM

Catalina Cocianu, Panayiotis Vlamos

We propose a refined gradient ascent method including heuristic parameters for solving the dual problem of nonlinear SVM. Aiming to get better tuning to the particular training sequence, the proposed refinement consists of the use of heuristically established weights in correcting the search direction at each step of the learning algorithm that evolves in the feature space. We propose three variants for computing the correcting weights, their effectiveness being analyzed on experimental basis in the final part of the paper. The tests pointed out good convergence properties, and moreover, the proposed modified variants proved higher convergence rates as compared to Platt’s SMO algorithm. The experimental analysis aimed to derive conclusions on the recognition rate as well as on the generalization capacities. The learning phase of the SVM involved linearly separable samples randomly generated from Gaussian repartitions and the WINE and WDBC datasets. The generalization capacities in c...

downloadDownload free PDF View PDFchevron_right

A study on SMO-type decomposition methods for support vector machines

Chih-Jen Lin

2006

Abstract Decomposition methods are currently one of the major methods for training support vector machines. They vary mainly according to different working set selections. Existing implementations and analysis usually consider some specific selection rules. This paper studies sequential minimal optimization type decomposition methods under a general and flexible way of choosing the two-element working set.

downloadDownload free PDF View PDFchevron_right

Training very large scale nonlinear SVMs using Alternating Direction Method of Multipliers coupled with the Hierarchically Semi-Separable kernel approximations

Jacek Gondzio

ArXiv, 2021

Typically, nonlinear Support Vector Machines (SVMs) produce significantly higher classification quality when compared to linear ones but, at the same time, their computational complexity is prohibitive for large-scale datasets: this drawback is essentially related to the necessity to store and manipulate large, dense and unstructured kernel matrices. Despite the fact that at the core of training a SVM there is a simple convex optimization problem, the presence of kernel matrices is responsible for dramatic performance reduction, making SVMs unworkably slow for large problems. Aiming to an efficient solution of large-scale nonlinear SVM problems, we propose the use of the Alternating Direction Method of Multipliers coupled with Hierarchically Semi-Separable (HSS) kernel approximations. As shown in this work, the detailed analysis of the interaction among their algorithmic components unveils a particularly efficient framework and indeed, the presented experimental results demonstrate ...

downloadDownload free PDF View PDFchevron_right

SVM : Reduction of Learning Time

Lynda Zaoui

International Journal of Computer Applications, 2010

Training a support vector machine (SVM) leads to a quadratic optimization problem with bound constraints and one linear equality constraint. Despite the fact that this type of problem is well understood, there are many issues to be considered in designing an SVM learner. In particular, for large learning tasks with many training examples, off-the-shelf optimization techniques for general quadratic programs quickly become intractable in their memory and time requirements. Here we propose an algorithm which aims at reducing the learning time, this algorithm is based on the decomposition method proposed by Osuna dedicated to optimizing SVMs: it divides the original optimization problem into sub problems computable by the machine in terms of CPU time and memory storage, the obtained solution is in practice more parsimonious than that found by the approach of Osuna in terms of learning time quality, while offering similar performances.

downloadDownload free PDF View PDFchevron_right

A simpler and faster method for SVM implementation

Jose Ruiz

2007 European Conference on Power Electronics and Applications, 2007

Authors would like to acknowledge the partial financial support of the Junta de Castilla y León under grants VA004B06 and VA021B06.

downloadDownload free PDF View PDFchevron_right

Training and analysis of Support Vector Machine using Sequential Minimal Optimization

Shahrani Shahbudin

2008 IEEE International Conference on Systems, Man and Cybernetics, 2008

Maximizing the classification performance of the training data is a typical procedure in training a classifier. It is well known that training a Support Vector Machine (SVM) requires the solution of an enormous quadratic programming (QP) optimization problem. Serious challenges appeared in the training dilemma due to immense training and this could be solved using Sequential Minimal Optimization (SMO). This paper investigates the performance of SMO solver in term of CPU time, number of support vector and decision boundaries when applied in a 2-dimensional datasets. Next, the chunking algorithm is employed for comparison purpose. Initial results demonstrated that the SMO algorithm could enhance the performance of the training dataset. Both algorithms illustrated similar patterns from the decision boundaries attained. Classification rate achieved by both solvers are superb.

downloadDownload free PDF View PDFchevron_right

A Fast Method for Training Linear SVM in the Primal

Thierry Artieres

Lecture Notes in Computer Science, 2008

We propose a new algorithm for training a linear Support Vector Machine in the primal. The algorithm mixes ideas from non smooth optimization, subgradient methods, and cutting planes methods. This yields a fast algorithm that compares well to state of the art algorithms. It is proved to require O(1/λ ) iterations to converge to a solution with accuracy . Additionally we provide an exact shrinking method in the primal that allows reducing the complexity of an iteration to much less than O(N ) where N is the number of training samples.

downloadDownload free PDF View PDFchevron_right

Cycle-breaking acceleration of SVM training

J. Dorronsoro

Neurocomputing, 2009

Fast SVM training is an important goal for which many proposals have been given in the literature. In this work we will study from a geometrical point of view the presence, in both the Mitchell-Demyanov-Malozemov (MDM) algorithm and Platt's Sequential Minimal Optimization, of training cycles, that is, the repeated selection of some concrete updating patterns. We shall see how to take advantage of these cycles by partially collapsing them in a single updating vector that gives better minimizing directions. We shall numerically illustrate the resulting procedure, showing that it can lead to substantial savings in the number of iterations and kernel operations for both algorithms.

downloadDownload free PDF View PDFchevron_right

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

Christian Igel

Neural Computation, 2008

Iterative learning algorithms that approximate the solution of support vector machines (SVMs) have two potential advantages. First, they allow online and active learning. Second, for large data sets, computing the exact SVM solution may be too time-consuming, and an efficient approximation can be preferable. The powerful LASVM iteratively approaches the exact SVM solution using sequential minimal optimization (SMO). It allows efficient online and active learning. Here, this algorithm is considerably improved in speed and accuracy by replacing the working set selection in the SMO steps. A second-order working set selection strategy, which greedily aims at maximizing the progress in each single step, is incorporated.

downloadDownload free PDF View PDFchevron_right

First and Second Order SMO Algorithms for LS-SVM Classifiers

Johan Suykens

Neural Processing Letters, 2011

LS-SVM classifiers have been traditionally trained with conjugate gradient algorithms. In this work, completing the study by Keerthi et al., we explore the applicability of the SMO algorithm for solving the LS-SVM problem, by comparing First Order and Second Order working set selections concentrating on the RBF kernel, which is the most usual choice in practice. It turns out that, considering all the range of possible values of the hyperparameters, Second Order working set selection is altogether more convenient than First Order. In any case, whichever the selection scheme is, the number of kernel operations performed by SMO appears to scale quadratically with the number of patterns. Moreover, asymptotic convergence to the optimum is proved and the rate of convergence is shown to be linear for both selections.

downloadDownload free PDF View PDFchevron_right

On the Equivalence of the SMO and MDM Algorithms for SVM Training

Jorge Lázaro

Principles of Data Mining and Knowledge Discovery, 2008

SVM training is usually discussed under two different algorithmic points of view. The first one is provided by decomposition methods such as SMO and SVMLight while the second one encompasses geometric methods that try to solve a Nearest Point Problem (NPP), the Gilbert–Schlesinger–Kozinec (GSK) and Mitchell–Demyanov–Malozemov (MDM) algorithms being the most representative ones. In this work we will show that, indeed, both approaches are essentially coincident. More precisely, we will show that a slight modification of SMO in which at each iteration both updating multipliers correspond to patterns in the same class solves NPP and, moreover, that this modification coincides with an extended MDM algorithm. Besides this, we also propose a new way to apply the MDM algorithm for NPP problems over reduced convex hulls.

downloadDownload free PDF View PDFchevron_right

An accelerated MDM algorithm for SVM training

Jorge Lázaro

In this work we will propose an acceleration procedure for the Mitchell-Demyanov-Malozemov (MDM) algorithm (a fast geometric algorithm for SVM construction) that may yield quite large training savings. While decomposition algorithms such as SVMLight or SMO are usually the SVM methods of choice, we shall show that there is a relationship between SMO and MDM that suggests that, at least in their simplest implementations, they should have similar training speeds. Thus, and although we will not discuss it here, the proposed MDM acceleration might be used as a starting point to new ways of accelerating SMO.

downloadDownload free PDF View PDFchevron_right

Simple proof of convergence of the SMO algorithm for different SVM variants

J. Dorronsoro

IEEE transactions on neural networks and learning systems, 2012

In this brief, we give a new proof of the asymptotic convergence of the sequential minimum optimization (SMO) algorithm for both the most violating pair and second order rules to select the pair of coefficients to be updated. The proof is more self-contained, shorter, and simpler than previous ones and has a different flavor, partially building upon Gilbert's original convergence proof of its algorithm to solve the minimum norm problem for convex hulls. It is valid for both support vector classification (SVC) and support vector regression, which are formulated under a general problem that encompasses them. Moreover, this general problem can be further extended to also cover other support vector machines (SVM)-related problems such as -SVC or one-class SVMs, while the convergence proof of the slight variant of SMO needed for them remains basically unchanged.

downloadDownload free PDF View PDFchevron_right

A fast revised simplex method for SVM training

Georgios C Anagnostopoulos

2008 19th International Conference on Pattern Recognition, 2008

Active set methods for training the Support Vector Machines (SVM) are advantageous since they enable incremental training and, as we show in this research, do not exhibit exponentially increasing training times commonly associated with the decomposition methods as the SVM training parameter, C, is increased or the classification difficulty increases. Previous implementations of the active set method must contend with singularities, especially associated with the linear kernel, and must compute infinite descent directions, which may be inefficient, especially as C is increased. In this research, we propose a revised simplex method for quadratic programming, which has a guarantee of non-singularity for the sub-problem, and show how this can be adapted to SVM training.

downloadDownload free PDF View PDFchevron_right

Efficient Sequential Minimal Optimisation of Support Vector Classifiers

Gavin Cawley

2001

Abstract This paper describes a simple modification to the sequential minimal optimisation (SMO) training algorithm for support vector machine (SVM) classifiers, reducing training time at the expense of a small increase in memory used proportional to the number of training patterns. Results obtained on real-world pattern recognition tasks indicate that the proposed modification can more than halve the average training time.

downloadDownload free PDF View PDFchevron_right

Candidate working set strategy based SMO algorithm in support vector machine

Yi-Ping Phoebe Chen

2009

Sequential minimal optimization (SMO) is quite an efficient algorithm for training the support vector machine. The most important step of this algorithm is the selection of the working set, which greatly affects the training speed. The feasible direction strategy for the working set selection can decrease the objective function, however, may augment to the total calculation for selecting the working set in each of the iteration. In this paper, a new candidate working set (CWS) Strategy is presented considering the cost on the working set selection and cache performance. This new strategy can select several greatest violating samples from Cache as the iterative working sets for the next several optimizing steps, which can improve the efficiency of the kernel cache usage and reduce the computational cost related to the working set selection. The results of the theory analysis and experiments demonstrate that the proposed method can reduce the training time, especially on the large-scale datasets.

downloadDownload free PDF View PDFchevron_right

Using Sequential Unconstrained Minimization Techniques to simplify SVM solvers

Suresh Chandra

Neurocomputing, 2012

In this paper, we apply Sequential Unconstrained Minimization Techniques (SUMTs) to the classical formulations of both the classical L1 norm SVM and the least squares SVM. We show that each can be solved as a sequence of unconstrained optimization problems with only box constraints. We propose relaxed SVM and relaxed LSSVM formulations that correspond to a single problem in the corresponding SUMT sequence. We also propose a SMO like algorithm to solve the relaxed formulations that works by updating individual Lagrange multipliers. The methods yield comparable or better results on large benchmark datasets than classical SVM and LSSVM formulations, at substantially higher speeds.

downloadDownload free PDF View PDFchevron_right

Efficient Multiplicative updates for Support Vector Machines

Vince Calhoun

Proceedings of the 2009 SIAM International Conference on Data Mining, 2009

The dual formulation of the support vector machine (SVM) objective function is an instance of a nonnegative quadratic programming problem. We reformulate the SVM objective function as a matrix factorization problem which establishes a connection with the regularized nonnegative matrix factorization (NMF) problem. This allows us to derive a novel multiplicative algorithm for solving hard and soft margin SVM. The algorithm follows as a natural extension of the updates for NMF and semi-NMF. No additional parameter setting, such as choosing learning rate, is required. Exploiting the connection between SVM and NMF formulation, we show how NMF algorithms can be applied to the SVM problem. Multiplicative updates that we derive for SVM problem also represent novel updates for semi-NMF. Further this unified view yields algorithmic insights in both directions: we demonstrate that the Kernel Adatron algorithm for solving SVMs can be adapted to NMF problems. Experiments demonstrate rapid convergence to good classifiers. We analyze the rates of asymptotic convergence of the updates and establish tight bounds. We test them on several datasets using various kernels and report equivalent classification performance to that of a standard SVM.

downloadDownload free PDF View PDFchevron_right

Faster Directions for Second Order SMO

Sign up for access to the world's latest research

Abstract

Related papers

Related papers

Related topics