A Quasi-Newton Approach to Nonsmooth Convex Optimization
J. Yu, S. Vishwanathan, S.
Günter, and N. N. Schraudolph.
A Quasi-Newton Approach to Nonsmooth Convex Optimization. In Proc. 25th
Intl. Conf. Machine Learning (ICML), pp. 1216–1223, Omnipress, Helsinki,
Finland, 2008.
Long version
Download
| 627.8kB | 120.3kB | 488.2kB |
Abstract
We extend the well-known BFGS quasi-Newton method and its limited-memory variant LBFGS to the optimization of nonsmooth convex objectives. This is done in a rigorous fashion by generalizing three components of BFGS to subdifferentials: The local quadratic model, the identification of a descent direction, and the Wolfe line search conditions. We apply the resulting subLBFGS algorithm to $L_2$-regularized risk minimization with binary hinge loss, and its direction-finding component to $L_1$-regularized risk minimization with logistic loss. In both settings our generic algorithms perform comparable to or better than their counterparts in specialized state-of-the-art solvers.
BibTeX Entry
@inproceedings{YuVisGueSch08,
author = {Jin Yu and S.~V.~N. Vishwanathan and
Simon G\"unter and Nicol N. Schraudolph},
title = {\href{http://nic.schraudolph.org/pubs/YuVisGueSch08.pdf}{
A Quasi-{N}ewton Approach to Nonsmooth Convex Optimization}},
pages = {1216--1223},
editor = {Andrew McCallum and Sam Roweis},
booktitle = {Proc.\ 25$^{th}$ Intl.\ Conf.\ Machine Learning (ICML)},
address = {Helsinki, Finland},
publisher = {Omnipress},
year = 2008,
b2h_type = {Top Conferences},
b2h_topic = {>Quasi-Newton Methods},
b2h_note = {<a href="b2hd-YuVisGueSch10.html">Long version</a>},
abstract = {
We extend the well-known BFGS quasi-Newton method and its
limited-memory variant LBFGS to the optimization of nonsmooth
convex objectives. This is done in a rigorous fashion by
generalizing three components of BFGS to subdifferentials: The
local quadratic model, the identification of a descent direction,
and the Wolfe line search conditions. We apply the resulting
subLBFGS algorithm to $L_{2}$-regularized risk minimization
with binary hinge loss, and its direction-finding component to
$L_{1}$-regularized risk minimization with logistic loss. In
both settings our generic algorithms perform comparable to or
better than their counterparts in specialized state-of-the-art
solvers.
}}