Breaking the Curse of Dimensionality with Convex Neural Networks

48 mins 4 secs, 698.62 MB, MPEG-4 Video 640x360, 29.97 fps, 44100 Hz, 1.93 Mbits/sec

Share this media item:

Embed this media item:

<iframe width="_width_" height="_height_" src="https://sms.cam.ac.uk/media/2599241/embed" frameborder="0" scrolling="no" allowfullscreen></iframe>

Choose size:

About this item

Available Formats

About this item

Description:	Bach, F Tuesday 31st October 2017 - 14:50 to 15:40


Created:	2017-11-03 13:39
Collection:	Variational methods and effective algorithms for imaging and vision
Publisher:	Isaac Newton Institute
Copyright:	Bach, F
Language:	eng (English)
Distribution:	World (downloadable)
Explicit content:	No
Aspect Ratio:	16:9
Screencast:	No
Bumper:	UCS Default
Trailer:	UCS Default


Abstract:	We consider neural networks with a single hidden layer and non-decreasing positively homogeneous activation functions like the rectified linear units. By letting the number of hidden units grow unbounded and using classical non-Euclidean regularization tools on the output weights, they lead to a convex optimization problem and we provide a detailed theoretical analysis of their generalization performance, with a study of both the approximation and the estimation errors. We show in particular that they are adaptive to unknown underlying linear structures, such as the dependence on the projection of the input variables onto a low-dimensional subspace. Moreover, when using sparsity-inducing norms on the input weights, we show that high-dimensional non-linear variable selection may be achieved, without any strong assumption regarding the data and with a total number of variables potentially exponential in the number of observations. However, solving this convex optimization pro blem in infinite dimensions is only possible if the non-convex subproblem of addition of a new unit can be solved efficiently. We provide a simple geometric interpretation for our choice of activation functions and describe simple conditions for convex relaxations of the finite-dimensional non-convex subproblem to achieve the same generalization error bounds, even when constant-factor approximations cannot be found. We were not able to find strong enough convex relaxations to obtain provably polynomial-time algorithms and leave open the existence or non-existence of such tractable algorithms with non-exponential sample complexities. Related links: http://jmlr.org/papers/volume18/14-546/14-546.pdf - JMLR paper

Abstract:

We consider neural networks with a single hidden layer and non-decreasing positively homogeneous activation functions like the rectified linear units. By letting the number of hidden units grow unbounded and using classical non-Euclidean regularization tools on the output weights, they lead to a convex optimization problem and we provide a detailed theoretical analysis of their generalization performance, with a study of both the approximation and the estimation errors. We show in particular that they are adaptive to unknown underlying linear structures, such as the dependence on the projection of the input variables onto a low-dimensional subspace. Moreover, when using sparsity-inducing norms on the input weights, we show that high-dimensional non-linear variable selection may be achieved, without any strong assumption regarding the data and with a total number of variables potentially exponential in the number of observations. However, solving this convex optimization pro blem in infinite dimensions is only possible if the non-convex subproblem of addition of a new unit can be solved efficiently. We provide a simple geometric interpretation for our choice of activation functions and describe simple conditions for convex relaxations of the finite-dimensional non-convex subproblem to achieve the same generalization error bounds, even when constant-factor approximations cannot be found. We were not able to find strong enough convex relaxations to obtain provably polynomial-time algorithms and leave open the existence or non-existence of such tractable algorithms with non-exponential sample complexities.

Related links: http://jmlr.org/papers/volume18/14-546/14-546.pdf - JMLR paper

Available Formats

Format	Quality	Bitrate	Size
MPEG-4 Video *	640x360	1.93 Mbits/sec	698.62 MB	View	Download
WebM	640x360	516.36 kbits/sec	181.85 MB	View	Download
iPod Video	480x270	522.21 kbits/sec	183.84 MB	View	Download
MP3	44100 Hz	249.76 kbits/sec	88.02 MB	Listen	Download
Auto	(Allows browser to choose a format it supports)

Streaming Media Service Upload

Breaking the Curse of Dimensionality with Convex Neural Networks