Please report any queries concerning the funding data grouped in the sections named "Externally Awarded" or "Internally Disbursed" (shown on the profile page) to
your Research Finance Administrator. Your can find your Research Finance Administrator at https://www.ucl.ac.uk/finance/research/rs-contacts.php by entering your department
Please report any queries concerning the student data shown on the profile page to:
Email: portico-services@ucl.ac.uk
Help Desk: http://www.ucl.ac.uk/ras/portico/helpdesk
Email: portico-services@ucl.ac.uk
Help Desk: http://www.ucl.ac.uk/ras/portico/helpdesk
Publication Detail
Vector-valued Distribution Regression - Keep It Simple and Consistent
-
Publication Type:Conference presentation
-
Publication Sub Type:Presentation
-
Authors:Szabo Z, Sriperumbudur B, Poczos B, Gretton A
-
Date:01/05/2015
-
Name of Conference:CSML reading group, Department of Statistics, University of Oxford
-
Conference place:Oxford, UK
-
Conference start date:01/05/2015
-
Conference finish date:01/05/2015
-
Notes:Code: "https://bitbucket.org/szzoli/ite/" Event: "https://sites.google.com/site/ocsmlrg/home", "https://sites.google.com/site/ocsmlrg/schedule"
Abstract
We tackle the distribution regression problem (DRP): regressing from probability measures to vector-valued outputs in the two-stage sampled case, where the input distributions are only available through samples. Numerous important and challenging machine learning and statistical tasks fit into the studied problem family such as multi-instance learning or point estimation tasks. Although there is a vast number of heuristics in the literature to address the DRP problem, to the best of our knowledge the only existing technique with performance guarantees requires density estimation (which often scales poorly in practice) and the distributions to have densities on compact Euclidean domains. In my talk, I am going to present a simple alternative to solve the DRP problem by embedding the input distributions to a reproducing kernel Hilbert space, followed by ridge regression from the embeddings to the outputs. We prove that the proposed approach is consistent: we derive finite sample excess risk bounds which hold with high probability and establish explicit convergence rates as a function of the problem difficulty and sample numbers. Specifically, we justify the applicability of set kernels in regression, which was a 15-year-old open question, and construct alternative kernels on the embedded distributions. The studied scheme is viable under mild conditions, on separable topological domains endowed with kernels. We demonstrate the efficiency of the method in two applications, supervised entropy learning and aerosol optical depth prediction based on multispectral satellite images.
› More search options
UCL Researchers