Please report any queries concerning the funding data grouped in the sections named "Externally Awarded" or "Internally Disbursed" (shown on the profile page) to
your Research Finance Administrator. Your can find your Research Finance Administrator at https://www.ucl.ac.uk/finance/research/rscontacts.php by entering your department
Please report any queries concerning the student data shown on the profile page to:
Email: porticoservices@ucl.ac.uk
Help Desk: http://www.ucl.ac.uk/ras/portico/helpdesk
Email: porticoservices@ucl.ac.uk
Help Desk: http://www.ucl.ac.uk/ras/portico/helpdesk
Publication Detail
Density Estimation in Infinite Dimensional Exponential Families

Publication Type:Journal article

Authors:Sriperumbudur B, Fukumizu K, Gretton A, Hyvärinen A, Kumar R

Publication date:12/12/2013

Keywords:math.ST, math.ST, stat.ME, stat.ML, stat.TH

Author URL:

Notes:58 pages, 8 figures; Fixed some errors and typos
Abstract
In this paper, we consider an infinite dimensional exponential family,
$\mathcal{P}$ of probability densities, which are parametrized by functions in
a reproducing kernel Hilbert space, $H$ and show it to be quite rich in the
sense that a broad class of densities on $\mathbb{R}^d$ can be approximated
arbitrarily well in KullbackLeibler (KL) divergence by elements in
$\mathcal{P}$. The main goal of the paper is to estimate an unknown density,
$p_0$ through an element in $\mathcal{P}$. Standard techniques like maximum
likelihood estimation (MLE) or pseudo MLE (based on the method of sieves),
which are based on minimizing the KL divergence between $p_0$ and
$\mathcal{P}$, do not yield practically useful estimators because of their
inability to efficiently handle the logpartition function. Instead, we propose
an estimator, $\hat{p}_n$ based on minimizing the \emph{Fisher divergence},
$J(p_0\Vert p)$ between $p_0$ and $p\in \mathcal{P}$, which involves solving a
simple finitedimensional linear system. When $p_0\in\mathcal{P}$, we show that
the proposed estimator is consistent, and provide a convergence rate of
$n^{\min\left\{\frac{2}{3},\frac{2\beta+1}{2\beta+2}\right\}}$ in Fisher
divergence under the smoothness assumption that $\log
p_0\in\mathcal{R}(C^\beta)$ for some $\beta\ge 0$, where $C$ is a certain
HilbertSchmidt operator on $H$ and $\mathcal{R}(C^\beta)$ denotes the image of
$C^\beta$. We also investigate the misspecified case of $p_0\notin\mathcal{P}$
and show that $J(p_0\Vert\hat{p}_n)\rightarrow \inf_{p\in\mathcal{P}}J(p_0\Vert
p)$ as $n\rightarrow\infty$, and provide a rate for this convergence under a
similar smoothness condition as above. Through numerical simulations we
demonstrate that the proposed estimator outperforms the nonparametric kernel
density estimator, and that the advantage with the proposed estimator grows as
$d$ increases.
› More search options
UCL Researchers