Please report any queries concerning the funding data grouped in the sections named "Externally Awarded" or "Internally Disbursed" (shown on the profile page) to
your Research Finance Administrator. Your can find your Research Finance Administrator at https://www.ucl.ac.uk/finance/research/rs-contacts.php by entering your department
Please report any queries concerning the student data shown on the profile page to:
Email: portico-services@ucl.ac.uk
Help Desk: http://www.ucl.ac.uk/ras/portico/helpdesk
Email: portico-services@ucl.ac.uk
Help Desk: http://www.ucl.ac.uk/ras/portico/helpdesk
Publication Detail
Efficient Aggregated Kernel Tests using Incomplete U-statistics
-
Publication Type:Working discussion paper
-
Authors:Schrab A, Kim I, Guedj B, Gretton A
-
Publication date:18/06/2022
-
Status:Published
-
Language:English
-
Keyword:stat.ML, stat.ML, cs.LG, math.ST, stat.ME, stat.TH
-
Notes:31 pages
Abstract
We propose a series of computationally efficient, nonparametric tests for the
two-sample, independence and goodness-of-fit problems, using the Maximum Mean
Discrepancy (MMD), Hilbert Schmidt Independence Criterion (HSIC), and Kernel
Stein Discrepancy (KSD), respectively. Our test statistics are incomplete
$U$-statistics, with a computational cost that interpolates between linear time
in the number of samples, and quadratic time, as associated with classical
$U$-statistic tests. The three proposed tests aggregate over several kernel
bandwidths to detect departures from the null on various scales: we call the
resulting tests MMDAggInc, HSICAggInc and KSDAggInc. For the test thresholds,
we derive a quantile bound for wild bootstrapped incomplete $U$- statistics,
which is of independent interest. We derive uniform separation rates for
MMDAggInc and HSICAggInc, and quantify exactly the trade-off between
computational efficiency and the attainable rates: this result is novel for
tests based on incomplete $U$-statistics, to our knowledge. We further show
that in the quadratic-time case, the wild bootstrap incurs no penalty to test
power over more widespread permutation-based approaches, since both attain the
same minimax optimal rates (which in turn match the rates that use oracle
quantiles). We support our claims with numerical experiments on the trade-off
between computational efficiency and test power. In the three testing
frameworks, we observe that our proposed linear-time aggregated tests obtain
higher power than current state-of-the-art linear-time kernel tests.
› More search options
UCL Researchers