Robust gradient-based markov subsampling

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Subsampling is a widely used and effective method to deal with the challenges brought by big data. Most subsampling procedures are designed based on the importance sampling framework, where samples with high importance measures are given corresponding sampling probabilities. However, in the highly noisy case, these samples may cause an unstable estimator which could lead to a misleading result. To tackle this issue, we propose a gradient-based Markov subsampling (GMS) algorithm to achieve robust estimation. The core idea is to construct a subset which allows us to conservatively correct a crude initial estimate towards the true signal. Specifically, GMS selects samples with small gradients via a probabilistic procedure, constructing a subset that is likely to exclude noisy samples and provide a safe improvement over the initial estimate. We show that the GMS estimator is statistically consistent at a rate which matches the optimal in the minimax sense. The promising performance of GMS is supported by simulation studies and real data examples.

Original languageEnglish
Title of host publicationAAAI 2020 - 34th AAAI Conference on Artificial Intelligence
PublisherAAAI press
Pages4004-4011
Number of pages8
ISBN (Electronic)9781577358350
StatePublished - 2020
Externally publishedYes
Event34th AAAI Conference on Artificial Intelligence, AAAI 2020 - New York, United States
Duration: 7 Feb 202012 Feb 2020

Publication series

NameAAAI 2020 - 34th AAAI Conference on Artificial Intelligence

Conference

Conference34th AAAI Conference on Artificial Intelligence, AAAI 2020
Country/TerritoryUnited States
CityNew York
Period7/02/2012/02/20

Fingerprint

Dive into the research topics of 'Robust gradient-based markov subsampling'. Together they form a unique fingerprint.

Cite this