Data perturbation for data science
44 mins 41 secs,
81.76 MB,
MP3
44100 Hz,
249.81 kbits/sec
Share this media item:
Embed this media item:
Embed this media item:
About this item
Description: |
Samworth, R
Friday 29th June 2018 - 11:00 to 11:45 |
---|
Created: | 2018-06-29 17:05 |
---|---|
Collection: | Statistical scalability |
Publisher: | Isaac Newton Institute |
Copyright: | Samworth, R |
Language: | eng (English) |
Distribution: | World (downloadable) |
Explicit content: | No |
Aspect Ratio: | 16:9 |
Screencast: | No |
Bumper: | UCS Default |
Trailer: | UCS Default |
Abstract: | When faced with a dataset and a problem of interest, should we propose a statistical model and use that to inform an appropriate algorithm, or dream up a potential algorithm and then seek to justify it? The former is the more traditional statistical approach, but the latter appears to be becoming more popular. I will discuss a class of algorithms that belong in the second category, namely those that involve data perturbation (e.g. subsampling, random projections, artificial noise, knockoffs,...). As examples, I will consider Complementary Pairs Stability Selection for variable selection and sparse PCA via random projections. This will involve joint work with Rajen Shah, Milana Gataric and Tengyao Wang.
|
---|
Available Formats
Format | Quality | Bitrate | Size | |||
---|---|---|---|---|---|---|
MPEG-4 Video | 640x360 | 1.94 Mbits/sec | 650.26 MB | View | Download | |
WebM | 640x360 | 390.28 kbits/sec | 127.63 MB | View | Download | |
iPod Video | 480x270 | 522.14 kbits/sec | 170.75 MB | View | Download | |
MP3 * | 44100 Hz | 249.81 kbits/sec | 81.76 MB | Listen | Download | |
Auto | (Allows browser to choose a format it supports) |