Academic

News


Filter by
Jump to
Search

High-Dimensional Variable Screening and Dimension Reduction and Their Applications To Classification

Mr Cai ZhiboDepartment of Statistics and Applied Probability, NUS

Date:8 April 2021, Thursday

Location:ZOOM: https://nus-sg.zoom.us/j/86711359893?pwd=YW1TNHNGZFFoa0JQeHd5N1Q4SFA0Zz09

Time:10:00am - 11:00am, Singapore time

PhD Oral Presentation

Feature selection and dimension reduction are the most popular approaches in analysing high dimensional data.  The sure independence screening (SIS) provides an efficient and fast ranking of variables’ importance of regressions in ultra-high dimensional space. However, SIS also includes false important variables in the front of the rankings. To mitigate the problem, we propose a new approach by partitioning the sample into subsets consecutively, leading to a tree structure of sub-samples, called SIS-tree hereafter. Final rankings of importance are calculated by averaging dependence measures in these sub-samples. SIS-tree can be combined with any measure of dependence providing it meets some basic requirements. A cut-off of the rankings is also studied. As a direct application of the screening, we consider the classification of high dimensional data.
Different from variable selection, sufficient dimension reduction (SDR) selects important linear combinations of predictors to reduce the dimension of data. In the second part of the thesis, we extend the method and theory of the outer-product-of-gradient (OPG, one of the most efficient SDR methods) to high dimensional data, which allows the dimension diverge to infinity with sample size. We call the method high-dimensional OPG (HOPG). As an application in high dimensional space, we propose an ensemble classifier by aggregating results of classifiers that are built on subspaces reduced by the random projection and HOPG consecutively from the original data.