[Statlist] Rappel : Séminaire de statistique jeudi 23 février 2012

ISTAT Messagerie Me@@@ger|e@ISTAT @end|ng |rom un|ne@ch
Thu Feb 16 10:27:41 CET 2012


SEMINAIRE DE STATISTIQUE

Institut de Statistique, Université de Neuchâtel, Pierre-à-Mazel 7, 2000 Neuchâtel- http://www2.unine.ch/statistics       

JEUDI 23 février 2012 à 11h00, salle PAM 101, 1er étage.

Maria-Pia Victoria-Feser
University of Geneva

Robust VIF Regression

The sophisticated and automated means of data collection used by an increasing number of institutions and companies leads to extremely large datasets. Subset selection in regression is essential when a huge number of covariates can potentially explain a response variable of interest. The recent statistical literature has seen an emergence of new selection methods that provide some type of compromise between implementation (computational speed) and statistical optimality (e.g. prediction error minimization).Global methods such as Mallows' $C_p$ have been supplanted by sequential methods such as stepwise regression. More recently, streamwise regression, faster than the former, has emerged. A recently proposed streamwise regression approach based on the variance inflation factor (VIF) is promising but its least-squares based implementation makes it susceptible to the outliers inevitable in such large data sets. This lack of robustness can lead to poor and suboptimal feature selection. This talk proposes a robust VIF regression, based on fast robust estimators, that inherits all the good properties of classical VIF in the absence of outliers, but also continues to perform well in their presence where the classical approach fails. The analysis of two real data sets shows the necessity of a robust approach for policy makers.




More information about the Statlist mailing list