ZüKoSt: Seminar on Applied Statistics
Time/Place: every Thursday at 4.15 pm at the Main Building of ETH, HG G 19.1
Spring Semester 2015
Date 
Speaker 
Title 
Time 
Location 
5mar2015 (thu) 
MariaPia VictoriaFeser

Robust Generalised Method of Wavelet Moments

16:1517:00 
HG G 19.1 
Abstract: 
The estimation of complex timeseries or statespace models via maximumlikelihood can often be extremely complicated and burdensome. In addition, the existing estimation procedures can become highly biased if the true process is characterized by contamination which is unrelated to the process itself. Recently however, Guerrier et al. (2013) proposed a new methodology which employs the Wavelet Variance (WV), a measure which quantifies the amount of variation present in each of the subprocesses resulting from a wavelet decomposition. This methodology is called the Generalized Method of Wavelet Moments (GMWM) which takes advantage of the unique matching that exists between the WV and a stochastic process Pθ estimating the parameters θ which minimize the distance between the observed WV and that implied by the model Pθ. Moreover, the GMWM is often the only viable method to estimate the parameters of processes which are composed of an ensemble of underlying processes that operate at different scales (hereinafter composite processes).
Nonetheless, many of the domains in which the GMWM can be employed often suffer from different sources of data contamination which can highly bias the parameter estimation process. It is therefore necessary to employ robust estimation methods which are able to limit the bias under different contamination settings. By using a robust estimator for the WV based on Huber's Proposal 2 or the approach proposed by Mondal and Percival (2011), it is possible to deliver a robust version of the GMWM (RGMWM) which provides a method to robustly estimate both simple time series models as well as complex statespace models or composite processes.
References
S. Guerrier, Y. Stebler, J. Skaloud, and M.P. VictoriaFeser. Wavelet variance based estimation for
composite stochastic processes. Journal of the American Statistical Association, 2013, 108 (503): 10211030
D. Mondal and D.B. Percival. Mestimation of wavelet variance. Annals of The Institute of
Statistical Mathematics, February 2012, Volume 64, pp 2753.

Speakers: 
MariaPia VictoriaFeser
(Research Center for Statistics, Université de Genève)


15apr2015 (wed) 
Friedrich Leisch

FLEXIBLE IMPLEMENTATION OF RESAMPLING SCHEMES FOR CLUSTER VALIDATION

16:1517:00 
HG G 19.1 
Abstract: 
Model diagnostic for cluster analysis is still a developing field because of its exploratory nature. Numerous indices have been proposed in the literature to evaluate goodnessoffit, but no clear winner that works in all situations has been found yet. Derivation of (asymptotic) distribution properties is not possible in most cases. Resampling schemes provide an elegant framework to computationally derive the distribution of interesting quantities describing the quality of a partition. Special emphasis will be given to stability of a partition, i.e., given a new sample from the same population, how likely is it to obtain a similar clustering? This framework has been implemented in R with automatic support for parallel processing on multiple cores or compute clusters. An example from market segmentation is used to illustrate the procedures.

Speakers: 
Friedrich Leisch
(Universität Wien)


23apr2015 (thu) 
Diego Kuonen

A Statistician's 'Big Tent' View on Big Data and Data Science

16:1517:00 
HG G 19.1 
Abstract: 
There is no question that big data have hit the business, government
and scientific sectors. The demand for skills in data science is
unprecedented in sectors where value, competitiveness and efficiency
are driven by data. However, there is plenty of misleading hype
around the terms 'big data' and 'data science'. This presentation
gives a professional statistician's 'big tent' view on these terms,
illustrates the connection between data science and statistics, and
highlights some challenges and opportunities from a statistical
perspective. 
Speakers: 
Diego Kuonen
(Statoo Consulting, Bern)


25jun2015 (thu) 
Hadley Wickham

Pure, predictable, pipeable: creating fluent interfaces with R.

16:1517:00 
HG E 1.1 
Abstract: 
A fluent interface lets you easily express yourself in code. Over time a fluent interface retreats to your subconcious. You don't need to bring it to mind; the code just flows out of your fingers. I strive for this fluency in all the packages I write, and while I don't always succeed, I think I've learned some valuable lessons along the way.
In this talk, I'll discuss three guidelines that make it easier to develop fluent interfaces:
* __Pure functions__. A pure function only interacts with the world
through its inputs and outputs; it has no sideeffects. Pure
functions make great building blocks because they're are easy to
reason about and can be easily composed.
* __Predictable interfaces__. It's easier to learn a function if its
consistent, because you can learn the behaviour of a whole group of
functions at once. I'll highlight the benefits of predictability with
some of my favourite R "WAT"s (including `c()`, `sapply()` and
`sample()`).
* __Pipes__. Pure predictable functions are nice in isolation but
are most powerful in combination. The pipe, `%>%`, is particularly
in important when combining many functions because it turns function
composition on its head so you can read it from lefttoright. I'll
show you how this has helped me build dplyr, rvest, ggvis, lowliner,
stringr and more.
This talk will help you make best use of my recent packages, and teach you how to apply the same principles to make your own code easier to use.

Speakers: 
Hadley Wickham
(Rice University)


Further information: sekretariat@stat.math.ethz.ch
Mailinglist: Would you like to receive notice of these presentations via email? Register here