Research Seminar
Time/Place: every Friday at 3.15 pm in the Main Building of ETH, HG G 19.1
Spring Semester 2013
| Date |
Speaker |
Title |
Time |
Location |
| 22-feb-2013 (fri) |
Thanh Mai Pham Ngoc
|
Goodness of fit tests for noisy directional data
|
15:15-16:15 |
HG G 19.1 |
| Abstract: |
In astrophysics,a burning issue consists in understanding the behaviour of the so-called UUltra High Energy Cosmic Rays (UHECR). These latter are cosmic rays with an extreme kinetic energy and the rarest particles in the universe. The source of those most energetic
particles remains a mystery. Finding out more about the law of probability of those incoming directions is crucial to gain an insight into the mechanisms generating the UHECR.
Astrophysicists have at their disposal directional data which are measurements of the incoming directions of the UHECR on Earth. Unfortunately their trajectories are deflected by Galactic and intergalactic fields. A first way to model the deflection in the incoming directions can be done thanks to the following model with random rotations : Zi =εiXi, i=1,...,N We define a nonparametric test procedure to distinguish H0 : the density f of Xi is the uniform density f0 on the sphere and H1. we show that an adaptive procedure cannot have a faster rate of separation than ψad(s) = (N/loglog(N))−2s/ (2s+2ν+1) and we provide a procedure which reaches this rate. We illustrate the theory by implementing our test procedure for
various kinds of noise on SO(3) and by comparing it to other procedures.
Applications to real data in astrophysics and paleomagnetism are provided. |
| Speakers: |
Thanh Mai Pham Ngoc
(Université de Paris Sud Orsay)
|
|
| 1-mar-2013 (fri) |
Sébastien Loustau
|
Inverse Statistical Learning
|
15:15-16:15 |
HG G 19.1 |
| Abstract: |
We propose to consider the problem of statistical learning when we observe a contaminated sample. More precisely, we state fast rates of convergence in classification with errors in variables for deconvolution empirical risk minimizers. These rates depends on the ill-posedness, the margin and the complexity of the problem. The cornerstone of the proof is a bias variance decomposition of the excess risk.
After a theoretical study of the problem, we turn out into more practical considerations by presenting a new algorithm for noisy finite dimensional clustering called noisy K-means.
|
| Speakers: |
Sébastien Loustau
(Université d'Angers, France)
|
|
| 8-mar-2013 (fri) |
Alexei Onatski
|
Asymptotic Analysis of the Squared Estimation Error in Misspecified Factor Models
|
15:15-16:15 |
HG G 19.1 |
| Abstract: |
In this paper, we obtain asymptotic approximations to the mean squared error of the least squares estimator of the common component in large approximate factor models with possibly misspecified number of factors. The approximations are derived under both strong and weak factors asymptotics assuming that the cross-sectional and temporal dimensions of the data are comparable. We develop consistent estimators of these approximations and propose to use them as new criteria for selection of the number of factors. We show that the estimators of the number of factors that minimize these criteria are asymptotically loss efficient in the sense of Shibata (1980), Li (1987), and Shao (1997). |
| Speakers: |
Alexei Onatski
(University of Cambridge)
|
|
| 27-mar-2013 (wed) |
Patrik Guggenberger
|
Subset inference in the linear IV model
|
15:15-16:15 |
HG G 19.1 |
| Abstract: |
In the linear instrumental variables model we are interested in testing a hypothesis on the coefficient of an exogenous variable when one right hand side endogenous variable is present. Under the assumption of conditional homoskedasticity but no restriction on the reduced form coefficient vector, we derive the asymptotic size of the subset Lagrange multiplier (LM) test and provide the nonrandom size corrected (SC) critical value that ensures that the resulting SC subset LM test has correct asymptotic size. We introduce an easy-to-implement generalized moment selection plug-in SC (GMS-PSC) subset LM test that uses a data-dependent critical value. We compare the local power properties of the GMS-PSC subset LM and subset AR test and also provide a Monte Carlo study that compares the finite-sample properties of the two tests. The GMS-PSC is shown to have competitive power properties. |
| Speakers: |
Patrik Guggenberger
(University of California, San Diego)
|
|
| 2-apr-2013 (tue) |
Joris M. Mooij
|
Cyclic Causal Discovery from Equilibrium Data
|
15:15-16:15 |
HG G 19.2 |
| Abstract: |
Causal feedback loops play important roles in many biological systems. In the
absence of time series data, inferring the structure of cyclic causal systems
can be extremely challenging. An example of such a biological system is a
cellular signalling network that plays an important role in human immune system
cells (Sachs et al., Science 2005), consisting of several interacting proteins
and phospholipids. The protein concentration data measured by Sachs et al.
utilizing flow cytometry has been analyzed by different researchers in order to
evaluate various causal inference methods. Most of these methods only consider
acyclic causal structures, even though the data shows strong evidence that
feedback loops are present. In this talk I will propose a new method for cyclic
causal discovery from a combination of observational and interventional
equilibrium data. I will show that the method indeed finds evidence for feedback
loops in the flow cytometry data and that it gives a more accurate quantitative
description of the data at comparable model complexity.
|
| Speakers: |
Joris M. Mooij
(Radboud University Nijmegen)
|
|
| 19-apr-2013 (fri) |
Alexander Sokol
|
Stochastic differential equations as causal models
|
15:15-16:15 |
HG G 19.1 |
| Abstract: |
We define a notion of interventions in a stochastic differential equation
based on simple substitution in the SDE. We prove that this notion of intervention
is the same as can be obtained by making do()-interventions in the Euler scheme for
the SDE and taking the limit. We show that when the driving semimartingale is a Lévy
process and there are no latent variables, the postintervention distribution is
always identifiable from the observational distribution. We also relate our results
to litterature on weak conditional local independence by Gégout-Petit and
Commenges. |
| Speakers: |
Alexander Sokol
(University of Copenhagen)
|
|
| 19-apr-2013 (fri) |
Johanna G. Neslehova and Christian Genest
|
Tests of independence for sparse contingency tables and beyond
|
16:30-17:30 |
HG G 19.1 |
| Abstract: |
New statistics are proposed for testing the hypothesis that arbitrary random variables are mutually independent. These tests are consistent and well-behaved for any type of data, even for sparse contingency tables and tables whose dimension depends on the sample size. The statistics are Cram?ér-von Mises and Kolmogorov-Smirnov type functionals of the empirical checkerboard copula. The asymptotic behavior of the corresponding empirical process will be characterized and illustrated; it will also be shown how replicates from the limiting process can be generated using a multiplier bootstrap procedure. As will be seen through simulations, the new tests are considerably more powerful than those based on the Pearson chi squared, likelihood ratio, and Zelterman statistics often used in this context.
|
| Speakers: |
Johanna G. Neslehova and Christian Genest
(McGill University, Montréal, Canada)
|
|
| 3-may-2013 (fri) |
Niels Richard Hansen
|
Non-parametric estimation of linear filters for point processes
|
15:15-16:15 |
HG G 19.1 |
| Abstract: |
A main challenge in neuron science is to model the dynamic activity of the brain and how it responds to external stimuli. We present models of neuron network activity based on multichannel spike data. The models form a class of point orocess models with spike rates determined through linear filters of the spike histories. The filters are given in terms of filter functions that are estimated non-parametrically as elements in e.g. a reproducing kernel Hilbert space. We discuss how the models can be used to infer network connectivity and predictions of stimuli (intervention) effects. The methods used are available via the R package ppstat.
|
| Speakers: |
Niels Richard Hansen
(University of Copenhagen)
|
|
| 20-jun-2013 (thu) |
Andrew B. Nobel
|
Large Average Submatrices of a Gaussian Random Matrix: Landscapes and Local Optima.
|
15:15-16:15 |
HG G 19.1 |
| Abstract: |
The problem of finding large average submatrices of a real-valued matrix arises in the exploratory analysis of data from a variety of disciplines, ranging from genomics to social sciences. This talk details several new theoretical results concerning the asymptotic behavior of large average submatrices of an nxn Gaussian random matrix. The first result identifies the average and joint distribution of the (globally optimal) kxk submatrix having largest average value. We then turn our attention to submatrices with dominant row and column sums, which arise as the local optima of a useful iterative search procedure for large average submatrices. Paralleling the result for global optima, the second result identifies the average and joint distribution of a typical locally optimal kxk submatrix. The last part of the talk considers the *number* of locally optimal kxk submatrices, L_n(k), beginning with the asymptotic behavior of its mean and variance for fixed k and increasing n. The final result is a Gaussian central limit theorem for L_n(k) that is based on a new variant of Stein's method for normal approximation.
Joint work with Shankar Bhamidi and Partha S. Dey
|
| Speakers: |
Andrew B. Nobel
(University of North Carolina at Chapel Hill)
|
|
Further information: sekretariat@stat.math.ethz.ch
Mailinglist: Would you like to receive notice of these presentations via e-mail? Please subscribe here: https://stat.ethz.ch/mailman/listinfo/statlist