[R] PCA for Binary data

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Jun 13 08:20:57 CEST 2007


On Tue, 12 Jun 2007, Spencer Graves wrote:

>      The problem with applying prcomp to binary data is that it's not
> clear what problem you are solving.
>
>      The standard principal components and factor analysis models
> assume that the observations are linear combinations of unobserved
> "common" factors (shared variability), normally distributed, plus normal
> noise, independent between observations and variables.  Those
> assumptions are clearly violated for binary data.
>
>      RSiteSearch("PCA for binary data") produced references to 'ade4'
> and 'FactoMineR'.  Have you considered these?  I have not used them, but
> FactoMineR included functions for 'Multiple Factor Analysis for Mixed
> [quantitative and qualitative] Data'

AFAIK, that is not using 'factor analysis' in the same sense as e.g. 
factanal().

Continuous underlying variables with binary manifest variables is part of 
latent variable analysis.  Package 'ltm' covers a variety of such models.

But to begin to give advice we would need to know the scientific problem 
for which Ranga Chandra Gudivada is looking for a tool. Simon Blomberg 
mentioned ordination, but that is only one of several classes of uses of 
PCA (which finds a linear subspace that both has maximal variance within 
and is least-squares fitting to the data).

>
>      Hope this helps.
>      Spencer Graves
>
> Josh Gilbert wrote:
>> I don't understand, what's wrong with using prcomp in this situation?
>>
>> On Sunday 10 June 2007 12:50 pm, Ranga Chandra Gudivada wrote:
>>
>>> Hi,
>>>
>>>     I was wondering whether there is any package implementing Principal
>>> Component Analysis for Binary data
>>>
>>>                                               Thanks chandra
>>>
>>>
>>> ---------------------------------
>>>
>>>
>>> 	[[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html and provide commented, minimal,
>>> self-contained, reproducible code.
>>>
>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list