[R] Different PCA results under Windows and Linux

Mark Difford mark_difford at yahoo.co.uk
Wed Sep 17 21:34:15 CEST 2008


Hi Jathine,

And then to see things more clearly still, you can do something like this on
your test results:

format(formatC(p1$var$coord, digits=15, format="f"), justify="right")

and

format(formatC(p1$var$coord, digits=16, format="f"), justify="right")

Though I do hope that the second command doesn't begin to concern you even
more.

Regards, Mark.



Mark Difford wrote:
> 
> Hi Jathine,
> 
>>> I hope this can explain the problem a bit more clearly. 
>>> Why PCA gives different results on the two different platforms?
> 
> What is amazing, Jathine, is how nearly exactly identical the two sets of
> results are, not that they begin to differ at the 16th decimal place. To
> assuage your concerns, do the following on the results from your two
> trials:
> 
> round(p1$var$coord, 15)
> ?round
> 
> ## And read the famous FAQ on floating point arithmetic
> 
> It also isn't a very good idea to be doing PCAs on 0s and 1s
> 
> Regards, Mark.
> 
> 
> jathine wrote:
>> 
>> Thank you for your reply.
>> Here are some more info, I hope this can explain the problem a bit more
>> clearly. 
>> Why PCA gives different results on the two different platforms?
>> 
>> freqtest.txt file line text : 
>> M1 M2 M3 M4 M5 M6 M7 M8
>> -1 -1 -1 -1 -1 -1 -1 -1
>> 0 0 0 0 -1 -1 1 1
>> -1 -1 -1 -1 -1 -1 -1 -1
>> 0 0 0 0 -1 -1 1 1
>> 
>> ******Linux R script result and sessionInfo()
>>> library(FactoMineR)
>>> x1=read.table("freqtest.txt", header=TRUE)
>>> xrcc2=x1[,1:8]
>>> p1=PCA(xrcc2, graph=FALSE)
>>> p1$var
>> 
>> $coord
>>    Dim.1         Dim.2         Dim.3
>> M1     1 -3.925231e-16 -2.287663e-48
>> M2     1  7.850462e-17 -3.600641e-32
>> M3     1  7.850462e-17  9.001602e-33
>> M4     1  7.850462e-17  9.001602e-33
>> M5     0  0.000000e+00  0.000000e+00
>> M6     0  0.000000e+00  0.000000e+00
>> M7     1  7.850462e-17  9.001602e-33
>> M8     1  7.850462e-17  9.001602e-33
>> 
>> $cor
>>    Dim.1         Dim.2         Dim.3
>> M1     1 -3.925231e-16 -2.287663e-48
>> M2     1  7.850462e-17 -3.600641e-32
>> M3     1  7.850462e-17  9.001602e-33
>> M4     1  7.850462e-17  9.001602e-33
>> M5   NaN           NaN           NaN
>> M6   NaN           NaN           NaN
>> M7     1  7.850462e-17  9.001602e-33
>> M8     1  7.850462e-17  9.001602e-33
>> 
>> $cos2
>>    Dim.1        Dim.2        Dim.3
>> M1     1 1.540744e-31 5.233404e-96
>> M2     1 6.162976e-33 1.296462e-63
>> M3     1 6.162976e-33 8.102884e-65
>> M4     1 6.162976e-33 8.102884e-65
>> M5   NaN          NaN          NaN
>> M6   NaN          NaN          NaN
>> M7     1 6.162976e-33 8.102884e-65
>> M8     1 6.162976e-33 8.102884e-65
>> 
>> $contrib
>>       Dim.1     Dim.2        Dim.3
>> M1 16.66667 83.333333 3.229346e-31
>> M2 16.66667  3.333333 8.000000e+01
>> M3 16.66667  3.333333 5.000000e+00
>> M4 16.66667  3.333333 5.000000e+00
>> M5  0.00000  0.000000 0.000000e+00
>> M6  0.00000  0.000000 0.000000e+00
>> M7 16.66667  3.333333 5.000000e+00
>> M8 16.66667  3.333333 5.000000e+00
>> 
>>> sessionInfo()
>> R version 2.7.1 (2008-06-23)
>> x86_64-redhat-linux-gnu
>> 
>> locale:
>> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>> 
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>> 
>> other attached packages:
>> [1] FactoMineR_1.09
>>>
>> 
>> ******Windows R script result and sessionInfo()
>>> library(FactoMineR)
>>> x1=read.table("freqtest.txt", header=TRUE)
>>> xrcc2=x1[,1:8]
>>> p1=PCA(xrcc2, graph=FALSE)
>>> p1$var
>> $coord
>>    Dim.1         Dim.2         Dim.3
>> M1     1  2.458061e-16 -4.590163e-49
>> M2     1 -4.916122e-17 -4.750455e-32
>> M3     1 -4.916122e-17  1.187614e-32
>> M4     1 -4.916122e-17  1.187614e-32
>> M5     0  0.000000e+00  0.000000e+00
>> M6     0  0.000000e+00  0.000000e+00
>> M7     1 -4.916122e-17  1.187614e-32
>> M8     1 -4.916122e-17  1.187614e-32
>> 
>> $cor
>>    Dim.1         Dim.2         Dim.3
>> M1     1  2.458061e-16 -4.590163e-49
>> M2     1 -4.916122e-17 -4.750455e-32
>> M3     1 -4.916122e-17  1.187614e-32
>> M4     1 -4.916122e-17  1.187614e-32
>> M5   NaN           NaN           NaN
>> M6   NaN           NaN           NaN
>> M7     1 -4.916122e-17  1.187614e-32
>> M8     1 -4.916122e-17  1.187614e-32
>> 
>> $cos2
>>    Dim.1        Dim.2        Dim.3
>> M1     1 6.042064e-32 2.106959e-97
>> M2     1 2.416826e-33 2.256682e-63
>> M3     1 2.416826e-33 1.410426e-64
>> M4     1 2.416826e-33 1.410426e-64
>> M5   NaN          NaN          NaN
>> M6   NaN          NaN          NaN
>> M7     1 2.416826e-33 1.410426e-64
>> M8     1 2.416826e-33 1.410426e-64
>> $contrib
>>       Dim.1     Dim.2        Dim.3
>> M1 16.66667 83.333333 7.469228e-33
>> M2 16.66667  3.333333 8.000000e+01
>> M3 16.66667  3.333333 5.000000e+00
>> M4 16.66667  3.333333 5.000000e+00
>> M5  0.00000  0.000000 0.000000e+00
>> M6  0.00000  0.000000 0.000000e+00
>> M7 16.66667  3.333333 5.000000e+00
>> M8 16.66667  3.333333 5.000000e+00
>> 
>>> sessionInfo()
>> R version 2.7.2 (2008-08-25)
>> i386-pc-mingw32
>> 
>> locale:
>> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
>> States.1252;LC_MONETARY=English_United
>> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>> 
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>> 
>> other attached packages:
>> [1] FactoMineR_1.09
>>>
>> 
>> 
>> 
>> Steven McKinney wrote:
>>> 
>>> 
>>> Not likely that anyone can explain, as
>>> there is not enough information in your
>>> email.
>>> 
>>> Including the contents of the freqtest.txt file
>>> was a good idea, as the posting guide suggests
>>> (the posting guide is that clearly labeled bit
>>> at the bottom that looks like this:
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> Check it out! It is cool.)
>>> 
>>> Additionally, include the command 
>>>   sessionInfo() 
>>> and its output from all machines you refer to
>>> so maintainers know which versions of software
>>> you are running.  Also, include the output you obtained
>>> from your code (with your code being a self-contained 
>>> and reproducible set of  R commands).
>>> 
>>> Finally, describe what the difference is and why
>>> the difference is problematic (i.e. don't report
>>> machine precision differences, or sign differences
>>> for PCA results - PCA vector directions are arbitrary
>>> modulo 180 degrees).
>>> 
>>>> I also tried mean(xrcc2) and sd(xrcc2) on both machines, the results
>>>> are the
>>>> same. 
>>>> Please explain.
>>> 
>>> The R maintainers do an amazing job of creating
>>> numerically stable platform-independent software,
>>> so you get the same results almost everywhere.
>>> (Thank you R core!)
>>> 
>>> 
>>> HTH
>>> 
>>> Steve McKinney
>>> 
>>> -----Original Message-----
>>> From: r-help-bounces at r-project.org on behalf of jathine
>>> Sent: Tue 9/16/2008 2:19 PM
>>> To: r-help at r-project.org
>>> Subject: [R]  Different PCA results under Windows and Linux
>>>  
>>> 
>>> I ran the following R script under both Linux and Windows, and got 2
>>> different results.
>>> Linux R version 2.7.1 and Windows R version 2.7.2.
>>> 
>>>> library(FactoMineR)
>>>>x1=read.table("freqtest.txt",header=TRUE)
>>>>xrcc2=x1[,1:8]
>>>>p1=PCA(xrcc2, graph=FALSE)
>>>>p1$var
>>> 
>>> freqtest.txt file lines of text :
>>> M1 M2 M3 M4 M5 M6 M7 M8
>>> -1 -1 -1 -1 -1 -1 -1 -1
>>> 0 0 0 0 -1 -1 1 1
>>> -1 -1 -1 -1 -1 -1 -1 -1
>>> 0 0 0 0 -1 -1 1 1 
>>> 
>>> I also tried mean(xrcc2) and sd(xrcc2) on both machines, the results are
>>> the
>>> same. 
>>> Please explain.
>>> 
>>> 
>>> -- 
>>> View this message in context:
>>> http://www.nabble.com/Different-PCA-results-under-Windows-and-Linux-tp19520449p19520449.html
>>> Sent from the R help mailing list archive at Nabble.com.
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>>> 
>> 
>> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Different-PCA-results-under-Windows-and-Linux-tp19520449p19539474.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list