[R] Different PCA results under Windows and Linux
Mark Difford
mark_difford at yahoo.co.uk
Wed Sep 17 21:34:15 CEST 2008
Hi Jathine,
And then to see things more clearly still, you can do something like this on
your test results:
format(formatC(p1$var$coord, digits=15, format="f"), justify="right")
and
format(formatC(p1$var$coord, digits=16, format="f"), justify="right")
Though I do hope that the second command doesn't begin to concern you even
more.
Regards, Mark.
Mark Difford wrote:
>
> Hi Jathine,
>
>>> I hope this can explain the problem a bit more clearly.
>>> Why PCA gives different results on the two different platforms?
>
> What is amazing, Jathine, is how nearly exactly identical the two sets of
> results are, not that they begin to differ at the 16th decimal place. To
> assuage your concerns, do the following on the results from your two
> trials:
>
> round(p1$var$coord, 15)
> ?round
>
> ## And read the famous FAQ on floating point arithmetic
>
> It also isn't a very good idea to be doing PCAs on 0s and 1s
>
> Regards, Mark.
>
>
> jathine wrote:
>>
>> Thank you for your reply.
>> Here are some more info, I hope this can explain the problem a bit more
>> clearly.
>> Why PCA gives different results on the two different platforms?
>>
>> freqtest.txt file line text :
>> M1 M2 M3 M4 M5 M6 M7 M8
>> -1 -1 -1 -1 -1 -1 -1 -1
>> 0 0 0 0 -1 -1 1 1
>> -1 -1 -1 -1 -1 -1 -1 -1
>> 0 0 0 0 -1 -1 1 1
>>
>> ******Linux R script result and sessionInfo()
>>> library(FactoMineR)
>>> x1=read.table("freqtest.txt", header=TRUE)
>>> xrcc2=x1[,1:8]
>>> p1=PCA(xrcc2, graph=FALSE)
>>> p1$var
>>
>> $coord
>> Dim.1 Dim.2 Dim.3
>> M1 1 -3.925231e-16 -2.287663e-48
>> M2 1 7.850462e-17 -3.600641e-32
>> M3 1 7.850462e-17 9.001602e-33
>> M4 1 7.850462e-17 9.001602e-33
>> M5 0 0.000000e+00 0.000000e+00
>> M6 0 0.000000e+00 0.000000e+00
>> M7 1 7.850462e-17 9.001602e-33
>> M8 1 7.850462e-17 9.001602e-33
>>
>> $cor
>> Dim.1 Dim.2 Dim.3
>> M1 1 -3.925231e-16 -2.287663e-48
>> M2 1 7.850462e-17 -3.600641e-32
>> M3 1 7.850462e-17 9.001602e-33
>> M4 1 7.850462e-17 9.001602e-33
>> M5 NaN NaN NaN
>> M6 NaN NaN NaN
>> M7 1 7.850462e-17 9.001602e-33
>> M8 1 7.850462e-17 9.001602e-33
>>
>> $cos2
>> Dim.1 Dim.2 Dim.3
>> M1 1 1.540744e-31 5.233404e-96
>> M2 1 6.162976e-33 1.296462e-63
>> M3 1 6.162976e-33 8.102884e-65
>> M4 1 6.162976e-33 8.102884e-65
>> M5 NaN NaN NaN
>> M6 NaN NaN NaN
>> M7 1 6.162976e-33 8.102884e-65
>> M8 1 6.162976e-33 8.102884e-65
>>
>> $contrib
>> Dim.1 Dim.2 Dim.3
>> M1 16.66667 83.333333 3.229346e-31
>> M2 16.66667 3.333333 8.000000e+01
>> M3 16.66667 3.333333 5.000000e+00
>> M4 16.66667 3.333333 5.000000e+00
>> M5 0.00000 0.000000 0.000000e+00
>> M6 0.00000 0.000000 0.000000e+00
>> M7 16.66667 3.333333 5.000000e+00
>> M8 16.66667 3.333333 5.000000e+00
>>
>>> sessionInfo()
>> R version 2.7.1 (2008-06-23)
>> x86_64-redhat-linux-gnu
>>
>> locale:
>> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>>
>> other attached packages:
>> [1] FactoMineR_1.09
>>>
>>
>> ******Windows R script result and sessionInfo()
>>> library(FactoMineR)
>>> x1=read.table("freqtest.txt", header=TRUE)
>>> xrcc2=x1[,1:8]
>>> p1=PCA(xrcc2, graph=FALSE)
>>> p1$var
>> $coord
>> Dim.1 Dim.2 Dim.3
>> M1 1 2.458061e-16 -4.590163e-49
>> M2 1 -4.916122e-17 -4.750455e-32
>> M3 1 -4.916122e-17 1.187614e-32
>> M4 1 -4.916122e-17 1.187614e-32
>> M5 0 0.000000e+00 0.000000e+00
>> M6 0 0.000000e+00 0.000000e+00
>> M7 1 -4.916122e-17 1.187614e-32
>> M8 1 -4.916122e-17 1.187614e-32
>>
>> $cor
>> Dim.1 Dim.2 Dim.3
>> M1 1 2.458061e-16 -4.590163e-49
>> M2 1 -4.916122e-17 -4.750455e-32
>> M3 1 -4.916122e-17 1.187614e-32
>> M4 1 -4.916122e-17 1.187614e-32
>> M5 NaN NaN NaN
>> M6 NaN NaN NaN
>> M7 1 -4.916122e-17 1.187614e-32
>> M8 1 -4.916122e-17 1.187614e-32
>>
>> $cos2
>> Dim.1 Dim.2 Dim.3
>> M1 1 6.042064e-32 2.106959e-97
>> M2 1 2.416826e-33 2.256682e-63
>> M3 1 2.416826e-33 1.410426e-64
>> M4 1 2.416826e-33 1.410426e-64
>> M5 NaN NaN NaN
>> M6 NaN NaN NaN
>> M7 1 2.416826e-33 1.410426e-64
>> M8 1 2.416826e-33 1.410426e-64
>> $contrib
>> Dim.1 Dim.2 Dim.3
>> M1 16.66667 83.333333 7.469228e-33
>> M2 16.66667 3.333333 8.000000e+01
>> M3 16.66667 3.333333 5.000000e+00
>> M4 16.66667 3.333333 5.000000e+00
>> M5 0.00000 0.000000 0.000000e+00
>> M6 0.00000 0.000000 0.000000e+00
>> M7 16.66667 3.333333 5.000000e+00
>> M8 16.66667 3.333333 5.000000e+00
>>
>>> sessionInfo()
>> R version 2.7.2 (2008-08-25)
>> i386-pc-mingw32
>>
>> locale:
>> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
>> States.1252;LC_MONETARY=English_United
>> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>>
>> other attached packages:
>> [1] FactoMineR_1.09
>>>
>>
>>
>>
>> Steven McKinney wrote:
>>>
>>>
>>> Not likely that anyone can explain, as
>>> there is not enough information in your
>>> email.
>>>
>>> Including the contents of the freqtest.txt file
>>> was a good idea, as the posting guide suggests
>>> (the posting guide is that clearly labeled bit
>>> at the bottom that looks like this:
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> Check it out! It is cool.)
>>>
>>> Additionally, include the command
>>> sessionInfo()
>>> and its output from all machines you refer to
>>> so maintainers know which versions of software
>>> you are running. Also, include the output you obtained
>>> from your code (with your code being a self-contained
>>> and reproducible set of R commands).
>>>
>>> Finally, describe what the difference is and why
>>> the difference is problematic (i.e. don't report
>>> machine precision differences, or sign differences
>>> for PCA results - PCA vector directions are arbitrary
>>> modulo 180 degrees).
>>>
>>>> I also tried mean(xrcc2) and sd(xrcc2) on both machines, the results
>>>> are the
>>>> same.
>>>> Please explain.
>>>
>>> The R maintainers do an amazing job of creating
>>> numerically stable platform-independent software,
>>> so you get the same results almost everywhere.
>>> (Thank you R core!)
>>>
>>>
>>> HTH
>>>
>>> Steve McKinney
>>>
>>> -----Original Message-----
>>> From: r-help-bounces at r-project.org on behalf of jathine
>>> Sent: Tue 9/16/2008 2:19 PM
>>> To: r-help at r-project.org
>>> Subject: [R] Different PCA results under Windows and Linux
>>>
>>>
>>> I ran the following R script under both Linux and Windows, and got 2
>>> different results.
>>> Linux R version 2.7.1 and Windows R version 2.7.2.
>>>
>>>> library(FactoMineR)
>>>>x1=read.table("freqtest.txt",header=TRUE)
>>>>xrcc2=x1[,1:8]
>>>>p1=PCA(xrcc2, graph=FALSE)
>>>>p1$var
>>>
>>> freqtest.txt file lines of text :
>>> M1 M2 M3 M4 M5 M6 M7 M8
>>> -1 -1 -1 -1 -1 -1 -1 -1
>>> 0 0 0 0 -1 -1 1 1
>>> -1 -1 -1 -1 -1 -1 -1 -1
>>> 0 0 0 0 -1 -1 1 1
>>>
>>> I also tried mean(xrcc2) and sd(xrcc2) on both machines, the results are
>>> the
>>> same.
>>> Please explain.
>>>
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/Different-PCA-results-under-Windows-and-Linux-tp19520449p19520449.html
>>> Sent from the R help mailing list archive at Nabble.com.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>>
>
>
--
View this message in context: http://www.nabble.com/Different-PCA-results-under-Windows-and-Linux-tp19520449p19539474.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list