[R] A small nag

Joshua Wiley jwiley.psych at gmail.com
Mon Aug 15 06:03:11 CEST 2011


On Sun, Aug 14, 2011 at 8:41 PM, Chintanu <chintanu at gmail.com> wrote:
> Hello Joshua,
>
> I could feel that my explanation was bad so far. Now, giving another effort
> here to simplify things :
>
> I have a dataframe ("file") containing 8 samples (in columns). Those
> samples' results (numericals) are available in the dataframe's rows.
>
> LGD is another vector.
>
> LGD <- c(11.6, 12.3, 15.8, 33.1, 43.5, 51.3, 67.3, 84.9)
>
> Now, correlation needs to be found between -
>
> i) each of the rows of the dataframe, and
> ii) LGD

Ah, rows, then try:

apply(file[1:47321, 3:10], 1, cor, y = LGD)

it is basically an implicit for loop that loops through the first
argument (your file matrix), row by row, correlating each row with
LGD.  So:

cor(file[1, 3:10], LGD)

from 1 to 47321. It is not very efficient, but even on my slow laptop
it is a matter of seconds so speed is probably not a big issue unless
it is part of a simulation or something.  I have a sense that a smidge
of clever work with matrices could avoid the apply() call, but its not
jumping out at me.

Cheers!

Josh

> Thanks,
> Chintanu
>
>
>
> ===============================================================
>
> On Mon, Aug 15, 2011 at 1:12 PM, Joshua Wiley <jwiley.psych at gmail.com>
> wrote:
>>
>> On Sun, Aug 14, 2011 at 7:21 PM, Chintanu <chintanu at gmail.com> wrote:
>> > Hi Joshua,
>> >
>> > SORRY for not making that clear. I wish to have the correlation values
>> > between each column of my "file" with the "LGD". For example:
>> >
>> > cor (Column 1, LGD)
>> > cor (column 2, LGD) ... so on.
>>
>> Okay, you need to make a tractable example.  Create or give us data
>> where cor(Column1, LGD) works.  LGD is a vector of length 8, file is
>> probably some sort of matrix or data frame, which you are extracting
>> part of, but there are way too many possible ways to repeat,
>> transpose, twist, and otherwise manipulate the data into some sort of
>> correlatable form (using rep() is not sufficient---that just gives you
>> a really long vector).
>>
>> If you are currently under the impression that it is possible to
>> correlate a 47231 x 1 matrix with a vector of length 8, read the
>> Wikipedia page so you understand how correlation works:
>> http://en.wikipedia.org/wiki/Correlation_and_dependence.
>>
>> >
>> > The first one you have provided is producing an error :
>> >
>> >> sapply(file[1:47231, 3:10], FUN = cor, y = rep(LGD, 47231), method =
>> >> "pearson")
>> > Error in FUN(X[[1L]], ...) : incompatible dimensions
>> > Cheers,
>> > Chintanu
>> >
>> >
>> > ===============================================
>> >
>> > On Mon, Aug 15, 2011 at 12:09 PM, Joshua Wiley <jwiley.psych at gmail.com>
>> > wrote:
>> >>
>> >> Hi Chintanu,
>> >>
>> >> Do you want the correlation of columns 3:10 of file with the y vector
>> >> or do you want a correlation matrix of all variables?
>> >>
>> >> ## correlation between cols 3:10 and y
>> >> sapply(file[1:47231, 3:10], FUN = cor, y = rep(LGD, 47231), method =
>> >> "pearson")
>> >>
>> >> ## correlation matrix
>> >> cor(cbind(file[1:47231, 3:10], rep(LGD, 47231)), method = "pearson")
>> >>
>> >> HTH,
>> >>
>> >> Josh
>> >>
>> >>
>> >> On Sun, Aug 14, 2011 at 7:02 PM, Chintanu <chintanu at gmail.com> wrote:
>> >> > Hi,
>> >> >
>> >> > I am not sure how to fix the following error.
>> >> >
>> >> > LGD <-  c(11.6,   12.3,      15.8,      33.1,      43.5,      51.3,
>> >> > 67.3,      84.9)
>> >> >
>> >> > cor (x=(file [1:47231,3:10]), y= rep (LGD, 47231), method =
>> >> > "pearson")
>> >> >
>> >> > Error in cor(x = (file[1:47231, 3:10]), y = rep(LGD, 47231), method =
>> >> > "pearson") :
>> >> >
>> >> >  incompatible dimensions
>> >> >
>> >> >> sessionInfo()
>> >> >
>> >> > R version 2.13.0 (2011-04-13)
>> >> >
>> >> > Platform: i386-pc-mingw32/i386 (32-bit)
>> >> >
>> >> > locale:
>> >> >
>> >> > [1] LC_COLLATE=English_Australia.1252
>> >> >  LC_CTYPE=English_Australia.1252
>> >> >   LC_MONETARY=English_Australia.1252
>> >> > LC_NUMERIC=C                       LC_TIME=English_Australia.1252
>> >> >
>> >> > attached base packages:
>> >> >
>> >> > [1] stats     graphics  grDevices utils     datasets  methods   base
>> >> >
>> >> > loaded via a namespace (and not attached):
>> >> >
>> >> > [1] tools_2.13.0
>> >> >
>> >> > Thank you.
>> >> >
>> >> > Kind regards,
>> >> >
>> >> > Chintanu
>> >> >
>> >> >        [[alternative HTML version deleted]]
>> >> >
>> >> > ______________________________________________
>> >> > R-help at r-project.org mailing list
>> >> > https://stat.ethz.ch/mailman/listinfo/r-help
>> >> > PLEASE do read the posting guide
>> >> > http://www.R-project.org/posting-guide.html
>> >> > and provide commented, minimal, self-contained, reproducible code.
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Joshua Wiley
>> >> Ph.D. Student, Health Psychology
>> >> Programmer Analyst II, ATS Statistical Consulting Group
>> >> University of California, Los Angeles
>> >> https://joshuawiley.com/
>> >
>> >
>>
>>
>>
>> --
>> Joshua Wiley
>> Ph.D. Student, Health Psychology
>> Programmer Analyst II, ATS Statistical Consulting Group
>> University of California, Los Angeles
>> https://joshuawiley.com/
>
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/



More information about the R-help mailing list