[R-sig-eco] Association Routine?
Bob O'Hara
bohara at senckenberg.de
Sat Feb 28 19:42:45 CET 2015
On 02/28/2015 06:48 PM, Alexandre F. Souza wrote:
> Dear friends,
>
> I need to write a code to find data using one variable as reference. The
> code I wrote, however, is not working and I can't figure it out why. Could
> anyone help me?
>
> Imagine a data set with two variables, B and C. Now I have variable A,
> which is the same variable as variable B but the data are not in the same
> order nor have necessarily the same extension as B (it may be a sample of
> B, for example).
>
> I want to find the values of variable C that match each line in variable A
> using B as the association criterion. So the code should perform a loop in
> which it would take the first line in A, search B until it finds it there,
> then copy the corresponding value of C and store it in a new variable D. Do
> it until all lines in A have been associated to a C value.
starting with...
df<-data.frame(B=sample(letters[1:10],replace=FALSE), C=rnorm(10),
stringsAsFactors=FALSE)
A=letters[1:10]
two thoughts spring to mind:
(a) would merge() do what you want? e.g. df2 <-
merge(df,data.frame(A=A), by.x="B", by.y="A"), and then extract the
values of C with df2$C[df2$B=="f"], for example.
(b) sapply(A, function(lt, DF) DF$C[DF$B==lt], DF=df)
R's looping is generally more efficient when it's done internally, so it
will be easier for you if you understand the R mentality, in particular
vectorisation. usually if you have a for() loop, you're not writing R
code efficiently.
Bob
> Here is the code I wrote:
>
>
> # Considering that matrices data.ref and data.assoc have been already read,
> containing the
>
> # User-defined number of columns to be associated with A (I imagined that
> more than one variable could be associated at once)
> col.assoc = 20
>
> # To assure that data will not be in a non-usable data category
> ref = as.matrix(data.ref)
> assoc = as.matrix(data.assoc)
>
>
> # Table where results will be stored
> # Number of columns = n associated variables plus one column
> # Reserved to receive the initial data (example column A)
>
> result = matrix(nrow = nrow(ref), ncol = col.assoc + 1)
>
> # Fulfill the first column of the result table with the original reference
> variable
>
> result[,1] = ref[,1]
>
>
> for (i in 1:nrow(ref)){
> for (j in 1:nrow(assoc))
> if (ref[i, 1] == assoc[j, 1]){
> resultado[i, 2] == assoc[j, 2]
> }
> }
>
>
>
> col = ncol(dados)
>
> ####
>
> Any thoughts?
>
> Thanks in advance,
>
> Alexandre
>
--
Bob O'Hara
Biodiversity and Climate Research Centre
Senckenberganlage 25
D-60325 Frankfurt am Main,
Germany
Tel: +49 69 7542 1863
Mobile: +49 1515 888 5440
WWW: http://www.bik-f.de/root/index.php?page_id=219
Blog: http://blogs.nature.com/boboh
Journal of Negative Results - EEB: www.jnr-eeb.org
More information about the R-sig-ecology
mailing list