[R] Big data and column correspondence problem
Daniel Malter
daniel at umd.edu
Tue Jul 26 23:31:27 CEST 2011
This is much clearer. So here is what I think you want to do. In theory and
practice:
Theory:
Check if AA[i] is in BB
If AA[i] is in BB, then take the row where BB[j] == AA[i] and check whether
A1 and A2 are in B1 to B3. Is that right? Only if both are, you want the
indicator to take 1.
Here is how you do this:
newdata<-merge(A,B,by.x='AA',by.y='BB',all.x=F,all.y=F)
A1.check<-with(newdata,A1==B1|A1==B2|A1==B3)
B1.check<-with(newdata,A2==B1|A1==B2|A1==B3)
A1.check<-replace(A1.check,which(is.na(A1.check)),0)
B1.check<-replace(B1.check,which(is.na(B1.check)),0)
newdata<-data.frame(newdata,A1.check,B1.check)
newdata$index<-with(newdata,ifelse(A1.check+B1.check==2,1,0))
HTH,
Daniel
murilofm wrote:
>
>>>I can not see A1[1]=20 in your example data.
>
> Sorry about the typos.... A1[1]=3.
>
>>> Why B[3,]?
>
> Because AA[1]=BB[3]=4.
>
> I will reformulate the example with the code I'm running:
>
> AA = c(4,4,4,2,2,6,8,9)
> A1 = c(3,3,11,5,5,7,11,12)
> A2 = c(3,3,7,3,5,7,11,12)
> A = cbind(AA, A1, A2)
>
> BB = c(2,2,4,6,6)
> B1 =c(5,11,7,13,NA)
> B2 =c(4,12,11,NA,NA)
> B3 =c(12,13,NA,NA,NA)
>
> A = cbind(AA, A1, A2,0)
> B=cbind(BB,B1,B2,B3)
>
> for(i in 1:dim(A)[1]){
> if (!is.na(sum(match(A[i,2:3],B[B[,1]==A[i,1],2:dim(B)[2]])))) A[i,4]<-1
> }
>
> Thanks
>
--
View this message in context: http://r.789695.n4.nabble.com/Big-data-and-column-correspondence-problem-tp3694912p3697067.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list