[R] Alternative to lops
Berry, Charles
ccberry @end|ng |rom uc@d@edu
Thu Apr 4 20:01:16 CEST 2019
Comments inline, but first:
Please review the posting guide and follow the instructions there, especially:
1) "No HTML posting..."
2) "When providing examples, it is best to give an R command that constructs the data,..."
> On Apr 4, 2019, at 9:41 AM, Ek Esawi <esawiek using gmail.com> wrote:
>
> Hi All--
>
> Sorry i sent the one inadvertently
>
> Her is a sample of my data. A data frame (MyDF) and a list (MyList). My
> own data frame has over 10,000 rows. I want to find out which elements of
> MyDF$B contain any element(s) of MYList; then change MyDF$C to the name of
> the vector of the list that has match.
>
> I solved this via loops and if statements, using &in& but I am hoping for
> a better solution using the apply family functions. I tried something like
> this but did not work.
>
> lapply(strsplit(MyDF$B," "),function(x) lapply(MyList,function(y) if(sum(y
> %in% x)>0,x$Code==y[[1]]))
>
> Thanks in advance--EK
>
> My Sample data
>
>> MyDF
>
> A B C
> 1 1 aa ab ac 0
> 2 2 bb bc bd 0
> 3 3 cc cf 0
> 4 4 dd 0
> 5 5 ee 0
Note: You did not tell us if myDF$B is a factor, in which case strsplit needs to accommodate multiple blanks:
levels(MyDF$B)
[1] " dd" " ee" " cc cf" "aa ab ac" "bb bc bd"
>
>
>> MyList
>
> $X
> [1] "a" "ba" "cc"
>
> $Y
> [1] "abs" "aa" "BA" "BB"
>
> $z
> [1] "ab" "bb" "xy" "zy" "gh"
>
>
>
> Desired results.
>
>
>
>> MyDF
>
> A B C
> 1 1 aa ab ac Y
'aa' matches Y, 'ab' matches z, 'cc' does not match
> 2 2 bb bc bd Y
Huh? 'bb' matches z, 'bc' and 'bd' do not match,
> 3 3 cc cf X
'cc' matches X, 'cf' does not match
> 4 4 dd 0
> 5 5 ee 0
>
Neither match.
You need to clarify what it is you seek. The example is hard to penetrate.
Maybe this helps you:
> queries <- strsplit(as.character(MyDF$B), "[ ]+")
> matches <- match( unlist(queries), unlist(MyList), 0)
> hits <- findInterval( matches, 1+cumsum(c(0,lengths(MyList))))
> hitList <- relist(hits, queries)
> hitList
[[1]]
[1] 2 3 0
[[2]]
[1] 3 0 0
[[3]]
[1] 0 1 0
[[4]]
[1] 0 0
[[5]]
[1] 0 0
You can now process hitList to get the desired vector.
HTH,
Chuck
More information about the R-help
mailing list