[R] Nested foreach loops in R repeating items

Bert Gunter gunter.berton at gene.com
Thu Feb 6 01:42:17 CET 2014


I don't think you answered the OP's query, although I confess that I
am not so sure I understand it either (see below). In any case, I
believe the R level loop (i.e. apply()) is unnecessary. There is a
unique (and a duplicated()) method for data frames, so simply

unique(x)

returns a data frame with all the unique rows of x.

However, I don't think that's what the OP wanted. (S)he appeared to
want all unique combinations of 2 columns. If I got that right (??),
combn(ncol(x),2) gives that and could be used for indexing. I'm not
sure parallel processing is useful here, but then again, I may have
misunderstood the query. If so, my apologies, and feel free to ignore
all the above :-(  .


Cheers,
Bert




Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
H. Gilbert Welch




On Wed, Feb 5, 2014 at 3:26 PM, arun <smartpink111 at yahoo.com> wrote:
> Hi,
> Try ?duplicated()
>  apply(x,2,function(x) {x[duplicated(x)]<-"";x})
> A.K.
>
>
>
> Hi all,
>
> I have a dataset of around a thousand column and a few thousands
>  of rows. I'm trying to get all the possible combinations (without
> repetition) of the data columns and process them in parallel. Here's a
> simplification of what my data and my code looks like:
>
> mydata <- structure(list(col1 = c(231L, 8946L, 534L), col2 = c(123L, 2361L,
> 65L), col3 = c(5645L, 45L, 51L), col4 = c(654L, 356L, 32L), col5 = c(21L,
> 1L, 51L), col6 = c(4L, 4515L, 15L), col7 = c(6L, 1L, 535L), col8 = c(894L,
> 20L, 35L), col9 = c(68L, 21L, 123L), col10 = c(46L, 2L, 2L)), .Names = c("col1",
> "col2", "col3", "col4", "col5", "col6", "col7", "col8", "col9",
> "col10"), class = "data.frame", row.names = c(NA, -3L))
>
> require(foreach)
>
> x <-
> foreach(m=1:5, .combine='cbind') %:%
> foreach(j=(m+1):10, .combine='c') %do% {
> paste(colnames(mydata)[m], colnames(mydata)[j])
>
> }
>
> x
>
>
>
> if you execute the command above in R, you will get this result.
>
>
>
>       result.1     result.2     result.3     result.4     result.5
>  [1,] "col1 col2"  "col2 col3"  "col3 col4"  "col4 col5"  "col5 col6"
>  [2,] "col1 col3"  "col2 col4"  "col3 col5"  "col4 col6"  "col5 col7"
>  [3,] "col1 col4"  "col2 col5"  "col3 col6"  "col4 col7"  "col5 col8"
>  [4,] "col1 col5"  "col2 col6"  "col3 col7"  "col4 col8"  "col5 col9"
>  [5,] "col1 col6"  "col2 col7"  "col3 col8"  "col4 col9"  "col5 col10"
>  [6,] "col1 col7"  "col2 col8"  "col3 col9"  "col4 col10" "col5 col6"
>  [7,] "col1 col8"  "col2 col9"  "col3 col10" "col4 col5"  "col5 col7"
>  [8,] "col1 col9"  "col2 col10" "col3 col4"  "col4 col6"  "col5 col8"
>  [9,] "col1 col10" "col2 col3"  "col3 col5"  "col4 col7"  "col5 col9"
>
> notice that first problem I face that in the last row of the
> second column of the  "x" matrix says "col2 col3" which is a repetition
> of the first item (which happens also in all succeeding columns). I was
> planning to have unique combinations of all columns, which obviously,
> did not work.
>
> Can somebody please help me with this? My desired output would be
>
>
>
>       result.1     result.2     result.3     result.4     result.5
>  [1,] "col1 col2"  "col2 col3"  "col3 col4"  "col4 col5"  "col5 col6"
>  [2,] "col1 col3"  "col2 col4"  "col3 col5"  "col4 col6"  "col5 col7"
>  [3,] "col1 col4"  "col2 col5"  "col3 col6"  "col4 col7"  "col5 col8"
>  [4,] "col1 col5"  "col2 col6"  "col3 col7"  "col4 col8"  "col5 col9"
>  [5,] "col1 col6"  "col2 col7"  "col3 col8"  "col4 col9"
>  [6,] "col1 col7"  "col2 col8"  "col3 col9"
>  [7,] "col1 col8"  "col2 col9"
>  [8,] "col1 col9"  "col2 col10"
>  [9,] "col1 col10"
>
>
> Many thanks
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list