[R] integrating 2 lists and a data frame in R

Bogdan Tanasa tanasa at gmail.com
Tue Jun 6 17:27:06 CEST 2017


Thank you Bert for your suggestion ;).

On Tue, Jun 6, 2017 at 8:19 AM, Bert Gunter <bgunter.4567 at gmail.com> wrote:

> Simple matrix indexing suffices without any fancier functionality.
>
> ## First convert M and N to character vectors -- which they should
> have been in the first place!
>
> M <- sort(as.character(M[,1]))
> N <-  sort(as.character(N[,1]))
>
> ## This could be a one-liner, but I'll split it up for clarity.
>
> res <-matrix(NA, length(M),length(N),dimnames = list(M,N))
>
> res[as.matrix(C[,2:1])] <- C$I ## matrix indexing
>
> res
>
> Cheers,
> Bert
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Tue, Jun 6, 2017 at 7:46 AM, Bogdan Tanasa <tanasa at gmail.com> wrote:
> > Thank you David. Using xtabs operation simplifies the code very much,
> many
> > thanks ;)
> >
> > On Tue, Jun 6, 2017 at 7:44 AM, David Winsemius <dwinsemius at comcast.net>
> > wrote:
> >
> >>
> >> > On Jun 6, 2017, at 4:01 AM, Jim Lemon <drjimlemon at gmail.com> wrote:
> >> >
> >> > Hi Bogdan,
> >> > Kinda messy, but:
> >> >
> >> > N <- data.frame(N=c("n1","n2","n3","n4"))
> >> > M <- data.frame(M=c("m1","m2","m3","m4","m5"))
> >> > C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"),
> >> I=c(100,300,400))
> >> > MN<-as.data.frame(matrix(NA,nrow=length(N[,1]),ncol=length(M[,1])))
> >> > names(MN)<-M[,1]
> >> > rownames(MN)<-N[,1]
> >> > C[,1]<-as.character(C[,1])
> >> > C[,2]<-as.character(C[,2])
> >> > for(row in 1:dim(C)[1]) MN[C[row,1],C[row,2]]<-C[row,3]
> >>
> >> `xtabs` offers another route:
> >>
> >> C$m <- factor(C$m, levels=M$M)
> >> C$n <- factor(C$n, levels=N$N)
> >>
> >> Option 1:  Zeroes in the empty positions:
> >> > (X <- xtabs(I ~ m+n , C, addNA=TRUE))
> >>     n
> >> m     n1  n2  n3  n4
> >>   m1 100 300   0   0
> >>   m2   0   0   0   0
> >>   m3   0   0 400   0
> >>   m4   0   0   0   0
> >>   m5   0   0   0   0
> >>
> >> Option 2: Sparase matrix
> >> > (X <- xtabs(I ~ m+n , C, sparse=TRUE))
> >> 5 x 4 sparse Matrix of class "dgCMatrix"
> >>     n
> >> m     n1  n2  n3 n4
> >>   m1 100 300   .  .
> >>   m2   .   .   .  .
> >>   m3   .   . 400  .
> >>   m4   .   .   .  .
> >>   m5   .   .   .  .
> >>
> >> I wasn't sure if the sparse reuslts of xtabs would make a distinction
> >> between 0 and NA, but happily it does:
> >>
> >> > C <- data.frame(n=c("n1","n2","n3", "n3", "n4"), m=c("m1","m1","m3",
> >> "m4", "m5"), I=c(100,300,400, NA, 0))
> >> > C
> >>    n  m   I
> >> 1 n1 m1 100
> >> 2 n2 m1 300
> >> 3 n3 m3 400
> >> 4 n3 m4  NA
> >> 5 n4 m5   0
> >> > (X <- xtabs(I ~ m+n , C, sparse=TRUE))
> >> 4 x 4 sparse Matrix of class "dgCMatrix"
> >>     n
> >> m     n1  n2  n3 n4
> >>   m1 100 300   .  .
> >>   m3   .   . 400  .
> >>   m4   .   .   .  .
> >>   m5   .   .   .  0
> >>
> >> (In the example I forgot to repeat the lines that augmented the factor
> >> levels so m2 is not seen.
> >>
> >> --
> >> Davod
> >> >
> >> >
> >> > Jim
> >> >
> >> > On Tue, Jun 6, 2017 at 3:51 PM, Bogdan Tanasa <tanasa at gmail.com>
> wrote:
> >> >> Dear Bert,
> >> >>
> >> >> thank you for your response. here it is the piece of R code : given 3
> >> data
> >> >> frames below ---
> >> >>
> >> >> N <- data.frame(N=c("n1","n2","n3","n4"))
> >> >>
> >> >> M <- data.frame(M=c("m1","m2","m3","m4","m5"))
> >> >>
> >> >> C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"),
> >> I=c(100,300,400))
> >> >>
> >> >> how shall I integrate N, and M, and C in such a way that at the end
> we
> >> have
> >> >> a data frame with :
> >> >>
> >> >>
> >> >>   - list N as the columns names
> >> >>   - list M as the rows names
> >> >>   - the values in the cells of N * M, corresponding to the numerical
> >> >>   values in the data frame C.
> >> >>
> >> >> more precisely, the result shall be :
> >> >>
> >> >>     n1  n2  n3 n4
> >> >> m1  100  200   -   -
> >> >> m2   -   -   -   -
> >> >> m3   -   -   300   -
> >> >> m4   -   -   -   -
> >> >> m5   -   -   -   -
> >> >>
> >> >> thank you !
> >> >>
> >> >>
> >> >> On Mon, Jun 5, 2017 at 6:57 PM, Bert Gunter <bgunter.4567 at gmail.com>
> >> wrote:
> >> >>
> >> >>> Reproducible example, please. -- In particular, what exactly does C
> >> look
> >> >>> ilike?
> >> >>>
> >> >>> (You should know this by now).
> >> >>>
> >> >>> -- Bert
> >> >>> Bert Gunter
> >> >>>
> >> >>> "The trouble with having an open mind is that people keep coming
> along
> >> >>> and sticking things into it."
> >> >>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >> >>>
> >> >>>
> >> >>> On Mon, Jun 5, 2017 at 6:45 PM, Bogdan Tanasa <tanasa at gmail.com>
> >> wrote:
> >> >>>> Dear all,
> >> >>>>
> >> >>>> please could you advise on the R code I could use in order to do
> the
> >> >>>> following operation :
> >> >>>>
> >> >>>> a. -- I have 2 lists of "genome coordinates" : a list is composed
> by
> >> >>>> numbers that represent genome coordinates;
> >> >>>>
> >> >>>> let's say list N :
> >> >>>>
> >> >>>> n1
> >> >>>>
> >> >>>> n2
> >> >>>>
> >> >>>> n3
> >> >>>>
> >> >>>> n4
> >> >>>>
> >> >>>> and a list M:
> >> >>>>
> >> >>>> m1
> >> >>>>
> >> >>>> m2
> >> >>>>
> >> >>>> m3
> >> >>>>
> >> >>>> m4
> >> >>>>
> >> >>>> m5
> >> >>>>
> >> >>>> 2 -- and a data frame C, where for some pairs of coordinates (n,m)
> >> from
> >> >>> the
> >> >>>> lists above, we have a numerical intensity;
> >> >>>>
> >> >>>> for example :
> >> >>>>
> >> >>>> n1; m1; 100
> >> >>>>
> >> >>>> n1; m2; 300
> >> >>>>
> >> >>>> The question would be : what is the most efficient R code I could
> use
> >> in
> >> >>>> order to integrate the list N, the list M, and the data frame C, in
> >> order
> >> >>>> to obtain a DATA FRAME,
> >> >>>>
> >> >>>> -- list N as the columns names
> >> >>>> -- list M as the rows names
> >> >>>> -- the values in the cells of N * M, corresponding to the numerical
> >> >>> values
> >> >>>> in the data frame C.
> >> >>>>
> >> >>>> A little example would be :
> >> >>>>
> >> >>>>      n1  n2  n3 n4
> >> >>>>
> >> >>>>      m1  100  -   -   -
> >> >>>>
> >> >>>>      m2  300  -   -   -
> >> >>>>
> >> >>>>      m3   -   -   -   -
> >> >>>>
> >> >>>>      m4   -   -   -   -
> >> >>>>
> >> >>>>      m5   -   -   -   -
> >> >>>> I wrote a script in perl, although i would like to do this in R
> >> >>>> Many thanks ;)
> >> >>>> -- bogdan
> >> >>>>
> >> >>>>        [[alternative HTML version deleted]]
> >> >>>>
> >> >>>> ______________________________________________
> >> >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> >>>> https://stat.ethz.ch/mailman/listinfo/r-help
> >> >>>> PLEASE do read the posting guide http://www.R-project.org/
> >> >>> posting-guide.html
> >> >>>> and provide commented, minimal, self-contained, reproducible code.
> >> >>>
> >> >>
> >> >>        [[alternative HTML version deleted]]
> >> >>
> >> >> ______________________________________________
> >> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> >> PLEASE do read the posting guide http://www.R-project.org/
> >> posting-guide.html
> >> >> and provide commented, minimal, self-contained, reproducible code.
> >> >
> >> > ______________________________________________
> >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > PLEASE do read the posting guide http://www.R-project.org/
> >> posting-guide.html
> >> > and provide commented, minimal, self-contained, reproducible code.
> >>
> >> David Winsemius
> >> Alameda, CA, USA
> >>
> >>
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list