[R] Help with data management

Jim Lemon drjimlemon at gmail.com
Fri Feb 24 01:00:23 CET 2017


Hi Andre,
As far as I am aware, merges can only be accomplished between two data
frames, so I think you would have to do it one by one. It is probably
possible to program this to operate on your list of data frames, but I
suspect that it would take as much time as a bit of copying and
pasting. If your data is being extracted from an external database, it
may be possible to perform the operation in SQL, I don't have the time
to work that out at the moment.

Jim


On Fri, Feb 24, 2017 at 10:53 AM, André Luis Neves <andrluis at ualberta.ca> wrote:
> Hi, Jim:
>
> Your code worked great, but I have 48 dataframes. After merging A and B in
> D, you merged C in D. In this case, do I need to add them one by one until
> getting the 48 dataframes merged in one?
>
> Thank you for your great help.
>
> Andre
>
> On Thu, Feb 23, 2017 at 4:24 PM, Jim Lemon <drjimlemon at gmail.com> wrote:
>>
>> Hi Andre,
>> This might do it:
>>
>> A<-data.frame(c("c", "d", "e"),4.4:6.8,c(1,2,3))
>> colnames(A) <- c ("Family", "NormalizedCount", "Hits")
>> B<-data.frame(c("c", "f", "a"),c(3.2,6.4, 4.4), c(1,4,3))
>> colnames(B) <- c ("Family", "NormalizedCount", "Hits")
>> C<-data.frame(c("q", "o", "f"),c(7.2,9.4, 41.4), c(10,4,5))
>> colnames(C) <- c ("Family", "NormalizedCount", "Hits")
>> keepcols<-c("Family","Hits")
>> D<-merge(A[,keepcols],B[,keepcols],by="Family",all=TRUE)
>> D<-merge(D,C[,keepcols],by="Family",all=TRUE)
>> D[,2:4]<-sapply(D[,-1],function(x) { x[is.na(x)]<-0; x })
>> names(D)<-c("Family","A","B","C")
>>
>> Jim
>>
>>
>> On Fri, Feb 24, 2017 at 9:37 AM, André Luis Neves <andrluis at ualberta.ca>
>> wrote:
>> > Dear R users,
>> >
>> > I have the following dataframes (A, B, and C) stored in a list:
>> >
>> > A= data.frame(c("c", "d", "e"),4.4:6.8,c(1,2,3))
>> > colnames(A) <- c ("Family", "NormalizedCount", "Hits")
>> > A
>> >
>> >
>> > B= data.frame(c("c", "f", "a"),c(3.2,6.4, 4.4), c(1,4,3))
>> > colnames(B) <- c ("Family", "NormalizedCount", "Hits")
>> > B
>> >
>> >
>> > C= data.frame(c("q", "o", "f"),c(7.2,9.4, 41.4), c(10,4,5))
>> > colnames(C) <- c ("Family", "NormalizedCount", "Hits")
>> > C
>> >
>> > mylist <- list(A=A,B=B,C=C)
>> > mylist
>> >
>> >
>> > My idea is to merge the three dataframes into another dataframe (let's
>> > name
>> > it: 'D')  with a structure in which the rows are the Families and
>> > columns
>> > the "Hits" of each family detected in the dataframes A, B, and C. If a
>> > given 'Family' does NOT have a 'Hit' in the dataframe we need to assign
>> > number 0 to it.
>> >
>> > The dataframe 'D' would need to be populated as follows:
>> >
>> >
>> > Family                                                      A
>> >        B                                      C
>> > c 1 1 0
>> > d 2 0 0
>> > e 3 0 0
>> > f 0 4 5
>> > a 0 3 0
>> > q 0 0 10
>> > o 0 0 4
>> >
>> >
>> > Thank you very much for your great help,
>> >
>> >
>> >
>> > --
>> > Andre
>> >
>> >         [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
> --
> Andre



More information about the R-help mailing list