[Bioc-devel] rbind,DataFrame vs rbind.data.frame
Hervé Pagès
hpages at fhcrc.org
Wed Apr 30 19:36:15 CEST 2014
Thanks! I didn't see the test in test_IRanges-class.R. In the process
of moving things from IRanges to S4Vectors, I run into things that have
been written a long time ago. When the code looks kind of suspect, I try
to run it in various ways to check its behavior. I was doing this for
our internal utility rbind.mcols when I ran into this rbind,DataFrame
vs rbind.data.frame discrepancy. There are other problems with
rbind.mcols() and I might come back later about that.
H.
On 04/30/2014 04:58 AM, Michael Lawrence wrote:
> There is a pre-historic coercion in rbind,DataFrame that dates back to
> XDataFrame that turns every combined column into the class of the column
> in the first argument to rbind(). At least with Emacs VCS I couldn't get
> it to go back far enough in the logs. There is a comment there that I no
> longer understand and may no longer be relevant.
>
> So I got rid of that line, and somewhat amusingly there was a test
> almost *exactly* like your example except it expected the undesired
> result. I think Patrick wrote it but it's been touched too many times
> over the years to easily know, and I'm not sure what he was thinking.
>
> Checked it in; we'll see how it goes.
>
> Thanks,
> Michael
>
>
> On Tue, Apr 29, 2014 at 11:24 PM, Hervé Pagès <hpages at fhcrc.org
> <mailto:hpages at fhcrc.org>> wrote:
>
> Hi Michael,
>
> I noticed this difference between DataFrame vs data.frame when doing
> rbind():
>
> > rbind(data.frame(aa=NA), data.frame(aa=1:2))
> aa
> 1 NA
> 2 1
> 3 2
>
> > rbind(DataFrame(aa=NA), DataFrame(aa=1:2))
> DataFrame with 4 rows and 1 column
> aa
> <logical>
> 1 NA
> 2 TRUE
> 3 TRUE
>
> If the DataFrame with NAs is put after the DataFrame with integers,
> things look better:
>
> > rbind(DataFrame(aa=1:3), DataFrame(aa=NA))
> DataFrame with 4 rows and 1 column
> aa
> <integer>
> 1 1
> 2 2
> 3 3
> 4 NA
>
> As a consequence, combining 2 Vector objects, one with no metadata cols
> and one with metadata cols, will loose the data if the object with no
> metadata cols is put first:
>
> ir1 <- IRanges(1:2, 5)
> ir2 <- IRanges(11:13, 14)
> mcols(ir2) <- DataFrame(score=2:0)
>
> Then:
>
> > mcols(c(ir1, ir2))
> DataFrame with 5 rows and 1 column
> score
> <logical>
> 1 NA
> 2 NA
> 3 TRUE
> 4 TRUE
> 5 FALSE
>
> How hard it would be to bring the rbind,DataFrame method in line with
> rbind.data.frame?
>
> Thanks,
> H.
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages at fhcrc.org <mailto:hpages at fhcrc.org>
> Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
> Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
>
> _________________________________________________
> Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org> mailing list
> https://stat.ethz.ch/mailman/__listinfo/bioc-devel
> <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
>
>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the Bioc-devel
mailing list