[Rd] nrow(rbind(character(), character())) returns 2 (as documented but very unintuitive, IMHO)

Pages, Herve hp@ge@ @end|ng |rom |redhutch@org
Fri May 17 01:45:16 CEST 2019


Hi Gabe,

   ncol(data.frame(aa=c("a", "b", "c"), AA=c("A", "B", "C")))
   # [1] 2

   ncol(data.frame(aa="a", AA="A"))
   # [1] 2

   ncol(data.frame(aa=character(0), AA=character(0)))
   # [1] 2

   ncol(cbind(aa=c("a", "b", "c"), AA=c("A", "B", "C")))
   # [1] 2

   ncol(cbind(aa="a", AA="A"))
   # [1] 2

   ncol(cbind(aa=character(0), AA=character(0)))
   # [1] 2

   nrow(rbind(aa=c("a", "b", "c"), AA=c("A", "B", "C")))
   # [1] 2

   nrow(rbind(aa="a", AA="A"))
   # [1] 2

   nrow(rbind(aa=character(0), AA=character(0)))
   # [1] 2

hmmm... not sure why ncol(cbind(aa=character(0), AA=character(0))) or 
nrow(rbind(aa=character(0), AA=character(0))) should do anything 
different from what they do.

In my experience, and more generally speaking, the desire to treat 
0-length vectors as a special case that deviates from the 
non-zero-length case has never been productive.

H.


On 5/16/19 13:17, Gabriel Becker wrote:
> Hi all,
>
> Apologies if this has been asked before (a quick google didn't  find it for
> me),and I know this is a case of behaving as documented but its so
> unintuitive (to me at least) that I figured I'd bring it up here anyway. I
> figure its probably going to not be changed,  but I'm happy to submit a
> patch if this is something R-core feels can/should change.
>
> So I recently got bitten by the fact that
>
>> nrow(rbind(character(), character()))
> [1] 2
>
>
> I was checking whether the result of an rbind call had more than one row,
> and that unexpected returned true, causing all sorts of shenanigans
> downstream as I'm sure you can imagine.
>
> Now I know that from ?rbind
>
> For ‘cbind’ (‘rbind’), vectors of zero length (including ‘NULL’)
>>       are ignored unless the result would have zero rows (columns), for
>>
>>       S compatibility.  (Zero-extent matrices do not occur in S3 and are
>>
>>       not ignored in R.)
>>
> But there's a couple of things here. First, for the rowbind  case this
> reads as "if there would be zero columns,  the vectors will not be
> ignored". This wording implies to me that not ignoring the vectors is a
> remedy to the "problem" of the potential for a zero-column return, but
> thats not the case.  The result still has 0 columns, it just does not also
> have zero rows. So even if the behavior is not changed, perhaps this
> wording can be massaged for clarity?
>
> The other issue, which I admit is likely a problem with my intuition, but
> which I don't think I'm alone in having, is that even if I can't have a 0x0
> matrix (which is what I'd prefer) I would have expected/preferred a 1x0
> matrix, the reasoning being that if we must avoid a 0x0 return value, we
> would do the  minimum required to avoid, which is to not ignore the first
> length 0 vector, to ensure a non-zero-extent matrix, but then ignore the
> remaining ones as they contain information for 0 new rows.
>
> Of course I can program around this now that I know the behavior, but
> again, its so unintuitive (even for someone with a fairly well developed
> intuition for R's sometimes "quirky" behavior) that I figured I'd bring it
> up.
>
> Thoughts?
>
> Best,
> ~G
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=WzRf-6PuyYeprM0v55lLX2U-_hYGf__5yf3h6JNdJH0&s=nn76KQtp4viR66768zoSNcH7WpG77Pp8LyhOwYOs674&e=

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages using fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the R-devel mailing list