[R] Tables package - remove NAs and NaN
Duncan Murdoch
murdoch.duncan at gmail.com
Tue Apr 23 14:13:10 CEST 2013
On 13-04-23 6:31 AM, Duncan Murdoch wrote:
> On 13-04-22 10:40 PM, David Winsemius wrote:
>>
>> On Apr 22, 2013, at 5:49 PM, Santosh wrote:
>>
>>> Dear Rxperts,
>>> q <- data.frame(p=rep(c("A","B"),each=10,len=30),
>>> a=rep(c(1,2,3),each=10),id=seq(30),
>>> b=round(runif(30,10,20)),
>>> c=round(runif(30,40,70)))
>>> The operation below...
>>> tabular(((p=factor(p))*(a=factor(a))+1) ~ (N = 1) + (b + c)*
>>> (mean+sd),data=q)
>>> yields some rows of NAs and NaN as shown below
>>>
>>> b c
>>> p a N mean sd mean sd
>>> A 1 10 16.30 2.497 52.30 9.358
>>> 2 0 NaN NA NaN NA
>>> 3 10 15.60 2.716 60.30 8.001
>>> B 1 0 NaN NA NaN NA
>>> 2 10 15.40 2.366 57.70 10.414
>>> 3 0 NaN NA NaN NA
>>> All 30 15.77 2.473 56.77 9.601
>>>
>>> How do I remove the rows having N=0 ?
>>> I would like the resulting table look like..
>>> b c
>>> p a N mean sd mean sd
>>> A 1 10 16.30 2.497 52.30 9.358
>>> 3 10 15.60 2.716 60.30 8.001
>>> B 2 10 15.40 2.366 57.70 10.414
>>> All 30 15.77 2.473 56.77 9.601
>>
>> Here's a bit of a hack:
>>
>> tabular( (`p a`=interaction(p,a, drop=TRUE, sep=" ")) ~ (N = 1) + (b + c)*
>> (mean+sd),data=q)
>>
>> b c
>> p a N mean sd mean sd
>> A 1 10 12.8 0.7888 52.1 8.020
>> B 2 10 16.3 3.0569 54.9 8.711
>> A 3 10 14.6 3.7771 56.5 6.980
>>
>> I have been rather hoping that Duncan Murdoch would have noticed the earlier thread, but maybe he can comment on whether there is a more direct route/
>>
>
> This isn't something that the package is designed to handle: if you say
> p*a, it wants all combinations of p and a.
>
> If I wanted a table like that, I'd use a different hack. One
> possibility is to create that interaction column, but display it as just
> the initial letter, labelled p, and then add another column to contain
> the a values as data. It would be tricky to get the formatting right.
>
> Another possibility is to generate the whole table with the N=0 rows,
> and then post-process it to remove those rows, and adjust the row labels
> appropriately. This approach probably gives the nicer result, but the
> post-processing is quite messy: you need to delete some rows from the
> table, from its rowLabels attribute, and from the justification
> attributes of both the table and its rowLabels. (I should add a [
> method to the package to hide this messiness.)
I've done this now, in version 0.7.54 on R-forge. To leave out the rows
with N=0, you can select a subset of the table where N (the first
column) is non-zero:
tab <- tabular(((p=factor(p))*(a=factor(a))+1) ~ (N = 1) + (b +
c)*(mean+sd),data=q)
tab[ tab[,1] > 0, ]
and it produces this:
b c
p a N mean sd mean sd
A 1 10 16.20 3.458 56.3 10.155
3 10 13.60 2.119 58.1 8.075
B 2 10 14.40 2.547 51.2 9.438
All 30 14.73 2.888 55.2 9.419
Indexing of tables isn't as general as indexing of matrices, but most of
the simple forms should work. I haven't tested yet, but I expect this
will be fine in LaTeX or HTML (also new, not on CRAN yet) output as well.
Duncan Murdoch
More information about the R-help
mailing list