[R] tapply
Liaw, Andy
andy_liaw at merck.com
Tue Jun 21 19:30:54 CEST 2005
Try:
> (x <- factor(1:2, levels=1:5))
[1] 1 2
Levels: 1 2 3 4 5
> (x <- x[, drop=TRUE])
[1] 1 2
Levels: 1 2
Andy
> From: Weiwei Shi [mailto:helprhelp at gmail.com]
>
> Even before I tried, I already realize it must be true when I read
> this reply! Great job! thanks, Andy.
>
> > str(z)
> `data.frame': 235 obs. of 2 variables:
> $ CLAIMNUM : Factor w/ 1907 levels "0","10000001849",..: 1083 1083
> 1083 1582 1582 1084 1681 1681 1391 1391 ...
> $ SIU.SAVED: int 475 3000 3000 0 0 4352 0 0 4500 3000 ...
>
> So, I have another general question: how to avoid this when I
> do the matching?
> In my case, claimnum does not have to be a factor. I think I can do
> as.integer on it to de-factor it. But, I want to know how to do it w/
> keeping is as factor? btw, what's your way to drop those levels? :)
>
> weiwei
>
>
> On 6/21/05, Liaw, Andy <andy_liaw at merck.com> wrote:
> > What does str(z) say? I suspect the second column is a
> factor, which, after
> > the subsetting, has some empty levels. If so, just drop
> those levels.
> >
> > Andy
> >
> > > From: Weiwei Shi
> > >
> > > hi
> > > i tried all the methods suggested above:
> > > ave and rowsum with "with" function works for my
> situation. I think
> > > the problem might not be due to tapply.
> > > My data z comes from
> > > z<-y[y[[1]] %in% x[[2]], c(1,9)]
> > >
> > > while z is supposed to have no entries for those non-matched
> > > between x and y.
> > >
> > > however, when I run tapply, and the result also includes those
> > > non-matched entries. I use is.na function to remove those
> entry from z
> > > first and then use tapply again, but the result is the same: those
> > > NA's and those non-matched results are still there.
> That's what I mean
> > > by "it doesn't work".
> > >
> > > Is there something I missed here so that z "implicitly" has some
> > > "trace" back to y dataset?
> > >
> > > thanks,
> > >
> > > On 6/20/05, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
> > > > On 6/20/05, Weiwei Shi <helprhelp at gmail.com> wrote:
> > > > > hi,
> > > > > i have another question on tapply:
> > > > > i have a dataset z like this:
> > > > > 5540 389100307391 2600
> > > > > 5541 389100307391 2600
> > > > > 5542 389100307391 2600
> > > > > 5543 389100307391 2600
> > > > > 5544 389100307391 2600
> > > > > 5546 381300302513 NA
> > > > > 5547 387000307470 NA
> > > > > 5548 387000307470 NA
> > > > > 5549 387000307470 NA
> > > > > 5550 387000307470 NA
> > > > > 5551 387000307470 NA
> > > > > 5552 387000307470 NA
> > > > >
> > > > > I want to sum the column 3 by column 2.
> > > > > I removed NA by calling:
> > > > > tapply(z[[3]], z[[2]], sum, na.rm=T)
> > > > > but it does not work.
> > > > >
> > > > > then, i used
> > > > > z1<-z[!is.na(z[[3]],]
> > > > > and repeat
> > > > > still doesn't work.
> > > > >
> > > > > please help.
> > > > >
> > > >
> > > > Depending on what you want you may be able to use rowsum:
> > > >
> > > > - display only groups that have at least one non-NA with the sum
> > > > being the sum of the non-NAs:
> > > >
> > > > with(na.omit(z), rowsum(V3, V2))
> > > >
> > > > - display all groups with the sum being NA if any member is NA:
> > > >
> > > > rowsum(z$V3, z$V2)
> > > >
> > >
> > >
> > > --
> > > Weiwei Shi, Ph.D
> > >
> > > "Did you always know?"
> > > "No, I did not. But I believed..."
> > > ---Matrix III
> > >
> > > ______________________________________________
> > > R-help at stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide!
> > > http://www.R-project.org/posting-guide.html
> > >
> > >
> > >
> >
> >
> >
> >
> --------------------------------------------------------------
> ----------------
> > Notice: This e-mail message, together with any
> attachments, contains information of Merck & Co., Inc. (One
> Merck Drive, Whitehouse Station, New Jersey, USA 08889),
> and/or its affiliates (which may be known outside the United
> States as Merck Frosst, Merck Sharp & Dohme or MSD and in
> Japan, as Banyu) that may be confidential, proprietary
> copyrighted and/or legally privileged. It is intended solely
> for the use of the individual or entity named on this
> message. If you are not the intended recipient, and have
> received this message in error, please notify us immediately
> by reply e-mail and then delete it from your system.
> >
> --------------------------------------------------------------
> ----------------
> >
>
>
> --
> Weiwei Shi, Ph.D
>
> "Did you always know?"
> "No, I did not. But I believed..."
> ---Matrix III
>
>
>
More information about the R-help
mailing list