[R] help on deleting NAs
Petr Pikal
petr.pikal at precheza.cz
Fri Feb 18 08:41:51 CET 2005
On 18 Feb 2005 at 13:56, Patrick Connolly wrote:
> On Thu, 17-Feb-2005 at 02:54PM -0600, KeLin at mdanderson.org wrote:
>
> |> Dear R friends
> |>
> |> My goal is to eliminate this specific group(1) if the # of NAs in
> this |> group greater than |> 50%(specifically say greater than 3).
> Would you please show me how to do |> it. |> I have a sample data as
> following: |> |> Thanks a lot. |> |> Kevin Lin |> |> y
> group f1 f2 f3 |> 30 NA 1 1 1 1 |> 27 NA 1
> 1 2 2 |> 48 NA 1 2 1 2 |> 40 -0.6066416 1 2 2
> 1 |> 24 -0.8323225 1 3 2 2 |> 25 1.3401226 2 1 1 1 |>
> 13 1.2619082 2 1 2 1 |> 14 -0.4323220 2 3 1 1 |> 36
> 0.8406529 2 3 2 2 |> 21 0.9604758 3 1 2 1 |> 18
> 0.9562072 3 2 1 1 |> 45 1.1285016 3 2 1 1 |> 50
> NA 4 1 1 1 |> 11 NA 4 1 1 2 |> 41 -1.1017167
> 4 2 1 1 |> 37 0.9661283 4 3 1 1 |> 39 -0.2540905 4 3
> 1 2
>
>
> There's probably a lot of niftier ways but this will give an idea: If
> X is your dataframe above,
Hi
I am not sure if it is niftier but
> x <- read.table("clipboard",header=T)
> x[!x$group %in% which(tapply(is.na(x$y), x$group, sum) > 2), ]
y group f1 f2 f3
25 1.3401226 2 1 1 1
13 1.2619082 2 1 2 1
14 -0.4323220 2 3 1 1
36 0.8406529 2 3 2 2
21 0.9604758 3 1 2 1
18 0.9562072 3 2 1 1
45 1.1285016 3 2 1 1
50 NA 4 1 1 1
11 NA 4 1 1 2
41 -1.1017167 4 2 1 1
37 0.9661283 4 3 1 1
39 -0.2540905 4 3 1 2
or if you want to use this 50% margin
x[!x$group %in% which (tapply(is.na(x$y),x$group,sum)/
tapply(is.na(x$y),x$group,length)>.5),]
gives you what you want.
Cheers
Petr
>
> > aa <- with(X, tapply((y), group, function(x) length(x[is.na(x)])))
> > names(aa[aa>2])
> [1] "1"
>
> > X[!with(X, group%in%as.numeric(names(aa[aa>2]))),]
> y group f1 f2 f3
> 6 1.3401226 2 1 1 1
> 7 1.2619082 2 1 2 1
> 8 -0.4323220 2 3 1 1
> 9 0.8406529 2 3 2 2
> 10 0.9604758 3 1 2 1
> 11 0.9562072 3 2 1 1
> 12 1.1285016 3 2 1 1
> 13 NA 4 1 1 1
> 14 NA 4 1 1 2
> 15 -1.1017167 4 2 1 1
> 16 0.9661283 4 3 1 1
> 17 -0.2540905 4 3 1 2
> >
>
> The function in the tapply part could be made more general if 3
> doesn't always constitute a majority.
>
> HTH
>
> --
> Patrick Connolly
> HortResearch
> Mt Albert
> Auckland
> New Zealand
> Ph: +64-9 815 4200 x 7188
> ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
> ~ I have the world`s largest collection of seashells. I keep it on all
> the beaches of the world ... Perhaps you`ve seen it. ---Steven Wright
> ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
> ~
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
Petr Pikal
petr.pikal at precheza.cz
More information about the R-help
mailing list