[R] help on deleting NAs
Patrick Connolly
p.connolly at hortresearch.co.nz
Fri Feb 18 01:56:33 CET 2005
On Thu, 17-Feb-2005 at 02:54PM -0600, KeLin at mdanderson.org wrote:
|> Dear R friends
|>
|> My goal is to eliminate this specific group(1) if the # of NAs in this
|> group greater than
|> 50%(specifically say greater than 3). Would you please show me how to do
|> it.
|> I have a sample data as following:
|>
|> Thanks a lot.
|>
|> Kevin Lin
|>
|> y group f1 f2 f3
|> 30 NA 1 1 1 1
|> 27 NA 1 1 2 2
|> 48 NA 1 2 1 2
|> 40 -0.6066416 1 2 2 1
|> 24 -0.8323225 1 3 2 2
|> 25 1.3401226 2 1 1 1
|> 13 1.2619082 2 1 2 1
|> 14 -0.4323220 2 3 1 1
|> 36 0.8406529 2 3 2 2
|> 21 0.9604758 3 1 2 1
|> 18 0.9562072 3 2 1 1
|> 45 1.1285016 3 2 1 1
|> 50 NA 4 1 1 1
|> 11 NA 4 1 1 2
|> 41 -1.1017167 4 2 1 1
|> 37 0.9661283 4 3 1 1
|> 39 -0.2540905 4 3 1 2
There's probably a lot of niftier ways but this will give an idea:
If X is your dataframe above,
> aa <- with(X, tapply((y), group, function(x) length(x[is.na(x)])))
> names(aa[aa>2])
[1] "1"
> X[!with(X, group%in%as.numeric(names(aa[aa>2]))),]
y group f1 f2 f3
6 1.3401226 2 1 1 1
7 1.2619082 2 1 2 1
8 -0.4323220 2 3 1 1
9 0.8406529 2 3 2 2
10 0.9604758 3 1 2 1
11 0.9562072 3 2 1 1
12 1.1285016 3 2 1 1
13 NA 4 1 1 1
14 NA 4 1 1 2
15 -1.1017167 4 2 1 1
16 0.9661283 4 3 1 1
17 -0.2540905 4 3 1 2
>
The function in the tapply part could be made more general if 3
doesn't always constitute a majority.
HTH
--
Patrick Connolly
HortResearch
Mt Albert
Auckland
New Zealand
Ph: +64-9 815 4200 x 7188
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~
I have the world`s largest collection of seashells. I keep it on all
the beaches of the world ... Perhaps you`ve seen it. ---Steven Wright
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~
More information about the R-help
mailing list