[R] Unique.data.frame...still getting duplicates
Prof Brian Ripley
ripley at stats.ox.ac.uk
Fri Jun 25 08:13:10 CEST 2004
Your code cannot possibly work in a recent version of R, so please try the
current version (1.9.1).
data[ID, ] is what? Why not just call unique() on ID?
BTW, if you call methods such as unique.data.frame you are adding possible
course of error -- here I suspect data[ID, ] is not what you intend.
Please call the generic.
On Fri, 25 Jun 2004, F Z wrote:
> Hi there
>
> I have a data frame with about 65,000 rows and 8 variables. I am trying to
> get rid of the double entries of a factor variable "ID" so I can get a
> unique observation for each ID
>
> I tried:
>
> >dupl_unique.data.frame(data[ID,]) #I obtain a data frame with 21,547
> >observations..so far so good, but then when I check for duplicates
>
> >d_duplicated(dupl2$ID)
> >summary(as.factor(d))
> FALSE TRUE
> 6836 14711
>
> Meaning that I am still getting 14,711 duplicates!
>
> I tried changing the ID type to integer and repeated the process but I got
> dentical results....what am I missing?
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list