[R] avoiding timconsuming for loop renaming identifiers
Benilton Carvalho
bcarvalh at jhsph.edu
Sat Jul 21 03:55:20 CEST 2007
as.integer(factor(dta[["school_id"]]))
b
On Jul 20, 2007, at 9:26 PM, toby909 at gmail.com wrote:
> Hi All
>
> I was wondering if I can avoid a time-consuming for loop on my
> 600000 obs dataset.
>
> school_id y
> 8 9.87
> 8 8.89
> 8 7.89
> 8 8.88
> 20 6.78
> 20 9.99
> 20 8.79
> 31 10.1
> 31 11
>
> There are, say, 143 different schools in this 600000 obs dataset.
>
> I need to thave sequential identifiers, 1,2,3,4,5,...,143.
>
> I was using an awkward for look that took 30 minutes to run.
> sid = 1
> dta$sid[1] = 1
> for (i in 2:nrow(dta)) {
> if (dta$school_id[i] != dta$school_[i-1]) sid = sid+1
> dta$sid[i] = sid
> }
>
> Any hints appreciated.
>
> Thanks Toby
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list