[R] Automatic Recoding
William Dunlap
wdunlap at tibco.com
Thu Sep 1 18:51:03 CEST 2011
You could also use match() directly instead
of going through factors. Any of the following
would map your inputs to small integers
> match(x, x)-1
[1] 0 1 0 3 0 5 0 7 8 9
> match(x, unique(x))-1
[1] 0 1 0 2 0 3 0 4 5 6
> match(x, sort(unique(x)))-1
[1] 3 4 3 6 3 2 3 0 5 1
Your numbers are pretty big, c. 2^46. If you get
bigger than 2^53 you won't always be able to distinguish
between adjacent numbers
> (2^53 + 5) == (2^53 + 4)
[1] TRUE
so you may want to input them as character strings.
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of David Winsemius
> Sent: Thursday, September 01, 2011 9:24 AM
> To: Thomas Chesney
> Cc: r-help at r-project.org
> Subject: Re: [R] Automatic Recoding
>
>
> On Sep 1, 2011, at 10:54 AM, Thomas Chesney wrote:
>
> > I have a text file full of numbers (it's a edgelist for a graph) and
> > I would like to recode the numbers as they are way too big to work
> > with. So for instance the following:
> >
> > 676529098667 1000198767829
> > 676529098667 100867672856227
> > 676529098667 91098726278
> > 676529098667 98928373
> > 1092837363526 716172829
> >
> > would become:
> >
> > 0 1
> > 0 2
> > 0 3
> > 0 4
> > 5 6
> >
> > i.e. all 676529098667 would become 0, all 1000198767829 would become
> > 1 etc.
>
> Depending on how that set of numbers was entered see if this is helpful:
>
> 1) First entering across first then down.
>
> x <- c(676529098667 , 1000198767829,
> 676529098667 , 100867672856227,
> 676529098667 , 91098726278,
> 676529098667 , 98928373,
> 1092837363526 ,716172829)
> as.numeric(factor(x, levels=unique(x)) )
> # [1] 1 2 1 3 1 4 1 5 6 7
>
> 2( Now entering first down then over.
>
> x2 <- matrix(x, ncol=2, byrow=TRUE) # Matrices are column first
> ordered.
>
> as.numeric(factor(x2, levels=unique(c(x2))) ) # need c() to avoid
> warning.
> # [1] 1 1 1 1 2 3 4 5 6 7
>
> > If I read all the values into a matrix, is there a pre-existing
> > function that can do the recoding?
>
> You can just subtract one from the factor results. The trick is to use
> explicit levels determined to match the sort order you want. Other
> wise the levels would be first collated.
>
> --
> David Winsemius, MD
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list