[R] Automatic Recoding

William Dunlap wdunlap at tibco.com
Thu Sep 1 18:51:03 CEST 2011


You could also use match() directly instead
of going through factors.  Any of the following
would map your inputs to small integers

  > match(x, x)-1
   [1] 0 1 0 3 0 5 0 7 8 9
  > match(x, unique(x))-1
   [1] 0 1 0 2 0 3 0 4 5 6
  > match(x, sort(unique(x)))-1
   [1] 3 4 3 6 3 2 3 0 5 1

Your numbers are pretty big, c. 2^46.  If you get
bigger than 2^53 you won't always be able to distinguish
between adjacent numbers
  > (2^53 + 5) == (2^53 + 4)
  [1] TRUE
so you may want to input them as character strings.


Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of David Winsemius
> Sent: Thursday, September 01, 2011 9:24 AM
> To: Thomas Chesney
> Cc: r-help at r-project.org
> Subject: Re: [R] Automatic Recoding
> 
> 
> On Sep 1, 2011, at 10:54 AM, Thomas Chesney wrote:
> 
> > I have a text file full of numbers (it's a edgelist for a graph) and
> > I would like to recode the numbers as they are way too big to work
> > with. So for instance the following:
> >
> > 676529098667    1000198767829
> > 676529098667    100867672856227
> > 676529098667    91098726278
> > 676529098667    98928373
> > 1092837363526   716172829
> >
> > would become:
> >
> > 0   1
> > 0   2
> > 0   3
> > 0   4
> > 5   6
> >
> > i.e. all 676529098667 would become 0, all 1000198767829 would become
> > 1 etc.
> 
> Depending on how that set of numbers was entered see if this is helpful:
> 
> 1) First entering across first then down.
> 
>   x <- c(676529098667 ,   1000198767829,
>   676529098667 ,   100867672856227,
>   676529098667  ,  91098726278,
>   676529098667   , 98928373,
>   1092837363526   ,716172829)
>   as.numeric(factor(x, levels=unique(x))  )
> # [1] 1 2 1 3 1 4 1 5 6 7
> 
> 2( Now entering first down then over.
> 
>   x2 <- matrix(x, ncol=2, byrow=TRUE) # Matrices are column first
> ordered.
> 
>   as.numeric(factor(x2, levels=unique(c(x2))) ) # need c() to avoid
> warning.
> # [1] 1 1 1 1 2 3 4 5 6 7
> 
> > If I read all the values into a matrix, is there a pre-existing
> > function that can do the recoding?
> 
> You can just subtract one from the factor results. The trick is to use
> explicit levels determined to match the sort order you want. Other
> wise the levels would be first collated.
> 
> --
> David Winsemius, MD
> West Hartford, CT
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list