[R] algorithm to create unique identifiers
Prof Brian Ripley
ripley at stats.ox.ac.uk
Fri Sep 5 09:12:07 CEST 2008
For a much simpler solution that does always work for numbers, see
unique's methods for matrices and data frames.
On Thu, 4 Sep 2008, Henrik Bengtsson wrote:
> On Thu, Sep 4, 2008 at 8:44 PM, Ralph S. <ruffel1 at hotmail.com> wrote:
>>
>> Hi all,
>>
>> I am trying to create a unique identifier for each row, combining
>> numbers from three columns.
>>
>> Do you know if there is a general formula to do this (or some manual
>> where I can read about this)?
>>
>> I figure I can use the numeric entries of the columns as "coordinates"
>> and multiply them with different coefficients (different magnitudes) to
>> get the unique ID - but it would be nice to read about such algorithms
>> in general.
>
> What are you numbers? Are they in a fixed range? Integers or reals?
> If fixed range integers, it is easy. Think regular numerical
> representation, e.g. binary, octadecimal, decimal and hexadecimal.
>
> For a more generic solution that works with any data types, see e.g.
> MD5 [http://en.wikipedia.org/wiki/MD5]. It is not guaranteed to
> generated unique codes, but it is extremely rare that two different
> inputs gives the same MD5 code. MD5 (and others) are implemented in
> the 'digest' packages, e.g.
>
>> library(digest)
>> digest(list(a=1, b=list(1:10, c=letters)))
> [1] "73e0ae066a97bfff7f79d41c65b55fde"
>
> My $.02
>
> /Henrik
>
>
>>
>> Any links/input would be great -
>>
>> Ralph
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list