[R-sig-hpc] Handling data with thousands of variables

Håvard Wahl Kongsgård haavard.kongsgaard at gmail.com
Sun Jun 26 16:42:00 CEST 2011


> - are the response variables numeric? (integer or floating point?)
integer

> - does the order of the tuples matter ?
no,

> - do you know all the possible keywords ?
>  (so that they could be encoded with numerical representations)
nope..

The database relates to file sharing activity, the keywords are plot
keywords. Like http://www.imdb.com/title/tt1133985/keywords

-Håvard



More information about the R-sig-hpc mailing list