[R-sig-hpc] Handling data with thousands of variables

Håvard Wahl Kongsgård haavard.kongsgaard at gmail.com
Sun Jun 26 09:07:33 CEST 2011


In machine learning settings it's not uncommon that the data has
thousands of variables. The same is also the case with genetic
studies.

In R what is the best approach for handling such data? Any personal
experience with handling such data in R?

For my case the raw data is a response variable and a unstructured
tuple with string keywords.

1341,{"Harry","Larry","Kline"}
54232,{"Mary","Kline","Larry"}
54232,{"David","Line","Lars"}


- Håvard



More information about the R-sig-hpc mailing list