[R-sig-hpc] Handling data with thousands of variables

Sun Jun 26 09:07:33 CEST 2011

In machine learning settings it's not uncommon that the data has
thousands of variables. The same is also the case with genetic
studies.

In R what is the best approach for handling such data? Any personal
experience with handling such data in R?

For my case the raw data is a response variable and a unstructured
tuple with string keywords.

1341,{"Harry","Larry","Kline"}
54232,{"Mary","Kline","Larry"}
54232,{"David","Line","Lars"}

- Håvard