# [R] Convert COLON separated format

Tue Oct 9 07:28:07 CEST 2012

```Hello,

Here's a function that doesn't do it all but might help.

fun <- function(x){
x1 <- unlist(strsplit(x, " "))
x2 <- x1[nchar(x1) > 0]
i <- as.integer(x2[1])
x3 <- unlist(strsplit(x2[-1], ":"))
j <- as.integer(x3[rep(c(TRUE, FALSE), length(x3)/2)])
y <- numeric(max(j))
y[j] <- as.numeric(x3[rep(c(FALSE, TRUE), length(x3)/2)])
list(row = i, line = y)
}

x <- "1  5:1  27:3  345:10"
fun(x)

If you know that your labels, i.e., row numbers are consecutive, have
the function return just 'y', not a list.
Then use readLines to read the file in and lapply fun to it. Something like

lst <- lapply(ln, fun)

Then you'll have another problem. The lines' lengths. They shouldn't be
all the same, so in order to make a data.frame or matrix you'll need
extra work. Try the code above and say whether it's on the right track.

Also, take a look at package Matrix. It's a recommended package and it
implements sparse matrices.

Hope this helps,

Em 09-10-2012 05:56, Noah Silverman escreveu:
> I have a bunch of data sets that were created for the libsvm tool.  They are in "colon separated sparse format".
>
> i.e.
>
> 1  5:1  27:3  345:10
>
> Is a row with the label of "1" and only has values in columns 5, 27, and 345.
>
> I want to read these into a data.frame in R.
>
> Is there a simple way to do this?
>
> --
> Noah Silverman, M.S.
> UCLA Department of Statistics
> 8117 Math Sciences Building
> Los Angeles, CA 90095
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help