[R] how to read this kind of csv in R?

Rui Barradas ru|pb@rr@d@@ @end|ng |rom @@po@pt
Mon Oct 7 18:55:33 CEST 2019


Hello,

OK, I had some spare time. Try



readCSVFile <- function(filename){
   lns <- readLines(filename)
   lns <- lns[sapply(lns, nchar) > 0]
   lns <- gsub(" ", "", lns)
   lns <- sub(";$", "", lns)
   i_title <- grep("[[:alpha:]]", lns)

   blocks <- lapply(seq_along(i_title)[-1], function(i){
     if(i == length(i_title)){
       j <- i_title[i] + 1
       k <- length(lns)
     }else{
       j <- i_title[i] + 1
       k <- i_title[i + 1] - 1
     }
     lns[j:k]
   })

   n <- length(unlist(strsplit(blocks[[1]][1], ";")))
   first <- unlist(strsplit(lns[i_title[1] + 1], ";"))
   first <- as.numeric(first)
   first <- rep(first, each = n)

   blocks <- lapply(blocks, function(x){
     unlist(strsplit(x, ";"))
   })
   res <- do.call(cbind.data.frame, blocks)
   res <- cbind.data.frame(first, res)

   names(res) <- sub("\\[.*\\]$", "", lns[i_title])
   res
}

df1 <- readCSVFile("strange.csv")


If this function doesn't do it, please try to make an effort on your 
own, R-Help is not a code writing service, it's a mail list for *doubts* 
on R code.

Hope this helps,

Rui Barradas

Às 09:18 de 07/10/19, vodvos using zoho.com escreveu:
> I am mad about importing this strange csv format type.
> 
> The real csv has been attached now. The raw data points are huge.
> 
> Many thanks.
> 
> 
> 
> 
>   ---- 在 星期日, 06 十月 2019 07:58:37 -0700 Rui Barradas <ruipbarradas using sapo.pt> 撰写 ----
>   > Hello,
>   >
>   > It is not clear if all files have
>   >
>   > * a first block with just one data line
>   > * all other blocks with as many rows as the numbers in that first data line.
>   >
>   > If yes, maybe something like this?
>   >
>   > lns <- readLines("strange.csv")
>   > lns <- lns[sapply(lns, nchar) > 0]
>   > lns <- sub(",$", "", lns)
>   > i_title <- grep("[[:alpha:]]", lns)
>   >
>   > tmp <- lapply(seq_along(i_title), function(i){
>   >    tmp <- if(i < length(i_title)){
>   >      lns[(i_title[i] + 1):(i_title[i + 1] - 1)]
>   >    }else{
>   >      lns[(i_title[i] + 1):length(lns)]
>   >    }
>   >    list(n = length(tmp), text = unlist(strsplit(tmp, ",")))
>   > })
>   >
>   > n <- max(sapply(tmp, '[[', 'n'))
>   > tmp <- lapply(tmp, function(x) as.numeric(x$text))
>   > tmp[[1]] <- rep(tmp[[1]], each = n)
>   > res <- do.call(cbind.data.frame, tmp)
>   > names(res) <- lns[i_title]
>   > res
>   >
>   >
>   > If you have hundreds of files, you should make a function out of the
>   > code above.
>   >
>   > Hope this helps,
>   >
>   > Rui Barradas
>   >
>   > Às 12:29 de 06/10/19, vod vos via R-help escreveu:
>   > > I got hundreds of csv files. The real formats in each csv file are as follows:
>   > >
>   > > aa(cm)
>   > > 1, 2 , 3,
>   > >
>   > > bb(mm)
>   > > 1, 2, 3,
>   > > 4, 5, 6,
>   > > 7, 8, 9,
>   > >
>   > > cc(mm)
>   > > 3, 4, 5,
>   > > 7, 5, 9,
>   > > 6, 5, 8,
>   > >
>   > > How can I use read.table or read.csv to convert the csv files
>   > > to a tidy data frame format as follow:
>   > >
>   > > aa, bb, cc
>   > > 1, 1, 3
>   > > 1, 2, 4
>   > > 1, 3, 5
>   > > 2, 4, 7
>   > > 2, 5, 5
>   > > 2, 6, 9
>   > > 3, 7, 6
>   > > 3, 8, 5
>   > > 3, 9, 8
>   > >
>   > > many thanks.
>   > >
>   > > ______________________________________________
>   > > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>   > > https://stat.ethz.ch/mailman/listinfo/r-help
>   > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>   > > and provide commented, minimal, self-contained, reproducible code.
>   > >
>   >
>



More information about the R-help mailing list