[R] Split fixed width data in R

Zilefac Elvis zilefacelvis at yahoo.com
Wed Oct 22 23:41:16 CEST 2014


Thanks Ellison.
Just what I wanted.
Cheers.
AT.


On Wednesday, October 22, 2014 10:03 AM, S Ellison <S.Ellison at LGCGroup.com> wrote:
This seems to do a fair bit of it on your example data; you can pull out the date bits separately using Date functions if you need them

decode.lst <- function(x) { 
    data.frame(Date=as.Date(substr(x,1,8), format="%Y%m%d"), 
            Site=substr(x, 9,12), 
            Precipitation=as.numeric(substring(x,13)))
}

decode.lst(lst1Sub)


S Ellison






> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
> Behalf Of Zilefac Elvis
> Sent: 22 October 2014 16:38
> To: R. Help
> Subject: [R] Split fixed width data in R
> 
> Hi,
> I have fixed width data that I would like to split into columns. Here is a sanpshot
> of the data (actual data is a list object):
> lst1Sub<-
> "20131124GGG1 23.00"
> "20131125GGG1 15.00"
> "20131128GGG1  0.00"
> "201312 1GGG1  0.00"
> "201312 4GGG1  0.00"
> "201312 7GGG1 10.00"
> "20131210GGG1  0.00"
> "20131213GGG1  0.00"
> "20131216GGG1  0.00"
> "20131219GGG1  0.00"
> "20131222GGG1  0.00"
> "20131225GGG1  0.00"
> "20131228GGG1  0.00"
> 
> The following script will split the data into [Year Month Day Site Precipitation]
> ------------------------------------------------------------------------------------------------------
> library(stringr)
> dateSite <- gsub("(.*G.{3}).*","\\1",lst1Sub);
> dat1 <- data.frame(Year=as.numeric(substr(dateSite,1,4)),
> Month=as.numeric(substr(dateSite,5,6)),
> 
> Day=as.numeric(substr(dateSite,7,8)),Site=substr(dateSite,9,12),Rain=substr(dat
> eSite,13,18),stringsAsFactors=FALSE);
> lst3 <- lapply(lst1Sub,function(x) {dateSite <- gsub("(.*G.{3}).*","\\1",x);
>                                     dat1 <-
> data.frame(Year=as.numeric(substr(dateSite,1,4)),
> Month=as.numeric(substr(dateSite,5,6)),Day=as.numeric(substr(dateSite,7,8)),Si
> te=substr(dateSite,9,12),stringsAsFactors=FALSE);
>                                     Sims <-
> str_trim(gsub(".*G.{3}\\s?(.*)","\\1",x));Sims[grep("\\d+-",Sims)] <- gsub("(.*)([-
> ][0-9]+\\.[0-9]+)","\\1 \\2",gsub("^([0-9]+\\.[0-9]+)(.*)","\\1 \\2",
> Sims[grep("\\d+-",Sims)]));
>                                     Sims1 <- read.table(text=Sims,header=FALSE);
> names(Sims1) <- c("Precipitation");dat2 <- cbind(dat1,Sims1)})
> -------------------------------------------------------------------------------------------------------
> -----------------------------------
> 
> Problem: the above script deletes the first value of my precipitation values. For
> example, after splitting, "20131124GGG1 23.00" becomes
> 2013 11 24 GGG1 3.00 INSTEAD of 2013 11 24 GGG1 23.00 (right answer).
> 
> Anything wrong with the string trimming? Is there another way to arrive at the
> same answer?
> 
> Thanks,
> AT.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}



More information about the R-help mailing list