[R] Split fixed width data in R
Clint Bowman
clint at ecy.wa.gov
Wed Oct 22 17:54:04 CEST 2014
?read.fortran
Clint Bowman INTERNET: clint at ecy.wa.gov
Air Quality Modeler INTERNET: clint at math.utah.edu
Department of Ecology VOICE: (360) 407-6815
PO Box 47600 FAX: (360) 407-7534
Olympia, WA 98504-7600
USPS: PO Box 47600, Olympia, WA 98504-7600
Parcels: 300 Desmond Drive, Lacey, WA 98503-1274
On Wed, 22 Oct 2014, Zilefac Elvis wrote:
> Hi,
> I have fixed width data that I would like to split into columns. Here is a sanpshot of the data (actual data is a list object):
> lst1Sub<-
> "20131124GGG1 23.00"
> "20131125GGG1 15.00"
> "20131128GGG1 0.00"
> "201312 1GGG1 0.00"
> "201312 4GGG1 0.00"
> "201312 7GGG1 10.00"
> "20131210GGG1 0.00"
> "20131213GGG1 0.00"
> "20131216GGG1 0.00"
> "20131219GGG1 0.00"
> "20131222GGG1 0.00"
> "20131225GGG1 0.00"
> "20131228GGG1 0.00"
>
> The following script will split the data into [Year Month Day Site Precipitation]
> ------------------------------------------------------------------------------------------------------
> library(stringr)
> dateSite <- gsub("(.*G.{3}).*","\\1",lst1Sub);
> dat1 <- data.frame(Year=as.numeric(substr(dateSite,1,4)), Month=as.numeric(substr(dateSite,5,6)),
> Day=as.numeric(substr(dateSite,7,8)),Site=substr(dateSite,9,12),Rain=substr(dateSite,13,18),stringsAsFactors=FALSE);
> lst3 <- lapply(lst1Sub,function(x) {dateSite <- gsub("(.*G.{3}).*","\\1",x);
> dat1 <- data.frame(Year=as.numeric(substr(dateSite,1,4)), Month=as.numeric(substr(dateSite,5,6)),Day=as.numeric(substr(dateSite,7,8)),Site=substr(dateSite,9,12),stringsAsFactors=FALSE);
> Sims <- str_trim(gsub(".*G.{3}\\s?(.*)","\\1",x));Sims[grep("\\d+-",Sims)] <- gsub("(.*)([-][0-9]+\\.[0-9]+)","\\1 \\2",gsub("^([0-9]+\\.[0-9]+)(.*)","\\1 \\2", Sims[grep("\\d+-",Sims)]));
> Sims1 <- read.table(text=Sims,header=FALSE); names(Sims1) <- c("Precipitation");dat2 <- cbind(dat1,Sims1)})
> ------------------------------------------------------------------------------------------------------------------------------------------
>
> Problem: the above script deletes the first value of my precipitation values. For example, after splitting, "20131124GGG1 23.00" becomes
> 2013 11 24 GGG1 3.00 INSTEAD of 2013 11 24 GGG1 23.00 (right answer).
>
> Anything wrong with the string trimming? Is there another way to arrive at the same answer?
>
> Thanks,
> AT.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list