[R] parsing text files
jim holtman
jholtman at gmail.com
Fri Mar 9 14:33:47 CET 2012
Here is one way of doing it; it reads the file and create a 'long' version.
##########
input <- file("/temp/ClinicalReports.txt", 'r')
outFile <- '/temp/output.txt' # tempfile()
output <- file(outFile, 'w')
writeLines("ID, Date, variable, value", output)
ID <- NULL
dataSw <- NULL
repeat{
line <- readLines(input, n = 1)
if (length(line) == 0) break
if (!is.null(dataSw)){
if (line == ''){ # end of data
ID <- NULL
dataSw <- NULL
next
}
# now write CSV output file
cat(ID
, ','
, Date
, ','
, substring(line, 1, 31)
, ','
, substring(line, 32, 43)
, '\n'
, sep = ''
, file = output
)
next
}
if (grepl("Acc.ne", line)){
ID <- (substring(line, 29,35))
Date <- (substring(line, 52,61))
next
}
if (!is.null(ID)){ # looking for Esame
if (grepl("Esame", line)){
# skip two lines
readLines(input, n = 2)
dataSw <- 1
next
}
}
}
# now read in the data in a long format
close(output)
result <- read.csv(outFile, as.is = TRUE)
the results from your test data is:
> str(result)
'data.frame': 43 obs. of 4 variables:
$ ID : int 185 185 185 185 185 185 185 185 185 185 ...
$ Date : chr "05/12/2011" "05/12/2011" "05/12/2011" "05/12/2011" ...
$ variable: chr "AZOTEMIA " "CREATININEMIA
" "SODIEMIA " "POTASSIEMIA
" ...
$ value : num 33.6 0.99 136 4.22 94.2 8.68 1.87 1.79 189 118 ...
> head(result)
ID Date variable value
1 185 05/12/2011 AZOTEMIA 33.60
2 185 05/12/2011 CREATININEMIA 0.99
3 185 05/12/2011 SODIEMIA 136.00
4 185 05/12/2011 POTASSIEMIA 4.22
5 185 05/12/2011 CLOREMIA 94.20
6 185 05/12/2011 CALCEMIA 8.68
>
On Thu, Mar 8, 2012 at 8:24 AM, ginger <biino at igm.cnr.it> wrote:
> Ooops,
> I forgot to specify that for each raw, containing records of the clinical
> reports , the values of the 22 parameter measurement have to be reported.
> For example, first raw, first 5 columns:
> ID DATE GLICEMIA AZOTEMIA
> CREATININEMIA SODIEMIA ... ... ...
> 0000185 05/12/2011 115 33.6 0.99
> 136 ... ... ...
>
> --
> View this message in context: http://r.789695.n4.nabble.com/parsing-text-files-tp4456355p4456389.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Jim Holtman
Data Munger Guru
What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.
More information about the R-help
mailing list