[R] reading formatted txt file into a data frame
Tony B
tony.breyal at googlemail.com
Thu May 6 15:58:54 CEST 2010
Dear all
Lets say I have a plain text file as follows:
> cat(c("[ID: 001 ] [Writer: Steven Moffat ] [Rating: 8.9 ] Doctor Who",
+ "[ID: 002 ] [Writer: Joss Whedon ] [Rating: 8.8 ] Buffy",
+ "[ID: 003 ] [Writer: J. Michael Straczynski ] [Rating: 7.4 ]
Babylon [5]"),
+ sep = "\n", file = "tmp.txt")
I would somehow like to read in this file to R and covert it into a
data frame like this:
> DF <- data.frame(ID = c("001", "002", "003"),
+ Writer = c("Steven Moffat", "Joss Whedon", "J.
Michael Straczynski"),
+ Rating = c("8.9", "8.8", "7.4"),
+ Text = c("Doctor Who", "Buffy", "Babylon [5]"),
stringsAsFactors = FALSE)
My initial thoughts were to use readLines on the text file and maybe
do some regular expressions and also use strsplit(..); but having
confused myself after several attempts I was wondering if there is a
way, perhaps using maybe read.table instead? My end goal is to
hopefully convert DF into an XML structure.
Thank you kindly in advance for your time,
Tony Breyal
# Windows Vista
> sessionInfo()
R version 2.11.0 (2010-04-22)
i386-pc-mingw32
locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United
Kingdom.1252 LC_MONETARY=English_United Kingdom.1252
LC_NUMERIC=C LC_TIME=English_United Kingdom.
1252
attached base packages:
[1] stats graphics grDevices utils datasets methods
base
other attached packages:
[1] XML_2.8-1
loaded via a namespace (and not attached):
[1] tools_2.11.0
More information about the R-help
mailing list