[R] reading formatted txt file into a data frame

Tony B tony.breyal at googlemail.com
Thu May 6 15:58:54 CEST 2010


Dear all

Lets say I have a plain text file as follows:

> cat(c("[ID: 001 ] [Writer: Steven Moffat ] [Rating: 8.9 ] Doctor Who",
+       "[ID: 002 ] [Writer: Joss Whedon ] [Rating: 8.8 ] Buffy",
+       "[ID: 003 ] [Writer: J. Michael Straczynski ] [Rating: 7.4 ]
Babylon [5]"),
+       sep = "\n", file = "tmp.txt")

I would somehow like to read in this file to R and covert it into a
data frame like this:

> DF <- data.frame(ID = c("001", "002", "003"),
+                 Writer = c("Steven Moffat", "Joss Whedon", "J.
Michael Straczynski"),
+                 Rating = c("8.9", "8.8", "7.4"),
+                 Text = c("Doctor Who", "Buffy", "Babylon [5]"),
stringsAsFactors = FALSE)


My initial thoughts were to use readLines on the text file and maybe
do some regular expressions and also use strsplit(..); but having
confused myself after several attempts I was wondering if there is a
way, perhaps using maybe read.table instead?  My end goal is to
hopefully convert DF into an XML structure.

Thank you kindly in advance for your time,
Tony Breyal

# Windows Vista
> sessionInfo()
R version 2.11.0 (2010-04-22)
i386-pc-mingw32

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
Kingdom.1252    LC_MONETARY=English_United Kingdom.1252
LC_NUMERIC=C                            LC_TIME=English_United Kingdom.
1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods
base

other attached packages:
[1] XML_2.8-1

loaded via a namespace (and not attached):
[1] tools_2.11.0



More information about the R-help mailing list