[R] stacking imported data
Gabor Grothendieck
ggrothendieck at myway.com
Tue Nov 2 02:41:34 CET 2004
Gabor Grothendieck <ggrothendieck <at> myway.com> writes:
:
: Sundar Dorai-Raj <sundar.dorai-raj <at> pdf.com> writes:
:
: :
: : Hi all,
: : I have a question that I don't have a good answer for (note the word
: : "good"; I have an answer, but I consider it not "good"). Take the
: : following data in a single tab-delimited text file:
: :
: : <text>
: :
: : A
: : Labels Value SE 2.5% 97.5%
: : R90 0.231787 1.148044 0.035074 1.531779
: : R0 0.500861 0.604406 0.185336 1.353552
: :
: : B
: : Labels Value SE 2.5% 97.5%
: : (Intercept) 1.367514 0.036431 1.287975 1.451964
: : </text>
: :
: : (Note: the <text> tags are not present and are added here only to show
: : blank lines.)
: :
: : I would like to read the data into a single data.frame which looks like
: :
: : Labels Value SE 2.5% 97.5%
: : A.R90 0.231787 1.148044 0.035074 1.531779
: : A.R0 0.500861 0.604406 0.185336 1.353552
: : B.(Intercept) 1.367514 0.036431 1.287975
1.451964
: :
: : A few rules:
: :
: : 1. the number of rows in "A" and "B" will vary from 1 to ???. Here "A"
: : has 1 row (excluding header) and B has 2 rows (excluding header).
: : 2. the number of columns in "A" and "B" will always be the same.
: : 4. the headers for "A" and "B" will always be the same.
: : 3. there is always an empty line at the beginning of the file and in
: : between "A" and "B".
: :
:
: Read the lines into vector z, one line per element.
:
: Define a grouping variable, g, which is 1 for the lines
: starting at the first blank line and 2 for the lines
: starting at the 2nd. Define a function f which accepts such
: a group of lines and creates the appropriate data frame from
: them. tapply the lines, grouped by g, and bind the rows of
: the data frame produced from each group together into one
: large data frame.
:
: z <- readLines("file.dat")
:
: g <- cumsum(nchar(z) == 0)
: f <- function(x) {
: x[-(1:3)] <- paste(trim(x[2]), x[-(1:3)], sep = ".")
: read.table(textConnection(x[-(1:2)]), header = TRUE)
: }
: do.call("rbind", tapply(z, cumsum(nchar(z) == 0), f))
A correction:
z <- readLines("file.dat")
g <- cumsum(nchar(z) == 0)
f <- function(x) {
x[-(1:3)] <- paste(x[2], x[-(1:3)], sep = ".")
read.table(textConnection(x[-(1:2)]), header = TRUE)
}
do.call("rbind", tapply(z, cumsum(nchar(z) == 0), f))
:
: Note: if the blank lines or the A and B lines contain
: whitespace trim this off first. That is, insert these
: two lines after the readLines statement:
:
: trim <- function(x) gsub("^[[:space:]]+|[[:space:]]+$", "", x)
: z <- trim(z)
:
: ______________________________________________
: R-help <at> stat.math.ethz.ch mailing list
: https://stat.ethz.ch/mailman/listinfo/r-help
: PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
:
:
More information about the R-help
mailing list