[R] How to read in this data format?
Gabor Grothendieck
ggrothendieck at gmail.com
Thu Mar 1 18:35:43 CET 2007
Read in the data using readLines, extract out
all desired lines (namely those containing only
numbers, dots and spaces or those with the
word Time) and remove Retention from all
lines so that all remaining lines have two
fields. Now that we have desired lines
and all lines have two fields read them in
using read.table.
Finally, split them into groups and restructure
them using "by" and in the last line we
convert the "by" output to a data frame.
At the end we display an alternate function f
for use with by should we wish to generate long
rather than wide output (using the terminology
of the reshape command).
Lines <- "$$ Experiment Number:
$$ Associated Data:
FUNCTION 1
Scan 1
Retention Time 0.017
399.8112 184
399.8742 0
399.9372 152
....
Scan 2
Retention Time 0.021
399.8112 181
399.8742 1
399.9372 153
"
# replace next line with: Lines. <- readLines("myfile.dat")
Lines. <- readLines(textConnection(Lines))
Lines. <- grep("^[1-9][0-9. ]*$|Time", Lines., value = TRUE)
Lines. <- gsub("Retention", "", Lines.)
DF <- read.table(textConnection(Lines.), as.is = TRUE)
closeAllConnections()
f <- function(x) c(id = x[1,2], structure(x[-1,2], .Names = x[-1,1]))
out.by <- by(DF, cumsum(DF[,1] == "Time"), f)
as.data.frame(do.call("rbind", out.by))
We could alternately consider producing long
format by replacing the function f with:
f <- function(x) data.frame(x[-1,], id = x[1,2])
On 3/1/07, Bart Joosen <bartjoosen at hotmail.com> wrote:
> Hi,
>
> I recieved an ascii file, containing following information:
>
> $$ Experiment Number:
> $$ Associated Data:
>
> FUNCTION 1
>
> Scan 1
> Retention Time 0.017
>
> 399.8112 184
> 399.8742 0
> 399.9372 152
> ....
>
> Scan 2
> Retention Time 0.021
>
> 399.8112 181
> 399.8742 1
> 399.9372 153
> .....
>
>
> I would like to import this data in R into a dataframe, where there is a
> column time, the first numbers as column names, and the second numbers as
> data in the dataframe:
>
> Time 399.8112 399.8742 399.9372
> 0.017 184 0 152
> 0.021 181 1 153
>
> I did take a look at the read.table, read.delim, scan, ... But I 've no idea
> about how to solve this problem.
>
> Anyone?
>
>
> Thanks
>
> Bart
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list