[R] combining data from multiple read.delim() invocations.
Bert Gunter
gunter.berton at gene.com
Tue Jul 1 19:33:00 CEST 2014
Maybe, David, but this isn't really it.
Your code just basically reproduces the explicit for() loop with the
lapply. Maybe there might be some advantage in rbinding the list over
incrementally adding rows to the data frame, but I would be surprised
if it made much of a difference either way. Of course, someone with
actual data might prove me wrong...
Cheers,
Bert
Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374
"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
Clifford Stoll
On Tue, Jul 1, 2014 at 9:31 AM, David L Carlson <dcarlson at tamu.edu> wrote:
> There is a better way. First we need some data. This creates three files in your home directory, each with five rows:
>
> write.table(data.frame(rep("A", 5), Sys.time(), Sys.time()),
> "A.tab", sep="\t", row.names=FALSE, col.names=FALSE)
> write.table(data.frame(rep("B", 5), Sys.time(), Sys.time()),
> "B.tab", sep="\t", row.names=FALSE, col.names=FALSE)
> write.table(data.frame(rep("C", 5), Sys.time(), Sys.time()),
> "C.tab", sep="\t", row.names=FALSE, col.names=FALSE)
>
> Now to read and combine them into a single data.frame:
>
> fls <- c("A.tab", "B.tab", "C.tab")
> df.list <- lapply(fls, read.delim, header=FALSE, col.names=c("lpar","started","ended"),
> as.is=TRUE, na.strings='\\N', colClasses=c("character","POSIXct","POSIXct"))
> df.all <- do.call(rbind, df.list)
>> str(df.all)
> 'data.frame': 15 obs. of 3 variables:
> $ lpar : chr "A" "A" "A" "A" ...
> $ started: POSIXct, format: "2014-07-01 11:25:05" "2014-07-01 11:25:05" ...
> $ ended : POSIXct, format: "2014-07-01 11:25:05" "2014-07-01 11:25:05" ...
>
> -------------------------------------
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77840-4352
>
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of John McKown
> Sent: Tuesday, July 1, 2014 7:07 AM
> To: r-help at r-project.org
> Subject: [R] combining data from multiple read.delim() invocations.
>
> Is there a better way to do the following? I have data in a number of tab
> delimited files. I am using read.delim() to read them, in a loop. I am
> invoking my code on Linux Fedora 20, from the BASH command line, using
> Rscript. The code I'm using looks like:
>
> arguments <- commandArgs(trailingOnly=TRUE);
> # initialize the capped_data data.frame
> capped_data <- data.frame(lpar="NULL",
> started=Sys.time(),
> ended=Sys.time(),
> stringsAsFactors=FALSE);
> # and empty it.
> capped_data <- capped_data[-1,];
> #
> # Read in the data from the files listed
> for (file in arguments) {
> data <- read.delim(file,
> header=FALSE,
> col.names=c("lpar","started","ended"),
> as.is=TRUE,
> na.strings='\\N',
> colClasses=c("character","POSIXct","POSIXct"));
> capped_data <- rbind(capped_data,data)
> }
> #
>
> I.e. is there an easier way than doing a read.delim/rbind in a loop?
>
>
> --
> There is nothing more pleasant than traveling and meeting new people!
> Genghis Khan
>
> Maranatha! <><
> John McKown
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list