[R] Multiple cbind according to filename
Ken
katakagi at bu.edu
Fri May 25 16:44:38 CEST 2012
Matthew Ouellette <mouellette89 <at> gmail.com> writes:
>
> Hi all,
>
> I'm just a beginner with R but I have not been able to search for any
> relevant answer to my problem. I apologize if it has in fact been asked
> before.
>
> Recently I've realized that I need to combine hundreds of pairs of data
> frames. The filenames of the frames I need to combine have unique strings.
> This is my best guess as to the approach to take:
>
> filenames<-list.files()
>
> filenames
> [1] "a1.csv" "a2.csv" "b1.csv" "b2.csv" "c1.csv" "c2.csv"
>
> alldata<-lapply(filenames, read.csv, header=TRUE)
>
> names(alldata)<-filenames
> summary(alldata)
> Length Class Mode
> a1.csv 27 data.frame list
> a2.csv 27 data.frame list
> b1.csv 27 data.frame list
> b2.csv 27 data.frame list
> c1.csv 27 data.frame list
> c2.csv 27 data.frame list
>
> My next step would be to cbind files that share a common string at the
> beginning, such as:
> cbind(alldata[[1]],alldata[[2]])
> cbind(alldata[[3]],alldata[[4]])
> cbind(alldata[[5]],alldata[[6]])
> ...
>
> but file list is hundreds of files long (but is sorted alphanumerically
> such as in this example - not sure if this is relevant). If I had to
> guess, I'd do something like this:
>
> which(names(alldata)==...), to identify which elements to combine based on
> unique filename
>
> OR
> x<-seq(1,length(alldata), 2)
> y=x+1
> z<-cbind(x,y)
> z
> x y
> [1,] 1 2
> [2,] 3 4
> [3,] 5 6
>
> to use the frame created in z to combine based on rows,
>
> then use a looped cbind function (or *apply function with nested cbind
> function?) using the previously returned indexes to create my new combined
> data frames, including a step to write the frames to a new unique filename
> (not sure how to do that step in this context). These last steps I've
> tried a lot of code but nothing worth mentioning as it has all failed
> miserably.
>
> I appreciate the help,
>
> M
>
> [[alternative HTML version deleted]]
>
>
Hi Matthew,
You could try using substr() if the cbind is based on a common string in the
file name just makes sure that the strings in filenames is in the same order as
the files are in list.files:
a1 <- data.frame("col1" = seq(1,10, 1))
a2 <- data.frame("col2" = seq(11,20, 1))
b1 <- data.frame("col3" = seq(21,30, 1))
b2 <- data.frame("col4" = seq(31,40, 1))
filenames <- c("a1", "a2", "b1", "b2")
list.files <- list(a1, a2, b1, b2)
first.letter <- substr(filenames, 1,1)
unique.first.letter <- unique(first.letter)
l.files <- list()
for(i in 1:length(unique.first.letter)){
l.files[[i]] = as.data.frame(list.files[first.letter == unique.first.letter[i]])
}
HTH,
Ken
More information about the R-help
mailing list