[R] Multiple cbind according to filename

Ken katakagi at bu.edu
Fri May 25 16:44:38 CEST 2012


Matthew Ouellette <mouellette89 <at> gmail.com> writes:

> 
> Hi all,
> 
> I'm just a beginner with R but I have not been able to search for any
> relevant answer to my problem.  I apologize if it has in fact been asked
> before.
> 
> Recently I've realized that I need to combine hundreds of pairs of data
> frames.  The filenames of the frames I need to combine have unique strings.
>  This is my best guess as to the approach to take:
> 
>  filenames<-list.files()
> 
> filenames
> [1] "a1.csv" "a2.csv" "b1.csv" "b2.csv" "c1.csv" "c2.csv"
> 
> alldata<-lapply(filenames, read.csv, header=TRUE)
> 
>  names(alldata)<-filenames
>  summary(alldata)
>        Length Class      Mode
> a1.csv 27     data.frame list
> a2.csv 27     data.frame list
> b1.csv 27     data.frame list
> b2.csv 27     data.frame list
> c1.csv 27     data.frame list
> c2.csv 27     data.frame list
> 
> My next step would be to cbind files that share a common string at the
> beginning, such as:
> cbind(alldata[[1]],alldata[[2]])
> cbind(alldata[[3]],alldata[[4]])
> cbind(alldata[[5]],alldata[[6]])
> ...
> 
> but file list is hundreds of files long (but is sorted alphanumerically
> such as in this example - not sure if this is relevant).  If I had to
> guess, I'd do something like this:
> 
> which(names(alldata)==...), to identify which elements to combine based on
> unique filename
> 
> OR
> x<-seq(1,length(alldata), 2)
> y=x+1
> z<-cbind(x,y)
> z
>      x y
> [1,] 1 2
> [2,] 3 4
> [3,] 5 6
> 
> to use the frame created in z to combine based on rows,
> 
> then use a looped cbind function (or *apply function with nested cbind
> function?) using the previously returned indexes to create my new combined
> data frames, including a step to write the frames to a new unique filename
> (not sure how to do that step in this context).  These last steps I've
> tried a lot of code but nothing worth mentioning as it has all failed
> miserably.
> 
> I appreciate the help,
> 
> M
> 
> 	[[alternative HTML version deleted]]
> 
> 

Hi Matthew,

You could try using substr() if the cbind is based on a common string in the
file name just makes sure that the strings in filenames is in the same order as
the files are in list.files:

a1 <- data.frame("col1" = seq(1,10, 1))
a2 <- data.frame("col2" = seq(11,20, 1))
b1 <- data.frame("col3" = seq(21,30, 1))
b2 <- data.frame("col4" = seq(31,40, 1))

filenames <- c("a1", "a2", "b1", "b2")

list.files <- list(a1, a2, b1, b2)
first.letter <- substr(filenames, 1,1)
unique.first.letter <- unique(first.letter)

l.files <- list()
for(i in 1:length(unique.first.letter)){
  l.files[[i]] = as.data.frame(list.files[first.letter == unique.first.letter[i]])
}


HTH,
Ken



More information about the R-help mailing list