[R] how to make the code more efficient using lapply
Stephen HonKit Wong
@tephen66 @ending from gm@il@com
Fri May 25 08:24:12 CEST 2018
Dear All,
I have a following for-loop code which is basically intended to read in
many excel files (each file has many columns and rows) in a directory and
extract the some rows and columns out of each file and then combine them
together into a dataframe. I use for loop which can do the work but quite
slow. How to make it faster using lapply function ? Thanks in advance!
temp.df<-c() # create an empty list to store the extracted result from each
excel file inside for-loop
for (i in list.files()) { # loop through each excel file in the directory
temp<-read_xlsx(i,sheet=1,range=cell_cols(c(1,30,38:42))) # from package
"readxl" to read in excel file
temp<-temp[grep("^geneA$|^geneB$|^geneC$",temp$Id),] # extract rows
based on temp$id
names(temp)<-gsub("^.*] ","",names(temp)) # clean up column names
temp.df<-append(temp.df, list(as.data.frame(temp))) # change the
dataframe to list, so it can be append to list.
if (i == list.files()[length(list.files())]){ # if it is last excel file,
then combine all the rows in the list into a dataframe because they all
have same column names
temp.df.all<-do.call("rbind",temp.df)
write_xlsx(temp.df.all, path="output.xlsx") # write_xlsx from package
writexl.
}
}
*Stephen*
[[alternative HTML version deleted]]
More information about the R-help
mailing list