[R] how to merge commands

MacQueen, Don macqueen1 at llnl.gov
Thu Feb 23 00:49:06 CET 2012

Are you absolutely certain that the data must be stored in Excel?

In the long run I believe you will find it easier if the data is stored in
an external database, or some other data repository that does not require
you to read so many separate files.

Probably the best you can hope for as it is now is to put these commands
inside a loop, or nested loops, with the input and output file names
constructed from the loop indexes [see help('paste') for constructing file


Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550

On 2/21/12 9:45 AM, "Nino Pierantonio" <nino.p.80 at gmail.com> wrote:

>Dear all,
>I am using R to work on huge numbers of telemetry data divided by day.
>Each file (an xlsx file) contains 2 rows, the first one for sst readings
>and the second one for chl readings, and 72360 columns, each
>corresponding to the centre of a cell in my study area. The columns have
>no headings. Lots of cells have fake readings (-999.0000000). What I
>want to do is merging the files together, by month and season, replace
>null values with "NA" and then calculate for both sst and chl average
>row values. I have stored the files in the directory C:/TEMP. This
>directory contains 12 subfolders, January to December and each subfolder
>contains a certain number of files, corresponding to the number of days
>for each month (e.g. January 31 files, February 30 files, and so on).
>I already have commands that work properly but would really know if it
>is possible to reduce their number and, maybe to do some of them
>automatically. What I do is working "month-by-month" as it follows (I am
>aware this is not the most elegant way to do it, i'm new to R and for
>the moment "elegance&stile" is not my main goal):
> >setwd("C:/Temp/January09")	# to set my working directory
> >library(xlsx)	# to load the "xlsx" library necessary to handle the
>original *.xlsx files
> >list.jan09<-list.files("C:/Temp/January09", full=TRUE)
> >read.all.jan09<-lapply(list.jan09, read.xlsx, 1, header=FALSE)
> >daily.all.jan09<-do.call("cbind",read.all.jan09)	# to create a data
>frame containig all my data
> >daily.sst.jan09<-daily.all.jan09[,seq(from=1,to=61,by=2)]	# to create
>a second data frame containing only sst readings (sst readings
>correspond to the first column of each daily file). The resulting file
>will have 31 columns and 72360 lines
> >daily.chl.jan09<-daily.all.jan09[,seq(from=2,to=62,by=2)]	# to create
>a third data frame containing only chl readings (chl readings correspond
>to the second column of each daily file). The resulting file will have
>31 columns and 72360 lines	
>>)	# used to replace -999.0000000 values with "NA" 		
> >jan09_avgsst<-rowMeans(daily.sst.jan09)	# to create a vector
>containing the mean sst value of all the rows		
> >write.xlsx(jan09_avgsst,
>"C:/Users/AAA/Desktop/Data/january09_avgsst.xlsx")	# to store the sst
>>)	# used to replace -999.0000000 values with "NA"		
> >jan09_avgchl<-rowMeans(daily.chl.jan09)	# to create a vector
>containing the mean value of all the rows			
> >write.xlsx(jan09_avgchl,
>"C:/Users/AAA/Desktop/Data/january09_avgchl.xlsx")	# to store the chl
>I repeat these same commands for all the months	and for the seasons
>(January-March; April-June; July-September; October-December), so the
>all thing is a bit redundant.
>How can I speed up the process, reduce the commands and maybe make them
>automatically? Many thanks for your help.
>Nino Pierantonio
>Mobile: +39 349.532.9370
>Skype: pierantonio_nino
>  * Italiano - rilevata
>  * Inglese
>  * Italiano
>  * Francese
>  * Spagnolo
>  * Tedesco
>  * Inglese
>  * Italiano
>  * Francese
>  * Spagnolo
>  * Tedesco
>  <javascript:void(0);>
>R-help at r-project.org mailing list
>PLEASE do read the posting guide
>and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list