[R] using for loop with data frames.

MacQueen, Don m@cqueen1 @end|ng |rom ||n|@gov
Thu May 10 18:11:16 CEST 2018


Evidently, you want your loop to create new data frames, named (in this example)
  df_selected1
  df_selected2
  df_selected3

Yes, it can be done. But to do it you will have to use the get() and assign() functions, and construct the data frame names as character strings. Syntax like
    df_bs_id[3]
does not give you df_bs_id3.

R experts typically discourage this kind of approach. A method more consistent with how R is designed to work would be to store the data frames as elements of a list.

dflst <- list(df_bs_id1, df_bs_id2, df_bs_id3)
nframes <- length(dflist)
newdf <- dflst

for (id in seq(nframes)) {
   newdf[id] <- dflst[[ id ]][ , c("column1", "column2")]  
}

Optionally, you could name the list elements:
 
    names(dflst) <- paste0('df_selected', seq(nframes))

After which you would have the original data frames as elements of dflst, and the processed data frames as elements of newdf. The loop can be simplified a bit if you don't need to keep copies of the original data frames.

With this approach, it would be better create dflst using a loop over the incoming file names, running read.csv() inside the loop. In which case you would never create separate data frames df_bs_id1, df_bs_id2, etc.

I have used both approaches at various times over the years, and each has pros and cons. In general, I would recommend the list approach, however, especially if you have a large number of files to process.

-Don

--
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
Lab cell 925-724-7509
 
 
On 5/10/18, 7:33 AM, "R-help on behalf of Marcelo Mariano Silva" <r-help-bounces using r-project.org on behalf of marcelomarianosilva using gmail.com> wrote:

    Hi,
    
    Is it possible use a loop to process many data frames in the same way?
    
    For example, if I have three data frames, all with same variables
    
    
    df_bs_id1 <- read.csv("test1.csv",header =TRUE)
    df_bs_id2 <- read.csv("test2.csv",header =TRUE)
    df_bs_id3 <- read.csv("test3.csv",header =TRUE)
    
    
    How could I would implement a code loop that , for instance, would select
    two coluns of interest in a fashion of the code below ?
    
    
    # selecting only 2 columns of interest
    
    for (1, 1:3) {
    df_selected [i] <- df_bs_id[i]  [ , c("column1", "column2")]  }
    
    
    Tks
    
    MMS
    
    	[[alternative HTML version deleted]]
    
    ______________________________________________
    R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.
    



More information about the R-help mailing list