[R] Using while statements to insert rows in a dataframe

Luc Villandre villandl at dms.umontreal.ca
Tue May 19 18:00:32 CEST 2009


Eric McKibben wrote:
> Hi.
> I am very new to R and have been diligently working my way through the manual and various tutorials.  I am now trying to work with some of my own data and have encountered a problem that I need to fix.  I have a dataframe with 8 columns and approximately 850 rows.  I have provided an excerpt of the dataframe below.  Within column 6 (Question) the numbers 1:33 repeat down the entire column.  Occasionally, however, another value (-32767) appears.  I need to locate this value everytime it appears and in its place insert 33 rows that are numbered 1:33 in column Question.  Additionally, I need to maintain the integrity of the other columns so that the values at that location in each column are also repeated 33 times.  So, in the example below, I currently have 68 rows of data, but I actually need 132 rows (two -32767 values need to be replaced).  Based on my reading I am guessing that I need to use a while loop, but I cannot seem to get it right.  Is this the appropriate function!
>  
>   or is there another more efficient method for achieving my goal.  Again, I am quite new to R.  Thanks for your help!
>
> Year Month Day Time PartID Question Latency Response
> 2008 2 7 194556 6 1 265 -1
> 2008 2 7 194556 6 2 466 84
> 2008 2 7 194556 6 3 199 68
> 2008 2 7 194556 6 4 152 83
> 2008 2 7 194556 6 5 177 100
> 2008 2 7 194556 6 6 177 61
> 2008 2 7 194556 6 7 400 43
> 2008 2 7 194556 6 8 225 88
> 2008 2 7 194556 6 9 249 32
> 2008 2 7 194556 6 10 172 8
> 2008 2 7 194556 6 11 163 17
> 2008 2 7 194556 6 12 326 70
> 2008 2 7 194556 6 13 232 26
> 2008 2 7 194556 6 14 157 22
> 2008 2 7 194556 6 15 135 -1
> 2008 2 7 194556 6 16 133 2
> 2008 2 7 194556 6 17 222 2
> 2008 2 7 194556 6 18 357 4
> 2008 2 7 194556 6 19 131 -1
> 2008 2 7 194556 6 20 222 90
> 2008 2 7 194556 6 21 230 35
> 2008 2 7 194556 6 22 374 32
> 2008 2 7 194556 6 23 275 85
> 2008 2 7 194556 6 24 141 -1
> 2008 2 7 194556 6 25 264 19
> 2008 2 7 194556 6 26 380 17
> 2008 2 7 194556 6 27 240 21
> 2008 2 7 194556 6 28 127 -1
> 2008 2 7 194556 6 29 232 92
> 2008 2 7 194556 6 30 205 95
> 2008 2 7 194556 6 31 185 96
> 2008 2 7 194556 6 32 319 61
> 2008 2 7 194556 6 33 101 -1
> 2008 2 8 122203 6 -32767 0 NA
> 2008 2 7 194556 6 1 265 -1
> 2008 2 7 194556 6 2 466 84
> 2008 2 7 194556 6 3 199 68
> 2008 2 7 194556 6 4 152 83
> 2008 2 7 194556 6 5 177 100
> 2008 2 7 194556 6 6 177 61
> 2008 2 7 194556 6 7 400 43
> 2008 2 7 194556 6 8 225 88
> 2008 2 7 194556 6 9 249 32
> 2008 2 7 194556 6 10 172 8
> 2008 2 7 194556 6 11 163 17
> 2008 2 7 194556 6 12 326 70
> 2008 2 7 194556 6 13 232 26
> 2008 2 7 194556 6 14 157 22
> 2008 2 7 194556 6 15 135 -1
> 2008 2 7 194556 6 16 133 2
> 2008 2 7 194556 6 17 222 2
> 2008 2 7 194556 6 18 357 4
> 2008 2 7 194556 6 19 131 -1
> 2008 2 7 194556 6 20 222 90
> 2008 2 7 194556 6 21 230 35
> 2008 2 7 194556 6 22 374 32
> 2008 2 7 194556 6 23 275 85
> 2008 2 7 194556 6 24 141 -1
> 2008 2 7 194556 6 25 264 19
> 2008 2 7 194556 6 26 380 17
> 2008 2 7 194556 6 27 240 21
> 2008 2 7 194556 6 28 127 -1
> 2008 2 7 194556 6 29 232 92
> 2008 2 7 194556 6 30 205 95
> 2008 2 7 194556 6 31 185 96
> 2008 2 7 194556 6 32 319 61
> 2008 2 7 194556 6 33 101 -1
> 2008 2 8 143056 6 -32767 0 NA
>
>
>
>
> Eric S McKibben
> Industrial-Organizational Psychology Graduate Student
> Clemson University
> Clemson, SC
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>   
Hi Eric,

Using a /while/ statement would probably work, but it would imply not 
making use of R's convenient indexing aspect. What I suggest is the 
following (my.data is the data.frame you provided) :

> ## To locate the rows ;
>
> row.pos = which(my.data$Question==-32767) ;
> repeat.index = rep(row.pos, 33) ;
>
> ## To output the result data.frame ;
>
> index.vector = sort(c(seq_along(my.data$Question)[my.data$Question != 
> -32767], repeat.index)) ;
> final.result = my.data[index.vector,] ;
This should do the trick.

Cheers,
-- 
*Luc Villandré*
/Biostatistician
McGill University Health Center -
Montreal Children's Hospital Research Institute/




More information about the R-help mailing list