[R] Inserting missing seq number

Avi Gross @v|gro@@ @end|ng |rom ver|zon@net
Wed Mar 30 05:39:31 CEST 2022


Jeff,

There may well be such a function somewhere but I would have requested
something less ambitious than a function that does exactly that.

As I see it, there are many ways to do what you specifically want, but much
depends on exact conditions in your data.

Your example shows seq values ascending from 1 to 7 with missing rows
for values 3 and 6. Can we assume the data will have everything in order
with gaps of exactly 1, or might there be gaps of 2 or more? Is the sequence
always beginning with 1, or might you have numbers from say 666 to 999?

Can the first or last entries be missing? How would we know?

One of many approaches I can outline is to create a second data structure
containing all valid integers between highest and lowest. In your case,
from 1 to 7. Again, one way is to make a data.frame similar to the above
and use some functions to make the appropriate merge. You would end
up, perhaps, with a data.frame with more rows and the new rows might
contain an NA for the count column.

Now, yes, there are many packages out there that have functions for
filling in missing values. Some allow you to fill in the previous value,
and some may allow a mean or interpolation and so on.

But consider a simple loop of indices in the enlarged data.frame of
values of seq. Each step in the loop makes note of the value stored in
count on the previous pass and if the currently pass has an NA, changes
it to the value being held. Lots more details, but as an outline, this
may suffice. And, yes, if you make a simple set of program lines
that performs what you want, and make it general enough, you should be able
to write an R function that accepts a data.frame and perhaps an indication
of which column to do this with and calculates everything needed to
return the enhanced data.frame as a result. You will then have a function
that does this!

I am sure eventually someone else will throw something like a dplyr function or two
or suggest you could have searched for something like: "R fill in missing info"

-----Original Message-----
From: Jeff Reichman <reichmanj using sbcglobal.net>
To: R-help using r-project.org
Sent: Tue, Mar 29, 2022 10:47 pm
Subject: [R] Inserting missing seq number


R-help



Is there a R function that will insert missing sequence number(s) and then

fill a missing observation with the preceding value.



For example df <- data.frame(seq = c(1,2,4,5,7), count = c(4,7,3,5,2))



  seq count

1    1        4

2    2        7

3    4        3

4    5        5

5    7        2



What I need is



  seq count

1    1        4

2    2        7

3    3        7

4    4        3

5    5        5

6    6        5

7    7        2



Jeff



______________________________________________

R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see

https://stat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list