[R] Fill NA values in columns with values of another column
Rui Barradas
ru|pb@rr@d@@ @end|ng |rom @@po@pt
Wed Aug 28 10:18:42 CEST 2024
Às 11:23 de 27/08/2024, Francesca PANCOTTO via R-help escreveu:
> Dear Contributors,
> I have a problem with a database composed of many individuals for many
> periods, for which I need to perform a manipulation of data as follows.
> Here I report the procedure I need to do for the first 32 observations of
> the first period.
>
>
> cbind(VB1d[,1],s1id[,1])
> [,1] [,2]
> [1,] 6 8
> [2,] 9 5
> [3,] NA 1
> [4,] 5 6
> [5,] NA 7
> [6,] NA 2
> [7,] 4 4
> [8,] 2 7
> [9,] 2 7
> [10,] NA 3
> [11,] NA 2
> [12,] NA 4
> [13,] 5 6
> [14,] 9 5
> [15,] NA 5
> [16,] NA 6
> [17,] 10 3
> [18,] 7 2
> [19,] 2 1
> [20,] NA 7
> [21,] 7 2
> [22,] NA 8
> [23,] NA 4
> [24,] NA 5
> [25,] NA 6
> [26,] 2 1
> [27,] 4 4
> [28,] 6 8
> [29,] 10 3
> [30,] NA 3
> [31,] NA 8
> [32,] NA 1
>
>
> In column s1id, I have numbers from 1 to 8, which are the id of 8 groups ,
> randomly mixed in the larger group of 32.
> For each group, I want the value that is reported for only to group
> members, to all the four group members.
>
> For example, value 8 in first row , second column, is group 8. The value
> for group 8 of the variable VB1d is 6. At row 28, again for s1id equal to
> 8, I have 6.
> But in row 22, the value 8 of the second variable, reports a value NA.
> in each group is the same, only two values have the correct number, the
> other two are NA.
> I need that each group, identified by the values of the variable S1id,
> correctly report the number of variable VB1d that is present for just two
> group members.
>
> I hope my explanation is acceptable.
> The task appears complex to me right now, especially because I will need to
> multiply this procedure for x12x14 similar databases.
>
> Anyone has ever encountered a similar problem?
> Thanks in advance for any help provided.
>
> ----------------------------------
>
> Francesca Pancotto
>
> Associate Professor Political Economy
>
> University of Modena, Largo Santa Eufemia, 19, Modena
>
> Office Phone: +39 0522 523264
>
> Web: *https://sites.google.com/view/francescapancotto/home
> <https://sites.google.com/view/francescapancotto/home>*
>
> ----------------------------------
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Hello,
Here is a solution.
Split the 1st column by the 2nd, keep only the not-NA values and unlist,
to have a named vector.
Then put the names and the values together with cbind.
mat <- structure(
c(6L, 9L, NA, 5L, NA, NA, 4L, 2L, 2L, NA, NA, NA, 5L,
9L, NA, NA, 10L, 7L, 2L, NA, 7L, NA, NA, NA, NA, 2L, 4L, 6L,
10L, NA, NA, NA, 8L, 5L, 1L, 6L, 7L, 2L, 4L, 7L, 7L, 3L, 2L,
4L, 6L, 5L, 5L, 6L, 3L, 2L, 1L, 7L, 2L, 8L, 4L, 5L, 6L, 1L, 4L,
8L, 3L, 3L, 8L, 1L), dim = c(32L, 2L))
res <- split(mat[, 1L], mat[, 2L]) |> lapply(\(x) x[!is.na(x)]) |> unlist()
nms <- names(res)
res <- cbind(
VB1d = res,
s1id = substr(nms, 1, nchar(nms) - 1L) |> as.integer()
)
res
#> VB1d s1id
#> 11 2 1
#> 12 2 1
#> 21 7 2
#> 22 7 2
#> 31 10 3
#> 32 10 3
#> 41 4 4
#> 42 4 4
#> 51 9 5
#> 52 9 5
#> 61 5 6
#> 62 5 6
#> 71 2 7
#> 72 2 7
#> 81 6 8
#> 82 6 8
Hope this helps,
Rui Barradas
--
Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus.
www.avg.com
More information about the R-help
mailing list