[R] Replace NAs in split lists
Jeff Newmiller
jdnewmil at dcn.davis.ca.us
Mon Jan 8 17:44:30 CET 2018
I don't know. You seem to be posting in HTML so your code is mangled. Can you post plain text and use the reprex package to make sure it produces the errorin a clean R session?
--
Sent from my phone. Please excuse my brevity.
On January 8, 2018 8:03:45 AM PST, Ek Esawi <esawiek at gmail.com> wrote:
>Thank you Jeff. Your code works, as usual , perfectly. I am just
>wondering why if i put the whole code in one line, i get an error
>message.
>sdf2 <- lapply( sdf, function(z){z$Value
><-ifelse(is.na(z$Value),z$Value[!is.na(z$Value)][1],z$Value)z})
>error. unexpected symbol in sdf2
>
>Thanks again
>
>EK
>
>
>On Mon, Jan 8, 2018 at 3:12 AM, Jeff Newmiller
><jdnewmil at dcn.davis.ca.us> wrote:
>> Upon closer examination I see that you are not using the split
>version of
>> df1 as I usually would, so here is a reproducible example:
>>
>> #----
>> df1 <- read.table( text=
>> "ID ID_2 Firist Value
>> 1 a aa TRUE 2
>> 2 a ab FALSE NA
>> 3 a ac FALSE NA
>> 4 b aa TRUE 5
>> 5 b ab FALSE NA
>> ", header=TRUE, as.is=TRUE )
>>
>> sdf <- split( df1, df1$ID )
>> # note the extra [ 1 ] in case you have more than one non-NA value #
>per ID
>> sdf2 <- lapply( sdf
>> , function( z ) {
>> z$Value <- ifelse( is.na( z$Value )
>> , z$Value[ !is.na( z$Value ) ][ 1 ]
>> , z$Value
>> )
>> z
>> }
>> )
>> df2 <- do.call( rbind, sdf2 )
>> df2
>> #> ID ID_2 Firist Value
>> #> a.1 a aa TRUE 2
>> #> a.2 a ab FALSE 2
>> #> a.3 a ac FALSE 2
>> #> b.4 b aa TRUE 5
>> #> b.5 b ab FALSE 5
>>
>> # or using tidyverse methods
>>
>> library(dplyr)
>> #>
>> #> Attaching package: 'dplyr'
>> #> The following objects are masked from 'package:stats':
>> #>
>> #> filter, lag
>> #> The following objects are masked from 'package:base':
>> #>
>> #> intersect, setdiff, setequal, union
>> df3 <- ( df1
>> %>% group_by( ID )
>> %>% do({
>> mutate( .
>> , Value = ifelse( is.na( Value )
>> , Value[ !is.na( Value ) ][ 1 ]
>> , Value
>> )
>> )
>> })
>> %>% ungroup
>> )
>> df3
>> #> # A tibble: 5 x 4
>> #> ID ID_2 Firist Value
>> #> <chr> <chr> <lgl> <int>
>> #> 1 a aa T 2
>> #> 2 a ab F 2
>> #> 3 a ac F 2
>> #> 4 b aa T 5
>> #> 5 b ab F 5
>> #----
>>
>>
>> On Sun, 7 Jan 2018, Jeff Newmiller wrote:
>>
>>> Why do you want to modify df1?
>>>
>>> Why not just reassemble the parts as a new data frame and use that
>going
>>> forward in your calculations? That is generally the preferred
>approach in R
>>> so you can re-do your calculations easily if you find a mistake
>later.
>>> --
>>> Sent from my phone. Please excuse my brevity.
>>>
>>> On January 7, 2018 7:35:59 PM PST, Ek Esawi <esawiek at gmail.com>
>wrote:
>>>>
>>>> I just came up with a solution right after i posted the question,
>but
>>>> i figured there must be a better and shorter one.than my solution
>>>> sdf1[[1]][1,4]<-lapplyresults[[1]]
>>>> sdf1[[2]][1,4]<-lapplyresults[[2]]
>>>>
>>>> EK
>>>>
>>>> On Sun, Jan 7, 2018 at 10:13 PM, Ek Esawi <esawiek at gmail.com>
>wrote:
>>>>>
>>>>> Hi all--
>>>>>
>>>>> I stumbled on this problem online. I did not like the solution
>given
>>>>> there which was a long UDF. I thought why cannot split and l/s
>apply
>>>>> work here. My aim is to split the data frame, use l/sapply, make
>>>>> changes on the split lists and combine the split lists to new data
>>>>> frame with the desired changes/output.
>>>>>
>>>>> The data frame shown below has a column named ID which has 2
>>>>
>>>> variables
>>>>>
>>>>> a and b; i want to replace the NAs on the Value column by 2, which
>is
>>>>> the only numeric entry, for ID=a and by 5 for ID=b.
>>>>>
>>>>> I worked out the solution but could not replace the results in the
>>>>
>>>> split lists.
>>>>>
>>>>>
>>>>> Original dataframe , df1
>>>>> ID ID_2 Firist Value
>>>>> 1 a aa TRUE 2
>>>>> 2 a ab FALSE NA
>>>>> 3 a ac FALSE NA
>>>>> 4 b aa TRUE 5
>>>>> 5 b ab FALSE NA
>>>>> Sdf1
>>>>> $a
>>>>> ID ID_2 Firist Value
>>>>> 1 a aa TRUE 2
>>>>> 2 a ab FALSE NA
>>>>> 3 a ac FALSE NA
>>>>> $b
>>>>> ID ID_2 Firist Value
>>>>> 4 b aa TRUE 5
>>>>> 5 b ab FALSE NA
>>>>> Desired results
>>>>> ID ID_2 Firist Value
>>>>> 1 a aa TRUE 2
>>>>> 2 a ab FALSE 2
>>>>> 3 a ac FALSE 2
>>>>>
>>>>> $b
>>>>> ID ID_2 Firist Value
>>>>> 4 b aa TRUE 5
>>>>> 5 b ab FALSE 5
>>>>>
>>>>> My code
>>>>>
>>>>> sdf <- split(df1,df$ID)
>>>>> lapply(sdf, function(z)
>>>>
>>>> ifelse(is.na(z$Value),z$Value[!is.na(z$Value)],z$Value))
>>>>>
>>>>> result:
>>>>> $ a: num [1:3] 2 2 2
>>>>> $ b: num [1:2] 5 5
>>>>>
>>>>> How could I put these two lists back in the split data frame,
>sdf1?
>>>>> Then I could use do.call to reassemble a data frame from the split
>>>>> lists,
>>>>>
>>>>> Thanks,
>>>>> EK
>>>>
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>---------------------------------------------------------------------------
>> Jeff Newmiller The ..... ..... Go
>Live...
>> DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live
>Go...
>> Live: OO#.. Dead: OO#..
>Playing
>> Research Engineer (Solar/Batteries O.O#. #.O#. with
>> /Software/Embedded Controllers) .OO#. .OO#.
>rocks...1k
>>
>---------------------------------------------------------------------------
More information about the R-help
mailing list