[R] cleanup/replacing a value on condition of another value
Rui Barradas
ru|pb@rr@d@@ @end|ng |rom @@po@pt
Mon Oct 25 19:06:22 CEST 2021
Hello,
Here is a pipe to replace based on the composite condition.
It uses ?base::replace with an integer index vector.
In the end, filter is meant to show the changed value in context, remove
it and assign the data.frame or tibble back to the input to change the
original.
library(dplyr)
data(coronavirus, package = "coronavirus")
coronavirus %>%
select(country, date, type, cases) -> covid
covid %>%
mutate(cases = replace(cases,
which(country == 'Namibia' &
date == '2021-10-23' &
cases == 357), NA)
) %>%
filter(
country == 'Namibia',
date >= '2021-10-22' & date <= '2021-10-24'
)
Hope this helps,
Rui Barradas
Às 16:49 de 25/10/21, Dr Eberhard W Lisse escreveu:
> Rui,
>
> that works for me too, but is not what I need to do.
>
> I want to make the 'cases' value for this particular country AND this
> particular date AND this particular type AND this particular value (ie
> ALL conditions must be fulfilled) become NA so that the tibble would
> change from
>
> [...]
> 2 Namibia 2021-10-24 death 4
> 3 Namibia 2021-10-23 confirmed 357
> 4 Namibia 2021-10-23 death 1
> [...]
>
> to
>
> [...]
> 2 Namibia 2021-10-24 death 4
> 3 Namibia 2021-10-23 confirmed 357
> 4 Namibia 2021-10-23 death 1
> [...]
>
> as long as they don't fix the dataset, and if/when they do it goes to
> the expected 23 value :-)-O
>
> greetings, el
>
> On 2021-10-25 17:26 , Rui Barradas wrote:
> > Hello,
> >
> > The following works with me.
> >
> >
> > library(coronavirus)
> > library(dplyr)
> >
> > data(coronavirus, package = "coronavirus")
> > #update_dataset(silence = FALSE)
> >
> > coronavirus %>%
> > select(country, date, type, cases) %>%
> > filter(
> > country == 'Namibia',
> > date == '2021-10-23',
> > cases == 357
> > )
> >
> >
> >
> > Can you post the pipe code you are running?
> >
> > Hope this helps,
> >
> > Rui Barradas
> >
> > Às 12:25 de 25/10/21, Dr Eberhard W Lisse escreveu:
> >> Hi,
> >>
> >> I have data from JHU via the 'coronavirus' package which has a value
> for
> >> the confirmed cases for 2021-10-23 which differs drastically (357) from
> >> what is reported in country (23).
> >>
> >> # A tibble: 962 × 4
> >> country date type cases
> >> <chr> <date> <chr> <int>
> >> 1 Namibia 2021-10-24 confirmed 23
> >> 2 Namibia 2021-10-24 death 4
> >> 3 Namibia 2021-10-23 confirmed 357
> >> 4 Namibia 2021-10-23 death 1
> >> 5 Namibia 2021-10-22 confirmed 30
> >> 6 Namibia 2021-10-22 death 1
> >> # … with 956 more rows
> >>
> >> I am using a '%>%' pipeline and am struggling to mutate 'cases' to NA
> >> using something like
> >>
> >> country == 'Namibia' & date == '2021-10-23' & cases == 357
> >>
> >> so that if or when the data-set is corrected I don't have to change the
> >> code (immediately), even after some googling.
> >>
> >> I can do
> >>
> >> cases == 357
> >>
> >> only, but that could find other cases as well, which is obviously not
> >> the thing to do
> >>
> >> Any suggestions?
> >>
> >> greetings, el
> >>
> >> ______________________________________________
> >> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list