[R] alternative for multiple if_else statements
Eric Berger
ericjberger at gmail.com
Thu Feb 22 11:04:16 CET 2018
Hi,
1. I think the reason that the different ordering leads to different
results is because of the following:
date[ some condition is true ][1]
will give you an NA if there are no rows where 'some condition holds'.
In the code that 'works' you don't have such a situation, but in the
code that 'does not work' you presumably hit an NA before you get to the
result that you really want.
2. I am not a big fan of your "nested if" layout. I think you could rewrite
it more clearly - and without nesting - with something like
> trialData$survey_year <- rep(NA_character_, nrow(trialData))
> trialData$survey_year[ condition for survey_2007 ] <- "survey_2007"
> trialData$survey_year[ condition for survey_2008 ] <- "survey_2008"
> etc
HTH,
Eric
On Wed, Feb 21, 2018 at 10:33 PM, Kevin Wamae <KWamae at kemri-wellcome.org>
wrote:
> Hi, I am having trouble trying to figure out why if_else is behaving the
> way it is, it may be my code or the way the data is structured.
>
> Below is a snapshot of a database am working on and it represents a
> longitudinal survey of study participants in a trial with weekly follow up.
>
> The variable "survey_start" represents the start of the study-defined one
> year follow up (which we called "survey_year").
>
> I am trying to populate all subsequent entries for each participant, per
> survey year, with the entry "survey" followed by an underscore and the
> respective year, eg. survey_2014.
>
> There are missing entries such as the participant represented here, wasn't
> available at the start of the 2015 survey. Also, some participants don’t
> have complete one-year follow ups but I still need to include them.
>
> I have written two codes, first one fails while the second works, the only
> difference being I have reversed the order in which the entries are
> populated in the second code (from 2007-2016 to 2016-2007) and removed the
> if_else statement for 2015. Also noticed, that for the second code, which
> spans the years 2007-2016 (less 2015), if a participants entries start from
> 2010-2016, the code fails.
>
> Kindly assist in figuring this out...or better yet, an alternative.
>
> trialData <- structure(list(study = c("site_1", "site_1", "site_1",
> "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1"), studyno = c("child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1"), date = structure(c(16078, 16085, 16092,
> 16098, 16104, 16115, 16121, 16129, 16135, 16140, 16146, 16156,
> 16162, 16168, 16177, 16185, 16191, 16195, 16203, 16210, 16217,
> 16225, 16234, 16237, 16246, 16253, 16262, 16269, 16278, 16283,
> 16288, 16297, 16304, 16311, 16319, 16326, 16332, 16337, 16346,
> 16353, 16360, 16366, 16370, 16381, 16384, 16395, 16399, 16407,
> 16415, 16422, 16444, 16452, 16454, 16467, 16474, 16477, 16484,
> 16490, 16501, 16508, 16514, 16520, 16529, 16533, 16539, 16550,
> 16556, 16564, 16566, 16578, 16582, 16593, 16599, 16604, 16613,
> 16620, 16623, 16635, 16636, 16654, 16660, 16666, 16673, 16681,
> 16688, 16693, 16702, 16706, 16714, 16721, 16728, 16734, 16745,
> 16749, 16757, 16764, 16769, 16778, 16785, 16792, 16805, 16812,
> 16819, 16830, 16832, 16839, 16846, 16856, 16862, 16867, 16877,
> 16884, 16890, 16898, 16904, 16912, 16917, 16923, 16936, 16938,
> 16953, 16960, 16966, 16973, 16980), class = "Date"), year = c(2014L,
> 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
> 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
> 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
> 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
> 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
> 2014L, 2014L, 2014L, 2014L, 2015L, 2015L, 2015L, 2015L, 2015L,
> 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
> 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
> 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
> 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
> 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
> 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
> 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
> 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L), month = c(1L,
> 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L,
> 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L,
> 8L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L,
> 12L, 12L, 12L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L,
> 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 7L,
> 7L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 11L,
> 11L, 11L, 11L, 11L, 12L, 12L, 12L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
> 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L,
> 6L, 6L), survey_start = c("", "", "", "", "", "", "", "", "",
> "", "", "", "", "", "", "", "", "Y", "", "", "", "", "", "",
> "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
> "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
> "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
> "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
> "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
> "", "", "", "", "", "", "Y", "", "", "", "", "", "", "", "",
> "", "", "", "", "", "")), class = "data.frame", row.names = c(NA,
> -125L), .Names = c("study", "studyno", "date", "year", "month",
> "survey_start"))
>
>
> code 1 fails:
>
> trialData <- trialData %>% arrange(studyno, date) %>% group_by(studyno) %>%
> mutate(survey_year = if_else(date >= date[survey_start == "Y" & year ==
> 2007 & study == "site_1"][1] & date < date[month == 5 & year == 2008 &
> study == "site_1"][1], "survey_2007",
> if_else(date >= date[survey_start == "Y" & year ==
> 2008 & study == "site_1"][1] & date < date[month == 4 & year == 2009 &
> study == "site_1"][1], "survey_2008",
> if_else(date >= date[survey_start == "Y" & year ==
> 2009 & study == "site_1"][1] & date < date[month == 5 & year == 2010 &
> study == "site_1"][1], "survey_2009",
> if_else(date >= date[survey_start == "Y" & year ==
> 2010 & study == "site_1"][1] & date < date[month == 5 & year == 2011 &
> study == "site_1"][1], "survey_2010",
> if_else(date >= date[survey_start == "Y" & year ==
> 2011 & study == "site_1"][1] & date < date[month == 4 & year == 2012 &
> study == "site_1"][1], "survey_2011",
> if_else(date >= date[survey_start == "Y" & year ==
> 2012 & study == "site_1"][1] & date < date[month == 4 & year == 2013 &
> study == "site_1"][1], "survey_2012",
> if_else(date >= date[survey_start == "Y" & year ==
> 2013 & study == "site_1"][1] & date < date[month == 4 & year == 2014 &
> study == "site_1"][1], "survey_2013",
> if_else(date >= date[survey_start == "Y" & year ==
> 2014 & study == "site_1"][1] & date < date[month == 4 & year == 2015 &
> study == "site_1"][1], "survey_2014",
> if_else(date >= date[survey_start == "Y" & year ==
> 2015 & study == "site_1"][1] & date < date[month == 3 & year == 2016 &
> study == "site_1"][1], "survey_2015",
> if_else(date >= date[survey_start == "Y" & year ==
> 2016 & study == "site_1"][1], "survey_2016","")))))))))))
>
> code 2 works:
>
> trialData <- trialData %>% arrange(studyno, date) %>%
> group_by(studyno) %>%
> mutate(survey_year = if_else(date >= date[survey_start == "Y" & year ==
> 2016 & study == "site_1"][1]
> , "survey_2016",
> if_else(date >= date[survey_start == "Y" & year
> == 2014 & study == "site_1"][1] & date < date[month == 4 & year == 2015 &
> study == "site_1"][1], "survey_2014",
> if_else(date >= date[survey_start == "Y" & year
> == 2013 & study == "site_1"][1] & date < date[month == 4 & year == 2014 &
> study == "site_1"][1], "survey_2013",
> if_else(date >= date[survey_start == "Y" & year
> == 2012 & study == "site_1"][1] & date < date[month == 4 & year == 2013 &
> study == "site_1"][1], "survey_2012",
> if_else(date >= date[survey_start == "Y" & year
> == 2011 & study == "site_1"][1] & date < date[month == 4 & year == 2012 &
> study == "site_1"][1], "survey_2011",
> if_else(date >= date[survey_start == "Y" & year
> == 2010 & study == "site_1"][1] & date < date[month == 5 & year == 2011 &
> study == "site_1"][1], "survey_2010",
> if_else(date >= date[survey_start == "Y" & year
> == 2009 & study == "site_1"][1] & date < date[month == 5 & year == 2010 &
> study == "site_1"][1], "survey_2009",
> if_else(date >= date[survey_start == "Y" & year
> == 2008 & study == "site_1"][1] & date < date[month == 4 & year == 2009 &
> study == "site_1"][1], "survey_2008",
> if_else(date >= date[survey_start == "Y" & year
> == 2007 & study == "site_1"][1] & date < date[month == 5 & year == 2008 &
> study == "site_1"][1], "survey_2007",""))))))))))
>
> ______________________________________________________________________
>
> This e-mail contains information which is confidential. It is intended
> only for the use of the named recipient. If you have received this e-mail
> in error, please let us know by replying to the sender, and immediately
> delete it from your system. Please note, that in these circumstances, the
> use, disclosure, distribution or copying of this information is strictly
> prohibited. KEMRI-Wellcome Trust Programme cannot accept any responsibility
> for the accuracy or completeness of this message as it has been
> transmitted over a public network. Although the Programme has taken
> reasonable precautions to ensure no viruses are present in emails, it
> cannot accept responsibility for any loss or damage arising from the use of
> the email or attachments. Any views expressed in this message are those of
> the individual sender, except where the sender specifically states them to
> be the views of KEMRI-Wellcome Trust Programme.
> ______________________________________________________________________
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list