[R] Looping Through DataFrames with Differing Lenghts

Paul Bernal paulbernal07 at gmail.com
Tue Mar 28 16:32:15 CEST 2017


Dear Bo Lin,

I tried doing
Containerdata$TransitDate<-as.Date(Containerdata$TransitDate, "%e-%B-%y")
but I keep getting NAs.

I also tried a solution that I saw in stackoverflow doing:

> lct<-Sys.getlocale("LC_TIME"); Sys.setlocale("LC_TIME", "C")
[1] "C"
>
> Sys.setlocale("LC_TIME", lct)
[1] "English_United States.1252"

but didn´t work.

Any other suggestion?

Thank you for your valuable help,

Regards,

Paul

2017-03-28 9:19 GMT-05:00 Ng Bo Lin <ngbolin91 at gmail.com>:

> Hi Paul,
>
> The date format that you have supplied to R isn’t exactly right.
>
> Instead of supplying the format “%Y-%m-%d”, it appears that the format of
> your data adheres to the “%e-%B-%y” format. In this case, %e refers to Day,
> and takes an integer between (0 - 31), %B refers to the 3 letter
> abbreviated version of the Month, and %y refers to the Year provided in a
> “2-integer” format.
>
> Hope this helps!
>
> Thank you.
>
> Regards,
> Bo Lin
>
> On 28 Mar 2017, at 10:12 PM, Paul Bernal <paulbernal07 at gmail.com> wrote:
>
> Dear friends Ng Bo Lin, Mark and Ulrik, thank you all for your kind and
> valuable replies,
>
> I am trying to reformat a date as follows:
>
> Data<-read.csv("Container.csv")
>
> DataFrame<-data.frame(Data)
>
> DataFrame$TransitDate<-as.Date(DataFrame$TransitDate, "%Y-%m-%d")
>
> #trying to put it in YYYY-MM-DD format
>
> However, when I do this, I get a bunch of NAs for the dates.
>
> I am providing a sample dataset as a reference.
>
> Any help will be greatly appreciated,
>
> Best regards,
>
> Paul
>
> 2017-03-28 8:15 GMT-05:00 Ng Bo Lin <ngbolin91 at gmail.com>:
>
>> Hi Paul,
>>
>> Using the example provided by Ulrik, where
>>
>> > exdf1 <- data.frame(Date = c("1985-10-01", "1985-11-01", "1985-12-01”,
>> "1986-01-01"), Transits = c(NA, NA, NA, NA))
>> > exdf2 <- data.frame(Date = c("1985-10-01", "1986-01-01"), Transits =
>> c(15,20)),
>>
>> You could also try the following function:
>>
>> for (i in 1:dim(exdf1)[1]){
>>         if (!exdf1[i, 1] %in% exdf2[, 1]){
>>                 exdf2 <- rbind(exdf2, exdf1[i,])
>>         }
>> }
>>
>> Basically, what the function does is that it runs through the number of
>> rows in exdf1, and checks if the Date of the exdf1 row already exists in
>> Date column of exdf2. If so, it skips it. Otherwise, it binds the row to
>> df2.
>>
>> Hope this helps!
>>
>>
>> Side note.: Computational efficiency wise, think Ulrik’s answer is
>> probably better. Presentation wise, his is also much better.
>>
>> Regards,
>> Bo Lin
>>
>> > On 28 Mar 2017, at 5:22 PM, Ulrik Stervbo <ulrik.stervbo at gmail.com>
>> wrote:
>> >
>> > Hi Paul,
>> >
>> > does this do what you want?
>> >
>> > exdf1 <- data.frame(Date = c("1985-10-01", "1985-11-01", "1985-12-01",
>> > "1986-01-01"), Transits = c(NA, NA, NA, NA))
>> > exdf2 <- data.frame(Date = c("1985-10-01", "1986-01-01"), Transits =
>> c(15,
>> > 20))
>> >
>> > tmpdf <- subset(exdf1, !Date %in% exdf2$Date)
>> >
>> > rbind(exdf2, tmpdf)
>> >
>> > HTH,
>> > Ulrik
>> >
>> > On Tue, 28 Mar 2017 at 10:50 Paul Bernal <paulbernal07 at gmail.com>
>> wrote:
>> >
>> > Dear friend Mark,
>> >
>> > Great suggestion! Thank you for replying.
>> >
>> > I have two dataframes, dataframe1 and dataframe2.
>> >
>> > dataframe1 has two columns, one with the dates in YYYY-MM-DD format and
>> the
>> > other colum with number of transits (all of which were set to NA
>> values).
>> > dataframe1 starts in 1985-10-01 (october 1st 1985) and ends in
>> 2017-03-01
>> > (march 1 2017).
>> >
>> > dataframe2 has the same  two columns, one with the dates in YYYY-MM-DD
>> > format, and the other column with number of transits. dataframe2 starts
>> > have the same start and end dates, however, dataframe2 has missing dates
>> > between the start and end dates, so it has fewer observations.
>> >
>> > dataframe1 has a total of 378 observations and dataframe2 has a  total
>> of
>> > 362 observations.
>> >
>> > I would like to come up with a code that could do the following:
>> >
>> > Get the dates of dataframe1 that are missing in dataframe2 and add them
>> as
>> > records to dataframe 2 but with NA values.
>> >
>> > <dataframe1                              <dataframe2
>> >
>> > Date              Transits                  Date
>> > Transits
>> > 1985-10-01    NA                         1985-10-01                15
>> > 1985-11-01    NA                         1986-01-01                 20
>> > 1985-12-01    NA                         1986-02-01                 5
>> > 1986-01-01    NA
>> > 1986-02-01    NA
>> > 2017-03-01    NA
>> >
>> > I would like to fill in the missing dates in dataframe2, with NA as
>> value
>> > for the missing transits, so that I  could end up with a dataframe3
>> looking
>> > as follows:
>> >
>> > <dataframe3
>> > Date                                Transits
>> > 1985-10-01                      15
>> > 1985-11-01                       NA
>> > 1985-12-01                       NA
>> > 1986-01-01                       20
>> > 1986-02-01                       5
>> > 2017-03-01                       NA
>> >
>> > This is what I want to accomplish.
>> >
>> > Thanks, beforehand for your help,
>> >
>> > Best regards,
>> >
>> > Paul
>> >
>> >
>> > 2017-03-27 15:15 GMT-05:00 Mark Sharp <msharp at txbiomed.org>:
>> >
>> >> Make some small dataframes of just a few rows that illustrate the
>> problem
>> >> structure. Make a third that has the result you want. You will get an
>> >> answer very quickly. Without a self-contained reproducible problem,
>> > results
>> >> vary.
>> >>
>> >> Mark
>> >> R. Mark Sharp, Ph.D.
>> >> msharp at TxBiomed.org
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>> On Mar 27, 2017, at 3:09 PM, Paul Bernal <paulbernal07 at gmail.com>
>> wrote:
>> >>>
>> >>> Dear friends,
>> >>>
>> >>> I have one dataframe which contains 378 observations, and another one,
>> >>> containing 362 observations.
>> >>>
>> >>> Both dataframes have two columns, one date column and another one with
>> >> the
>> >>> number of transits.
>> >>>
>> >>> I wanted to come up with a code so that I could fill in the dates that
>> >> are
>> >>> missing in one of the dataframes and replace the column of transits
>> with
>> >>> the value NA.
>> >>>
>> >>> I have tried several things but R obviously complains that the length
>> of
>> >>> the dataframes are different.
>> >>>
>> >>> How can I solve this?
>> >>>
>> >>> Any guidance will be greatly appreciated,
>> >>>
>> >>> Best regards,
>> >>>
>> >>> Paul
>> >>>
>> >>> [[alternative HTML version deleted]]
>> >>>
>> >>> ______________________________________________
>> >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >>> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>> PLEASE do read the posting guide http://www.R-project.org/
>> <http://www.r-project.org/>
>> >> posting-guide.html
>> >>> and provide commented, minimal, self-contained, reproducible code.
>> >>
>> >> CONFIDENTIALITY NOTICE: This e-mail and any files and/or attachments
>> >> transmitted, may contain privileged and confidential information and is
>> >> intended solely for the exclusive use of the individual or entity to
>> whom
>> >> it is addressed. If you are not the intended recipient, you are hereby
>> >> notified that any review, dissemination, distribution or copying of
>> this
>> >> e-mail and/or attachments is strictly prohibited. If you have received
>> > this
>> >> e-mail in error, please immediately notify the sender stating that this
>> >> transmission was misdirected; return the e-mail to sender; destroy all
>> >> paper copies and delete all electronic copies from your system without
>> >> disclosing its contents.
>> >>
>> >
>> >        [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html <http://www.r-project.org/posting-guide.html>
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>> >       [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html <http://www.r-project.org/posting-guide.html>
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>>
> <Container.csv>
>
>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list