[R] Looping Through DataFrames with Differing Lenghts
Paul Bernal
paulbernal07 at gmail.com
Tue Mar 28 16:40:43 CEST 2017
Dear friend David,
Thank you for your valuable suggestion. So here is the file in .txt format.
Best of regards,
Paul
2017-03-28 9:35 GMT-05:00 David L Carlson <dcarlson at tamu.edu>:
> We did not get the file on the list. You need to rename your file to
> "Container.txt" or the mailing list will strip it from your message. The
> read.csv() function returns a data frame so Data is already a data frame.
> The command DataFrame<-data.frame(Data) just makes a copy of Data.
>
> Without the file, it is difficult to be certain, but your dates are
> probably stored as character strings and read.csv() will turn those to
> factors unless you tell it not to do that. Try
>
> Data<-read.csv("Container.csv", stringsAsFactors=FALSE)
> str(Data) # To see how the dates are stored
>
> and see if things work better. If not, rename the file or use dput(Data)
> and copy the result into your email message. If the data is very long, use
> dput(head(Data, 15)).
>
> -------------------------------------
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77840-4352
>
>
> -----Original Message-----
> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Paul
> Bernal
> Sent: Tuesday, March 28, 2017 9:12 AM
> To: Ng Bo Lin <ngbolin91 at gmail.com>
> Cc: r-help at r-project.org
> Subject: Re: [R] Looping Through DataFrames with Differing Lenghts
>
> Dear friends Ng Bo Lin, Mark and Ulrik, thank you all for your kind and
> valuable replies,
>
> I am trying to reformat a date as follows:
>
> Data<-read.csv("Container.csv")
>
> DataFrame<-data.frame(Data)
>
> DataFrame$TransitDate<-as.Date(DataFrame$TransitDate, "%Y-%m-%d")
>
> #trying to put it in YYYY-MM-DD format
>
> However, when I do this, I get a bunch of NAs for the dates.
>
> I am providing a sample dataset as a reference.
>
> Any help will be greatly appreciated,
>
> Best regards,
>
> Paul
>
> 2017-03-28 8:15 GMT-05:00 Ng Bo Lin <ngbolin91 at gmail.com>:
>
> > Hi Paul,
> >
> > Using the example provided by Ulrik, where
> >
> > > exdf1 <- data.frame(Date = c("1985-10-01", "1985-11-01", "1985-12-01”,
> > "1986-01-01"), Transits = c(NA, NA, NA, NA))
> > > exdf2 <- data.frame(Date = c("1985-10-01", "1986-01-01"), Transits =
> > c(15,20)),
> >
> > You could also try the following function:
> >
> > for (i in 1:dim(exdf1)[1]){
> > if (!exdf1[i, 1] %in% exdf2[, 1]){
> > exdf2 <- rbind(exdf2, exdf1[i,])
> > }
> > }
> >
> > Basically, what the function does is that it runs through the number of
> > rows in exdf1, and checks if the Date of the exdf1 row already exists in
> > Date column of exdf2. If so, it skips it. Otherwise, it binds the row to
> > df2.
> >
> > Hope this helps!
> >
> >
> > Side note.: Computational efficiency wise, think Ulrik’s answer is
> > probably better. Presentation wise, his is also much better.
> >
> > Regards,
> > Bo Lin
> >
> > > On 28 Mar 2017, at 5:22 PM, Ulrik Stervbo <ulrik.stervbo at gmail.com>
> > wrote:
> > >
> > > Hi Paul,
> > >
> > > does this do what you want?
> > >
> > > exdf1 <- data.frame(Date = c("1985-10-01", "1985-11-01", "1985-12-01",
> > > "1986-01-01"), Transits = c(NA, NA, NA, NA))
> > > exdf2 <- data.frame(Date = c("1985-10-01", "1986-01-01"), Transits =
> > c(15,
> > > 20))
> > >
> > > tmpdf <- subset(exdf1, !Date %in% exdf2$Date)
> > >
> > > rbind(exdf2, tmpdf)
> > >
> > > HTH,
> > > Ulrik
> > >
> > > On Tue, 28 Mar 2017 at 10:50 Paul Bernal <paulbernal07 at gmail.com>
> wrote:
> > >
> > > Dear friend Mark,
> > >
> > > Great suggestion! Thank you for replying.
> > >
> > > I have two dataframes, dataframe1 and dataframe2.
> > >
> > > dataframe1 has two columns, one with the dates in YYYY-MM-DD format and
> > the
> > > other colum with number of transits (all of which were set to NA
> values).
> > > dataframe1 starts in 1985-10-01 (october 1st 1985) and ends in
> 2017-03-01
> > > (march 1 2017).
> > >
> > > dataframe2 has the same two columns, one with the dates in YYYY-MM-DD
> > > format, and the other column with number of transits. dataframe2 starts
> > > have the same start and end dates, however, dataframe2 has missing
> dates
> > > between the start and end dates, so it has fewer observations.
> > >
> > > dataframe1 has a total of 378 observations and dataframe2 has a total
> of
> > > 362 observations.
> > >
> > > I would like to come up with a code that could do the following:
> > >
> > > Get the dates of dataframe1 that are missing in dataframe2 and add them
> > as
> > > records to dataframe 2 but with NA values.
> > >
> > > <dataframe1 <dataframe2
> > >
> > > Date Transits Date
> > > Transits
> > > 1985-10-01 NA 1985-10-01 15
> > > 1985-11-01 NA 1986-01-01 20
> > > 1985-12-01 NA 1986-02-01 5
> > > 1986-01-01 NA
> > > 1986-02-01 NA
> > > 2017-03-01 NA
> > >
> > > I would like to fill in the missing dates in dataframe2, with NA as
> value
> > > for the missing transits, so that I could end up with a dataframe3
> > looking
> > > as follows:
> > >
> > > <dataframe3
> > > Date Transits
> > > 1985-10-01 15
> > > 1985-11-01 NA
> > > 1985-12-01 NA
> > > 1986-01-01 20
> > > 1986-02-01 5
> > > 2017-03-01 NA
> > >
> > > This is what I want to accomplish.
> > >
> > > Thanks, beforehand for your help,
> > >
> > > Best regards,
> > >
> > > Paul
> > >
> > >
> > > 2017-03-27 15:15 GMT-05:00 Mark Sharp <msharp at txbiomed.org>:
> > >
> > >> Make some small dataframes of just a few rows that illustrate the
> > problem
> > >> structure. Make a third that has the result you want. You will get an
> > >> answer very quickly. Without a self-contained reproducible problem,
> > > results
> > >> vary.
> > >>
> > >> Mark
> > >> R. Mark Sharp, Ph.D.
> > >> msharp at TxBiomed.org
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>> On Mar 27, 2017, at 3:09 PM, Paul Bernal <paulbernal07 at gmail.com>
> > wrote:
> > >>>
> > >>> Dear friends,
> > >>>
> > >>> I have one dataframe which contains 378 observations, and another
> one,
> > >>> containing 362 observations.
> > >>>
> > >>> Both dataframes have two columns, one date column and another one
> with
> > >> the
> > >>> number of transits.
> > >>>
> > >>> I wanted to come up with a code so that I could fill in the dates
> that
> > >> are
> > >>> missing in one of the dataframes and replace the column of transits
> > with
> > >>> the value NA.
> > >>>
> > >>> I have tried several things but R obviously complains that the length
> > of
> > >>> the dataframes are different.
> > >>>
> > >>> How can I solve this?
> > >>>
> > >>> Any guidance will be greatly appreciated,
> > >>>
> > >>> Best regards,
> > >>>
> > >>> Paul
> > >>>
> > >>> [[alternative HTML version deleted]]
> > >>>
> > >>> ______________________________________________
> > >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > >>> https://stat.ethz.ch/mailman/listinfo/r-help
> > >>> PLEASE do read the posting guide http://www.R-project.org/
> > >> posting-guide.html
> > >>> and provide commented, minimal, self-contained, reproducible code.
> > >>
> > >> CONFIDENTIALITY NOTICE: This e-mail and any files and/or attachments
> > >> transmitted, may contain privileged and confidential information and
> is
> > >> intended solely for the exclusive use of the individual or entity to
> > whom
> > >> it is addressed. If you are not the intended recipient, you are hereby
> > >> notified that any review, dissemination, distribution or copying of
> this
> > >> e-mail and/or attachments is strictly prohibited. If you have received
> > > this
> > >> e-mail in error, please immediately notify the sender stating that
> this
> > >> transmission was misdirected; return the e-mail to sender; destroy all
> > >> paper copies and delete all electronic copies from your system without
> > >> disclosing its contents.
> > >>
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/
> > posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/
> > posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> >
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
-------------- next part --------------
TransitDate Transits
1-Oct-85 4
1-Nov-85 4
1-Dec-85 5
1-Jan-86 4
1-Feb-86 3
1-Mar-86 6
1-Apr-86 4
1-May-86 3
1-Jun-86 4
1-Jul-86 5
1-Aug-86 5
1-Sep-86 4
1-Oct-86 4
1-Nov-86 5
1-Dec-86 2
1-Feb-88 1
1-Mar-88 1
1-Apr-88 2
1-May-88 2
1-Jul-88 1
1-Aug-88 1
1-Sep-88 1
1-Oct-88 2
1-Dec-88 2
1-Jan-89 3
1-Mar-89 2
1-Apr-89 3
1-May-89 4
1-Jun-89 3
1-Jul-89 3
1-Aug-89 2
1-Sep-89 5
1-Oct-89 3
1-Nov-89 3
1-Dec-89 4
1-Jan-90 6
1-Feb-90 4
1-Mar-90 6
1-Apr-90 3
1-May-90 7
1-Jun-90 7
1-Jul-90 3
1-Aug-90 6
1-Sep-90 5
1-Oct-90 6
1-Nov-90 7
1-Dec-90 6
1-Jan-91 5
1-Feb-91 7
1-Mar-91 7
1-Apr-91 7
1-May-91 8
1-Jun-91 7
1-Jul-91 7
1-Aug-91 8
1-Sep-91 9
1-Oct-91 8
1-Nov-91 8
1-Dec-91 9
1-Jan-92 10
1-Feb-92 8
1-Mar-92 8
1-Apr-92 7
1-May-92 9
1-Jun-92 8
1-Jul-92 12
1-Aug-92 12
1-Sep-92 11
1-Oct-92 12
1-Nov-92 12
1-Dec-92 11
1-Jan-93 13
1-Feb-93 10
1-Mar-93 11
1-Apr-93 12
1-May-93 15
1-Jun-93 14
1-Jul-93 12
1-Aug-93 14
1-Sep-93 11
1-Oct-93 16
1-Nov-93 10
1-Dec-93 14
1-Jan-94 12
1-Feb-94 14
1-Mar-94 14
1-Apr-94 16
1-May-94 15
1-Jun-94 14
1-Jul-94 16
1-Aug-94 16
1-Sep-94 14
1-Oct-94 17
1-Nov-94 14
1-Dec-94 14
1-Jan-95 16
1-Feb-95 18
1-Mar-95 15
1-Apr-95 17
1-May-95 19
1-Jun-95 21
1-Jul-95 23
1-Aug-95 24
1-Sep-95 21
1-Oct-95 24
1-Nov-95 20
1-Dec-95 26
1-Jan-96 22
1-Feb-96 21
1-Mar-96 25
1-Apr-96 23
1-May-96 24
1-Jun-96 24
1-Jul-96 22
1-Aug-96 25
1-Sep-96 24
1-Oct-96 24
1-Nov-96 25
1-Dec-96 25
1-Jan-97 25
1-Feb-97 20
1-Mar-97 26
1-Apr-97 22
1-May-97 26
1-Jun-97 24
1-Jul-97 21
1-Aug-97 27
1-Sep-97 23
1-Oct-97 25
1-Nov-97 25
1-Dec-97 26
1-Jan-98 25
1-Feb-98 20
1-Mar-98 25
1-Apr-98 19
1-May-98 28
1-Jun-98 24
1-Jul-98 25
1-Aug-98 25
1-Sep-98 26
1-Oct-98 28
1-Nov-98 25
1-Dec-98 26
1-Jan-99 28
1-Feb-99 24
1-Mar-99 26
1-Apr-99 26
1-May-99 30
1-Jun-99 24
1-Jul-99 28
1-Aug-99 26
1-Sep-99 24
1-Oct-99 29
1-Nov-99 27
1-Dec-99 25
1-Jan-00 29
1-Feb-00 25
1-Mar-00 29
1-Apr-00 25
1-May-00 31
1-Jun-00 24
1-Jul-00 36
1-Aug-00 29
1-Sep-00 30
1-Oct-00 37
1-Nov-00 34
1-Dec-00 42
1-Jan-01 41
1-Feb-01 37
1-Mar-01 42
1-Apr-01 43
1-May-01 46
1-Jun-01 49
1-Jul-01 41
1-Aug-01 50
1-Sep-01 46
1-Oct-01 47
1-Nov-01 49
1-Dec-01 56
1-Jan-02 55
1-Feb-02 54
1-Mar-02 55
1-Apr-02 59
1-May-02 60
1-Jun-02 58
1-Jul-02 66
1-Aug-02 68
1-Sep-02 66
1-Oct-02 68
1-Nov-02 67
1-Dec-02 79
1-Jan-03 73
1-Feb-03 71
1-Mar-03 85
1-Apr-03 79
1-May-03 80
1-Jun-03 82
1-Jul-03 86
1-Aug-03 78
1-Sep-03 86
1-Oct-03 81
1-Nov-03 90
1-Dec-03 93
1-Jan-04 95
1-Feb-04 84
1-Mar-04 93
1-Apr-04 88
1-May-04 92
1-Jun-04 99
1-Jul-04 90
1-Aug-04 105
1-Sep-04 99
1-Oct-04 103
1-Nov-04 97
1-Dec-04 97
1-Jan-05 106
1-Feb-05 95
1-Mar-05 102
1-Apr-05 98
1-May-05 117
1-Jun-05 100
1-Jul-05 111
1-Aug-05 115
1-Sep-05 111
1-Oct-05 116
1-Nov-05 120
1-Dec-05 118
1-Jan-06 126
1-Feb-06 107
1-Mar-06 128
1-Apr-06 123
1-May-06 140
1-Jun-06 135
1-Jul-06 142
1-Aug-06 138
1-Sep-06 147
1-Oct-06 149
1-Nov-06 146
1-Dec-06 153
1-Jan-07 143
1-Feb-07 131
1-Mar-07 134
1-Apr-07 132
1-May-07 143
1-Jun-07 137
1-Jul-07 152
1-Aug-07 146
1-Sep-07 152
1-Oct-07 153
1-Nov-07 141
1-Dec-07 142
1-Jan-08 130
1-Feb-08 122
1-Mar-08 124
1-Apr-08 127
1-May-08 138
1-Jun-08 126
1-Jul-08 138
1-Aug-08 142
1-Sep-08 137
1-Oct-08 137
1-Nov-08 139
1-Dec-08 130
1-Jan-09 134
1-Feb-09 115
1-Mar-09 122
1-Apr-09 129
1-May-09 130
1-Jun-09 122
1-Jul-09 117
1-Aug-09 114
1-Sep-09 119
1-Oct-09 112
1-Nov-09 102
1-Dec-09 98
1-Jan-10 92
1-Feb-10 86
1-Mar-10 108
1-Apr-10 95
1-May-10 109
1-Jun-10 110
1-Jul-10 109
1-Aug-10 118
1-Sep-10 115
1-Oct-10 123
1-Nov-10 110
1-Dec-10 117
1-Jan-11 114
1-Feb-11 110
1-Mar-11 114
1-Apr-11 120
1-May-11 131
1-Jun-11 122
1-Jul-11 124
1-Aug-11 133
1-Sep-11 129
1-Oct-11 133
1-Nov-11 133
1-Dec-11 126
1-Jan-12 137
1-Feb-12 110
1-Mar-12 128
1-Apr-12 127
1-May-12 132
1-Jun-12 127
1-Jul-12 150
1-Aug-12 136
1-Sep-12 135
1-Oct-12 140
1-Nov-12 124
1-Dec-12 136
1-Jan-13 136
1-Feb-13 127
1-Mar-13 128
1-Apr-13 130
1-May-13 132
1-Jun-13 128
1-Jul-13 122
1-Aug-13 130
1-Sep-13 124
1-Oct-13 129
1-Nov-13 117
1-Dec-13 108
1-Jan-14 115
1-Feb-14 104
1-Mar-14 120
1-Apr-14 117
1-May-14 122
1-Jun-14 109
1-Jul-14 111
1-Aug-14 116
1-Sep-14 117
1-Oct-14 115
1-Nov-14 110
1-Dec-14 106
1-Jan-15 109
1-Feb-15 93
1-Mar-15 111
1-Apr-15 107
1-May-15 120
1-Jun-15 113
1-Jul-15 131
1-Aug-15 127
1-Sep-15 120
1-Oct-15 124
1-Nov-15 123
1-Dec-15 117
1-Jan-16 132
1-Feb-16 117
1-Mar-16 124
1-Apr-16 121
1-May-16 122
1-Jun-16 114
1-Jul-16 99
1-Aug-16 76
1-Sep-16 60
1-Oct-16 64
1-Nov-16 47
1-Dec-16 54
1-Jan-17 48
1-Feb-17 41
1-Mar-17 30
More information about the R-help
mailing list