[R] Problem with adding a row in a data table

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Sun Sep 4 22:56:05 CEST 2016


My suggested approach:

dta <- structure(list(Prod_name = c("Banana", "Apple", "Orange", 
"Yoghurt",
"Eggs", "Milk", "Day_num"), X1.1.2000 = c("1", "0", "4", "3",
"6", "2", "1"), X2.1.2000 = c("2", "4", "1", "5", "3", "0", "2"
), X3.1.2000 = c("1", "5", "2", "3", "0", "4", "3"), X4.1.2000 = c("2",
"4", "4", "1", "0", "0", "4"), X5.1.2000 = c("0", "0", "1", "0",
"2", "3", "5"), X6.1.2000 = c("1", "3", "2", "1", "4", "1", "6"
), X7.1.2000 = c("5", "4", "5", "2", "2", "1", "7")), .Names =
c("Prod_name",
"X1.1.2000", "X2.1.2000", "X3.1.2000", "X4.1.2000", "X5.1.2000",
"X6.1.2000", "X7.1.2000"), row.names = c(NA, 7L), class = "data.frame")

# The Day_num values ARE NOT data you will be aggregating and
# should not be in the data frame with meaningful values.
dta <- dta[ 1:6, ] # forget last garbage line
# assuming your data are intended to be numeric
for( i in 2:8 ) {
     dta[[ i ]] <- as.numeric( dta[[ i ]] )
}
# you didn't say what computation you want to accomplish on the data
# assuming you want to add values up by product and part of week

# base R functions
# generally useful to set timezone when using POSIXt types
Sys.setenv( TZ="Etc/GMT" )
# gather data values from multiple columns into long form
# I find this function very confusing, but it does work if you
# don't like depending on contributed packages that are easier to
# understand
dtaLong <- reshape( dta
                   , idvar = "Prod_name"
                   , varying = 1+seq.int( length( dta ) - 1 )
                   , v.names = "value"
                   , timevar = "XDates"
                   , times = names( dta )[ 1+seq.int( length( dta ) - 1 ) ]
                   , direction = "long"
                   )
# extract Date values from column names
dtaLong$Dates <- as.Date( dtaLong$XDates, format="X%d.%m.%Y" )
# read about POSIX types in the help page ?DateTimeClasses
dt_lt <- as.POSIXlt( dtaLong$Dates )
# extract the weekday information from the POSIXlt
dtaLong$wday <- dt_lt$wday # Sunday==0
# identify rows corresponding to time of week
dtaLong$WkPart <- ifelse( dtaLong$wday %in% c( 0, 6 )
                         , "Weekend"
                         , "Weekday" )
# aggregate by sum the value grouping by Prod_name and WkPart
dtaAgg <- aggregate( dtaLong$value
                    , dtaLong[ , c( "Prod_name", "WkPart" ), drop=FALSE ]
                    , FUN=sum
                    )

# or using dplyr/tidyr
library(dplyr)
library(tidyr)
library(lubridate)
# "pipe" data frames from one step to the next
dtaAgg2.a <- (   dta
              # tidyr way of making long form data
              %>% gather( XDates, value, -Prod_name )
              )
# dtaAgg2.a is purely for studying what is happening
dtaAgg2.b <- (   dta
            # tidyr way of making long form data
            %>% gather( XDates, value, -Prod_name )
            %>% mutate( Dates = as.Date( XDates, format="X%d.%m.%Y" )
                      , WkPart = ifelse( wday( Dates ) %in% c( 0, 6 )
                                       , "WeekEnd"
                                       , "WeekDay" )
                      )
            )
# dtaAgg2.b is also for studying what happens
# finally, run the whole pipeline of calculations
dtaAgg2 <- (   dta
            # tidyr way of making long form data
            %>% gather( XDates, value, -Prod_name )
            %>% mutate( Dates = as.Date( XDates, format="X%d.%m.%Y" )
                      , WkPart = ifelse( wday( Dates ) %in% c( 0, 6 )
                                       , "WeekEnd"
                                       , "WeekDay" )
                      )
            %>% group_by( Prod_name, WkPart )
            %>% summarise( SumOfValues = sum( value ) )
            )
# the group_by and summarise steps work together

On Sun, 4 Sep 2016, Filippos Katsios wrote:

> Dear all,
> I believe that this will be a more helpful way to put the problem:
> structure(list(Prod_name = c("Banana", "Apple", "Orange", "Yoghurt", 
> "Eggs", "Milk", "Day_num"), X1.1.2000 = c("1", "0", "4", "3", 
> "6", "2", "1"), X2.1.2000 = c("2", "4", "1", "5", "3", "0", "2"
> ), X3.1.2000 = c("1", "5", "2", "3", "0", "4", "3"), X4.1.2000 = c("2", 
> "4", "4", "1", "0", "0", "4"), X5.1.2000 = c("0", "0", "1", "0", 
> "2", "3", "5"), X6.1.2000 = c("1", "3", "2", "1", "4", "1", "6"
> ), X7.1.2000 = c("5", "4", "5", "2", "2", "1", "7")), .Names = c("Prod_name", 
> "X1.1.2000", "X2.1.2000", "X3.1.2000", "X4.1.2000", "X5.1.2000", 
> "X6.1.2000", "X7.1.2000"), row.names = c(NA, 7L), class = "data.frame")
> 
> and the code:
> https://gist.github.com/anonymous/750b02ad5db448d45c92a79059bf9844
> 
> Thank you for your help
> Filippos
> 
> On 4 September 2016 at 19:30, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:
>       Please use Reply-all to keep the mailing list in the loop. I cannot provide private assistance,
>       and others may provide valuable input or respond faster than I can.
>
>       It is very common that people cannot provide the original data. That means more work for YOU,
>       though,  not for us.  It is up to you to create a small simulated data set and process it as if
>       it were your original data.
>
>       Your idea will indeed be a good algorithm, but you will fail in R if you don't set it up
>       differently. Read [1] and provide us with a reproducible example data set and desired result and
>       someone here will be able to show you how to do it correctly.
>
>       [1] http://adv-r.had.co.nz/Reproducibility.html
>       --
>       Sent from my phone. Please excuse my brevity.
>
>       On September 4, 2016 8:28:39 AM PDT, Filippos Katsios <katsiosf at gmail.com> wrote:
>       >Dear Jeff,
>       >I am sorry but I am not allowed to share the original data. You are
>       >right
>       >about the Prod_name row. However, my goal is to split the columns
>       >"Date 1"
>       >etc into weekdays and weekends and manipulate them separately. I
>       >thought
>       >this would be the best way to do that (Assign to each day a number from
>       >1:7
>       >and then splitting them by a logical vector). Thank you for your help
>       >and
>       >your time!
>       >
>       >Filippos
>       >
>       >On 4 September 2016 at 18:20, Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
>       >wrote:
>       >
>       >> The "c" function creates vectors. Rows of data frames are data
>       >frames, not
>       >> vectors.
>       >>
>       >> new_row  <- data.frame( Prod_name = "Day_name",  `Date 1`=1, `Date
>       >> 2`=2,`Date 3`=3 )
>       >> data_may  <- rbind( new_row, data_may )
>       >>
>       >> Furthermore, data frames are NOT spreadsheets. "Day_num" looks
>       >> suspiciously UNlike a product name, which may mean the corresponding
>       >values
>       >> in that row are not Dates, which would also lead you into trouble.
>       >>
>       >> Please read the Posting Guide. In particular, you should read about
>       >making
>       >> your examples reproducible. Part of that is posting in plain text and
>       >using
>       >> the dput function to give us your sample data, because all too often
>       >the
>       >> problem lies in the details of how you have imported and manipulated
>       >your
>       >> data and the shortest way for us to see that the data are okay is to
>       >see it
>       >> as it exists in your R script so far.
>       >> --
>       >> Sent from my phone. Please excuse my brevity.
>       >>
>       >> On September 4, 2016 6:22:48 AM PDT, Filippos Katsios
>       ><katsiosf at gmail.com>
>       >> wrote:
>       >> >Dear All,
>       >> >
>       >> >I am relatively new to R and certainly new to the e-mailing list. I
>       >> >need
>       >> >your help. I am working on a data frame, which looks like this:
>       >> >
>       >> >Prod_name |  Date 1  |  Date 2 |  Date 3  |
>       >> >------------------|-------------|------------|--------------|
>       >> >Product 1    |     3      |      4     |     0       |
>       >> >------------------|-------------|------------|--------------|
>       >> >Product 2    |     5      |      3     |     3       |
>       >> >------------------|-------------|------------|--------------|
>       >> >Product 3    |     2      |      8     |     5       |
>       >> >
>       >> >I am trying to add a new row with the following results:
>       >> >
>       >> >Prod_name |  Date 1  |  Date 2 |  Date 3  |
>       >> >------------------|-------------|------------|--------------|
>       >> >Day_num    |     1      |      2     |      3      |
>       >> >------------------|-------------|------------|--------------|
>       >> >Product 1    |     3      |      4     |     0       |
>       >> >------------------|-------------|------------|--------------|
>       >> >Product 2    |     5      |      3     |     3       |
>       >> >------------------|-------------|------------|--------------|
>       >> >Product 3    |     2      |      8     |     5       |
>       >> >
>       >> >Bellow you can find the things I tried and the results.
>       >> >1)
>       >> >r <- 1
>       >> >newrow <- rep(1:7, 5, len=ncol(data_may)-1)
>       >> >insertRow <- function(data_may, newrow, r) {
>       >> >data_may[seq(r+1,nrow(data_may)+1),] <-
>       >> >data_may[seq(r,nrow(data_may)),]
>       >> >  data_may[r,] <- newrow
>       >> >  data_may
>       >> >}
>       >> >
>       >> >It doesn't put the new row.
>       >> >2)
>       >> >data_may<-rbind(data_may,c("Day_num",newrow))
>       >> >
>       >> >Error: cannot convert object to a data frame
>       >> >
>       >> >3)
>       >> >data_may[2093,]<-c("Day_num",rep(1:7, 5, len=ncol(data_may)-1))
>       >> >
>       >> >It makes all the columns characters and when i try to change it it
>       >says
>       >> >that you can change a list
>       >> >
>       >> >How can I add the row while keeping the columns (apart from the
>       >first
>       >> >one)
>       >> >as numeric or double or integer?
>       >> >
>       >> >Thank you, in advance, for your help!
>       >> >
>       >> >Kind regards
>       >> >Filippos
>       >> >
>       >> >       [[alternative HTML version deleted]]
>       >> >
>       >> >______________________________________________
>       >> >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>       >> >https://stat.ethz.ch/mailman/listinfo/r-help
>       >> >PLEASE do read the posting guide
>       >> >http://www.R-project.org/posting-guide.html
>       >> >and provide commented, minimal, self-contained, reproducible code.
>       >>
>       >>
> 
> 
> 
>

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
---------------------------------------------------------------------------


More information about the R-help mailing list