[R] Can't understand syntax

Rui Barradas ruipbarradas at sapo.pt
Sun Jul 15 01:41:57 CEST 2012


Hello,

It's more simple than you believe it is. One thing at a time.

First, in order to lighten the instructions, create index vectors.

test2 <- test  # save 'test' for later

na.v1 <- is.na(test[["v1"]])
na.v2 <- is.na(test[["v2"]])
na.v3 <- is.na(test[["v3"]])


Now use them.


test[[ "result" ]][ !na.v1 ] <- test[[ "v1" ]][ !na.v1 ]
test[[ "result" ]][ !na.v2 ] <- test[[ "v2" ]][ !na.v2 ]
test[[ "result" ]][ !na.v3 ] <- test[[ "v3" ]][ !na.v3 ]


Note that above, for instance, n the first line, on each side of '<-' we 
have two different types of indexing, in a certain sense.

One, a data.frame is a list of a special type, each list member is a 
(random?) variable and all variables have the same number of 
observations. So test[[ "result" ]] refers to a vector of the data.frame.

Another is the indexing of that vectors' elements. Imagine that we had 
assigned

test.res <- test[[ "result" ]]

and then accessed the elements of 'test.res' with

test.res[ !na.v1 ] <- ...etc...

That's what we are doing.
Considering that a df is a list with a tabular form, we could also use 
the row/column type of indexing. Maybe this would be more intuitive. 
Equivalent, exactly equivalent to the code above is:


test2[  !na.v1 , "result" ] <- test2[ !na.v1 , "v1" ]
test2[  !na.v2 , "result" ] <- test2[ !na.v2 , "v2" ]
test2[  !na.v3 , "result" ] <- test2[ !na.v3 , "v3" ]

all.equal(test, test2) # TRUE


Hope this helps,

Rui Barradas

Em 14-07-2012 21:22, Charles Stangor escreveu:
> OK, I need help!!
>
> I've been searching, but I don't understand the logic of some this
> dataframe addressing syntax.
>
> What is this type of code called?
>
> test [["v3"]] [is.na <http://is.na>(test[["v2"]])] <-10  #choose column
> v3 where column v2 is == 4 and replace with 10
>
> and where is it documented?
>
>
> The code below works for what I want to do (find the non-missing value
> in a row), but why?
>
> test <- read.table(text="
> v1  v2  v3  result
> 3  NA  NA  NA
> NA  3   NA NA
> NA  NA   3 NA
> "
> , header=TRUE)
>
> test [["result"]] [!(is.na <http://is.na>(test[["v1"]]))] <- test
> [["v1"]] [!(is.na <http://is.na>(test[["v1"]]))]
> test [["result"]] [!(is.na <http://is.na>(test[["v2"]]))] <- test
> [["v2"]] [!(is.na <http://is.na>(test[["v2"]]))]
> test [["result"]] [!(is.na <http://is.na>(test[["v3"]]))] <- test
> [["v3"]] [!(is.na <http://is.na>(test[["v3"]]))]
>
> thanks!
>
>
> On Fri, Jul 13, 2012 at 6:41 AM, Rui Barradas <ruipbarradas at sapo.pt
> <mailto:ruipbarradas at sapo.pt>> wrote:
>
>     Hello,
>
>     Check the structure of what you have, df and newdf. You will see
>     that in df dateTime is of class POSIXlt and in newDf newDateTime is
>     of class POSIXct.
>
>     Solution:
>
>     [...]
>     df$dateTime <- strptime(df$dateTime,"%m/%d/%Y %H:%M")
>     df$dateTime <- as.POSIXct(df$dateTime)
>     [...]
>
>     Hope this helps,
>
>     Rui Barradas
>
>     Em 13-07-2012 10:24, vioravis escreveu:
>
>         I have the following dataframe with the first column being of
>         type datetime:
>
>         dateTime <- c("10/01/2005 0:00",
>                         "10/01/2005 0:20",
>                         "10/01/2005 0:40",
>                         "10/01/2005 1:00",
>                         "10/01/2005 1:20")
>         var1 <- c(1,2,3,4,5)
>         var2 <- c(10,20,30,40,50)
>         df <- data.frame(dateTime = dateTime, var1 = var1, var2 = var2)
>         df$dateTime <- strptime(df$dateTime,"%m/%d/%Y %H:%M")
>
>         I want to create 10 minute interval data as follows:
>
>         minTime <- min(df$dateTime)
>         maxTime <- max(df$dateTime)
>         newTime <- seq(minTime,maxTime,600)
>         newDf <- data.frame(newDateTime = newTime)
>         newDf <- merge(newDf,df,by.x = "newDateTime",by.y =
>         "dateTime",all.x = TRUE)
>
>         The objective here is to create a data frame with values from df
>         for the
>         datetime in df and NA for the missing ones. However, I am
>         getting the
>         following data frame with both Var1 and Var2 having all NAs.
>
>             newDf
>
>                     newDateTime var1 var2
>         1 2005-10-01 00:00:00   NA   NA
>         2 2005-10-01 00:10:00   NA   NA
>         3 2005-10-01 00:20:00   NA   NA
>         4 2005-10-01 00:30:00   NA   NA
>         5 2005-10-01 00:40:00   NA   NA
>         6 2005-10-01 00:50:00   NA   NA
>         7 2005-10-01 01:00:00   NA   NA
>         8 2005-10-01 01:10:00   NA   NA
>         9 2005-10-01 01:20:00   NA   NA
>
>         Can someone help me on how to do the merge based on the two datetime
>         columns?
>
>         Thank you.
>
>         Ravi
>
>
>
>
>
>
>         --
>         View this message in context:
>         http://r.789695.n4.nabble.com/__Merging-on-Datetime-Column-__tp4636417.html
>         <http://r.789695.n4.nabble.com/Merging-on-Datetime-Column-tp4636417.html>
>         Sent from the R help mailing list archive at Nabble.com.
>
>         ________________________________________________
>         R-help at r-project.org <mailto:R-help at r-project.org> mailing list
>         https://stat.ethz.ch/mailman/__listinfo/r-help
>         <https://stat.ethz.ch/mailman/listinfo/r-help>
>         PLEASE do read the posting guide
>         http://www.R-project.org/__posting-guide.html
>         <http://www.R-project.org/posting-guide.html>
>         and provide commented, minimal, self-contained, reproducible code.
>
>
>     ________________________________________________
>     R-help at r-project.org <mailto:R-help at r-project.org> mailing list
>     https://stat.ethz.ch/mailman/__listinfo/r-help
>     <https://stat.ethz.ch/mailman/listinfo/r-help>
>     PLEASE do read the posting guide
>     http://www.R-project.org/__posting-guide.html
>     <http://www.R-project.org/posting-guide.html>
>     and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
> --
> Charles Stangor
> Professor and Associate Chair
>



More information about the R-help mailing list