[R] Data.frames can not hold objects...What can be done in the following scenario?

Rui Barradas ruipbarradas at sapo.pt
Tue Jun 12 00:35:08 CEST 2012


Hello,

There are also other possibilities. What I believe is the easiest is to 
go back to the beginning, i.e., have the function return a vector as 
before, and then use lapply on the data.frame's rows.

testfun <- function (x, y) seq(x, y, 1)


testframe$newcolumn <- lapply(1:nrow(testframe), function(i)
     testfun(testframe[i, 1], testframe[i, 2]))
class(testframe$newcolumn)  # [1] "list"

testframe$newcolumn[[1]]    # a vector, no longer a list
testframe$newcolumn[[1]][2]  # 2nd element of that vector


The main point is that data.frames are lists of a special kind, they 
implement the statistical concept of variables and their observations, 
the columns and the rows. And like all list, its elements can be any R 
object including lists.

Rui Barradas

Em 11-06-2012 23:02, R. Michael Weylandt escreveu:
> It is possible to chain together uses of `[[` -- e.g.,
>
> x <- list(1:5, list(letters[1:5], list(LETTERS[1:5])))
>
> x[[c(1,2)]] # 2L
>
> x[[c(2,1,3)]] # "c"
>
> x[[c(2,2,1,3)]] # "C"
>
> which is sometimes useful.
>
> Best,
> Michael
>
> On Mon, Jun 11, 2012 at 4:35 PM, Onur Uncu <onuruncu at gmail.com> wrote:
>> Rui and the R-help team,
>>
>> In Rui's helpful answer below, the function returns a list as output.
>> When we apply() the function to the data.frame, dataframe$newcolumn
>> has 2 layers of list before we can access each vector elements. For
>> instance, dataframe$newcolumn[[1]][[1]] is a vector whereas
>> dataframe$newcolumn and dataframe$newcolumn[[1]] are lists. Is there a
>> solution that involves less layers of lists? I am just trying to
>> understand the R language better.
>>
>> Thank you.
>>
>>
>> On Sun, Jun 10, 2012 at 3:18 PM, Rui Barradas <ruipbarradas at sapo.pt> wrote:
>>> Hello,
>>>
>>> What you need is to have your function return a list, not a vector. Like
>>> this
>>>
>>> testfun <- function (x, y) list(seq(x, y, 1))
>>>
>>> testframe<-data.frame(xvalues=c(2,3),yvalues=c(4,5))
>>>
>>> testframe$newcolumn <- apply(testframe, 1, function(x) testfun(x[1], x[2]))
>>> class(testframe$newcolumn)  # [1] "list"
>>>
>>> Then you access lists and list elements.
>>>
>>> testframe$newcolumn[[1]]  # a list with just one element
>>> testframe$newcolumn[[1]][[1]]  # that element, a vector
>>> testframe$newcolumn[[1]][[1]][2]  # the vector's 2nd element
>>>
>>>
>>> Since you want the function to return vectors in order to do further
>>> computations, you'll access those vectors by varying the list index,
>>>
>>>
>>> testframe$newcolumn[[1]][[1]]  # first list, it's only vector
>>> testframe$newcolumn[[2]][[1]]  # second list, it's only vector
>>>
>>>
>>> Etc.
>>>
>>> Hope this helps,
>>>
>>> Rui Barradas
>>>
>>> Em 10-06-2012 12:29, Onur Uncu escreveu:
>>>>
>>>> Thank you Duncan. A follow-up question is, how can I achieve the
>>>> desired result in the earlier email? (i.e. Add the resulting vectors
>>>> as a new column to the existing data.frame?)   I tried the following:
>>>>
>>>> testframe$newcolumn<-apply(testframe,1,function(x)testfun(x[1],x[2]))
>>>>
>>>> but I am getting the following error:
>>>>
>>>> Error in `$<-.data.frame`(`*tmp*`, "vecss", value = c(2, 3, 4, 3, 4, 5
>>>> : replacement has 3 rows, data has 2
>>>>
>>>> Thanks for the help.
>>>>
>>>>
>>>> On Sun, Jun 10, 2012 at 12:02 PM, Duncan Murdoch
>>>> <murdoch.duncan at gmail.com> wrote:
>>>>>
>>>>> On 12-06-10 6:41 AM, Onur Uncu wrote:
>>>>>>
>>>>>>
>>>>>> R-Help community,
>>>>>>
>>>>>> I understand that data.frames can hold elements of type double, string
>>>>>> etc but NOT objects (such as a matrix etc).
>>>>>
>>>>>
>>>>>
>>>>> That is incorrect.  Dataframes can hold list vectors.  For example:
>>>>>
>>>>> A <- data.frame(x = 1:3)
>>>>> A$y <- list(matrix(1, 2,2), matrix(2, 3,3), matrix(3,4,4))
>>>>>
>>>>> A[1,2] will now extract the 2x2 matrix, A[2,2] will extract the 3x3, etc.
>>>>>
>>>>> Duncan Murdoch
>>>>>
>>>>> This is not convenient for
>>>>>>
>>>>>>
>>>>>> me in the following situation. I have a function that takes 2 inputs
>>>>>> and returns a vector:
>>>>>>
>>>>>> testfun<- function (x,y) seq(x,y,1)
>>>>>>
>>>>>> I have a data.frame defined as follows:
>>>>>>
>>>>>> testframe<-data.frame(xvalues=c(2,3),yvalues=c(4,5))
>>>>>>
>>>>>> I would like to apply testfun to every row of testframe and then
>>>>>> create a new column in the data.frame which holds the returned vectors
>>>>>> as objects. Why do I want this? Because the returned vectors are an
>>>>>> intermediate step towards further calculations. It would be great to
>>>>>> keep adding new columns to the data.frame with the intermediate
>>>>>> objects. But this is not possible since data.frames can not hold
>>>>>> objects as elements. What do you suggest as an elegant solution in
>>>>>> this scenario? Thank you for any help!
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> I would love to hear if forum
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-help at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide
>>>>>> http://www.R-project.org/posting-guide.html
>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>>
>>>>>
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list