[R] Adding collumn to existing data frame
Ralf B
ralf.bierig at gmail.com
Wed Aug 4 04:54:18 CEST 2010
Actually it does -- one has to use feed the result back into the
original variable:
add.col <- function(df, vec, namevec){
if (nrow(df) < length(vec) ){ df <- # pads rows if needed
rbind(df, matrix(NA, length(vec)-nrow(df), ncol(df),
dimnames=list( NULL, names(df) ) ) )
}
length(vec) <- nrow(df) # pads with NA's
df[, namevec] <- vec; # names new col properly
return(df)
}
mydata <- NULL
mydata <- data.frame(userid = c(5, 6, 5, 6, 5, 6), taskid = c(1, 1, 2, 2, 3, 3),
stuff = 11:16)
mydata <- add.col(mydata, c(1,2,3,4),"test1")
mydata <- add.col(mydata, c(1,2,3,4,5,6,7,8),"test2")
mydata
Thanks a lot, David and all others here you made the effort!
Ralf
On Tue, Aug 3, 2010 at 10:37 PM, David Winsemius <dwinsemius at comcast.net> wrote:
>
> On Aug 3, 2010, at 10:35 PM, David Winsemius wrote:
>
>>
>> On Aug 3, 2010, at 8:32 PM, Ralf B wrote:
>>
>>> Hi experts,
>>>
>>> I am trying to write a very flexible method that allows me to add a
>>> new column to an existing data frame. This is what I have so far:
>>>
>>> add.column <- function(df, new.col, name) {
>>> n.row <- dim(df)[1]
>>> length(new.col) <- n.row
>>> names(new.col) <- name
>>> return(cbind(df, new.col))
>>> }
>>>
>>> df <- NULL
>>> df <- data.frame(a=c(1,2,3))
>>> df
>>> # corect: added NA to new collumn
>>> df <- add.column(df,c(1,2),'myNewColumn2')
>>> df
>>> # problem: not added, data frame should be extended with NAs
>>> add.column(df,c(1,2,3,4),'myNewColumn3')
>>> df
>>>
>>>
>>> However, there are two problems:
>>>
>>> 1) The column name is not renamed accurately but always set to
>>> 'new.col' . Surely this could be done outside the function, but it
>>> would be better if its self contained.
>>
>> Try this:
>>
>> add.col <- function(df, vec, namevec){
>> length(vec) <- nrow(df) # pads with NA's
>> cbind(df, namevec=vec)} # names new col properly
>>
> Actually it doesn't name column correctky... see below for a method with "[
> <-" .
>
>>> 2) It does not work for cases where new.col is longer than the length
>>> of the data frame. In such cases, I would like to add NA's to the data
>>> frame if it has less rows.
>>
>> Don't have a compact answer to this. (Tried re-dimensioning with "dim()
>> <-" but it was not accepted by the interpreter. Would need to add a test
>> at the beginning and then pad with rows of NA's using rbind before cbinding
>> as above.
>>
>> add.col <- function(df, vec, namevec){
>> if (nrow(df) < length(vec) ){ df <- # pads rows if needed
>> rbind(df, matrix(NA, length(vec)-nrow(df), ncol(df),
>> dimnames=list( NULL, names(df) ) ) ) }
>> length(vec) <- nrow(df) # pads with NA's
>> df[, namevec] <- vec; # names new col properly
>> return(df)}
>>
>>>
>>> Any ideas to to solve this?
>>
>> Has not been tested with columns of varying types.
>>
>
> David Winsemius, MD
> West Hartford, CT
>
>
More information about the R-help
mailing list