[R] coalesce columns within a data frame

Ivan Alves papucho at mac.com
Thu Oct 23 00:55:26 CEST 2008


Many thanks to all for their help.  Factors are indeed very tricky and  
sided on the conversion to character.
Kind regards,
Ivan
On 22 Oct 2008, at 19:01, Duncan Murdoch wrote:

> On 10/22/2008 12:09 PM, Ivan Alves wrote:
>> Dear all,
>> Thanks for all the replies.
>> I get something with Duncan's code (slightly more compact than the   
>> other two), but of class "integer", whereas the two inputs are  
>> class  "factor".  Clearly the name information is lost.  I did not  
>> see  anything on this in the help page for ifelse.
>
> It is there, in this warning:
>
>     The mode of the result may depend on the value of 'test', and the
>     class attribute of the result is taken from 'test' and may be
>     inappropriate for the values selected from 'yes' and 'no'.
>
> You'd want the result to be a factor, but those attributes are  
> lost.  I think this is a result of two design flaws:  ifelse()  
> shouldn't base the class on the test, it should base it on the  
> values.  And factors in S and R have all sorts of problems.
>
> You can work around this by converting to character vectors:
>
> Name <- ifelse(is.na(Name.x), as.character(Name.y),  
> as.character(Name.x))
>
> If you really want factors, you can convert back at the end, but why  
> would you want to?
>
> Duncan Murdoch
>
>> On this experience I also tried
>> df$Name <- df$NAME.x
>> df[is.na(df$NAME.x),"Name"] <- df[is.na(df $NAME.x),"NAME.y"]
>> but then again the "factor" issue was a problem (clearly the  
>> levels  are not the same and then there is a conflict)
>> Any further guidance?
>> Kind regards,
>> Ivan
>> On 22 Oct 2008, at 17:26, Duncan Murdoch wrote:
>>> On 10/22/2008 11:21 AM, Ivan Alves wrote:
>>>> Dear all,
>>>> I searched the mail archives and the R site and found no  
>>>> guidance   (tried "merge", "cbind" and terms like "coalesce" with  
>>>> no  success).   There surely is a way to coalesce (like in SQL)  
>>>> columns  in a  dataframe, right?  For example, I would like to go  
>>>> from a  dataframe  with two columns to one with only one as  
>>>> follows:
>>>> From
>>>> Name.x Name.y
>>>> nx1 ny1
>>>> nx2 NA
>>>> NA ny3
>>>> NA NA
>>>> ...
>>>> To
>>>> Name
>>>> nx1
>>>> nx2
>>>> ny3
>>>> NA
>>>> ...
>>>> where column Name.x is taken if there is a value, and if not  
>>>> then   column Name.y
>>>> Any help would be appreciated
>>>
>>> I don't know of any special function to do that, but ifelse() can   
>>> handle it easily:
>>>
>>> Name <- ifelse(is.na(Name.x), Name.y, Name.x)
>>>
>>> (If those are columns of a dataframe named df, you'd prefix each   
>>> column name by df$, or do
>>>
>>> within(df, Name <- ifelse(is.na(Name.x), Name.y, Name.x))
>>>
>>> Duncan Murdoch
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list