[R] coalesce columns within a data frame
Ivan Alves
papucho at mac.com
Thu Oct 23 00:55:26 CEST 2008
Many thanks to all for their help. Factors are indeed very tricky and
sided on the conversion to character.
Kind regards,
Ivan
On 22 Oct 2008, at 19:01, Duncan Murdoch wrote:
> On 10/22/2008 12:09 PM, Ivan Alves wrote:
>> Dear all,
>> Thanks for all the replies.
>> I get something with Duncan's code (slightly more compact than the
>> other two), but of class "integer", whereas the two inputs are
>> class "factor". Clearly the name information is lost. I did not
>> see anything on this in the help page for ifelse.
>
> It is there, in this warning:
>
> The mode of the result may depend on the value of 'test', and the
> class attribute of the result is taken from 'test' and may be
> inappropriate for the values selected from 'yes' and 'no'.
>
> You'd want the result to be a factor, but those attributes are
> lost. I think this is a result of two design flaws: ifelse()
> shouldn't base the class on the test, it should base it on the
> values. And factors in S and R have all sorts of problems.
>
> You can work around this by converting to character vectors:
>
> Name <- ifelse(is.na(Name.x), as.character(Name.y),
> as.character(Name.x))
>
> If you really want factors, you can convert back at the end, but why
> would you want to?
>
> Duncan Murdoch
>
>> On this experience I also tried
>> df$Name <- df$NAME.x
>> df[is.na(df$NAME.x),"Name"] <- df[is.na(df $NAME.x),"NAME.y"]
>> but then again the "factor" issue was a problem (clearly the
>> levels are not the same and then there is a conflict)
>> Any further guidance?
>> Kind regards,
>> Ivan
>> On 22 Oct 2008, at 17:26, Duncan Murdoch wrote:
>>> On 10/22/2008 11:21 AM, Ivan Alves wrote:
>>>> Dear all,
>>>> I searched the mail archives and the R site and found no
>>>> guidance (tried "merge", "cbind" and terms like "coalesce" with
>>>> no success). There surely is a way to coalesce (like in SQL)
>>>> columns in a dataframe, right? For example, I would like to go
>>>> from a dataframe with two columns to one with only one as
>>>> follows:
>>>> From
>>>> Name.x Name.y
>>>> nx1 ny1
>>>> nx2 NA
>>>> NA ny3
>>>> NA NA
>>>> ...
>>>> To
>>>> Name
>>>> nx1
>>>> nx2
>>>> ny3
>>>> NA
>>>> ...
>>>> where column Name.x is taken if there is a value, and if not
>>>> then column Name.y
>>>> Any help would be appreciated
>>>
>>> I don't know of any special function to do that, but ifelse() can
>>> handle it easily:
>>>
>>> Name <- ifelse(is.na(Name.x), Name.y, Name.x)
>>>
>>> (If those are columns of a dataframe named df, you'd prefix each
>>> column name by df$, or do
>>>
>>> within(df, Name <- ifelse(is.na(Name.x), Name.y, Name.x))
>>>
>>> Duncan Murdoch
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list