[R] Adding SORT to UNIQUE

Duncan Murdoch murdoch@dunc@n @end|ng |rom gm@||@com
Tue Dec 21 17:07:48 CET 2021


On 21/12/2021 10:16 a.m., Stephen H. Dawson, DSL via R-help wrote:
> Thanks everyone for the replies.
> 
> It is clear one either needs to write a function or put the unique
> entries into another dataframe.
> 
> It seems odd R cannot sort a list of unique column entries with ease.
> Python and SQL can do it with ease.

I've seen several responses that looked pretty simple.  It's hard to 
beat sort(unique(x)), though there's a fair bit of confusion about what 
you actually want.  Maybe you should post an example of the code you'd 
use in Python?

Duncan Murdoch

> 
> QUESTION
> Is there a simpler means than other than the unique function to capture
> distinct column entries, then sort that list?
> 
> 
> *Stephen Dawson, DSL*
> /Executive Strategy Consultant/
> Business & Technology
> +1 (865) 804-3454
> http://www.shdawson.com <http://www.shdawson.com>
> 
> 
> On 12/20/21 5:53 PM, Rui Barradas wrote:
>> Hello,
>>
>> Inline.
>>
>> Às 21:18 de 20/12/21, Stephen H. Dawson, DSL via R-help escreveu:
>>> Thanks.
>>>
>>> sort(unique(Data[[1]]))
>>>
>>> This syntax provides row numbers, not column values.
>>
>> This is not right.
>> The syntax Data[1] extracts a sub-data.frame, the syntax Data[[1]]
>> extracts the column vector.
>>
>> As for my previous answer, it was not addressing the question, I
>> misinterpreted it as being a question on how to sort by numeric order
>> when the data is not numeric. Here is a, hopefully, complete answer.
>> Still with package stringr.
>>
>>
>> cols_to_sort <- 1:4
>>
>> Data2 <- lapply(Data[cols_to_sort], \(x){
>>    stringr::str_sort(unique(x), numeric = TRUE)
>> })
>>
>>
>> Or using Avi's suggestion of writing a function to do all the work and
>> simplify the lapply loop later,
>>
>>
>> unisort2 <- function(vec, ...) stringr::str_sort(unique(vec), ...)
>> Data2 <- lapply(Data[cols_to_sort], unisort, numeric = TRUE)
>>
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>>
>>>
>>> *Stephen Dawson, DSL*
>>> /Executive Strategy Consultant/
>>> Business & Technology
>>> +1 (865) 804-3454
>>> http://www.shdawson.com <http://www.shdawson.com>
>>>
>>>
>>> On 12/20/21 11:58 AM, Stephen H. Dawson, DSL via R-help wrote:
>>>> Hi,
>>>>
>>>>
>>>> Running a simple syntax set to review entries in dataframe columns.
>>>> Here is the working code.
>>>>
>>>> Data <- read.csv("./input/Source.csv", header=T)
>>>> describe(Data)
>>>> summary(Data)
>>>> unique(Data[1])
>>>> unique(Data[2])
>>>> unique(Data[3])
>>>> unique(Data[4])
>>>>
>>>> I would like to add sort the unique entries. The data in the various
>>>> columns are not defined as numbers, but also text. I realize 1 and
>>>> 10 will not sort properly, as the column is not defined as a number,
>>>> but want to see what I have in the columns viewed as sorted.
>>>>
>>>> QUESTION
>>>> What is the best process to sort unique output, please?
>>>>
>>>>
>>>> Thanks.
>>>
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list