[R] Adding SORT to UNIQUE
Stephen H. Dawson, DSL
@erv|ce @end|ng |rom @hd@w@on@com
Wed Dec 22 17:13:29 CET 2021
OK, now I get what you are suggesting.
Much appreciated.
Kindest Regards,
*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com <http://www.shdawson.com>
On 12/22/21 11:08 AM, Duncan Murdoch wrote:
> On 22/12/2021 10:55 a.m., Stephen H. Dawson, DSL wrote:
>> I see.
>>
>> So, we are talking taking the output into a new dataframe. I was hoping
>> to have the output rendered on screen without another dataframe, but I
>> can live with this option it if must occur.
>>
>> Am I correct the desired vertical output must first go to a dataframe?
>
> No, that's just one option. The other 3 don't use dataframes.
>
> Duncan Murdoch
>>
>>
>> *Stephen Dawson, DSL*
>> /Executive Strategy Consultant/
>> Business & Technology
>> +1 (865) 804-3454
>> http://www.shdawson.com <http://www.shdawson.com>
>>
>>
>> On 12/22/21 10:47 AM, Duncan Murdoch wrote:
>>> On 22/12/2021 10:20 a.m., Stephen H. Dawson, DSL wrote:
>>>> Thanks for the reply.
>>>>
>>>> Both syntax options work to render the correct (unique) output.
>>>> However,
>>>> the output is rendered as horizontal. What needs to happen to get the
>>>> output to render vertical, please?
>>>
>>> The result of those expressions is a vector of the same type as the
>>> column, so your question is really about how to get a vector to print
>>> one element per line.
>>>
>>> Probably the simplest way is to put the vector in a dataframe (or
>>> matrix, or tibble, depending on which formatting you prefer). For
>>> example,
>>>
>>>> v <- c("red", "green", "blue")
>>>> data.frame(v)
>>> v
>>> 1 red
>>> 2 green
>>> 3 blue
>>>
>>> If you want a more minimal display, try
>>>
>>>> cat(v, sep = "\n")
>>> red
>>> green
>>> blue
>>>
>>> or
>>>
>>>> cat(format(v, justify = "right"), sep = "\n")
>>> red
>>> green
>>> blue
>>>
>>> If you want this to happen when you auto-print the object, you can
>>> give it a class attribute and write a function to print that class,
>>> e.g.
>>>
>>>> class(v) <- "oneperline"
>>>>
>>>> print.oneperline <- function(x, ...) cat(format(x, justify =
>>> "right"), sep = "\n")
>>>>
>>>> v
>>> red
>>> green
>>> blue
>>>
>>> Duncan Murdoch
>>>
>>>>
>>>>
>>>> *Stephen Dawson, DSL*
>>>> /Executive Strategy Consultant/
>>>> Business & Technology
>>>> +1 (865) 804-3454
>>>> http://www.shdawson.com <http://www.shdawson.com>
>>>>
>>>>
>>>> On 12/21/21 11:38 AM, Duncan Murdoch wrote:
>>>>> On 21/12/2021 11:31 a.m., Duncan Murdoch wrote:
>>>>>> On 21/12/2021 11:20 a.m., Stephen H. Dawson, DSL wrote:
>>>>>>> Thanks for the reply.
>>>>>>>
>>>>>>> sort(unique(Data[1]))
>>>>>>> Error in `[.data.frame`(x, order(x, na.last = na.last, decreasing =
>>>>>>> decreasing)) :
>>>>>>> undefined columns selected
>>>>>>
>>>>>> That's the wrong syntax: Data[1] is not "column one of Data". Use
>>>>>> Data[[1]] for that, so
>>>>>>
>>>>>> sort(unique(Data[[1]]))
>>>>>
>>>>> Actually, I'd probably recommend
>>>>>
>>>>> sort(unique(Data[, 1]))
>>>>>
>>>>> instead. This treats Data as a matrix rather than as a list.
>>>>> Dataframes are lists that look like matrices, but to me the matrix
>>>>> aspect is usually more intuitive.
>>>>>
>>>>> Duncan Murdoch
>>>>>
>>>>>>
>>>>>> I think Rui already pointed out the typo in the quoted text below...
>>>>>>
>>>>>> Duncan Murdoch
>>>>>>
>>>>>>>
>>>>>>> The recommended syntax did not work, as listed above.
>>>>>>>
>>>>>>> What I want is the sort of distinct column output. Again, the
>>>>>>> column
>>>>>>> may
>>>>>>> be text or numbers. This is a huge analysis effort with data
>>>>>>> coming at
>>>>>>> me from many different sources.
>>>>>>>
>>>>>>>
>>>>>>> *Stephen Dawson, DSL*
>>>>>>> /Executive Strategy Consultant/
>>>>>>> Business & Technology
>>>>>>> +1 (865) 804-3454
>>>>>>> http://www.shdawson.com <http://www.shdawson.com>
>>>>>>>
>>>>>>>
>>>>>>> On 12/21/21 11:07 AM, Duncan Murdoch wrote:
>>>>>>>> On 21/12/2021 10:16 a.m., Stephen H. Dawson, DSL via R-help wrote:
>>>>>>>>> Thanks everyone for the replies.
>>>>>>>>>
>>>>>>>>> It is clear one either needs to write a function or put the
>>>>>>>>> unique
>>>>>>>>> entries into another dataframe.
>>>>>>>>>
>>>>>>>>> It seems odd R cannot sort a list of unique column entries with
>>>>>>>>> ease.
>>>>>>>>> Python and SQL can do it with ease.
>>>>>>>>
>>>>>>>> I've seen several responses that looked pretty simple. It's
>>>>>>>> hard to
>>>>>>>> beat sort(unique(x)), though there's a fair bit of confusion about
>>>>>>>> what you actually want. Maybe you should post an example of the
>>>>>>>> code
>>>>>>>> you'd use in Python?
>>>>>>>>
>>>>>>>> Duncan Murdoch
>>>>>>>>
>>>>>>>>>
>>>>>>>>> QUESTION
>>>>>>>>> Is there a simpler means than other than the unique function to
>>>>>>>>> capture
>>>>>>>>> distinct column entries, then sort that list?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *Stephen Dawson, DSL*
>>>>>>>>> /Executive Strategy Consultant/
>>>>>>>>> Business & Technology
>>>>>>>>> +1 (865) 804-3454
>>>>>>>>> http://www.shdawson.com <http://www.shdawson.com>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 12/20/21 5:53 PM, Rui Barradas wrote:
>>>>>>>>>> Hello,
>>>>>>>>>>
>>>>>>>>>> Inline.
>>>>>>>>>>
>>>>>>>>>> Às 21:18 de 20/12/21, Stephen H. Dawson, DSL via R-help
>>>>>>>>>> escreveu:
>>>>>>>>>>> Thanks.
>>>>>>>>>>>
>>>>>>>>>>> sort(unique(Data[[1]]))
>>>>>>>>>>>
>>>>>>>>>>> This syntax provides row numbers, not column values.
>>>>>>>>>>
>>>>>>>>>> This is not right.
>>>>>>>>>> The syntax Data[1] extracts a sub-data.frame, the syntax
>>>>>>>>>> Data[[1]]
>>>>>>>>>> extracts the column vector.
>>>>>>>>>>
>>>>>>>>>> As for my previous answer, it was not addressing the question, I
>>>>>>>>>> misinterpreted it as being a question on how to sort by numeric
>>>>>>>>>> order
>>>>>>>>>> when the data is not numeric. Here is a, hopefully, complete
>>>>>>>>>> answer.
>>>>>>>>>> Still with package stringr.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> cols_to_sort <- 1:4
>>>>>>>>>>
>>>>>>>>>> Data2 <- lapply(Data[cols_to_sort], \(x){
>>>>>>>>>> stringr::str_sort(unique(x), numeric = TRUE)
>>>>>>>>>> })
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Or using Avi's suggestion of writing a function to do all the
>>>>>>>>>> work and
>>>>>>>>>> simplify the lapply loop later,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> unisort2 <- function(vec, ...) stringr::str_sort(unique(vec),
>>>>>>>>>> ...)
>>>>>>>>>> Data2 <- lapply(Data[cols_to_sort], unisort, numeric = TRUE)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hope this helps,
>>>>>>>>>>
>>>>>>>>>> Rui Barradas
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *Stephen Dawson, DSL*
>>>>>>>>>>> /Executive Strategy Consultant/
>>>>>>>>>>> Business & Technology
>>>>>>>>>>> +1 (865) 804-3454
>>>>>>>>>>> http://www.shdawson.com <http://www.shdawson.com>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 12/20/21 11:58 AM, Stephen H. Dawson, DSL via R-help wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Running a simple syntax set to review entries in dataframe
>>>>>>>>>>>> columns.
>>>>>>>>>>>> Here is the working code.
>>>>>>>>>>>>
>>>>>>>>>>>> Data <- read.csv("./input/Source.csv", header=T)
>>>>>>>>>>>> describe(Data)
>>>>>>>>>>>> summary(Data)
>>>>>>>>>>>> unique(Data[1])
>>>>>>>>>>>> unique(Data[2])
>>>>>>>>>>>> unique(Data[3])
>>>>>>>>>>>> unique(Data[4])
>>>>>>>>>>>>
>>>>>>>>>>>> I would like to add sort the unique entries. The data in the
>>>>>>>>>>>> various
>>>>>>>>>>>> columns are not defined as numbers, but also text. I realize
>>>>>>>>>>>> 1 and
>>>>>>>>>>>> 10 will not sort properly, as the column is not defined as a
>>>>>>>>>>>> number,
>>>>>>>>>>>> but want to see what I have in the columns viewed as sorted.
>>>>>>>>>>>>
>>>>>>>>>>>> QUESTION
>>>>>>>>>>>> What is the best process to sort unique output, please?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks.
>>>>>>>>>>>
>>>>>>>>>>> ______________________________________________
>>>>>>>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and
>>>>>>>>>>> more, see
>>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>>>>>> PLEASE do read the posting guide
>>>>>>>>>>> http://www.R-project.org/posting-guide.html
>>>>>>>>>>> and provide commented, minimal, self-contained, reproducible
>>>>>>>>>>> code.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ______________________________________________
>>>>>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>>>> PLEASE do read the posting guide
>>>>>>>>> http://www.R-project.org/posting-guide.html
>>>>>>>>> and provide commented, minimal, self-contained, reproducible
>>>>>>>>> code.
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
>
More information about the R-help
mailing list