[R] why is 9 after 10?
peter dalgaard
pdalgd at gmail.com
Fri Feb 12 23:10:13 CET 2016
It can also happen if you use colClasses, since that applies as.factor to the input column without first converting it to numeric. To wit:
> read.table(text="
+ 9
+ 10", colClasses="factor")$V1
[1] 9 10
Levels: 10 9
-pd
> On 12 Feb 2016, at 22:43 , Jim Lemon <drjimlemon at gmail.com> wrote:
>
> It depends upon what goes into the "data reshaping pipeline". If there is a
> single non-numeric value in the data read in, it will alpha sort it upon
> conversion to a factor:
>
> x<-factor(c(sample(6:37,1000,TRUE)," "))
> z<-factor(x)
> levels(z)
> [1] " " "10" "11" "12" "13" "14" "15" "16" "17" "18" "19" "20" "21" "22"
> "23"
> [16] "24" "25" "26" "27" "28" "29" "30" "31" "32" "33" "34" "35" "36" "37"
> "6"
> [31] "7" "8" "9"
>
> Jim
>
>
> On Sat, Feb 13, 2016 at 2:41 AM, Fox, John <jfox at mcmaster.ca> wrote:
>
>> Dear Federico,
>>
>>> -----Original Message-----
>>> From: Federico Calboli [mailto:federico.calboli at helsinki.fi]
>>> Sent: February 12, 2016 10:27 AM
>>> To: Fox, John <jfox at mcmaster.ca>
>>> Cc: R Help <r-help at r-project.org>
>>> Subject: Re: [R] why is 9 after 10?
>>>
>>> Dear John,
>>>
>>> that is fortunatey not the case, I just managed to figure out that the
>> problem
>>> was that in the data reshaping pipeline the numeric column was
>> transformed
>>> into a factor.
>>
>> But that shouldn't have this effect, I think:
>>
>>> z <- as.factor(x)
>>> table(z)
>> z
>> 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
>> 31 32 33 34 35 36 37
>> 29 30 35 29 41 33 27 21 38 36 34 35 31 29 27 26 28 22 21 34 32 33 31 34 23
>> 32 35 39 31 40 35 29
>>
>>> levels(z)
>> [1] "6" "7" "8" "9" "10" "11" "12" "13" "14" "15" "16" "17" "18" "19"
>> "20" "21" "22" "23" "24" "25" "26" "27" "28" "29" "30" "31"
>> [27] "32" "33" "34" "35" "36" "37"
>>
>> Best,
>> John
>>
>>>
>>> Many thanks for your time.
>>>
>>> BW
>>>
>>> F
>>>
>>>
>>>
>>>> On 12 Feb 2016, at 17:22, Fox, John <jfox at mcmaster.ca> wrote:
>>>>
>>>> Dear Federico,
>>>>
>>>> Might my.data[, 2] contain character data, which therefore would be
>>> sorted in this manner? For example:
>>>>
>>>>> x <- sample(6:37, 1000, replace=TRUE)
>>>>> table(x)
>>>> x
>>>> 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
>>>> 30 31 32 33 34 35 36 37
>>>> 29 30 35 29 41 33 27 21 38 36 34 35 31 29 27 26 28 22 21 34 32 33 31
>>>> 34 23 32 35 39 31 40 35 29
>>>>> y <- as.character(x)
>>>>> table(y)
>>>> y
>>>> 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
>>>> 33 34 35 36 37 6 7 8 9
>>>> 41 33 27 21 38 36 34 35 31 29 27 26 28 22 21 34 32 33 31 34 23 32 35
>>>> 39 31 40 35 29 29 30 35 29
>>>>
>>>> I hope this helps,
>>>> John
>>>>
>>>> -----------------------------
>>>> John Fox, Professor
>>>> McMaster University
>>>> Hamilton, Ontario
>>>> Canada L8S 4M4
>>>> Web: socserv.mcmaster.ca/jfox
>>>>
>>>>
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of
>>>>> Federico Calboli
>>>>> Sent: February 12, 2016 10:13 AM
>>>>> To: R Help <r-help at r-project.org>
>>>>> Subject: [R] why is 9 after 10?
>>>>>
>>>>> Hi All,
>>>>>
>>>>> I have some data, one of the columns is a bunch of numbers from 6 to
>> 41.
>>>>>
>>>>> table(my.data[,2])
>>>>>
>>>>> returns
>>>>>
>>>>> 10 11 12 13 14 15 16 17 18 19 20 21 22 23
>> 24 25 26 27 28
>>> 29
>>>>> 30 31 32 33 34 35 36 37
>>>>> 1761 1782 1897 1749 1907 1797 1734 1810 1913 1988 1914 1822 1951 1973
>>>>> 1951
>>>>> 1947 2067 1967 1812 2119 1999 2086 2133 2081 2165 2365 2330 2340
>>>>> 38 39 40 41 6 7 8 9
>>>>> 2681 2905 3399 3941 1648 1690 1727 1668
>>>>>
>>>>> whereas the reasonable expectation is that the numbers from 6 to 9
>>>>> would come before 10 to 41.
>>>>>
>>>>> How do I sort this incredibly silly behaviour so that my table
>>>>> follows a reasonable expectation that 9 comes before 10 (and so on and
>>> so forth)?
>>>>>
>>>>> BW
>>>>>
>>>>> F
>>>>>
>>>>> --
>>>>> Federico Calboli
>>>>> Ecological Genetics Research Unit
>>>>> Department of Biosciences
>>>>> PO Box 65 (Biocenter 3, Viikinkaari 1)
>>>>> FIN-00014 University of Helsinki
>>>>> Finland
>>>>>
>>>>> federico.calboli at helsinki.fi
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-
>>>>> guide.html and provide commented, minimal, self-contained,
>>>>> reproducible code.
>>>
>>> --
>>> Federico Calboli
>>> Ecological Genetics Research Unit
>>> Department of Biosciences
>>> PO Box 65 (Biocenter 3, Viikinkaari 1)
>>> FIN-00014 University of Helsinki
>>> Finland
>>>
>>> federico.calboli at helsinki.fi
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
More information about the R-help
mailing list