[R] bug in interaction order when using drop?
Prof Brian Ripley
ripley at stats.ox.ac.uk
Fri Aug 11 13:31:55 CEST 2006
On Thu, 10 Aug 2006, Petr Pikal wrote:
> Ooops, my first suggestion reorders factor itself but
>
> if (drop) factor(ans) else ans
>
> instead of whole drop construction shall preserve levels order
> without changing order of factor
Even easier would be to return ans[,drop=drop]. It seems to me that there
is an argument for expecting interaction(..., drop=TRUE) to give the same
result as interaction(...)[,drop=TRUE], but little argument that any
ordering is a *bug*.
The order of the levels of a factor are arbitrary, and in fact they seem
to me to be in a strange order, with the levels of the first factor
varying fastest (reverse lexiographic order).
> levels(interaction(c("A", "A", "B"), letters[1:3]))
[1] "A.a" "B.a" "A.b" "B.b" "A.c" "B.c"
so the existing
> levels(interaction(c("A", "A", "B"), letters[1:3], drop=T))
[1] "A.a" "A.b" "B.c"
looks more sensible in this case.
>
> Petr
>
> On 10 Aug 2006 at 16:32, Petr Pikal wrote:
>
> From: "Petr Pikal" <petr.pikal at precheza.cz>
> To: r-help at stat.math.ethz.ch
> Date sent: Thu, 10 Aug 2006 16:32:54 +0200
> Priority: normal
> Subject: [R] bug in interaction order when using drop?
>
> > Hallo all
> >
> > > version
> > _
> > platform i386-pc-mingw32
> > arch i386
> > os mingw32
> > system i386, mingw32
> > status beta
> > major 2
> > minor 3.1
> > year 2006
> > month 05
> > day 23
> > svn rev 38179
> > language R
> > version.string Version 2.3.1 beta (2006-05-23 r38179)
> > >
> >
> > When I use interaction(....) without drop=T parameters I will get
> > neatly organized factor with "protiproud" and "souproud" aligned.
> >
> > > levels(interaction(vykon, teplota, proudeni))
> > [1] "3.750.protiproud" "12.750.protiproud" "3.775.protiproud"
> > "12.775.protiproud" "3.800.protiproud" "12.800.protiproud"
> > [7] "3.825.protiproud" "12.825.protiproud" "3.850.protiproud"
> > "12.850.protiproud" "3.750.souproud" "12.750.souproud" [13]
> > "3.775.souproud" "12.775.souproud" "3.800.souproud"
> > "12.800.souproud" "3.825.souproud" "12.825.souproud" [19]
> > "3.850.souproud" "12.850.souproud"
> >
> > However when I use
> >
> > > levels(interaction(vykon, teplota, proudeni, drop=T))
> > [1] "3.775.protiproud" "3.800.souproud" "3.750.souproud"
> > "12.850.souproud" "12.825.protiproud"
> >
> > everything is out of order. I know I can reorder any factor according
> > to my wish but it would be good to have it ordered same way as without
> > using drop.
> >
> > Everything comes from unique in
> >
> > if (drop) {
> > f <- unique(ans[!is.na(ans)])
> > ans <- match(ans, f)
> > lvs <- lvs[f]
> > }
> >
> > maybe it can be modified.
> >
> > if (drop) {
> > f <- unique(ans[!is.na(ans)])
> > ord <- order(f)
> > ans <- match(ans, f)
> > lvs <- lvs[f[ord]]
> > }
> >
> > which seems to work but I am not sure if it does not makes problems
> > having NA in data.
> >
> > Here is my data frame.
> > Thank you
> >
> > Petr Pikal
> >
> > > dump("df", file=stdout())
> > df <-
> > structure(list(proudeni = structure(as.integer(c(1, 1, 1, 1,
> > 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2,
> > 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1,
> > 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> > 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
> > 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> > 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
> > 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> > 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2,
> > 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1,
> > 1, 1, 1)), .Label = c("protiproud", "souproud"), class = "factor"),
> > vykon = as.integer(c(3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
> > 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
> > 3, 3, 3, 3, 3, 3, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12,
> > 12, 12, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
> > 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
> > 3, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 3, 3,
> > 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
> > 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 12, 12, 12,
> > 12, 12, 12, 12, 12, 12, 12, 12, 12, 3, 3, 3, 3, 3, 3, 3,
> > 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
> > 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 12, 12, 12, 12, 12, 12,
> > 12, 12, 12, 12, 12, 12)), teplota = as.integer(c(775, 775,
> > 775, 775, 775, 775, 775, 775, 775, 775, 775, 775, 775, 775,
> > 775, 775, 775, 775, 800, 800, 800, 800, 800, 800, 800, 800,
> > 800, 800, 800, 800, 800, 800, 800, 800, 800, 800, 750, 850,
> > 850, 850, 850, 850, 850, 825, 825, 825, 825, 825, 825, 775,
> > 775, 775, 775, 775, 775, 775, 775, 775, 775, 775, 775, 775,
> > 775, 775, 775, 775, 775, 800, 800, 800, 800, 800, 800, 800,
> > 800, 800, 800, 800, 800, 800, 800, 800, 800, 800, 800, 750,
> > 850, 850, 850, 850, 850, 850, 825, 825, 825, 825, 825, 825,
> > 775, 775, 775, 775, 775, 775, 775, 775, 775, 775, 775, 775,
> > 775, 775, 775, 775, 775, 775, 800, 800, 800, 800, 800, 800,
> > 800, 800, 800, 800, 800, 800, 800, 800, 800, 800, 800, 800,
> > 750, 850, 850, 850, 850, 850, 850, 825, 825, 825, 825, 825,
> > 825, 775, 775, 775, 775, 775, 775, 775, 775, 775, 775, 775,
> > 775, 775, 775, 775, 775, 775, 775, 800, 800, 800, 800, 800,
> > 800, 800, 800, 800, 800, 800, 800, 800, 800, 800, 800, 800,
> > 800, 750, 850, 850, 850, 850, 850, 850, 825, 825, 825, 825,
> > 825, 825))), .Names = c("proudeni", "vykon", "teplota"),
> > row.names = c("1",
> > "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
> > "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24",
> > "25", "26", "27", "28", "29", "30", "31", "32", "33", "34", "35",
> > "36", "37", "38", "39", "40", "41", "42", "43", "44", "45", "46",
> > "47", "48", "49", "50", "51", "52", "53", "54", "55", "56", "57",
> > "58", "59", "60", "61", "62", "63", "64", "65", "66", "67", "68",
> > "69", "70", "71", "72", "73", "74", "75", "76", "77", "78", "79",
> > "80", "81", "82", "83", "84", "85", "86", "87", "88", "89", "90",
> > "91", "92", "93", "94", "95", "96", "97", "98", "99", "100", "101",
> > "102", "103", "104", "105", "106", "107", "108", "109", "110", "111",
> > "112", "113", "114", "115", "116", "117", "118", "119", "120", "121",
> > "122", "123", "124", "125", "126", "127", "128", "129", "130", "131",
> > "132", "133", "134", "135", "136", "137", "138", "139", "140", "141",
> > "142", "143", "144", "145", "146", "147", "148", "149", "150", "151",
> > "152", "153", "154", "155", "156", "157", "158", "159", "160", "161",
> > "162", "163", "164", "165", "166", "167", "168", "169", "170", "171",
> > "172", "173", "174", "175", "176", "177", "178", "179", "180", "181",
> > "182", "183", "184", "185", "186", "187", "188", "189", "190", "191",
> > "192", "193", "194", "195", "196"), class = "data.frame") > Petr Pikal
> > petr.pikal at precheza.cz
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html and provide commented,
> > minimal, self-contained, reproducible code.
>
> Petr Pikal
> petr.pikal at precheza.cz
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list