[R-sig-DB] SQLite problem with random()

Thierry Onkelinx th|erry@onke||nx @end|ng |rom |nbo@be
Sun Oct 25 11:56:47 CET 2015


Dear all,

It seems that the sorting goes wrong on columns filled with the result of
the random() function in SQLite. See the example below, run on R version
3.2.2 (2015-08-14), Platform: i386-w64-mingw32/i386 (32-bit).

Replacing random() with random() % .Machine$max.integer solves the problem.
So I expect that the problem is that the integers generated by random() are
larger than .Machine$max.integer. But I don't see why that would affect the
order. Isn't the order determined in SQLite?

Best regards,

library(RSQLite)
filename <- tempfile(fileext = ".db")
conn <- dbConnect(SQLite(), filename)
sql <- c("CREATE TABLE Base (BaseValue);",
"INSERT INTO Base (BaseValue) VALUES (1);",
"INSERT INTO Base (BaseValue) VALUES (2);",
"INSERT INTO Base (BaseValue) VALUES (3);",
"INSERT INTO Base (BaseValue) VALUES (4);")
junk <- lapply(sql, dbGetQuery, conn = conn)
dbReadTable(conn, "Base")
sql <- "CREATE TABLE GroupRanking AS
SELECT
 x1.BaseValue AS Level,
 x2.BaseValue AS Parent,
 x3.BaseValue AS Child,
 random() AS Ranking
FROM
  Base AS x1
CROSS JOIN
  Base AS x2
CROSS JOIN
  Base AS x3;
"
dbGetQuery(conn, sql)
sql <- "
    SELECT
      x.Level,
      x.Parent,
      x.Ranking,
      y.minr,
      y.maxr
    FROM
      GroupRanking AS x
    INNER JOIN
    (
      SELECT
        Level,
        Parent,
        min(Ranking) AS minr,
        max(Ranking) AS maxr
      FROM
        GroupRanking
      GROUP BY
        Level, Parent
    ) AS y
    ON
      x.Level = y.Level AND x.Parent = y.Parent
    ORDER BY
      x.Level, x.Parent, x.Ranking;
"
dbGetQuery(conn, sql)

sql <- paste("CREATE TABLE GroupRanking AS
SELECT
 x1.BaseValue AS Level,
 x2.BaseValue AS Parent,
 x3.BaseValue AS Child,
 random() %", .Machine$integer.max, " AS Ranking
FROM
  Base AS x1
CROSS JOIN
  Base AS x2
CROSS JOIN
  Base AS x3;
")
dbGetQuery(conn, sql)


ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

	[[alternative HTML version deleted]]




More information about the R-sig-DB mailing list