[R] Sampling from a Postgres database
Joe Conway
mail at joeconway.com
Fri Jan 15 18:07:13 CET 2010
On 01/15/2010 01:49 AM, Bart Joosen wrote:
>
> One way could be to first select only the unique ID's, sample this and then
> select only the relevant records:
>
> strQuery = "SELECT ID from tblFoo;"
> IDs <- sqlQuery(channel, strQuery)
> sample.IDs <- sample(IDs,10)
> strQuery = paste("SELECT ID from tblFoo WHRE ID IN(", sample.IDs, ");")
> IDs <- sqlQuery(channel, strQuery)
Better is to use the built-in random() function in Postgres:
#select count(*) from visits;
count
---------
4846604
(1 row)
# select count(*) from visits where random() < 0.005;
count
-------
24391
(1 row)
HTH,
Joe
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 899 bytes
Desc: OpenPGP digital signature
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20100115/10f492cc/attachment.bin>
More information about the R-help
mailing list