[R] How to view un-sampled data from a randomly sampled dataset

peter dalgaard pdalgd at gmail.com
Wed Oct 23 22:47:23 CEST 2013


On Oct 23, 2013, at 21:50 , William Dunlap wrote:

>> s <- sample(1:nrow(data), 40, replace=FALSE)
>> y <- data[s,]
>> x <- data[-s,]
> 
> If you don't know the size of the sample and it might be 0 then
> you have to be a bit more wordy:
>    x <- data[setdiff(seq_len(nrow(data)), s), ]
> or the uglier
>   x <- if (length(s) > 0) x else x[-s,]

Yes, I took the liberty of assuming that 40 was not 0... (Your "ugly" example seems to have a few problems though. Surely you mean: if (length(s)) data[-s,] else data ).

There's also the option of a logical index:

N <- nrow(data)
n <- 40
ix <- sample(rep(c(TRUE,FALSE), c(n, N-n)))
y <- data[ix,]
x <- data[!ix,]


> 
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
> 
> 
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
>> Of peter dalgaard
>> Sent: Wednesday, October 23, 2013 12:37 PM
>> To: erinu
>> Cc: r-help at r-project.org
>> Subject: Re: [R] How to view un-sampled data from a randomly sampled dataset
>> 
>> 
>> On Oct 23, 2013, at 20:13 , erinu wrote:
>> 
>>> Hi there-
>>> 
>>> I have a 150 row dataset (data). I create "y" a randomly sampled (without
>>> replacement) set number of observations (40):
>>> 
>>> y<-data[sample(1:nrow(data),40,replace=FALSE),]
>>> 
>>> I would like to make a new variable "x" that contains the leftover
>>> non-sampled 110 observations.  I am sure there is a fairly easy way to do
>>> this.
>>> 
>>> Any help would be greatly appreciated.
>>> 
>>> THANKS!
>>> 
>> 
>> Just hold on to the indices:
>> 
>> s <- sample(1:nrow(data), 40, replace=FALSE)
>> y <- data[s,]
>> x <- data[-s,]
>> 
>> -pd
>> 
>> 
>>> 
>>> 
>>> --
>>> View this message in context: http://r.789695.n4.nabble.com/How-to-view-un-
>> sampled-data-from-a-randomly-sampled-dataset-tp4678887.html
>>> Sent from the R help mailing list archive at Nabble.com.
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> --
>> Peter Dalgaard, Professor,
>> Center for Statistics, Copenhagen Business School
>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>> Phone: (+45)38153501
>> Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-help mailing list