[R] HELP!! how to remove 10% of data randomly in R
David Winsemius
dwinsemius at comcast.net
Wed Oct 31 16:49:00 CET 2012
On Oct 31, 2012, at 5:42 AM, Eugenie wrote:
> tDate tTime O3 No2 Temp Sun Wspeed Wdirect Hum Indicator
> 1 19980101 2400 0.065 0.036 31.4 765 9.9 351 NA 1
> 2 19980102 2400 0.053 0.025 31.8 624 7.7 351 NA 1
> 3 19980103 2400 0.027 0.033 31.5 852 8.8 331 NA 2
> 4 19980104 2400 0.034 0.023 30.7 679 7.0 338 NA 2
> 5 19980105 2400 0.019 0.016 28.1 376 9.6 354 NA 1
> 6 19980106 2400 0.021 0.018 29.9 603 9.3 356 NA 1
> 7 19980107 2400 0.026 0.047 31.2 857 10.7 336 NA 1
> 8 19980108 2400 0.024 0.014 31.1 635 7.8 330 NA 1
> 9 19980109 2400 0.058 0.033 32.5 742 10.7 334 NA 1
> 10 19980110 2400 0.026 0.032 33.9 923 10.6 347 NA 2
> 11 19980111 2400 0.064 0.034 32.5 751 6.3 355 NA 2
> 12 19980112 2400 0.066 0.034 33.3 697 8.5 319 NA 1
> 13 19980113 2400 0.026 0.030 33.4 992 12.5 341 NA 1
> 14 19980114 2400 0.101 0.028 33.8 705 8.7 349 NA 1
> 15 19980115 2400 0.069 0.030 33.3 718 11.4 348 NA 1
> 16 19980116 2400 0.054 0.026 33.4 639 10.9 354 NA 1
> 17 19980117 2400 0.090 0.039 33.1 653 13.2 342 NA 2
> 18 19980118 2400 0.048 0.017 33.2 825 10.8 323 NA 2
> 19 19980119 2400 0.038 0.027 33.7 984 10.3 353 NA 1
> 20 19980120 2400 0.026 0.032 34.2 994 15.0 357 NA 1
> 21 19980121 2400 0.065 0.044 33.8 999 17.5 343 NA 1
> 22 19980122 2400 0.046 0.024 33.5 931 10.1 332 NA 1
> 23 19980123 2400 0.050 0.041 33.9 881 11.3 353 NA 1
> 24 19980124 2400 0.036 0.027 33.8 877 9.1 328 NA 2
> 25 19980125 2400 0.043 0.021 33.2 777 10.5 340 NA 2
> 26 19980126 2400 0.029 0.016 33.1 999 14.1 341 NA 1
> 27 19980127 2400 0.033 0.030 33.9 943 12.9 344 NA 1
> 28 19980128 2400 0.040 0.022 33.7 805 12.6 354 NA 1
> 29 19980129 2400 0.029 0.015 30.2 512 7.4 356 NA 1
> 30 19980130 2400 0.027 0.013 31.7 656 13.9 349 NA 1
>
>
>
> if given data like this,how to remove the data in O3,NO2,sun,temp,wspeed
> randomly??(missing values in these rows & columns)
Not clear whether those entries are to be NA or that you wanted a reduced size dataframe. Perhaps:
is.na(dfrm[ sample(1:NROW(dfrm) , c('O3','NO2','sun','temp','wspeed')]) <- TRUE
Note that the spellings of your column names and specified targets are not the same, and so there is a further problem with you problem specification.
--
David Winsemius, MD
Alameda, CA, USA
More information about the R-help
mailing list