[R] Subset using grepl
Kang Min
ngokangmin at gmail.com
Thu Mar 17 06:40:05 CET 2011
Ok thank you!
On Mar 17, 12:12 pm, <Bill.Venab... at csiro.au> wrote:
> subset(data,grepl("[1-5]", section) & !grepl("0", section))
>
> BTW
>
> grepl("[1:5]", section)
>
> does work. It checks for the characters 1, :, or 5.
>
> -----Original Message-----
> From: r-help-boun... at r-project.org [mailto:r-help-boun... at r-project.org] On Behalf Of Kang Min
> Sent: Thursday, 17 March 2011 1:29 PM
> To: r-h... at r-project.org
> Subject: Re: [R]Subsetusinggrepl
>
> I have a new question, also regardinggrepl.
> I would like tosubsetrows with numbers from 1 to 5 in the section
> column, so I used
>
> subset(data,grepl("[1:5]", section))
>
> but this gave me rows with 10, 1 and 5. (Why is this so?) So I tried
>
> subset(data,grepl("[1,2,3,4,5]", section))
>
> which worked. But I also got 10 in the dataframe as well. How can I
> exclude 10?
>
> >data
> section piece LTc1 LTc2
> 10a 10-1 0.729095368 NA
> 10a 10-2 59.53292189 13.95612454
> 10h 10-3 0.213756661 NA
> 10i 10-4 NA NA
> 10b NA NA NA
> 10c NA NA NA
> 10d NA NA NA
> 10e NA NA NA
> 10f NA NA NA
> 10g NA NA NA
> 10h NA NA NA
> 10j NA NA NA
> 1b 1-1 NA NA
> 1d 1-2 29.37971303 12.79688209
> 1g 1-6 NA 7.607911603
> 1h 1-3 0.298059164 27.09896941
> 1i 1-4 25.11261782 19.87149991
> 1j 1-5 36.66969601 42.28507923
> 1a NA NA NA
> 1c NA NA NA
> 1e NA NA NA
> 1f NA NA NA
> 2a 2-1 15.98582117 10.58696146
> 2a 2-2 0.557308341 41.52650718
> 2c 2-3 14.99499024 10.0896793
> 2e 2-4 148.4530636 56.45493191
> 2f 2-5 25.27493551 12.98808577
> 2i 2-6 20.32857108 22.76075728
> 2b NA NA NA
> 2d NA NA NA
> 2g NA NA NA
> 2h NA NA NA
> 2j NA NA NA
> 3a 3-1 13.36602867 11.47541439
> 3a 3-7 NA 111.9007822
> 3c 3-2 10.57406701 5.587777567
> 3d 3-3 11.73240891 10.73833651
> 3e 3-8 NA 14.54214165
> 3h 3-4 21.56072089 21.59748884
> 3i 3-5 15.42846935 16.62715409
> 3i 3-6 129.7367193 121.8206045
> 3b NA NA NA
> 3f NA NA NA
> 3g NA NA NA
> 3j NA NA NA
> 5b 5-1 18.61733498 18.13545293
> 5d 5-3 NA 7.81018526
> 5f 5-2 12.5158971 14.37884817
> 5a NA NA NA
> 5c NA NA NA
> 5e NA NA NA
> 5g NA NA NA
> 5h NA NA NA
> 5i NA NA NA
> 5j NA NA NA
> 9h 9-1 NA NA
> 9a NA NA NA
> 9b NA NA NA
> 9c NA NA NA
> 9d NA NA NA
> 9e NA NA NA
> 9f NA NA NA
> 9g NA NA NA
> 9i NA NA NA
> 9j NA NA NA
> 8a 8-1 14.29712852 12.83178905
> 8e 8-2 23.46594953 9.097377872
> 8f 8-3 NA NA
> 8f 8-4 22.20001584 20.39646766
> 8h 8-5 50.54497551 56.93752065
> 8b NA NA NA
> 8c NA NA NA
> 8d NA NA NA
> 8g NA NA NA
> 8i NA NA NA
> 8j NA NA NA
> 4b 4-1 40.83468857 35.99017683
> 4f 4-3 NA 182.8060799
> 4f 4-4 NA 36.81401955
> 4h 4-2 17.13625062 NA
> 4a NA NA NA
> 4c NA NA NA
> 4d NA NA NA
> 4e NA NA NA
> 4g NA NA NA
> 4i NA NA NA
> 4j NA NA NA
> 7b 7-1 8.217605633 8.565035083
> 7a NA NA NA
> 7c NA NA NA
> 7d NA NA NA
> 7e NA NA NA
> 7f NA NA NA
> 7g NA NA NA
> 7h NA NA NA
> 7i NA NA NA
> 7j NA NA NA
> 6b 6-6 NA 11.57887288
> 6c 6-1 27.32608984 17.17778959
> 6c 6-2 78.21988783 61.80558768
> 6d 6-7 NA 3.599685625
> 6f 6-3 26.78838281 23.33258286
> 6h 6-4 NA NA
> 6h 6-5 NA NA
> 6a NA NA NA
> 6e NA NA NA
> 6g NA NA NA
> 6i NA NA NA
> 6j NA NA NA
>
> On Jan 29, 10:43 pm, Prof Brian Ripley <rip... at stats.ox.ac.uk> wrote:
> > On Sat, 29 Jan 2011, Kang Min wrote:
> > > Thanks Prof Ripley, the condition worked!
> > > Btw I tried to search ?repl but I don't have documentation for it. Is
> > > it in a non-basic package?
>
> > I meantgrepl: the edit messed up (but not on my screen, as sometimes
> > happens when working remotely). The point is that 'perl=TRUE'
> > guarantees that [A-J] is interpreted in ASCII order.
>
> > > On Jan 29, 6:54�pm, Prof Brian Ripley <rip... at stats.ox.ac.uk> wrote:
> > >> The grep comdition is "[A-J]"
>
> > >> BTW, why there are lots of unnecessary steps here, includingusing
> > >> cbind() andsubset():
>
> > >> x <- rep(LETTERS[1:20],3)
> > >> y <- rep(1:3, 20)
> > >> z <- paste(x,y, sep="")
> > >> random.data <- rnorm(60)
> > >> data <- data.frame(z, random.data)
> > >> data[grepl("[A-J]", z), ]
>
> > >> Now (for the paranoid and not needed in this example) in general the
> > >> effect of "[A-Z]" depends on the locale, so you could write out
> > >> "[ABCDEFIJK]" or create it by
>
> > >> cond <- paste("[", paste(LETTERS[1:10], collapse=""), "]", sep="")
>
> > >> Or use repl("[A-J]", z, perl=TRUE).
>
> > >> On Sat, 29 Jan 2011, Kang Min wrote:
> > >>> Hi all,
>
> > >>> I would like tosubseta dataframe byusingpart of the level name.
>
> > >>> x <- rep(LETTERS[1:20],3)
> > >>> y <- rep(1:3, 20)
> > >>> z <- paste(x,y, sep="")
> > >>> random.data <- rnorm(60)
> > >>> data <- as.data.frame(cbind(z, random.data))
>
> > >>> I need rows that contain the letters A to J, so I tried:
>
> > >>>subset(data,grepl(LETTERS[1:10], z)) # got only rows with A
> > >>>subset(data, z %in% LETTERS[1:10]) # got no rows
>
> > >>> I think I'm getting close to the solution but need a little bit of
> > >>> help here, thanks in advance.
>
> > >>> Kang Min
>
> > >>> ______________________________________________
> > >>> R-h... at r-project.org mailing list
> > >>>https://stat.ethz.ch/mailman/listinfo/r-help
> > >>> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> > >>> and provide commented, minimal, self-contained, reproducible code.
>
> > >> --
> > >> Brian D. Ripley, � � � � � � � � �rip... at stats.ox.ac.uk
> > >> Professor of Applied Statistics, �http://www.stats.ox.ac.uk/~ripley/
> > >> University of Oxford, � � � � � � Tel: �+44 1865 272861 (self)
> > >> 1 South Parks Road, � � � � � � � � � � +44 1865 272866 (PA)
> > >> Oxford OX1 3TG, UK � � � � � � � �Fax: �+44 1865 272595
>
> > >> ______________________________________________
> > >> R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> > >> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> > >> and provide commented, minimal, self-contained, reproducible code.
>
> > > ______________________________________________
> > > R-h... at r-project.org mailing list
> > >https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
>
> > --
> > Brian D. Ripley, rip... at stats.ox.ac.uk
> > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> > University of Oxford, Tel: +44 1865 272861 (self)
> > 1 South Parks Road, +44 1865 272866 (PA)
> > Oxford OX1 3TG, UK Fax: +44 1865 272595
>
> > ______________________________________________
> > R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
More information about the R-help
mailing list