[R] Extract student ID that match certain criteria

Ulrik Stervbo ulrik.stervbo at gmail.com
Mon Mar 13 09:06:20 CET 2017


Hi Roslinazairimah,

As Bert suggested, you should get acquainted with regular expressions. It
can be confusing at times, but pays off in the long run.

In your case, the pattern of "^[A-Z]{2}14.*" might work.

Best,
Ulrik

On Mon, 13 Mar 2017 at 06:20 roslinazairimah zakaria <roslinaump at gmail.com>
wrote:

> Another question,
>
> How do I extract ID based on the third and fourth letter:
>
> I have for example, AA14004, AB15035, CB14024, PA14009, PA14009 etc
>
> I would like to extract ID no. of AB14..., CB14..., PA14...
>
> On Mon, Mar 13, 2017 at 12:37 PM, roslinazairimah zakaria <
> roslinaump at gmail.com> wrote:
>
> > Hi Bert,
> >
> > Thank you so much for your help.  However I don't really sure what is the
> > use of y values.  Can we do without it?
> >
> > x <- as.character(FKASA$STUDENT_ID)
> > y <- c(1,786)
> > My.Data <- data.frame (x,y)
> >
> > My.Data[grep("^AA14", My.Data$x), ]
> >
> > I got the following data:
> >
> >           x   y
> > 1   AA14068   1
> > 7   AA14090   1
> > 11  AA14099   1
> > 14  AA14012 786
> > 15  AA14039   1
> > 22  AA14251 786
> >
> > On Mon, Mar 13, 2017 at 11:51 AM, Bert Gunter <bgunter.4567 at gmail.com>
> > wrote:
> >
> >> 1. Your code is incorrect. All entries are character strings and must be
> >> quoted.
> >>
> >> 2. See ?grep  and note in particular (in the "Value" section):
> >>
> >> "grep(value = TRUE) returns a character vector containing the selected
> >> elements of x (after coercion, preserving names but no other
> >> attributes)."
> >>
> >>
> >> 3. While the fixed = TRUE option will work here, you may wish to learn
> >> about "regular expressions", which can come in very handy for
> >> character string manipulation tasks. ?regex in R has a terse, but I
> >> have found comprehensible, discussion. There are many good gentler
> >> tutorials on the web, also.
> >>
> >>
> >> Cheers,
> >> Bert
> >>
> >> Bert Gunter
> >>
> >> "The trouble with having an open mind is that people keep coming along
> >> and sticking things into it."
> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>
> >>
> >> On Sun, Mar 12, 2017 at 8:32 PM, roslinazairimah zakaria
> >> <roslinaump at gmail.com> wrote:
> >> > Dear r-users,
> >> >
> >> > I have this list of student ID,
> >> >
> >> > dt <- c(AA14068, AA13194, AE11054, AA12251, AA13228, AA13286, AA14090,
> >> > AA13256, AA13260, AA13291, AA14099, AA15071, AA13143, AA14012,
> AA14039,
> >> > AA15018, AA13234, AA13149, AA13282, AA13218)
> >> >
> >> > and I would like to extract all student of ID AA14... only.
> >> >
> >> > I search and tried substrt, subset and select but it fail.
> >> >
> >> >  substr(FKASA$STUDENT_ID, 2, nchar(string1))
> >> > Error in nchar(string1) : 'nchar()' requires a character vector
> >> >> subset(FKASA, STUDENT_ID=="AA14" )
> >> >  [1] FAC_CODE    FACULTY     STUDENT_ID  NAME        PROGRAM
>  KURSUS
> >> >  CGPA        ACT_SS      ACT_VAL     ACT_CS      ACT_LED     ACT_PS
> >> >  ACT_IM
> >> > [14] ACT_ENT     ACT_CRE     ACT_UNI     ACT_VOL...
> >> >
> >> > Thank you so much for your help.
> >> >
> >> > How do I do it?
> >> > --
> >> > *Roslinazairimah Zakaria*
> >> > *Tel: +609-5492370 <+60%209-549%202370>; Fax. No.+609-5492766
> <+60%209-549%202766>*
> >> >
> >> > *Email: roslinazairimah at ump.edu.my <roslinazairimah at ump.edu.my>;
> >> > roslinaump at gmail.com <roslinaump at gmail.com>*
> >> > Faculty of Industrial Sciences & Technology
> >> > University Malaysia Pahang
> >> > Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia
> >> >
> >> >         [[alternative HTML version deleted]]
> >> >
> >> > ______________________________________________
> >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > PLEASE do read the posting guide http://www.R-project.org/posti
> >> ng-guide.html
> >> > and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >
> >
> > --
> > *Roslinazairimah Zakaria*
> > *Tel: +609-5492370 <+60%209-549%202370> <+60%209-549%202370>; Fax. No.
> +609-5492766 <+60%209-549%202766>
> > <+60%209-549%202766>*
> >
> > *Email: roslinazairimah at ump.edu.my <roslinazairimah at ump.edu.my>;
> > roslinaump at gmail.com <roslinaump at gmail.com>*
> > Faculty of Industrial Sciences & Technology
> > University Malaysia Pahang
> > Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia
> >
>
>
>
> --
> *Roslinazairimah Zakaria*
> *Tel: +609-5492370 <+60%209-549%202370>; Fax. No.+609-5492766
> <+60%209-549%202766>*
>
> *Email: roslinazairimah at ump.edu.my <roslinazairimah at ump.edu.my>;
> roslinaump at gmail.com <roslinaump at gmail.com>*
> Faculty of Industrial Sciences & Technology
> University Malaysia Pahang
> Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list