[R] substract start from the end of the vector

arnaud Gaboury arnaud.gaboury at gmail.com
Fri Apr 23 18:28:35 CEST 2010


TY Steve, using regular expression does the job nicely. I need now to fully
understand your code and learn more about what a regular expression is. Any
good ref is welcome.


> -----Original Message-----
> From: Steve Lianoglou [mailto:mailinglist.honeypot at gmail.com]
> Sent: Friday, April 23, 2010 6:11 PM
> To: arnaud Gaboury
> Cc: r-help at r-project.org
> Subject: Re: [R] substract start from the end of the vector
> 
> Hi,
> 
> On Fri, Apr 23, 2010 at 11:57 AM, arnaud Gaboury
> <arnaud.gaboury at gmail.com> wrote:
> > Dear group,
> >
> > Here is my df :
> >
> > df <-
> >
> > structure(list(DESCRIPTION = c("PRM HGH GD ALUMINIUM USD 09/07/10 ",
> >
> > "PRM HGH GD ALUMINIUM USD 09/07/10 ", "PRIMARY NICKEL USD 04/06/10 "
> >
> > ), CREATED.DATE = structure(c(18361, 18361, 18325), class = "Date"),
> >
> >    QUANITY = c(-1L, 1L, 1L), CLOSING.PRICE = c("2,415.90",
> "2,415.90",
> >
> >    "25,755.71")), .Names = c("DESCRIPTION", "CREATED.DATE",
> >
> > "QUANITY", "CLOSING.PRICE"), row.names = c(NA, 3L), class =
> "data.frame")
> >
> >>
> >
> >> df
> >
> >                         DESCRIPTION                   CREATED.DATE
> > QUANITY          CLOSING.PRICE
> >
> > 1 PRM HGH GD ALUMINIUM USD 09/07/10    2020-04-09               -1
> > 2,415.90
> >
> > 2 PRM HGH GD ALUMINIUM USD 09/07/10    2020-04-09                1
> > 2,415.90
> >
> > 3              PRIMARY NICKEL USD 04/06/10    2020-03-04
>    1
> > 25,755.71
> >
> > In the DESCRIPTION column, I want to get rid of the date (09/07/10
> .). I
> > know the function substr(x, start, stop), but in my case, I need to
> indicate
> > I want to start from the end of the vector, and I have no idea how to
> pass
> > this argument.
> >
> > TY for any help
> 
> How about using a regular expression:
> 
> R> gsub(" *(\\d+/\\d+/\\d+)$", "", "HGH GD ALUMINIUM USD 09/07/10",
> perl=TRUE)
> [1] "HGH GD ALUMINIUM USD"
> 
> And to replace your DESCRIPTION clolumn
> 
> R> df$DESCRIPTION <- gsub(" *(\\d+/\\d+/\\d+) *$", "", df$DESCRIPTION)
> R> df
>                DESCRIPTION CREATED.DATE QUANITY CLOSING.PRICE
> 1 PRM HGH GD ALUMINIUM USD   2020-04-09      -1      2,415.90
> 2 PRM HGH GD ALUMINIUM USD   2020-04-09       1      2,415.90
> 3       PRIMARY NICKEL USD   2020-03-04       1     25,755.71
> 
> I'm also removing leading/trailing spaces around your date to strip
> out any trailing whitespace.
> 
> HTH,
> -steve
> 
> --
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
>  | Memorial Sloan-Kettering Cancer Center
>  | Weill Medical College of Cornell University
> Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the R-help mailing list