[R] Regex to stop at first capital letter after sequence
Sarah Goslee
sarah.goslee at gmail.com
Mon Dec 19 23:01:46 CET 2016
Hi,
If your actual data are of the same form as your sample data, why not just:
x <- c("PPA 06 - Promo Vasito", "PPA 05 - Cuentos",
"PPA 04 - Promo vasito", "PPA 03 - Promoción escolar",
"PPA - Saluda a tu pediatra", "PPL - Dia del Pediatra")
sub("^.* - ", "", x)
[1] "Promo Vasito" "Cuentos" "Promo vasito"
[4] "Promoción escolar" "Saluda a tu pediatra" "Dia del Pediatra"
On Mon, Dec 19, 2016 at 4:25 PM, Omar André Gonzáles Díaz
<oma.gonzales at gmail.com> wrote:
> I have the following strings:
>
> [1] "PPA 06 - Promo Vasito" [2] "PPA 05 - Cuentos"
> [3] "PPA 04 - Promo vasito" [4] "PPA 03 - Promoción escolar"
> [5] "PPA - Saluda a tu pediatra" [6] "PPL - Dia del Pediatra"
>
> *Desired result*:
>
> [1] "Promo Vasito" "Cuentos" "Promo vasito"
>
> [4] "Promoción escolar" "Saluda a tu pediatra" "Dia del Pediatra"
>
>
> *First attemp*:
>
> After this line:
>
> mead_nov$`Nombre del anuncio` <- gsub("(PPA.*)([A-Z].*)", "\\2",
> mead_nov$`Nombre del anuncio`)
>
> I get these:
>
> [1] "Vasito" [2] "Cuentos" [3] "Promo
> vasito"
> [4] "Promoción escolar" [5] "Saluda a tu pediatra" [6] "PPL - Dia
> del Pediatra"
>
>
> *Second attemp:*
>
> mead_nov$`Nombre del anuncio` <- gsub("(PPA|PPL.*)([A-Z].*)", "\\2",
> mead_nov$`Nombre del anuncio`)
>
> I get this:
>
> [1] "PPA 06 - Promo Vasito" [2] "PPA 05 - Cuentos"
> [3] "PPA 04 - Promo vasito" [3] "PPA 03 - Promoción escolar"
> [5] "PPA - Saluda a tu pediatra" [6] "Pediatra"
>
>
> Thank you for your help.
>
--
Sarah Goslee
http://www.functionaldiversity.org
More information about the R-help
mailing list