[R] extracting characters from a string

Rui Barradas ruipbarradas at sapo.pt
Wed Jan 23 19:33:25 CET 2013


Hello,

Try the following.

fun <- function(x, sep = ", "){
	s <- unlist(strsplit(x, sep))
	regmatches(s, regexpr("[[:alpha:]]*", s))
}

fun(pub)


Hope this helps,

Rui Barradas

Em 23-01-2013 17:38, Biau David escreveu:
> Dear All,
>
> I have a data frame of vectors of publication names such as 'pub':
>
> pub1 <- c('Brown DK, Santos R, Rome DF, Don Juan X')
> pub2 <- c('Benigni D')
> pub3 <- c('Arstra SD, Van den Hoops DD, lamarque D')
>
> pub <- rbind(pub1, pub2, pub3)
>
>
> I would like to construct a dataframe with only author's last name and each last name in columns and the publication in rows. Basically I want to get rid of the initials (max 2, always before a comma) and spaces surounding last name. I would like to avoid a loop.
>
> ps: If I could have even a short explanation of the code that extract the values of the character string that would also be great!
>
>
> David
>
> 	[[alternative HTML version deleted]]
>
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list