[R] extracting characters from a string

Rui Barradas ruipbarradas at sapo.pt
Wed Jan 23 19:57:22 CET 2013


Hello,

I've just noticed that my first solution would only return the first set 
of alphabetic characters, such as "Van", not "Van den Hoops".
The following will solve that problem.


fun2 <- function(x, sep = ", "){
	x <- strsplit(x, sep)
	m <- lapply(x, function(y) gregexpr(" [[:alpha:]]*$", y))
	res <- lapply(seq_along(x), function(i)
		regmatches(x[[i]], m[[i]], invert = TRUE))
	res <- lapply(res, unlist)
	lapply(res, function(y) y[nchar(y) > 0])
}
fun2(pub)


Hope this helps,

Rui Barradas

Em 23-01-2013 18:33, Rui Barradas escreveu:
> Hello,
>
> Try the following.
>
> fun <- function(x, sep = ", "){
>      s <- unlist(strsplit(x, sep))
>      regmatches(s, regexpr("[[:alpha:]]*", s))
> }
>
> fun(pub)
>
>
> Hope this helps,
>
> Rui Barradas
>
> Em 23-01-2013 17:38, Biau David escreveu:
>> Dear All,
>>
>> I have a data frame of vectors of publication names such as 'pub':
>>
>> pub1 <- c('Brown DK, Santos R, Rome DF, Don Juan X')
>> pub2 <- c('Benigni D')
>> pub3 <- c('Arstra SD, Van den Hoops DD, lamarque D')
>>
>> pub <- rbind(pub1, pub2, pub3)
>>
>>
>> I would like to construct a dataframe with only author's last name and
>> each last name in columns and the publication in rows. Basically I
>> want to get rid of the initials (max 2, always before a comma) and
>> spaces surounding last name. I would like to avoid a loop.
>>
>> ps: If I could have even a short explanation of the code that extract
>> the values of the character string that would also be great!
>>
>>
>> David
>>
>>     [[alternative HTML version deleted]]
>>
>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list