[R] Help with text separation
David Winsemius
dwinsemius at comcast.net
Mon Nov 14 18:05:13 CET 2011
On Nov 14, 2011, at 4:20 AM, Michael Griffiths wrote:
> Good morning R list,
>
> My apologies if this has *already* answered elsewhere, but I have
> not found
> the answer that I am looking for.
>
> I have a character string, i.e.
>
>
> form<-c('~ A + B + C + C / D + E + E / F + G + H + I + J + K + L * M')
>
> Now, my aim is to find the position of all those instances of '*'
> and to
> remove said '*'. However, I would also like to remove the preceding
> variable name before the '*', the math operator preceding this, and
> also
> the variable name after the '*'. So, here I would like to remove
> '+L*M'
This would be a very narrow implementation that requires the +/spc/
alnum/spc/*/alnum sequence exactly;
> sub("\\+*\\s*[[:alnum:]]*\\s*\\*.[[:alnum:]]*", "", form)
[1] "~ A + B + C + C / D + E + E / F + G + H + I + J + K "
This is a more general implementation using the "*" operator that
matches each of the preceding item 0 or more times.
form<-c('~ A + B + C + C / D + E + E / F + G + H + I + J + K + L * M',
'~ A + B + C + C / D + E + E / F + G + H + I + J + K + L*M',
'~ A + B + C + C / D + E + E / F + G + H + I + J + K +Llll*M'
)
> sub("\\+*\\s*[[:alnum:]]*\\s*\\*.[[:alnum:]]*", "", form)
[1] "~ A + B + C + C / D + E + E / F + G + H + I + J + K "
[2] "~ A + B + C + C / D + E + E / F + G + H + I + J + K "
[3] "~ A + B + C + C / D + E + E / F + G + H + I + J + K "
---stripped out code---
--
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list