[R] Parsing a Simple Chemical Formula
hanson at depauw.edu
Mon Dec 27 00:29:52 CET 2010
Hello R Folks...
I've been looking around the 'net and I see many complex solutions in
various languages to this question, but I have a pretty simple need
(and I'm not much good at regex). I want to use a chemical formula as
a function argument. The formula would be in "Hill order" which is to
list C, then H, then all other elements in alphabetical order. My
example will have only a limited number of elements, few enough that
one can search directly for each element. So some examples would be
C5H12, or C5H12O or C5H11BrO (note that for oxygen and bromine, O or
Br, there is no following number meaning a 1 is implied).
> form <- "C5H11BrO"
I'd like to get the count of each element, so in this case I need to
extract C and 5, H and 11, Br and 1, O and 1 (I want to calculate the
molecular weight by mulitplying). Sounds pretty simple, but my
experiments with grep and strsplit don't immediately clue me into an
obvious solution. As I said, I don't need a general solution to the
problem of calculating molecular weight from an arbitrary formula,
that seems quite challenging, just a way to convert "form" into a list
or data frame which I can then do the math on.
Here's hoping this is a simple issue for more experienced R users!
Professor of Chemistry & Biochemistry
More information about the R-help