[R] Decoding subscripts/superscripts from CSVs
Jim Lemon
jim at bitwrit.com.au
Wed Jul 23 14:00:18 CEST 2008
On Tue, 2008-07-22 at 16:18 -0400, naw3 at duke.edu wrote:
> Hi,
>
> I have a CSV file with various biological reactions. Subscripts, superscripts,
> and italics are encoded in carats, and I was wondering if R can actually
> recognize those and print actual superscripts, etc. Here's an example:
>
> <i>S</i>-adenosyl-L-methionine + rRNA = <i>S</i>-adenosyl-L-homocysteine +
> rRNA containing <i>N<sup>6</sup></i>-methyladenine
>
Hi Nina,
Embedded formatting commands enclosed in angle brackets (a caret is ^)
are almost certainly from the SGML family of markup languages and
probably from XML as this is becoming more common as a data format. If
you want to translate the XML to plotmath, you must change the XML tags
to plotmath tags. Here is a toy function for your example:
xml2pm<-function(xmlstring) {
xmlstring<-gsub("<[iI]>","italic(",xmlstringE)
xmlstring<-gsub("</[Ii]>",")",xmlstring)
xmlstring<-gsub("<[Ss][Uu][Pp]>","^",xmlstring)
xmlstring<-gsub("</[Ss][Uu][Pp]>","",xmlstring)
return(xmlstring)
}
Jim
More information about the R-help
mailing list