[R] Query about data manipulation

Marc Schwartz marc_schwartz at comcast.net
Thu Mar 1 18:18:36 CET 2007


On Thu, 2007-03-01 at 08:34 -0800, lalitha viswanath wrote:
> Hi
> Thanks much for the prompt response to my earlier
> enquiry on packages for regression analyses.
> Along the same topic(?), I have another question about
> which I could use some input.
> 
> I am retreiving data from a MySQL database using
> RODBC. 
> The table has many BLOB columns and each BLOB column
> has data in the format
> "id1 \t id2 \t measure \n id3 \t id4 \t measure...."
> (i.e. multiple rows compressed as one long string)
> 
> I am retreiving them as follows.
> 
> dataFromDB <- sqlQuery(channel, "select
> uncompress(columnName) from tableName");
> 
> 
> I am looking for ways to convert this long "string"
> into a table/dataframe in R, making it easier for
> further post processing etc without reading/writing it
> to a file first.
> 
> Although by doing write.table and reading it in again,
> I got the result in a data frame, with the \t and \n
> interpreted correctly, I wish to sidestep this as I
> need to carry out this analyses for over 4 million
> such entries.
> I tried 
> write.table(dataFromDB, file="FileName");
> dataFromFile <- read.table(FileName, sep="\t") 
> dataFromFile is of the form
> 
> 92_8_nmenA      993_7_mpul      1.042444
> 92_8_nmenA      3_5_cpneuA      0.900939
> 190_1_rpxx      34_4_ctraM      0.822532
> 190_1_rpxx      781_6_pmul      0.870016
> 
> Your input on the above is greatly appreciated.
> Thanks
> Lalitha

The easiest way might be to use a textConnection().

Let's say that you have read in your data as above and you have a column
called 'blob':

> dataFromDB
                                            blob
1 id1 \t id2 \t measure \n id3 \t id4 \t measure


#Open textConnection.  Note coercion to character
BLOB <- textConnection(as.character(dataFromDB$blob))

# Read in the column
DF <- read.table(BLOB, sep = "\t")

# Close the connection
close(BLOB)


> DF
     V1    V2        V3
1  id1   id2   measure 
2  id3   id4   measure


See ?textConnection

HTH,

Marc Schwartz



More information about the R-help mailing list