[R-SIG-Mac] Preparing mySQL queries for R and SNA
elw at stderr.org
elw at stderr.org
Wed Nov 7 03:30:06 CET 2007
> input in R. I am hoping to do both the stats and SNA portions of my
> analysis in R (using the sna package as well). I endeavor to write a
> script to handle everything. My process flow is: mySQL >> R >> R(sna)
> >> output tables and graphs >> insert into docs.
This is indeed a sane workflow - pretty much what I use for my own social
network analysis work. I happen to use postgresql rather than mysql, but
that is somewhat tangential to the problem and occasionally driven by the
kind of data I tend to like to work with ;-)
Looked at the igraph package?
Side-tip -- the sooner one starts writing functions for big blobs of
analysis rather than do-it-all scripts, the happier one seems to be. A
lot of my old code is not written as functions, and frankly I seem to be
putting a lot of work into converting it TO functions prior to being able
to reuse it. Oops.
> Is .csv the best intermediate format for input into R and the sna
> package, or will I have to write my own script to prepare the data in
> special R(sna) matrix form?
As Byron suggests, pulling the data directly from MySQL into R is probably
much simpler, and frankly is pretty straightforward. You'll basically be
writing selects that return vectors of data objects - pretty much what you
likely already have in SQL, I would guess.
There is support for things like Harwell-Boeing sparse matrix format, but
you can probably avoid the need for that.
The more you can do from within R, the more headaches you can save
yourself. R rules :-)
> I am analyzing online bulletin board projects, so each project (~1200)
> will have a separate set of analyses done on it, then they will be
> aggregated and summarized.
Nice scale. Did I perhaps see one of your talks in Vancouver (at Sunbelt)
a couple years ago?
Several people in my department do very similar work (CMC, SNA,
computer-mediated discourse analysis, etc, ethnography, etc), and I would
love to hear more about what you're doing. Offlist, maybe?
--elijah wright
School of Library and Information Science
Indiana University, Bloomington
More information about the R-SIG-Mac
mailing list