[R] To Merge or to use Indicator Variables?

David Winsemius dwinsemius at comcast.net
Wed Jul 27 23:20:34 CEST 2011


On Jul 27, 2011, at 8:40 AM, oaxacamatt wrote:

> Greetings all,
> I have two sets of data that I would like to investigate. The first is
> gene/genome related data given different 'cell-states'. The second  
> set of
> data is relates the genes to a biological pathway. /(I think in  
> pictures so
> here goes.)/
> *dataframe1*
> gene, cell-state1, cell-state2
> gene1, x1, y1
> gene2, x2, y2
> gene.x, ..., ...
>
> *dataframe2*
> pathway1, gene-x1, gene-x2, ...
> pathway2, gene-y1, gene-y2, ...
>
> What I want to do is, see if 'cell-state1' (in-)activates different  
> genes /
> pathways from 'cell-state2.' Furthermore, I would like to test for
> correlation /(t-test and maybe graphing)/ between the cell-states 1  
> Vs 2 for
> specific pathways.
>
> My question is, which commands or method would allow me to do this

Do what exactly?

> most easily/efficiently: *merge* or using *dummy variables*?

Here's a counter question: Why don't you post a reproducible example?

You posted an identical question to StackOverflow this morning.  No  
one bit on the first one and I suspect you might attract more interest  
here if you read the Posting Guide and paid more attention to both its  
overall message and its specifics. There is, of course, no written  
rule that that you must notify rhelp readers that you are cross  
posting, but it is generally considered rule to crosspost without  
saying you are doing so. There is a written guideline that rhelp  
queries come with reproducible code that creates an r-object that can  
be tested against the output which has been unambiguously described. I  
think your message fails both criteria.

-- 

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list