[BioC] ConnectivityMap Package (instances, rankMatrix data)

Wed Jun 25 17:20:41 CEST 2014

Hi Azam,

Your best bet for exploring these questions is to contact the Connectivity Map project at the Broad, which is the source of the data:

    http://www.broadinstitute.org/cmap/

Sorry that I cannot be more directly helpful.

 - Paul

On Jun 24, 2014, at 8:39 PM, "Azam" <azam.peyvand at gmail.com> wrote:

> Hi All,
>  
> Using the package ConnectivityMap, the data "rankMatrix" represents instances such as inst_1, inst_2, ..., inst_6100 for each gene.
> I am a bit confused, what exactly such instances mean and how they are computed?
>  
> If they represent the difference between perturbation and vehicle samples, why they all are positive? I assume we may have at least some negative values for some genes( like down-regulated genes).
> 
> I tried to download the dataset GSE5258, titled with "Connectivity Map dataset (build01)" refereed in the supplemental material of the paper "The Connectivity Map: Using Gene-Expression Signatures to Connect Small Molecules, Genes, and Disease" from GEO. 
> 
> For example the expression value of the gene "1007_s_at" for the sample "EC2003090503AA" (perturbation_scan_id) is 387.6, while for the same gene  this value for sample "EC2003090502AA" (vehicle_scan_id3) is 393.2. 
> 
> By looking at the matrix rankMatrix in ConnectivityMap package for the same gene "1007_s_at", we have some instances from inst_1 to inst_6100.
> I don't understand that how these instance values for the gene "1007_s_at" come up? As an example, for this gene, we have instance values such as 6432, 12201 for inst_1 and inst_2. 
> What is the relationship between such samples and sample values in dataset GSE5258?
> 
> Thank You,
> Azam