[BioC] identifying sets of correlated genes
Robert Castelo
robert.castelo at upf.edu
Mon Nov 26 14:35:59 CET 2012
hi Alyaa,
i also think you should give a try to the simple approach that Sean
described in his previous email to see whether you get a clustering of
samples close to what you're looking for. take a look at the
MLInterfaces package and its first vignette for doing that with
microarray expression data stored in ExpressionSet objects.
along the lines of what you are specifically asking for, correlations,
you can always use the function cor() to calculate all pairwise Pearson
correlations (this function needs a matrix of expression values with
genes on the columns), and then threshold them at some cutoff to get the
clusters you want to use them for a clustering of samples later, but
this is not much different from what Sean was already proposing.
in any case, you should be aware that Pearson correlations are a
marginal measure of association and thus sensitive to confounding
factors, which although you say you do not expect them, with 257
samples, chances for non-biological variability are high. you may want
to give a try to a more restrictive measure of association such as
conditional dependence which can give you better results in the presence
of confounding. for that purpose you can use the 'qpgraph' package.
cheers,
robert.
On 11/26/2012 01:50 PM, Alyaa Mahmoud wrote:
>
>
>
> On Mon, Nov 26, 2012 at 1:51 PM, Robert Castelo <robert.castelo at upf.edu
> <mailto:robert.castelo at upf.edu>> wrote:
>
> hi,
>
> few more questions,
>
> how many samples do you have?
>
> 257
>
>
> what is the structure of these data: are all samples from the same
> experimental condition?
>
> yes
>
>
> do you suspect the presence of some confounding factors such as
> batch, gender (if applicable), strain (if applicable), etc...
>
> not really, I need to obtain sets of correlated genes (in
> expression/regulation...etc) and then re-cluster using these sets and
> observe the pattern of samples clustering.
>
>
> are you looking for some specific type of correlated genes, such as
> targets of DNA or RNA binding proteins?
>
> no, I am more interested in the behaviour of the samples rather, but I
> want to re-cluster using subsets of the genes.
>
>
>
> robert.
>
>
> On 11/26/2012 12:34 PM, Alyaa Mahmoud wrote:
>
> Hi Dr Castelo
>
> Gene expression dat
>
> Thanks
>
>
> On Mon, Nov 26, 2012 at 1:28 PM, Robert Castelo
> <robert.castelo at upf.edu <mailto:robert.castelo at upf.edu>
> <mailto:robert.castelo at upf.edu
> <mailto:robert.castelo at upf.edu>__>> wrote:
>
> hi Alyaa,
>
> from what kind of data?
>
> cheers,
> robert.
>
>
> On 11/22/2012 10:14 AM, Alyaa Mahmoud wrote:
>
> Dear Group
>
> What the most convenient direct way of identifying sets of
> correlated genes
> ?
>
> Thanks a lot
> Alyaa
>
>
> --
> Robert Castelo, PhD
> Associate Professor
> Dept. of Experimental and Health Sciences
> Universitat Pompeu Fabra (UPF)
> Barcelona Biomedical Research Park (PRBB)
> Dr Aiguader 88
> E-08003 Barcelona, Spain
> telf: +34.933.160.514 <tel:%2B34.933.160.514>
> <tel:%2B34.933.160.514>
> fax: +34.933.160.550 <tel:%2B34.933.160.550>
> <tel:%2B34.933.160.550>
>
>
>
>
>
> --
> Alyaa Mahmoud
>
> "Love all, trust a few, do wrong to none"- Shakespeare
>
>
> --
> Robert Castelo, PhD
> Associate Professor
> Dept. of Experimental and Health Sciences
> Universitat Pompeu Fabra (UPF)
> Barcelona Biomedical Research Park (PRBB)
> Dr Aiguader 88
> E-08003 Barcelona, Spain
> telf: +34.933.160.514 <tel:%2B34.933.160.514>
> fax: +34.933.160.550 <tel:%2B34.933.160.550>
>
>
>
>
> --
> Alyaa Mahmoud
>
> "Love all, trust a few, do wrong to none"- Shakespeare
>
--
Robert Castelo, PhD
Associate Professor
Dept. of Experimental and Health Sciences
Universitat Pompeu Fabra (UPF)
Barcelona Biomedical Research Park (PRBB)
Dr Aiguader 88
E-08003 Barcelona, Spain
telf: +34.933.160.514
fax: +34.933.160.550
More information about the Bioconductor
mailing list