Hi Thomas,

Thank you very much. I will check this out.

Best,

Alex

On 6/19/07, Thomas Girke <thomas.girke@ucr.edu> wrote:
>
> Alex,
>
> I guess Martin answered your question.
>
> A similar result, but with slower computation, can obtained by applying
> the IQR function like this:
>
>         apply(iris[,1:3], 1, IQR)
>
> Thomas
>
> On Tue 06/19/07 21:10, Martin Morgan wrote:
> > Alex,
> >
> > > library(Biobase)
> > [snip]
> > > args(rowQ)
> > function (imat, which)
> > NULL
> > > showMethods("rowQ")
> > Function: rowQ (package Biobase)
> > imat="ExpressionSet", which="numeric"
> > imat="exprSet", which="numeric"
> > imat="matrix", which="numeric"
> >
> > so it looks like x should be a matrix rather than a data frame.
> >
> > Martin
> >
> > "ssls sddd" <ssls.sddd@gmail.com> writes:
> >
> > > Hi Thomas,
> > >
> > > Thanks! Sorry for getting back to it late because I was out
> > > of town for a couple of days.
> > >
> > > I like the idea of 'removing all rows with low variability across
> > > samples'. I searched around and found an online tutorial
> > >
> http://www.economia.unimi.it/projects/marray/2006/material/Lab3/MachineLearning/ML-lab.pdfis
> > > doing very similar thing which teaches how to filter some
> > > undifferentially
> > > expressed genes.
> > >
> > > It takes the simplistic approach of using the 75th percentile of the
> > > interquartile range
> > > (IQR) as the cut-off point and computes quantiles using rowQ.
> > >
> > > I followed their method and my code is:
> > >
> > > library("Biobase")
> > > lowQ = rowQ(x, floor(0.25 * 49))#49 for 49 samples
> > > upQ = rowQ(x, ceiling(0.75 * 49))
> > > iqrs = upQ - lowQ
> > > giqr = iqrs > quantile(iqrs, probs = 0.75)
> > > sum(giqr)
> > > xsub = x[giqr, ]
> > > dim(xsub)
> > >
> > > But the error message is like:
> > >
> > > function (classes, fdef, mtable)  :
> > >         unable to find an inherited method for function "rowQ", for
> > > signature "data.frame", "numeric"
> > >
> > > Perhaps you can any experience in using 'rowQ'? If I want to use IQR
> > > function, how should I approach this?
> > >
> > > I really appreciate your help!
> > >
> > > Thank you very much!
> > >
> > > Sincerely,
> > >
> > > Alex
> > >
> > >
> > >
> > > On 6/13/07, Thomas Girke <thomas.girke@ucr.edu> wrote:
> > >>
> > >> Dear Alex,
> > >>
> > >> In addition, to Sean's advice, I would like to point out that the
> > >> sample you are giving below indicates that you are trying to pass on
> > >> to the heatmap function a column dendrogram plus a row dendrogram.
> With
> > >> your
> > >> matrix of 238,000 rows by 49 columns you should have only a column
> > >> dendrogram, because the row dendrogram would take more than 200 GB of
> > >> memory to
> > >> calculate. You can still use the heatmap or heatmap.2 functions by
> turning
> > >> off the row
> > >> sorting by setting the Rowv argument to NA. In addition to this, I
> would
> > >> consider to filter your rows in a meaningful manner to a much smaller
> > >> number, perhaps by using R's IQR function to remove all rows with
> very
> > >> low variability. I am suggesting this because, you won't see any
> > >> patterns in the heatmap when you have so many rows. If the row
> filtering
> > >> works then you could generate a dendrogram for the row dimension as
> well.
> > >> Remember: hclust will require ~4 GB of memory to cluster ~30,000
> items
> > >> and < 1 GB for 10,000 items, and pvclust that uses hclust internally
> will
> > >> need even much more than this.
> > >>
> > >> As a more general advice, when working with large data sets in R
> always
> > >> subset
> > >> your data to something very small to test out your strategy first,
> because
> > >> this
> > >> will save you a lot of time.
> > >> In your case, this could by done by selecting just the first 100 rows
> of
> > >> your
> > >> matrix like this:
> > >>                 my_matrix <- my_matrix[1:100, ]
> > >>
> > >> Once you have tested things out then just remove in your
> script/protocol
> > >> the '[1:100,]' part.
> > >>
> > >> Best,
> > >>
> > >> Thomas
> > >>
> > >>
> > >> On Wed 06/13/07 06:02, Sean Davis wrote:
> > >> > ssls sddd wrote:
> > >> > > Dear Dr.Thomas Girke,
> > >> > >
> > >> > > I have one more question for you. I tried pvclust in the session
> of
> > >> > > 'Obtain significant clusters by pvclust bootstrap analysis' for
> my
> > >> data, x.
> > >> > >
> > >> > > But I have a problem with:
> > >> > >
> > >> > > heatmap(x, Rowv=dend_colored, Colv=as.dendrogram(hc), col=
> my.colorFct
> > >> (),
> > >> > > scale="row", RowSideColors=mycolhc)
> > >> > >
> > >> > > the error was:
> > >> > >
> > >> > > error in heatmap(x, Rowv = dend_colored, Colv = as.dendrogram(hc),
> col
> > >> =
> > >> > > my.colorFct(),  :
> > >> > >         'x' must be a numeric matrix
> > >> > >
> > >> > > I ran 'x[1:3,1:3]' and it produced the following:
> > >> > >
> > >> > >               AIRNS_A09 AIRNS_A11 AIRNS_A12
> > >> > > SNP_A-1780271   1.85642   1.50956   1.73154
> > >> > > SNP_A-1780274   1.72140   1.83712   1.85948
> > >> > > SNP_A-1780277   2.04241   1.53458   1.65270
> > >> > >
> > >> > > I think the x is a numeric matrix. Do you think where I may get
> wrong?
> > >> >
> > >> > Try coercing the x into a matrix directly:
> > >> >
> > >> > heatmap(as.matrix(x), Rowv=dend_colored, Colv=as.dendrogram(hc),
> > >> > col=my.colorFct(), scale="row", RowSideColors=mycolhc)
> > >> >
> > >> > Does this fix the problem?  You can always check the class of an
> object
> > >> > by doing something like:
> > >> >
> > >> > class(x)
> > >> >
> > >> > which should report:
> > >> >
> > >> > [1] "matrix"
> > >> >
> > >> > Hope that helps.
> > >> >
> > >> > Sean
> > >> >
> > >>
> > >> --
> > >> Dr. Thomas Girke
> > >> Assistant Professor of Bioinformatics
> > >> Director, IIGB Bioinformatic Facility
> > >> Center for Plant Cell Biology (CEPCEB)
> > >> Institute for Integrative Genome Biology (IIGB)
> > >> Department of Botany and Plant Sciences
> > >> 1008 Noel T. Keen Hall
> > >> University of California
> > >> Riverside, CA 92521
> > >>
> > >> E-mail: thomas.girke@ucr.edu
> > >> Website: http://faculty.ucr.edu/~tgirke <
> http://faculty.ucr.edu/%7Etgirke>
> > >> Ph: 951-827-2469
> > >> Fax: 951-827-4437
> > >>
> > >
> > >     [[alternative HTML version deleted]]
> > >
> > > _______________________________________________
> > > Bioconductor mailing list
> > > Bioconductor@stat.math.ethz.ch
> > > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > > Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >
> > --
> > Martin Morgan
> > Bioconductor / Computational Biology
> > http://bioconductor.org
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor@stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >
>
> --
> Thomas Girke
> Assistant Professor of Bioinformatics
> Director, IIGB Bioinformatic Facility
> Center for Plant Cell Biology (CEPCEB)
> Institute for Integrative Genome Biology (IIGB)
> Department of Botany and Plant Sciences
> 1008 Noel T. Keen Hall
> University of California
> Riverside, CA 92521
>
> E-mail: thomas.girke@ucr.edu
> Website: http://faculty.ucr.edu/~tgirke
> Ph: 951-827-2469
> Fax: 951-827-4437
>

	[[alternative HTML version deleted]]