[BioC] Unable to Generate QC Report for mogene10stv1

James W. MacDonald jmacdon at med.umich.edu
Fri Dec 17 16:20:40 CET 2010


Hi Rick,

On 12/16/2010 4:13 PM, Rick Frausto wrote:
> Hi Jim,
>
> How do I run an RMA analysis without a proper ExpresionSet? Honest answer, I
> don't know, I just put in a command line from a manual I found online and it
> spit out some result- see #3 Affy packages in following link (
> http://manuals.bioinformatics.ucr.edu/home/R_BioCondManual#biocon_intro).

You are mistaken. All of the functions mentioned there result in a 
proper ExpressionSet. And if you just do

abatch <- ReadAffy()
eset <- rma(abatch)

Then you will 100% surely get an ExpressionSet.

>
> Perhaps you don't need an ExpressionSet until after the preprocessing, at
> least that is what I get from the "An Introduction to Bioconductor's
> ExpressionSet Class" written by Seth Falcon, Martin Morgan and Robert
> Gentleman. Everything seemed to be going smoothly until I tried to get a QC
> Report.
>
> Now, the answer for why I would want to do such a thing is easy. Simply that
> I don't know any better :) Just started working with R a few days ago, but
> I'm learning.
>
>
> Apparently Snow Leopard running on 32bit can only utilize about 3.2GB of
> RAM, whereas 64bit can make use of all 4GB. I'll switch to the 64 bit OS and
> see if it makes a difference.

Well, it won't be much different. The reason a 32-bit OS can only use 
about 3.2 Gb of RAM is that the OS needs some to run. The 64-bit OS also 
needs to use some RAM, so you won't get all 4 Gb there either. The issue 
is how much RAM can be allocated to a single process, and on a 64-bit OS 
that gets bumped up significantly.

Best,

Jim



>
> Thanks for your insight!
>
> Cheers,
> Rick
>
>
>
>
> On 16/12/10 11:31 AM, "James W. MacDonald"<jmacdon at med.umich.edu>  wrote:
>
>> Hi Rick,
>>
>> On 12/16/2010 12:57 PM, Rick Frausto wrote:
>>> Thanks Jim! How much memory would I need, I currently have 4GB, but have
>>> quite a few other programs running in the background...I'll see if closing
>>> them helps. Perhaps setting up an "ExpressionSet" would solve the problem. I
>>> just started reading up on how to set one of these up yesterday. Will do
>>> this and see if the duplicates will go away.
>>>
>>> The "mydata" originates from CEL files and then I run the RMA analysis on
>>> it, but I didn't actually set up a proper ExpressionSet. I'm guessing that
>>> doing this might reduce the QCReport PDF file size quite considerably since
>>> I won't have any duplication and will make further analysis easier.
>>
>> How do you run an RMA analysis without setting up a proper
>> ExpressionSet? The default behavior is to create one. In addition, why
>> would you want to do such a thing? The ExpressionSet class is
>> specifically designed to contain these sorts of data.
>>
>>
>>>
>>> I'm running Snow Leopard OSX which can be set up as 64bit. Would running as
>>> 64bit still necessitate more RAM?
>>
>> Probably. The difference isn't efficiency, but the ability to address
>> more RAM. A 32-bit OS can still address all the available memory that
>> you will have with just 4 Gb RAM, so you need to bump that up if you
>> want to do all the chips together. As for how much, I don't know. Since
>> RAM isn't that expensive these days, you might look at maxing your box out.
>>
>> Best,
>>
>> Jim
>>
>>
>>
>>
>>>
>>> Thanks again,
>>> Rick
>>>
>>>
>>> On 15/12/10 7:45 AM, "James W. MacDonald"<jmacdon at med.umich.edu>   wrote:
>>>
>>>> Hi Rick,
>>>>
>>>> On 12/14/2010 3:55 PM, Rick Frausto wrote:
>>>>> Dear All,
>>>>>
>>>>> I have recently entered the world of R. Through some trial and error I'm
>>>>> becoming more familiar with R and the relevant Bioconductor Affy packages.
>>>>> I¹m a molecular and cell biologist with rudimentary statistical knowledge
>>>>> and even less knowledge with respect to R.
>>>>>
>>>>> When I enter the following:
>>>>>
>>>>> library(affyQCReport); QCReport(mydata, file="ExampleQC.pdf")
>>>>>
>>>>> I get some errors in return.
>>>>>
>>>>> Loading required package: lattice
>>>>> Error: cannot allocate vector of size 437.4 Mb
>>>>
>>>> This indicates that you need more RAM, as you are running out of memory.
>>>>
>>>>> In addition: Warning message:
>>>>> In data.row.names(row.names, rowsi, i) :
>>>>>      some row.names duplicated:
>>>>>
> 4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48,49,50,51,52,53,>>>>
> 5
>>>>>
> 4,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,102,>>>>
> 1
>>>>>
> 03,104,108,109,110,111,114,119,120,121,122,127,134,136,137,138,139,141,142,>>>>
> 1
>>>>>
> 47,148,149,152,153,156,157,158,159,162,163,164,165,166,167,168,169,170,171,>>>>
> 1
>>>>>
> 73,175,176,179,180,183,184,185,186,191,192,195,197,198,199,200,202,206,207,>>>>
> 2
>>>>>
> 10,219,220,227,228,229,230,233,234,235,240,241,243,245,246,248,249,250,251,>>>>
> 2
>>>>>
> 52,253,257,259,260,266,271,272,276,277,280,281,284,286,287,289,290,291,292,>>>>
> 2
>>>>>
> 96,297,298,302,304,305,306,310,311,312,313,317,318,319,321,322,324,334,337,>>>>
> 3
>>>>>
> 38,339,340,341,345,346,350,351,356,359,362,364,366,367,370,371,373,376,378,>>>>
> 3
>>>>>
> 82,383,384,385,386,387,388,389,391,394,395,397,398,399,400,402,403,405,406,>>>>
> 4
>>>>>
> 07,409,410,411,415,416,418,419,425,431,432,433,434,435,440,441,443,445,447,>>>>
> 4
>>>>>
> 49,450,452,454,455,456,461,464,466,470,472,473,481,487,488,491,492,493,494,>>>>
> 4
>>>>> 95,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51 [...
>>>>> truncated]
>>>>
>>>> What exactly is 'mydata', and how did you generate it? The above error
>>>> indicates that you have duplicate row names, which IIRC isn't possible
>>>> to do with an expressionSet.
>>>>
>>>>> R(9062,0xa05c5540) malloc: *** mmap(size=458665984) failed (error code=12)
>>>>> *** error: can't allocate region
>>>>> *** set a breakpoint in malloc_error_break to debug
>>>>> R(9062,0xa05c5540) malloc: *** mmap(size=458665984) failed (error code=12)
>>>>> *** error: can't allocate region
>>>>> *** set a breakpoint in malloc_error_break to debug
>>>>
>>>> More lack of memory errors.
>>>>
>>>>
>>>>> Error in help(dt[i], package = pkg[i], htmlhelp = TRUE) :
>>>>>      unused argument(s) (htmlhelp = TRUE)
>>>>> In addition: Warning messages:
>>>>> 1: In data(package = .packages(all.available = TRUE)) :
>>>>>      datasets have been moved from package 'base' to package 'datasets'
>>>>> 2: In data(package = .packages(all.available = TRUE)) :
>>>>>      datasets have been moved from package 'stats' to package 'datasets'
>>>>> starting httpd help server ... done
>>>>>
>>>>> Would someone be able to diagnose the problem and suggest a solution?
>>>>
>>>> First, get more RAM. Second, you will be better off using a 64-bit OS.
>>>> Depending on your hardware, you might be able to just install a 64-bit
>>>> version of R.
>>>>
>>>> Best,
>>>>
>>>> Jim
>>>>
>>>>
>>>>
>>>>>
>>>>> If it is useful, I am using the following R software: R for Mac OS X GUI
>>>>> 1.35-dev Leopard build 32-bit. If there is any other info that would be
>>>>> useful please let me know.
>>>>>
>>>>> I had a read of the AffyQCReport Package pdf and I have added the following
>>>>> line: QCReport(ReadAffy(widget=TRUE)). Then I tried library(affyQCReport);
>>>>> QCReport(mydata, file="ExampleQC.pdf") again. It now seems to be doing
>>>>> something, in other words it doesn¹t go to the error, yet, but it¹s been
>>>>> processing for about 10 minutes. I am analyzing 35 chips.
>>>>>
>>>>> Perhaps it would work if I tried to generate each QCReport page separately
>>>>> rather than as a whole.
>>>>>
>>>>> Cordially,
>>>>> Rick
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Bioconductor mailing list
>>>>> Bioconductor at r-project.org
>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>> Search the archives:
>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 



More information about the Bioconductor mailing list