[BioC] making a cdf package for a custom affymetrix array and set up for limma

James W. MacDonald jmacdon at uw.edu
Thu Nov 8 19:18:17 CET 2012


Hi Lauren,

On 11/8/2012 11:12 AM, Lauren Sassoubre wrote:
> Hi Jim,
> Thank you for the help!
>
> I'm sorry I wasn't clear. I actually do have a time series, 5 time 
> points. So, I have a treatment and a control, 3 replicate experiments, 
> 5 time course points, and the treatment and control are paired. I 
> updated the table below to show all the information.
>
> I looked at 8.6 and see that I might need a contrast matrix to handle 
> the pairing but I'm not sure how this would work with the model 
> matrix? I thought section 8.8 also handled paired data? I think the 
> table below is my target fie but I'm not sure what my next step should 
> be? How can I account for the different experiments, time points and 
> pairings?

I don't see a section 8.8 in the user's guide, so I am not sure what you 
are talking about. Perhaps you are talking about 8.7?

In addition, you don't state which comparisons you are particularly 
interested in. However, I think section8.7 is applicable here, as you 
are trying to make comparisons both within and between the subjects. So 
the relevant part is that you need to fit your subjects as a random 
factor, using the duplicateCorrelation() function. Note that the pairing 
should not have any negative numbers.

Depending on what comparisons you might want to make, you may be better 
off combining e.g., treatment and time into one factor.

As you note, this is a pretty complex experiment and you appear to be a 
novice. I would highly recommend finding a statistician at Stanford who 
is willing to help (there are tons of people with microarray experience 
there). Even if you get things to work, you still have to interpret and 
more importantly, defend what you have done. Just getting the code to 
run is the easy part.

Best,

Jim

>
> Sorry for the beginner questions.
> Thanks so much for your help!
> Lauren
>
> Filename 	Treatment 	experiment 	time 	pairing
> EF731DO.CEL 	dark 	1 	0hr 	-1
> EF731D2.CEL 	dark 	1 	2hr 	-2
> EF731D6.CEL 	dark 	1 	6hr 	-3
> EF731D12.CEL 	dark 	1 	12hr 	-4
> EF731D24.CEL 	dark 	1 	24hr 	-5
> EF731LO.CEL 	light 	1 	0hr 	1
> EF731L2.CEL 	light 	1 	2hr 	2
> EF731L6.CEL 	light 	1 	6hr 	3
> EF731L12.CEL 	light 	1 	12hr 	4
> EF731L24.CEL 	light 	1 	24hr 	5
> EF813DO.CEL 	dark 	2 	0hr 	-6
> EF813D2.CEL 	dark 	2 	2hr 	-7
> EF813D6.CEL 	dark 	2 	6hr 	-8
> EF813D12.CEL 	dark 	2 	12hr 	-9
> EF813D24.CEL 	dark 	2 	24hr 	-10
> EF813LO.CEL 	light 	2 	0hr 	6
> EF813L2.CEL 	light 	2 	2hr 	7
> EF813L6.CEL 	light 	2 	6hr 	8
> EF813L12.CEL 	light 	2 	12hr 	9
> EF813L24.CEL 	light 	2 	24hr 	10
> EF815DO.CEL 	dark 	3 	0hr 	-11
> EF815D2.CEL 	dark 	3 	2hr 	-12
> EF815D6.CEL 	dark 	3 	6hr 	-13
> EF815D12.CEL 	dark 	3 	12hr 	-14
> EF815D24.CEL 	dark 	3 	24hr 	-15
> EF815LO.CEL 	light 	3 	0hr 	11
> EF815L2.CEL 	light 	3 	2hr 	12
> EF815L6.CEL 	light 	3 	6hr 	13
> EF815L12.CEL 	light 	3 	12hr 	14
> EF815L24.CEL 	light 	3 	24hr 	15
>
>
>
>
>
> On Nov 8, 2012, at 5:51 AM, James W. MacDonald wrote:
>
>> Hi Lauren,
>>
>> On 11/7/2012 8:04 PM, Lauren Sassoubre wrote:
>>> Thanks Jim!
>>>
>>> I realized that the code was calling on a cdf file with a typo in 
>>> the name. After I corrected the cdf file name, make.cdf.package 
>>> worked and I was able to install it in the terminal window.
>>>
>>> I would like to use limma to analyze the data. I have a treatment 
>>> (light) and a control (dark), 5 time points for each (i'm 
>>> considering lights and darks paired for each time point), and 3 sets 
>>> of experiments (biological replicates) so I have a total of 30 
>>> arrays. I made a txt file (see below) with information on the 
>>> filename and corresponding treatment and experiment. *I'm not sure 
>>> how to include the information about which are paired within this 
>>> file??*
>>>
>>> The data has been loaded with ReadAffy and processed with the rma 
>>> function:
>>> cel <- ReadAffy()
>>> pre<-rma(cel)
>>> eset <- exprs(pre)
>>
>> This last step is unnecessary. The vast majority of functions in BioC 
>> are designed to work on ExpressionSet objects, so there is no need to 
>> extract the expression data. In other words, you will get the same 
>> results if you do
>>
>> fit <- lmFit(pre, design)
>>
>> or
>>
>> fit <- lmFit(eset, design)
>>
>> And the upside of using an ExpressionSet is you a) get to append 
>> phenotypic and other data with the object, and b) you can manipulate 
>> the ExpressionSet as if it were a matrix, and everything gets 
>> subsetted accordingly. So
>>
>> pre[,1:3]
>>
>> will do exactly what you would expect, resulting in an ExpressionSet 
>> containing data for only the first three samples.
>>
>>>
>>> *How do I combine the file with the information about filename, 
>>> treatment and experiment (see below) with my processed data using 
>>> factor and model matrix??*
>>> I'm trying to use/understand the code in section 8.8 of the Limma 
>>> user guide:
>>
>> I assume that is a typo and you meant section 8.6. But I wonder why 
>> you are emulating that section, given it shows how to analyze a 
>> timecourse experiment, and it appears your experiment isn't one?
>>
>> I think you would be better off looking at section 8.4.1, which shows 
>> how to do a paired experiment by blocking on subject. The file you 
>> show below doesn't contain the pairing information, but what you want 
>> to do is add a column similar to the SibShip column in 8.4.1 that you 
>> can use as a blocking variable.
>>
>> Best,
>>
>> Jim
>>
>>
>>>
>>> lev <- c("wt.0hr","wt.6hr","wt.24hr","mu.0hr","mu.6hr","mu.24hr")
>>> f <- factor(targets$Target, levels=lev)
>>> design <- model.matrix(~0+f)
>>> colnames(design) <- lev
>>> fit <- lmFit(eset, design)
>>>
>>> This is what the txt file with information on filename, treatment, 
>>> and experiment looks like:
>>>
>>> Filename Treatment experiment
>>> EF731DO.CEL dark 1
>>> EF731D12.CEL dark 1
>>> EF731D2.CEL dark 1
>>> EF731D24.CEL dark 1
>>> EF731D6.CEL dark 1
>>> EF731LO.CEL light 1
>>> EF731L12.CEL light 1
>>> EF731L2.CEL light 1
>>> EF731L24.CEL light 1
>>> EF731L6.CEL light 1
>>> EF813DO.CEL dark 2
>>> EF813D12.CEL dark 2
>>> EF813D2.CEL dark 2
>>> EF813D24.CEL dark 2
>>> EF813D6.CEL dark 2
>>> EF813LO.CEL light 2
>>> EF813L12.CEL light 2
>>> EF813L2.CEL light 2
>>> EF813L24.CEL light 2
>>> EF813L6.CEL light 2
>>> EF815DO.CEL dark 3
>>> EF815D12.CEL dark 3
>>> EF815D2.CEL dark 3
>>> EF815D24.CEL dark 3
>>> EF815D6.CEL dark 3
>>> EF815LO.CEL light 3
>>> EF815L12.CEL light 3
>>> EF815L2.CEL light 3
>>> EF815L24.CEL light 3
>>> EF815L6.CEL light 3
>>>
>>>
>>> Thanks in advance for your help!!
>>> Best,
>>> Lauren
>>> (graduate student)
>>>
>>>
>>> On Nov 7, 2012, at 12:27 PM, James W. MacDonald wrote:
>>>
>>>> Hi Lauren,
>>>>
>>>> Please don't take conversations off-line; we hope that the archived 
>>>> questions and answers will be useful for others.
>>>>
>>>> On 11/7/2012 3:10 PM, Lauren Sassoubre wrote:
>>>>> Thank you Jim!
>>>>>
>>>>> If you don't mind me asking another question, I was able to make a 
>>>>> cdf env last week but deleted it and now when I try to use the 
>>>>> function I get the error message I posted. I'm not sure what 
>>>>> changed. Are previously created cdf environments saved in some way 
>>>>> that would complicate making another with the same cdf file?
>>>>
>>>> You could have saved the old environment if when you quit R you 
>>>> said 'yes'. It will automatically be dumped into a file .RData, 
>>>> which will automatically be loaded if you start R in that same 
>>>> directory. If this is true, you will get a message
>>>>
>>>> [Previously saved workspace restored]
>>>>
>>>> after you start R. In which case you can do ls() to see what is in 
>>>> the saved workspace, and maybe your old env is there.
>>>>
>>>> However, this shouldn't interfere with the creation of a new 
>>>> environment. You can always overwrite things in the .GlobalEnv (the 
>>>> default workspace).
>>>>
>>>> You can always see if the file is where you think it is. Start R, 
>>>> and then do a
>>>>
>>>> dir(".", pattern = "[Cc][Dd][Ff]")
>>>>
>>>> if it responds with a character(0), then the cdf isn't in your 
>>>> working directory. You then have to use setwd() to change to the 
>>>> correct dir, or just point to the directory where you have the cdf.
>>>>
>>>> Best,
>>>>
>>>> Jim
>>>>
>>>>
>>>>> Thanks again!
>>>>> Lauren
>>>>>
>>>>>
>>>>> On Nov 7, 2012, at 7:20 AM, James W. MacDonald wrote:
>>>>>
>>>>>> Hi Lauren,
>>>>>>
>>>>>> On 11/6/2012 9:11 PM, Lauren Sassoubre wrote:
>>>>>>> Hi,
>>>>>>> This error message has come up in previous posts but I haven't 
>>>>>>> found a solution that works. I'm having trouble loading a cdf 
>>>>>>> file for a custom array and using that cdf for the function 
>>>>>>> ReadAffy. Below are the code and error message. What does this 
>>>>>>> error mean? I have tried reinstalling R and deleting the cdf 
>>>>>>> files and reloading them from the original disk. I'm working 
>>>>>>> with the most recent version of R for Macs.
>>>>>>> Thanks in advance for the help!
>>>>>>> Lauren
>>>>>>>
>>>>>>>> make.cdf.env("Gilmorea520817.cdf")
>>>>>>> Error in isCDFXDA(file.path(path.expand(cdf.path), filename)) :
>>>>>>>  Unable to open the file 
>>>>>>> /Users/lmsassoubre/Desktop/solarsimandmicroarrays/CD_Gilmorea520187F/Full/Gilmorea520817.cdf
>>>>>> There are two obvious possibilities. First, the file might not be 
>>>>>> in the path you specify. For example:
>>>>>>
>>>>>>> dir(".", "nopackage.cdf")
>>>>>> character(0)
>>>>>>> make.cdf.env("nopackage.cdf")
>>>>>> Error in isCDFXDA(file.path(path.expand(cdf.path), filename)) :
>>>>>> Unable to open the file /misc/staff/jmacdon/nopackage.cdf
>>>>>>
>>>>>> alternatively, you might not have read permissions. This is much 
>>>>>> less likely, especially if you are copying the file and running R 
>>>>>> as the same user. To make the below work, I had to change 
>>>>>> permissions to a state that is unlikely to exist in that 
>>>>>> situation. However unlikely, it is still a possibility:
>>>>>>
>>>>>>> system("ls -lah AG.CDF")
>>>>>> --wx------ 1 jmacdon staff 20M Nov  7 07:13 AG.CDF
>>>>>>> make.cdf.env("AG.CDF")
>>>>>> Error in isCDFXDA(file.path(path.expand(cdf.path), filename)) :
>>>>>> Unable to open the file /misc/staff/jmacdon/AG.CDF
>>>>>>
>>>>>> I suppose there are other alternatives as well, but none come 
>>>>>> readily to mind.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Jim
>>>>>>
>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Bioconductor mailing list
>>>>>>> Bioconductor at r-project.org <mailto:Bioconductor at r-project.org> 
>>>>>>> <mailto:Bioconductor at r-project.org>
>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>>>> Search the archives: 
>>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>>> -- 
>>>>>> James W. MacDonald, M.S.
>>>>>> Biostatistician
>>>>>> University of Washington
>>>>>> Environmental and Occupational Health Sciences
>>>>>> 4225 Roosevelt Way NE, # 100
>>>>>> Seattle WA 98105-6099
>>>>>>
>>>>
>>>> -- 
>>>> James W. MacDonald, M.S.
>>>> Biostatistician
>>>> University of Washington
>>>> Environmental and Occupational Health Sciences
>>>> 4225 Roosevelt Way NE, # 100
>>>> Seattle WA 98105-6099
>>>>
>>>
>>
>> -- 
>> James W. MacDonald, M.S.
>> Biostatistician
>> University of Washington
>> Environmental and Occupational Health Sciences
>> 4225 Roosevelt Way NE, # 100
>> Seattle WA 98105-6099
>>
>

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list