[BioC] different gal files using limma

Sat Sep 15 04:08:01 CEST 2007

Dear Tiandao,

I'm glad that you've successfully merged your separate gal files. 
But, please, do not address questions specifically to me. This is a 
mailing list with many people who might make helpful comments.

You've given us no information at all about the design of your 
experiment, so no one has any chance of being able to tell you why 
some of your coefficients can be estimated. The message means that 
your experiment does not provide any information about the difference 
between these RNA sources and "N0", which you have specified as a 
reference. You need to give a little more thought to what comparisons 
you are really trying to make in your experiment.

Best wishes
Gordon

At 07:08 AM 15/09/2007, Tiandao Li wrote:
>Dear Dr. Smyth,
>
>MA2 had the full set of IDs (2716 genes), while MA1 only 8 IDs less than
>the full set of IDs, 2708 genes. I want to match MA1 to MA2, however,
>there are 8 "NA" in new MA$genes$ID instead of the IDs from MA2. The rest
>of them are the same. I will check if there is any different between MA1
>and MA1 part of new MA.
>
>I used your codes to merge MALists from MA1 and MA2, I can't get the
>correct result file.
> > m <- match(MA2$genes$ID, MA1$genes$ID)
> > MA <- cbind(MA1[m,], MA2)
>
>So I used the follwoings to merge MA1 and MA2. The new MA file is the same
>one I joined manually.
>rownames(MA1$M) <- rownames(MA1$A) <- MA1$genes$ID
>MA3 <- new("MAList",list(M=MA1$M,A=MA1$A))
>rownames(MA2$M) <- rownames(MA2$A) <- MA2$genes$ID
>MA4 <- new("MAList",list(M=MA2$M,A=MA2$A))
>MA <- merge(MA4,MA3)
>
>I imported the entire gpr files and exported them to see if I do anything
>wrong. I also used some items as quality controls or to make some plots.
>Everything is fine, however "Log Ratio (635/532)" sometimes give me
>"character" instead of "numeric". Without importing the entire data, the
>PrintLayout was always wrong.
>
>Since I had 2 target files to read in gpr files. Now I put 2 target files
>together to create a new target file, and use it to build design matrix
>for linear model.
>design <- modelMatrix(targets,ref="N0")
>fit <- lmFit(MA,design)
>However, I got the warning message:
>
>Coefficients not estimable: M55 N6
>
>Would you let me know what are the reasons that some coefficients can't be
>estimated from liear model?
>
>Sincerely,
>
>Tiandao
>
>
>On Wed, 12 Sep 2007, Gordon Smyth wrote:
>
>Dear Tiandao,
>
>It doesn't necessarily make sense to try to merge MAList if they 
>aren't the same
>length and don't have the same IDs. I suggest you get down to a 
>subset of probes
>for this is true, then try the merge command again. This assumes that the ID
>column of RG$genes has unambiguous identifiers for each probe. (I 
>can't give you
>a lot of detail, because trying to troubleshoot this over the email is very
>hard.)
>
>BTW, I notice that you're reading the entire GPR files into your 
>RGList objects.
>This will make huge objects. Do you need to do that? Why not just
>
>   RG <- read.maimages(targets,source="genepix.median",ext="gpr")
>
>Best wishes
>Gordon
>
>At 07:26 AM 12/09/2007, Tiandao Li wrote:
> > Dear Dr. Symth,
> >
> > Thanks for your help. I read in the gpr files using 2 gal files
> > separately, then found the spot types separately, normalization
> > separately, and remove all control spots separately, and only keep gene
> > type for further analysis. Both MA1 and MA2 used the same gene ID s,
> > however, MA2$genes$ID have 8 more genes than MA1. I used your code to
> > match MA1 to MA2
> >
> > m <- match(MA2$genes$ID, MA1$genes$ID)
> > MA <- cbind(MA1[m,], MA2)
> >
> > I compared MA2 to MA2 part of MA, the numbers are identical, however,
> > there are some "NA" in MA$genes$ID instead of gene IDs from MA2$genes$ID.
> > Because MA1 and MA2 aren't the same length and IDs. Could I still use it?
> > There are 4 duplicate spots per gene on the array.
> >
> > I put 2 target files together to create a new target file, and use it to
> > build design matrix for linear model. Is it OK?
> >
> > Sincerely,
> >
> > Tiandao
> >
> > On Tue, 11 Sep 2007, Gordon Smyth wrote:
> >
> > Dear Tiandao,
> >
> > Dealing with multiple gal files is very tricky, but possible. In limma, you
> > need
> > to read in the GPR files for each GAL file separately, identify 
> control spots
> > separately, and normalize separately. So, if you have two GAL 
> files, you will
> > end up with two normalized MAList objects MA1 and MA2.
> >
> > You will then need to align MA1 and MA2 by gene ID. There is a 
> merge command,
> > but very often the situation is too complex for this command to handle.
> > Usually
> > you will need to remove the control spots from MA1 and MA2 
> separately, to get
> > down to a list of common genes, then sort MA1 to match the gene 
> order of MA2,
> > then cbind them together.
> >
> > If MA1 and MA2 are of the same length, with the same gene IDs, 
> then something
> > like this wil do the merge:
> >
> >    m <- match(MA2$genes$ID, MA1$genes$ID)
> >    MA <- cbind(MA1[m,], MA2)
> >
> > There is any alternative method, which is to use the printorder() 
> function to
> > map spots back to the original 384-well plate positions, then 
> align the arrays
> > by 384-well plate. This method requires that the plates were used 
> in the same
> > order throughout the printing, except for control plates.
> >
> > You need to be very careful!
> > Good luck.
> > Gordon
> >
> > > Date: Sun, 9 Sep 2007 14:26:47 -0500 (CDT)
> > > From: Tiandao Li <Tiandao.Li at usm.edu>
> > > Subject: [BioC] different gal files using limma
> > > To: Bioconductor_help <bioconductor at stat.math.ethz.ch>
> > > Message-ID: <Pine.LNX.4.64.0709091401440.32134 at orca.st.usm.edu>
> > > Content-Type: TEXT/PLAIN; charset=US-ASCII
> > >
> > > Hello,
> > >
> > > I am analyzing cDNA microarray data using limma. I generated the GAL file
> > > using the program coming with chipwriter, everything looks 
> great. However,
> > > when I printed the first batch of chips, after the last dip of 
> pins in the
> > > first plates, print, wash, and the pins redipped again in the first plate
> > > from the beginning, and print, wash, then stop to change the plate. The
> > > company gave us the patch to solve this problem. So this gal file is a
> > > little different than the rest batches of chips, the locations of genes,
> > > MSP, and controls are different (5%). After hybridization, I used GenePix
> > > Pro 6.1 for spotfinding. After reading the data into limma, I want to use
> > > MSP and control spots for normalization. I don't know how to label
> > > different gal files using readSpotTypes() in all chips.
> > >
> > > Thanks,
> > >
> > > Tiandao
> > >
> > > I am kind of new to R and limma. The following is my setting.
> > >
> > > > sessionInfo()
> > > R version 2.5.1 (2007-06-27)
> > > i386-pc-mingw32
> > >
> > > locale:
> > > LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> > > States.1252;LC_MONETARY=English_United
> > > States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
> > >
> > > attached base packages:
> > > [1] "stats"     "graphics"  "grDevices" "utils"     "datasets"  "methods"
> > > [7] "base"
> > >
> > > other attached packages:
> > >  statmod    limma
> > >  "1.3.0" "2.10.5"
> > >
> > > Codes for analysis
> > >
> > > library(limma)
> > >
> > > A <- list(R="F635 Median",G="F532 Median",Rb="B635",Gb="B532")
> > > B <- list("Block", "Column", "Row", "Name", "ID", "X", "Y", "Dia.", "F635
> > > Median", "F635 Mean", "F635 SD", "F635 CV", "B635", "B635 Median", "B635
> > > Mean", "B635 SD", "B635 CV", "% > B635+1SD", "% > B635+2SD", "F635 %
> > > Sat.", "F532 Median", "F532 Mean", "F532 SD", "F532 CV", "B532", "B532
> > > Median", "B532 Mean", "B532 SD", "B532 CV", "% > B532+1SD", "% >
> > > B532+2SD", "F532 % Sat.", "Ratio of Medians (635/532)", "Ratio of Means
> > > (635/532)", "Median of Ratios (635/532)", "Mean of Ratios (635/532)",
> > > "Ratios SD (635/532)", "Rgn Ratio (635/532)", "Rgn R2 (635/532)", "F
> > > Pixels", "B Pixels", "Circularity", "Sum of Medians (635/532)", "Sum of
> > > Means (635/532)", "Log Ratio (635/532)", "F635 Median - B635", "F532
> > > Median - B532", "F635 Mean - B635", "F532 Mean - B532", "F635 Total
> > > Intensity", "F532 Total Intensity", "SNR 635", "SNR 532", "Flags",
> > > "Normalize", "Autoflag")
> > >
> > > # read 6 test files
> > > targets<-readTargets(file="targets.txt", row.name="Name") # 6 test files
> > > RG <-
> > >
> > 
> read.maimages(targets$FileName,source="genepix",ext="gpr",columns=A,other.columns=B)
> > > spottypes <- readSpotTypes("spottypes3.txt") # short spot types
> > > RG$genes$Status <- controlStatus(spottypes,RG)
> > >
> > > targets
> > > SlideNumber     FileName        Cy3     Cy5     Name
> > > 1       13582917        N0      N1      N0N121
> > > 2       13582918        N0      N1      N0N122
> > > 3       13590446        N0      N1      N0N123
> > > 4       13590420        N1      H1      N1H121
> > > 5       13590521        N1      H1      N1H122
> > > 6       13591193        N1      H1      N1H123
> > >
> > > spottypes3
> > > SpotType        ID      Color
> > > gene    *       black
> > > Calibration     Calib*  blue
> > > Ratio   Ratio*  red
> > > Negative        Neg*|Util*      brown
> > > MSP     MSP     orange
> > > Alexa   Alexa*  yellow
> > > blank   NotDefined      green
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives: 
>http://news.gmane.org/gmane.science.biology.informatics.conductor