[BioC] analyzing HumanHT12 with lumi
die_stevie at web.de
die_stevie at web.de
Wed Sep 9 11:00:14 CEST 2009
P.S. Just tried loading the chips individually. For one chip everything is fine. The other one seems to be the problem -> again "Inf" and "NaN" values...
I still donŽt know any reason.
> -----Ursprüngliche Nachricht-----
> Von: <die_stevie at web.de>
> Gesendet: 09.09.09 10:36:34
> An: "Paul Leo" <p.leo at uq.edu.au>
> CC: bioconductor at stat.math.ethz.ch
> Betreff: Re: [BioC] analyzing HumanHT12 with lumi
> Hei!
>
> This also does not work.
> Loading the BeadStudio export always causes these "Inf" and "NA" values in quality control even with skipping the annotation at this point of the analysis.
>
> My export looks like this:
>
>
> [Header]
> BSGX Version 3.2.3
> Report Date 9/8/2009 13:41
> Project tmp
> Group Set all_seperated
> Analysis all_seperated_nonorm
> Normalization none
> [Group Probe Profile]
>
> TargetID ProbeID 4433719067_A:AVG_Signal 4433719067_A:BEAD_STDERR 4433719067_A:Avg_NBEADS 4433719067_A:Detection Pval
> 7A5 6450255 51.57291 2.277424 28 0.9106901
>
> (just for the first slide on the array)
>
>
> Is the layout correct?
> I wonder about the ":" in ArrayID_Slide":"AVG_Signal
>
> x.lumi<-lumiR(filename, convertNuID=FALSE, inputAnnotation=FALSE) should work, shouldnŽt it?
>
> Kind regards,
> Steffi
>
>
>
>
>
>
>
>
>
> > -----Ursprüngliche Nachricht-----
> > Von: "Paul Leo" <p.leo at uq.edu.au>
> > Gesendet: 09.09.09 09:35:14
> > An: die_stevie at web.de
> > CC: bioconductor at stat.math.ethz.ch
> > Betreff: Re: [BioC] analyzing HumanHT12 with lumi
>
>
> > Not sure if this will help you ...
> > Have you tried using the annotations file at:
> >
> > http://www.switchtoi.com/annotationprevfiles.ilmn
> >
> > get the text version. see if that works for lumiR ?
> >
> > Personally I don't bother.
> > x.lumi<-lumiR(filenames,convertNuID=FALSE,inputAnnotation=FALSE)
> > and annotate later with Bioconductor libraries via the probe ID s or the
> > illumina annotation file via the Array_Addresss_Id (do with what makes
> > sense to you with multi-mappers)....
> >
> > ann<-read.delim("HumanHT-12_V3_0_R1_11283641_T.txt",header=T,skip=8,sep="\t",fill=TRUE)
> > dim(ann)
> >
> > [1] 48803 28
> > > colnames(ann)
> > [1] "Species" "Source" "Search_Key"
> > "Transcript"
> > [5] "ILMN_Gene" "Source_Reference_ID" "RefSeq_ID"
> > "Unigene_ID"
> > [9] "Entrez_Gene_ID" "GI" "Accession"
> > "Symbol"
> > [13] "Protein_Product" "Probe_Id" "Array_Address_Id"
> > "Probe_Type"
> > [17] "Probe_Start" "Probe_Sequence" "Chromosome"
> > "Probe_Chr_Orientation"
> > [21] "Probe_Coordinates" "Cytoband" "Definition"
> > "Ontology_Component"
> > [25] "Ontology_Process" "Ontology_Function" "Synonyms"
> > "Obsolete_Probe_Id"
> > length(unique(ann[,"Array_Address_Id"]))
> > [1] 48803
> >
> > -----Original Message-----
> > From: die_stevie at web.de
> > To: amit491 at gmail.com
> > Cc: bioconductor at stat.math.ethz.ch
> > Subject: Re: [BioC] analyzing HumanHT12 with lumi
> > Date: Wed, 09 Sep 2009 09:01:53 +0200
> >
> > Hello Amit,
> >
> > first thank you very much for your response!
> >
> > I included the the TargetID column and tried to run the lumiR function
> > with all options available but the result is still the same.
> >
> >
> > test <- lumiR(file = "D:/Programme/eclipse/tmp/FinalReport.txt", sep =
> > "\t", detectionTh = 0.01, na.rm = TRUE, convertNuID = TRUE, lib.
> > mapping = NULL, dec = '.', parseColumnName = TRUE, checkDupId = TRUE,
> > QC = TRUE, columnNameGrepPattern = list(exprs='AVG_SIGNAL', se.exprs='
> > BEAD_STD', detection='DETECTION', beadNum='Avg_NBEADS'), inputAnnotatio
> > n=TRUE, annotationColumn=c('PROBE_SEQUENCE'), verbose = TRUE)
> >
> >
> > I also pre-processed a Mouse WG-6 chip (V2) and everything is fine
> > there: no duplicated IDs or âInfâ in the quality control.
> >
> > Maybe there is a problem with the HumanHT12 chip?
> >
> > Does anyone else have any advice?
> >
> > Thanks again!
> >
> > Kind regards,
> >
> > Steffi
> >
> >
> > > Von: amit mandal [mailto:amit491 at gmail.com]
> > > *Gesendet:* Tuesday, September 08, 2009 5:30 PM
> > > *An:* stefanie.figura
> > > *Cc:* BioC_mail
> > > *Betreff:* Fwd: [BioC] analyzing HumanHT12 with lumi
> > >
> > > hello Steffi,
> > > In 'lumi' one needs to import the data out of BeadStudio in a
> > > particular order of various columns (it can be arranged outside BS
> > > also). The columns in order are-
> > > 1) TargetID
> > > 2) ProbeID (this is different from the Probe_ID col)
> > > 3) Avg_Signal
> > > 4) BEAD_STDER
> > > 5) Detection Pval
> > > These are the cols. that are mandatory. Apart from them, annotation
> > > cols. can also be added. Info. about them is given in the "Using lumi.
> > > ." pdf that comes as vignette with the package.
> > > Also while importing the data using lumiR command, one needs to
> > > specify the grep pattern of the column headers by which lumiRwould
> > > recognize which col. contains what. Though the deafult output has the
> > > columns in order for lumiR to work in default settings, but just in
> > > case.
> > > I haven't analyzed HT-12 but WG-6. And above method works fine.
> > > lumitakes "ProbeID" as the unique identifier (v 3.0 chips onward) and
> > > I didn't encounter a 'duplicate ID..' message.
> > > I'm also unsure of the 'Inf' message. Maybe try importing the data
> > > with specifications for most of the options, i.e. col. grep pattern.
> > >
> > > regards
> > > amit mandal
> > >
> > > Graduate student
> > > Genomics & Molecular Medicine lab
> > > IGIB, Delhi
> > >
> > > On Tue, Sep 8, 2009 at 7:30 PM, stefanie.figura <figura at uni-muenster.
> > > de> wrote:
> > >
> > > Dear all!
> > >
> > > I tried to analyse the Illumina HumanHT12 chip with the lumi package
> > > and I
> > > have some questions about the import and the results of the quality
> > > control.
> > >
> > > My first question is which columns have to be exported from
> > > BeadStudio at
> > > least? I am not sure because in the .pdf manual for the lumi package
> > > the
> > > figure is not completely represented.
> > >
> > > I only exported ProbeID, PROBE_SEQUENCE (for nuID mapping with
> > > biocLite("lumiHumanAll.db")), AVG_Signal, BEAD_STDERR, Avg_NBEADS and
> > > Detection Pval from Group Gene Profile for all samples. Is there
> > > anything I
> > > missed which is
> > > <http://dict.leo.org/ende?lp=ende&p=thMx..&search=important>
> > > important for
> > > the analysis?
> > >
> > > I am not sure if I did a mistake in the code because of the results
> > > of the
> > > quality control:
> > >
> > > > importData <- lumiR("D:/Programme/eclipse/tmp/tmp_GroupProbeProfile.
> > > txt")
> > >
> > > Perform Quality Control assessment of the LumiBatch object ...
> > >
> > > Directly converting probe sequence to nuIDs ...
> > >
> > > Duplicated IDs found and were merged!
> > >
> > > > importData
> > >
> > > Summary of data information:
> > >
> > > Data File Information:
> > >
> > > BSGX Version 3.2.3
> > >
> > > Report Date 9/8/2009 1:41:49 PM
> > >
> > > Project tmp
> > >
> > > Group Set all_seperated
> > >
> > > Analysis all_seperated_nonorm
> > >
> > > Normalization none
> > >
> > > Major Operation History:
> > >
> > > submitted finished
> > >
> > > 1 2009-09-08 15:44:37 2009-09-08 15:45:12
> > >
> > > 2 2009-09-08 15:45:12 2009-09-08 15:45:14
> > >
> > > 3 2009-09-08 15:45:34 2009-09-08 15:45:34
> > >
> > > 4 2009-09-08 15:45:14 2009-09-08 15:45:35
> > >
> > > command
> > >
> > > 1 lumiR("D:/Programme/eclipse/tmp/tmp_GroupProbeProfile.txt")
> > >
> > > 2 lumiQ(x.lumi = x.lumi, detectionTh = detectionTh, verbose =
> > > verbose)
> > >
> > > 3 Subsetting 48803
> > > features.
> > >
> > > 4 addNuID2lumi(x.lumi = x.lumi, lib.mapping = lib.mapping, verbose =
> > > verbose)
> > >
> > > lumiVersion
> > >
> > > 1 1.10.1
> > >
> > > 2 1.10.1
> > >
> > > 3 1.10.1
> > >
> > > 4 1.10.1
> > >
> > > Object Information:
> > >
> > > LumiBatch (storageMode: lockedEnvironment)
> > >
> > > assayData: 48802 features, 24 samples
> > >
> > > element names: beadNum, detection, exprs, se.exprs
> > >
> > > phenoData
> > >
> > > sampleNames: 4433719067_A, 4433719067_B, ..., 4433719068_L (24 total)
> > >
> > > varLabels and varMetadata description:
> > >
> > > sampleID: The unique Illumina microarray Id
> > >
> > > featureData
> > >
> > > featureNames: Ku8QhfS0n_hIOABXuE, fqPEquJRRlSVSfL.8A, ...,
> > > N8t5EuJCr0Tk9.zHno (48802 total)
> > >
> > > fvarLabels and fvarMetadata description:
> > >
> > > ProbeID: The Illumina microarray identifier
> > >
> > > experimentData: use 'experimentData(object)'
> > >
> > > Annotation:
> > >
> > > Control Data: Available
> > >
> > > QC information: Please run summary(x, 'QC') for details!
> > >
> > > > summary(importData, 'QC')
> > >
> > > Data dimension: 48802 genes x 24 samples
> > >
> > > Summary of Samples:
> > >
> > > 4433719067_A 4433719067_B 4433719067_C 4433719067_D
> > >
> > > mean 6.8010 6.7230 6.6660 6.6870
> > >
> > > standard deviation 1.6760 1.6360 1.6370 1.6550
> > >
> > > detection rate(0.01) 0.3367 0.3432 0.3459 0.3436
> > >
> > > distance to sample mean Inf Inf Inf Inf
> > >
> > > 4433719067_E 4433719067_F 4433719067_G 4433719067_H
> > >
> > > mean 6.7220 6.6060 6.623 6.5730
> > >
> > > standard deviation 1.6470 1.6440 1.675 1.6440
> > >
> > > detection rate(0.01) 0.3531 0.3318 0.346 0.3378
> > >
> > > distance to sample mean Inf Inf Inf Inf
> > >
> > > 4433719067_I 4433719067_J 4433719067_K 4433719067_L
> > >
> > > mean 6.5400 6.5390 6.5470 6.4570
> > >
> > > standard deviation 1.6420 1.6740 1.6790 1.6390
> > >
> > > detection rate(0.01) 0.3316 0.3424 0.3464 0.3518
> > >
> > > distance to sample mean Inf Inf Inf Inf
> > >
> > > 4433719068_A 4433719068_B 4433719068_C 4433719068_D
> > >
> > > mean 6.3170 6.3000 6.304 6.213
> > >
> > > standard deviation 1.5630 1.5970 1.619 1.566
> > >
> > > detection rate(0.01) 0.3348 0.3257 0.336 0.320
> > >
> > > distance to sample mean Inf Inf Inf Inf
> > >
> > > 4433719068_E 4433719068_F 4433719068_G 4433719068_H
> > >
> > > mean 6.253 6.2510 6.169 6.2380
> > >
> > > standard deviation 1.600 1.6170 1.579 1.6590
> > >
> > > detection rate(0.01) 0.347 0.3434 0.335 0.3455
> > >
> > > distance to sample mean Inf Inf Inf Inf
> > >
> > > 4433719068_I 4433719068_J 4433719068_K 4433719068_L
> > >
> > > mean -Inf 6.191 6.1420 6.0510
> > >
> > > standard deviation NaN 1.642 1.6150 1.5360
> > >
> > > detection rate(0.01) 0.3319 0.346 0.3429 0.3462
> > >
> > > distance to sample mean 62.4000 Inf Inf Inf
> > >
> > > I wonder about the "Inf" and "NaN" and I really think something was
> > > going
> > > wrong.
> > >
> > > Any advice is welcome, because I just started to learn R.
> > >
> > > Thank you very much in advance!
> > >
> > > Kind regards,
> > >
> > > Steffi
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > _______________________________________________
> > > Bioconductor mailing list
> > > Bioconductor at stat.math.ethz.ch
> > > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > > Search the archives: http://news.gmane.org/gmane.science.biology.
> > > informatics.conductor
> > >
> > > --
> > > ---------------------------------------------------------------
> > > The robbed that smiles, steals something
> > > from the thief.
> > > - Shakespeare
> > > ---------------------------------------------------------------
> > >
> > > --
> > > ---------------------------------------------------------------
> > > The robbed that smiles, steals something
> > > from the thief.
> > > - Shakespeare
> > > ---------------------------------------------------------------
> > >
> > >
> >
> >
> > ______________________________________________________
> > GRATIS für alle WEB.DE-Nutzer: Die maxdome Movie-FLAT!
> > Jetzt freischalten unter http://movieflat.web.de
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> >
> >
>
>
______________________________________________________
GRATIS für alle WEB.DE-Nutzer: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://movieflat.web.de
More information about the Bioconductor
mailing list