[BioC] analyzing HumanHT12 with lumi
Paul Leo
p.leo at uq.edu.au
Wed Sep 9 09:34:48 CEST 2009
Not sure if this will help you ...
Have you tried using the annotations file at:
http://www.switchtoi.com/annotationprevfiles.ilmn
get the text version. see if that works for lumiR ?
Personally I don't bother.
x.lumi<-lumiR(filenames,convertNuID=FALSE,inputAnnotation=FALSE)
and annotate later with Bioconductor libraries via the probe ID s or the
illumina annotation file via the Array_Addresss_Id (do with what makes
sense to you with multi-mappers)....
ann<-read.delim("HumanHT-12_V3_0_R1_11283641_T.txt",header=T,skip=8,sep="\t",fill=TRUE)
dim(ann)
[1] 48803 28
> colnames(ann)
[1] "Species" "Source" "Search_Key"
"Transcript"
[5] "ILMN_Gene" "Source_Reference_ID" "RefSeq_ID"
"Unigene_ID"
[9] "Entrez_Gene_ID" "GI" "Accession"
"Symbol"
[13] "Protein_Product" "Probe_Id" "Array_Address_Id"
"Probe_Type"
[17] "Probe_Start" "Probe_Sequence" "Chromosome"
"Probe_Chr_Orientation"
[21] "Probe_Coordinates" "Cytoband" "Definition"
"Ontology_Component"
[25] "Ontology_Process" "Ontology_Function" "Synonyms"
"Obsolete_Probe_Id"
length(unique(ann[,"Array_Address_Id"]))
[1] 48803
-----Original Message-----
From: die_stevie at web.de
To: amit491 at gmail.com
Cc: bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] analyzing HumanHT12 with lumi
Date: Wed, 09 Sep 2009 09:01:53 +0200
Hello Amit,
first thank you very much for your response!
I included the the TargetID column and tried to run the lumiR function
with all options available but the result is still the same.
test <- lumiR(file = "D:/Programme/eclipse/tmp/FinalReport.txt", sep =
"\t", detectionTh = 0.01, na.rm = TRUE, convertNuID = TRUE, lib.
mapping = NULL, dec = '.', parseColumnName = TRUE, checkDupId = TRUE,
QC = TRUE, columnNameGrepPattern = list(exprs='AVG_SIGNAL', se.exprs='
BEAD_STD', detection='DETECTION', beadNum='Avg_NBEADS'), inputAnnotatio
n=TRUE, annotationColumn=c('PROBE_SEQUENCE'), verbose = TRUE)
I also pre-processed a Mouse WG-6 chip (V2) and everything is fine
there: no duplicated IDs or “Inf” in the quality control.
Maybe there is a problem with the HumanHT12 chip?
Does anyone else have any advice?
Thanks again!
Kind regards,
Steffi
> Von: amit mandal [mailto:amit491 at gmail.com]
> *Gesendet:* Tuesday, September 08, 2009 5:30 PM
> *An:* stefanie.figura
> *Cc:* BioC_mail
> *Betreff:* Fwd: [BioC] analyzing HumanHT12 with lumi
>
> hello Steffi,
> In 'lumi' one needs to import the data out of BeadStudio in a
> particular order of various columns (it can be arranged outside BS
> also). The columns in order are-
> 1) TargetID
> 2) ProbeID (this is different from the Probe_ID col)
> 3) Avg_Signal
> 4) BEAD_STDER
> 5) Detection Pval
> These are the cols. that are mandatory. Apart from them, annotation
> cols. can also be added. Info. about them is given in the "Using lumi.
> ." pdf that comes as vignette with the package.
> Also while importing the data using lumiR command, one needs to
> specify the grep pattern of the column headers by which lumiRwould
> recognize which col. contains what. Though the deafult output has the
> columns in order for lumiR to work in default settings, but just in
> case.
> I haven't analyzed HT-12 but WG-6. And above method works fine.
> lumitakes "ProbeID" as the unique identifier (v 3.0 chips onward) and
> I didn't encounter a 'duplicate ID..' message.
> I'm also unsure of the 'Inf' message. Maybe try importing the data
> with specifications for most of the options, i.e. col. grep pattern.
>
> regards
> amit mandal
>
> Graduate student
> Genomics & Molecular Medicine lab
> IGIB, Delhi
>
> On Tue, Sep 8, 2009 at 7:30 PM, stefanie.figura <figura at uni-muenster.
> de> wrote:
>
> Dear all!
>
> I tried to analyse the Illumina HumanHT12 chip with the lumi package
> and I
> have some questions about the import and the results of the quality
> control.
>
> My first question is which columns have to be exported from
> BeadStudio at
> least? I am not sure because in the .pdf manual for the lumi package
> the
> figure is not completely represented.
>
> I only exported ProbeID, PROBE_SEQUENCE (for nuID mapping with
> biocLite("lumiHumanAll.db")), AVG_Signal, BEAD_STDERR, Avg_NBEADS and
> Detection Pval from Group Gene Profile for all samples. Is there
> anything I
> missed which is
> <http://dict.leo.org/ende?lp=ende&p=thMx..&search=important>
> important for
> the analysis?
>
> I am not sure if I did a mistake in the code because of the results
> of the
> quality control:
>
> > importData <- lumiR("D:/Programme/eclipse/tmp/tmp_GroupProbeProfile.
> txt")
>
> Perform Quality Control assessment of the LumiBatch object ...
>
> Directly converting probe sequence to nuIDs ...
>
> Duplicated IDs found and were merged!
>
> > importData
>
> Summary of data information:
>
> Data File Information:
>
> BSGX Version 3.2.3
>
> Report Date 9/8/2009 1:41:49 PM
>
> Project tmp
>
> Group Set all_seperated
>
> Analysis all_seperated_nonorm
>
> Normalization none
>
> Major Operation History:
>
> submitted finished
>
> 1 2009-09-08 15:44:37 2009-09-08 15:45:12
>
> 2 2009-09-08 15:45:12 2009-09-08 15:45:14
>
> 3 2009-09-08 15:45:34 2009-09-08 15:45:34
>
> 4 2009-09-08 15:45:14 2009-09-08 15:45:35
>
> command
>
> 1 lumiR("D:/Programme/eclipse/tmp/tmp_GroupProbeProfile.txt")
>
> 2 lumiQ(x.lumi = x.lumi, detectionTh = detectionTh, verbose =
> verbose)
>
> 3 Subsetting 48803
> features.
>
> 4 addNuID2lumi(x.lumi = x.lumi, lib.mapping = lib.mapping, verbose =
> verbose)
>
> lumiVersion
>
> 1 1.10.1
>
> 2 1.10.1
>
> 3 1.10.1
>
> 4 1.10.1
>
> Object Information:
>
> LumiBatch (storageMode: lockedEnvironment)
>
> assayData: 48802 features, 24 samples
>
> element names: beadNum, detection, exprs, se.exprs
>
> phenoData
>
> sampleNames: 4433719067_A, 4433719067_B, ..., 4433719068_L (24 total)
>
> varLabels and varMetadata description:
>
> sampleID: The unique Illumina microarray Id
>
> featureData
>
> featureNames: Ku8QhfS0n_hIOABXuE, fqPEquJRRlSVSfL.8A, ...,
> N8t5EuJCr0Tk9.zHno (48802 total)
>
> fvarLabels and fvarMetadata description:
>
> ProbeID: The Illumina microarray identifier
>
> experimentData: use 'experimentData(object)'
>
> Annotation:
>
> Control Data: Available
>
> QC information: Please run summary(x, 'QC') for details!
>
> > summary(importData, 'QC')
>
> Data dimension: 48802 genes x 24 samples
>
> Summary of Samples:
>
> 4433719067_A 4433719067_B 4433719067_C 4433719067_D
>
> mean 6.8010 6.7230 6.6660 6.6870
>
> standard deviation 1.6760 1.6360 1.6370 1.6550
>
> detection rate(0.01) 0.3367 0.3432 0.3459 0.3436
>
> distance to sample mean Inf Inf Inf Inf
>
> 4433719067_E 4433719067_F 4433719067_G 4433719067_H
>
> mean 6.7220 6.6060 6.623 6.5730
>
> standard deviation 1.6470 1.6440 1.675 1.6440
>
> detection rate(0.01) 0.3531 0.3318 0.346 0.3378
>
> distance to sample mean Inf Inf Inf Inf
>
> 4433719067_I 4433719067_J 4433719067_K 4433719067_L
>
> mean 6.5400 6.5390 6.5470 6.4570
>
> standard deviation 1.6420 1.6740 1.6790 1.6390
>
> detection rate(0.01) 0.3316 0.3424 0.3464 0.3518
>
> distance to sample mean Inf Inf Inf Inf
>
> 4433719068_A 4433719068_B 4433719068_C 4433719068_D
>
> mean 6.3170 6.3000 6.304 6.213
>
> standard deviation 1.5630 1.5970 1.619 1.566
>
> detection rate(0.01) 0.3348 0.3257 0.336 0.320
>
> distance to sample mean Inf Inf Inf Inf
>
> 4433719068_E 4433719068_F 4433719068_G 4433719068_H
>
> mean 6.253 6.2510 6.169 6.2380
>
> standard deviation 1.600 1.6170 1.579 1.6590
>
> detection rate(0.01) 0.347 0.3434 0.335 0.3455
>
> distance to sample mean Inf Inf Inf Inf
>
> 4433719068_I 4433719068_J 4433719068_K 4433719068_L
>
> mean -Inf 6.191 6.1420 6.0510
>
> standard deviation NaN 1.642 1.6150 1.5360
>
> detection rate(0.01) 0.3319 0.346 0.3429 0.3462
>
> distance to sample mean 62.4000 Inf Inf Inf
>
> I wonder about the "Inf" and "NaN" and I really think something was
> going
> wrong.
>
> Any advice is welcome, because I just started to learn R.
>
> Thank you very much in advance!
>
> Kind regards,
>
> Steffi
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.
> informatics.conductor
>
> --
> ---------------------------------------------------------------
> The robbed that smiles, steals something
> from the thief.
> - Shakespeare
> ---------------------------------------------------------------
>
> --
> ---------------------------------------------------------------
> The robbed that smiles, steals something
> from the thief.
> - Shakespeare
> ---------------------------------------------------------------
>
>
______________________________________________________
GRATIS für alle WEB.DE-Nutzer: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://movieflat.web.de
_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list