[BioC] Question about sequence data results
Hrishikesh Deshmukh
d_hrishikesh at yahoo.com
Mon Feb 7 18:53:35 CET 2005
Dear Bioconductorians,
I have written a script which reads in Affymetrix
data, filters based on intensity values and then pulls
sequence data for the probes which satisfy the
intensity based filter. I notice that results given by
the script are different from the sequence data files
which Affymetrix supplies!
Here are top three seq. from my result and below these
are the sequences for the same probe ID's from
Affymetrix seq file, why do i different sequences, i
am attaching the script file for your perusal, your
help is appreciated.
h.pbn my.in.seq
1000_at TTGCGCTACAGCTAGGCCGCATGCT
1000_at TGGAAGCCAGGAAGGCCTATGTGAA
100_g_at TTCCCTGAAGGAACATTCCTTAGTC
>probe:HG-U95Av2:1000_at:399:559;
TCTCCTTTGCTGAGGCCTCCAGCTT
>probe:HG-U95Av2:1000_at:544:185;
AGGCCTCCAGCTTCAGGCAGGCCAA
>probe:HG-U95Av2:1000_at:530:505;
CCAGCTTCAGGCAGGCCAAGGCCTT
>probe:HG-U95Av2:1000_at:617:349;
AGCTCAGGTGGCCCCAGTTCAATCT
>probe:HG-U95Av2:1000_at:459:489;
AGTTCTGGAATGGAAGGGTTCTGGC
>probe:HG-U95Av2:1000_at:408:545;
TAGGGACTCAGGGCCATGCCTGCCC
>probe:HG-U95Av2:1000_at:484:311;
TTCCCTGAAGGAACATTCCTTAGTC
>probe:HG-U95Av2:1000_at:548:333;
GAAGGAACATTCCTTAGTCTCAAGG
>probe:HG-U95Av2:1000_at:578:369;
CTTAGTCTCAAGGGCTAGCATCCCT
>probe:HG-U95Av2:1000_at:498:465;
CTCAAGGGCTAGCATCCCTGAGGAG
>probe:HG-U95Av2:1000_at:503:441;
GGCTAGCATCCCTGAGGAGCCAGGC
>probe:HG-U95Av2:1000_at:482:439;
CTGTCAAAGCTGTCACTTCGCGTGC
>probe:HG-U95Av2:1000_at:397:545;
AAGCTGTCACTTCGCGTGCCCTCGC
>probe:HG-U95Av2:1000_at:352:465;
CGCGTGCCCTCGCTGCTTCTGTGTG
>probe:HG-U95Av2:1000_at:253:495;
CCCTCGCTGCTTCTGTGTGTGGTGA
>probe:HG-U95Av2:1000_at:228:631;
CTGCTTCTGTGTGTGGTGAGCAGAA
++++++++++++++++++++++++++++++++++++++++
>probe:HG-U95Av2:100_g_at:497:273;
CATCTGGAACAGCTGCTCTTGGTCA
>probe:HG-U95Av2:100_g_at:208:557;
AACAGCTGCTCTTGGTCACCCATCT
>probe:HG-U95Av2:100_g_at:495:355;
GCTGCTCTTGGTCACCCATCTTGAC
>probe:HG-U95Av2:100_g_at:478:371;
TTGAGGTGCTGCAGGCCAGTGATAA
>probe:HG-U95Av2:100_g_at:612:429;
CTACCCCGGCTGCAGGAGCTGCTAC
>probe:HG-U95Av2:100_g_at:563:317;
GCAGGAGCTGCTACTGTGCAACAAC
>probe:HG-U95Av2:100_g_at:223:559;
GCAGCCTGCAGTGCTCCAGCCTCTT
>probe:HG-U95Av2:100_g_at:523:575;
GTCCTCCTCAACCTGCAGGGTAACC
>probe:HG-U95Av2:100_g_at:551:445;
AGGGTAACCCGCTGTGCCAAGCGGT
>probe:HG-U95Av2:100_g_at:509:475;
GCATCTTGGAGCAACTGGCTGAACT
>probe:HG-U95Av2:100_g_at:576:249;
AGCAACTGGCTGAACTGCTGCCTTC
>probe:HG-U95Av2:100_g_at:568:349;
CTGGCTGAACTGCTGCCTTCAGTTA
>probe:HG-U95Av2:100_g_at:523:441;
GCTGCCTTCAGTTAGCAGCGTCCTC
>probe:HG-U95Av2:100_g_at:562:421;
CCTTCAGTTAGCAGCGTCCTCACCT
>probe:HG-U95Av2:100_g_at:622:473;
AGTTAGCAGCGTCCTCACCTAAGAG
>probe:HG-U95Av2:100_g_at:567:607;
GCCCTTTAACTTATTGGGACTGAAT
library(affy)
library(hgu95av2probe)
data(hgu95av2probe)
summary(hgu95av2probe)
Data <- ReadAffy()
pmi <- pm(Data)
mmi <- mm(Data)
pbn <- probeNames(Data)
rng.pmi <- apply(pmi,1,range)
rng.mmi <- apply(mmi,1,range)
in.boundspm <- ((rng.pmi[1,] >=200) & (rng.pmi[1,]
<=20000))
in.boundsmm <- ((rng.mmi[1,] >=200) & (rng.mmi[1,]
<=20000))
in.bounds <- (in.boundspm & in.boundsmm)
length(pmi[,1])
ac1 <- 1:201800
ac2 <- ac1[in.bounds]
h.pbn <- pbn[ac2]
h.pmi <- pmi[ac2,]
h.mmi <- mmi[ac2,]
my.in.seqpm <- hgu95av2probe$sequence[in.boundspm]
my.in.seqmm <- hgu95av2probe$sequence[in.boundsmm]
my.in.seq <- hgu95av2probe$sequence[in.bounds]
seq.data<- cbind(h.pbn,my.in.seq)
write.table(seq.data, file="SeqData.txt",quote
=F,row.names=F,col.names=T,sep = " ")
Eagerly waiting for your reply.
Thanks in advance.
Hrishi
More information about the Bioconductor
mailing list