[Bioc-sig-seq] Import Solexa Data

Martin Morgan mtmorgan at fhcrc.org
Tue Aug 18 20:10:12 CEST 2009


Hi John --

John Lande wrote:
> dear Biocers*
> 
> I have just received data from solexa facility.
> 
> I tried the packages ShortRead and Rolexa, but it seems to me they require a
> specific file organization with 3 main folder as the imput.
> 
> instead I have 2 type of file: the first looks like that
> 
> HWI-EAS373    1    2    1    1    1    0    1
> NAGCCAGCTTACCTCCCGGTGGTGGGTCGGTGGTCCCTGGGCAGGGGTCTCCCGATCCCGGACGAGCCCCCCAAT
> DGPRTTTTRRRTTTRTRTTTTTQPPTSUQPQPTSPSSSSBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
> NM                                            Y

These look like solexa 'export' files, so

  library(ShortRead)
  aln <- readAligned('path/to/file', type='SolexaExport')

see ?readAligned, and also confirm that the files are actually
'SolexaExport'.

Martin

> HWI-EAS373    1    2    1    2    2    0    1
> NATCTGTTCTTGGCCCTGAGCCGGGGCAGGAACTGCTTACCACAGATATCCTGTTTGGCCCATATTCAGCTGTTC
> DKWUWUUWVUSWUWWWWUUUWWUWVVWSSSWVWSWUWVWWSSTQUVWWVWUWWQWVWWWQWSRKPWVWVWVUQWW
> NM                                            Y
> HWI-EAS373    1    2    1    3    3    0    1
> NACCTTACACAGTCCTGCTGACCACCCCCACCGCCCTCAAAGTAGACGGCATCGCAGCTTGGATACACGCCGCCC
> DLRPPNSPRTTSPRRFSTSTSRRTRPRPSNSPRRNQRNTTRPKTSNRPPPSTTSRTBBBBBBBBBBBBBBBBBBB
> NM                                            N
> HWI-EAS373    1    2    1    4    4    0    1
> NAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTG
> DOZXZUZZVZZZOXZUZWWUYXXXRZWMMMGWYNYXXTXXXXVBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
> NM                                            Y
> HWI-EAS373    1    2    1    5    5    0    1
> NATAGAATGTCATGGCACTGTTCAGAATGGCATACAACAACGGCCAGGTCTGCAGCCACATCAGGGCAAACACAT
> DMWRWSVXYRSNPXXWSPXXXYXUXXXYXTGTUXXSRWUWWYXYWXXYUXYXUXWWVXWVVVRUYXTTTVXXWTW
> NM                                            Y
> HWI-EAS373    1    2    1    6    6    0    1
> NAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAAC
> DKUUYXUXXXXXURYXUXYXXXXYXXYUXWTTXXWVXXUXYYYUYYWSWXVSVXUXWPVWXVVVVVTNTWVVSVB
> NM                                            Y
> HWI-EAS373    1    2    1    7    7    0    1
> NCCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAA
> DMWYYYUWSWWTYWYWWUWUWWYWYVVWWWWUWWWTVVYYWWWYTUWUWTYUVXTTUSSWVTWWURWUWNSUWQP
> NM                                            Y
> HWI-EAS373    1    2    1    8    8    0    1
> NAGCGTGGGTCCCACTGTATCATTGGGGCACTGGTGCCAGGTTGAAAGTCCACCCGTCACGACCTTCTACACATG
> BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
> NM                                            N
> HWI-EAS373    1    2    1    9    9    0    1
> NAAACCTCTNACACATGCAGCTCCCGGAGACGGTCACAGCTTGTCTGTAAGCGGATGCCGGGAGCAGACAAGCCC
> DKTTSTTTMDNTTSSSTTTTSTTTRSUTRTTTTRSRSNPRRSSPSSTRBBBBBBBBBBBBBBBBBBBBBBBBBBB
> NM                                            N
> HWI-EAS373    1    2    1    10    10    0    1
> NGGTTCTCTAGAAACTGCTGAGGGCTGGACCGCATCTGGGGACCATCTGTTCTTGGCCCTGAGCCGGGGCAGGAA
> DKRRTTRSSSSTNOSSNTQKQPNPUTMPPRSSOSSPRRRRSRSRSTBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
> NM
> 
> 
> the second like this one
> 
> HWI-EAS373    1    2    1    1    1    0    1
> NAGCCAGCTTACCTCCCGGTGGTGGGTCGGTGGTCCCTGGGCAGGGGTCTCCCGATCCCGGACGAGCCCCCCAAT
> DGPRTTTTRRRTTTRTRTTTTTQPPTSUQPQPTSPSSSSBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
> NM                                            Y
> HWI-EAS373    1    2    1    2    2    0    1
> NATCTGTTCTTGGCCCTGAGCCGGGGCAGGAACTGCTTACCACAGATATCCTGTTTGGCCCATATTCAGCTGTTC
> DKWUWUUWVUSWUWWWWUUUWWUWVVWSSSWVWSWUWVWWSSTQUVWWVWUWWQWVWWWQWSRKPWVWVWVUQWW
> NM                                            Y
> HWI-EAS373    1    2    1    3    3    0    1
> NACCTTACACAGTCCTGCTGACCACCCCCACCGCCCTCAAAGTAGACGGCATCGCAGCTTGGATACACGCCGCCC
> DLRPPNSPRTTSPRRFSTSTSRRTRPRPSNSPRRNQRNTTRPKTSNRPPPSTTSRTBBBBBBBBBBBBBBBBBBB
> NM                                            N
> HWI-EAS373    1    2    1    4    4    0    1
> NAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTG
> DOZXZUZZVZZZOXZUZWWUYXXXRZWMMMGWYNYXXTXXXXVBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
> NM                                            Y
> HWI-EAS373    1    2    1    5    5    0    1
> NATAGAATGTCATGGCACTGTTCAGAATGGCATACAACAACGGCCAGGTCTGCAGCCACATCAGGGCAAACACAT
> DMWRWSVXYRSNPXXWSPXXXYXUXXXYXTGTUXXSRWUWWYXYWXXYUXYXUXWWVXWVVVRUYXTTTVXXWTW
> NM                                            Y
> HWI-EAS373    1    2    1    6    6    0    1
> NAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAAC
> DKUUYXUXXXXXURYXUXYXXXXYXXYUXWTTXXWVXXUXYYYUYYWSWXVSVXUXWPVWXVVVVVTNTWVVSVB
> NM                                            Y
> HWI-EAS373    1    2    1    7    7    0    1
> NCCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAA
> DMWYYYUWSWWTYWYWWUWUWWYWYVVWWWWUWWWTVVYYWWWYTUWUWTYUVXTTUSSWVTWWURWUWNSUWQP
> NM                                            Y
> HWI-EAS373    1    2    1    8    8    0    1
> NAGCGTGGGTCCCACTGTATCATTGGGGCACTGGTGCCAGGTTGAAAGTCCACCCGTCACGACCTTCTACACATG
> BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
> NM                                            N
> HWI-EAS373    1    2    1    9    9    0    1
> NAAACCTCTNACACATGCAGCTCCCGGAGACGGTCACAGCTTGTCTGTAAGCGGATGCCGGGAGCAGACAAGCCC
> DKTTSTTTMDNTTSSSTTTTSTTTRSUTRTTTTRSRSNPRRSSPSSTRBBBBBBBBBBBBBBBBBBBBBBBBBBB
> NM                                            N
> HWI-EAS373    1    2    1    10    10    0    1
> NGGTTCTCTAGAAACTGCTGAGGGCTGGACCGCATCTGGGGACCATCTGTTCTTGGCCCTGAGCCGGGGCAGGAA
> DKRRTTRSSSSTNOSSNTQKQPNPUTMPPRSSOSSPRRRRSRSRSTBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
> NM
> 
> 
> can i import them in the pipeline of analyses provided by bioconductor?
> which package is the most suitable??
> *
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing



More information about the Bioc-sig-sequencing mailing list