[Bioc-sig-seq] Import Solexa Data
Martin Morgan
mtmorgan at fhcrc.org
Tue Aug 18 20:10:12 CEST 2009
Hi John --
John Lande wrote:
> dear Biocers*
>
> I have just received data from solexa facility.
>
> I tried the packages ShortRead and Rolexa, but it seems to me they require a
> specific file organization with 3 main folder as the imput.
>
> instead I have 2 type of file: the first looks like that
>
> HWI-EAS373 1 2 1 1 1 0 1
> NAGCCAGCTTACCTCCCGGTGGTGGGTCGGTGGTCCCTGGGCAGGGGTCTCCCGATCCCGGACGAGCCCCCCAAT
> DGPRTTTTRRRTTTRTRTTTTTQPPTSUQPQPTSPSSSSBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
> NM Y
These look like solexa 'export' files, so
library(ShortRead)
aln <- readAligned('path/to/file', type='SolexaExport')
see ?readAligned, and also confirm that the files are actually
'SolexaExport'.
Martin
> HWI-EAS373 1 2 1 2 2 0 1
> NATCTGTTCTTGGCCCTGAGCCGGGGCAGGAACTGCTTACCACAGATATCCTGTTTGGCCCATATTCAGCTGTTC
> DKWUWUUWVUSWUWWWWUUUWWUWVVWSSSWVWSWUWVWWSSTQUVWWVWUWWQWVWWWQWSRKPWVWVWVUQWW
> NM Y
> HWI-EAS373 1 2 1 3 3 0 1
> NACCTTACACAGTCCTGCTGACCACCCCCACCGCCCTCAAAGTAGACGGCATCGCAGCTTGGATACACGCCGCCC
> DLRPPNSPRTTSPRRFSTSTSRRTRPRPSNSPRRNQRNTTRPKTSNRPPPSTTSRTBBBBBBBBBBBBBBBBBBB
> NM N
> HWI-EAS373 1 2 1 4 4 0 1
> NAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTG
> DOZXZUZZVZZZOXZUZWWUYXXXRZWMMMGWYNYXXTXXXXVBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
> NM Y
> HWI-EAS373 1 2 1 5 5 0 1
> NATAGAATGTCATGGCACTGTTCAGAATGGCATACAACAACGGCCAGGTCTGCAGCCACATCAGGGCAAACACAT
> DMWRWSVXYRSNPXXWSPXXXYXUXXXYXTGTUXXSRWUWWYXYWXXYUXYXUXWWVXWVVVRUYXTTTVXXWTW
> NM Y
> HWI-EAS373 1 2 1 6 6 0 1
> NAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAAC
> DKUUYXUXXXXXURYXUXYXXXXYXXYUXWTTXXWVXXUXYYYUYYWSWXVSVXUXWPVWXVVVVVTNTWVVSVB
> NM Y
> HWI-EAS373 1 2 1 7 7 0 1
> NCCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAA
> DMWYYYUWSWWTYWYWWUWUWWYWYVVWWWWUWWWTVVYYWWWYTUWUWTYUVXTTUSSWVTWWURWUWNSUWQP
> NM Y
> HWI-EAS373 1 2 1 8 8 0 1
> NAGCGTGGGTCCCACTGTATCATTGGGGCACTGGTGCCAGGTTGAAAGTCCACCCGTCACGACCTTCTACACATG
> BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
> NM N
> HWI-EAS373 1 2 1 9 9 0 1
> NAAACCTCTNACACATGCAGCTCCCGGAGACGGTCACAGCTTGTCTGTAAGCGGATGCCGGGAGCAGACAAGCCC
> DKTTSTTTMDNTTSSSTTTTSTTTRSUTRTTTTRSRSNPRRSSPSSTRBBBBBBBBBBBBBBBBBBBBBBBBBBB
> NM N
> HWI-EAS373 1 2 1 10 10 0 1
> NGGTTCTCTAGAAACTGCTGAGGGCTGGACCGCATCTGGGGACCATCTGTTCTTGGCCCTGAGCCGGGGCAGGAA
> DKRRTTRSSSSTNOSSNTQKQPNPUTMPPRSSOSSPRRRRSRSRSTBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
> NM
>
>
> the second like this one
>
> HWI-EAS373 1 2 1 1 1 0 1
> NAGCCAGCTTACCTCCCGGTGGTGGGTCGGTGGTCCCTGGGCAGGGGTCTCCCGATCCCGGACGAGCCCCCCAAT
> DGPRTTTTRRRTTTRTRTTTTTQPPTSUQPQPTSPSSSSBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
> NM Y
> HWI-EAS373 1 2 1 2 2 0 1
> NATCTGTTCTTGGCCCTGAGCCGGGGCAGGAACTGCTTACCACAGATATCCTGTTTGGCCCATATTCAGCTGTTC
> DKWUWUUWVUSWUWWWWUUUWWUWVVWSSSWVWSWUWVWWSSTQUVWWVWUWWQWVWWWQWSRKPWVWVWVUQWW
> NM Y
> HWI-EAS373 1 2 1 3 3 0 1
> NACCTTACACAGTCCTGCTGACCACCCCCACCGCCCTCAAAGTAGACGGCATCGCAGCTTGGATACACGCCGCCC
> DLRPPNSPRTTSPRRFSTSTSRRTRPRPSNSPRRNQRNTTRPKTSNRPPPSTTSRTBBBBBBBBBBBBBBBBBBB
> NM N
> HWI-EAS373 1 2 1 4 4 0 1
> NAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTG
> DOZXZUZZVZZZOXZUZWWUYXXXRZWMMMGWYNYXXTXXXXVBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
> NM Y
> HWI-EAS373 1 2 1 5 5 0 1
> NATAGAATGTCATGGCACTGTTCAGAATGGCATACAACAACGGCCAGGTCTGCAGCCACATCAGGGCAAACACAT
> DMWRWSVXYRSNPXXWSPXXXYXUXXXYXTGTUXXSRWUWWYXYWXXYUXYXUXWWVXWVVVRUYXTTTVXXWTW
> NM Y
> HWI-EAS373 1 2 1 6 6 0 1
> NAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAAC
> DKUUYXUXXXXXURYXUXYXXXXYXXYUXWTTXXWVXXUXYYYUYYWSWXVSVXUXWPVWXVVVVVTNTWVVSVB
> NM Y
> HWI-EAS373 1 2 1 7 7 0 1
> NCCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAA
> DMWYYYUWSWWTYWYWWUWUWWYWYVVWWWWUWWWTVVYYWWWYTUWUWTYUVXTTUSSWVTWWURWUWNSUWQP
> NM Y
> HWI-EAS373 1 2 1 8 8 0 1
> NAGCGTGGGTCCCACTGTATCATTGGGGCACTGGTGCCAGGTTGAAAGTCCACCCGTCACGACCTTCTACACATG
> BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
> NM N
> HWI-EAS373 1 2 1 9 9 0 1
> NAAACCTCTNACACATGCAGCTCCCGGAGACGGTCACAGCTTGTCTGTAAGCGGATGCCGGGAGCAGACAAGCCC
> DKTTSTTTMDNTTSSSTTTTSTTTRSUTRTTTTRSRSNPRRSSPSSTRBBBBBBBBBBBBBBBBBBBBBBBBBBB
> NM N
> HWI-EAS373 1 2 1 10 10 0 1
> NGGTTCTCTAGAAACTGCTGAGGGCTGGACCGCATCTGGGGACCATCTGTTCTTGGCCCTGAGCCGGGGCAGGAA
> DKRRTTRSSSSTNOSSNTQKQPNPUTMPPRSSOSSPRRRRSRSRSTBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
> NM
>
>
> can i import them in the pipeline of analyses provided by bioconductor?
> which package is the most suitable??
> *
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
More information about the Bioc-sig-sequencing
mailing list