[BioC] RE : RE : RE : maping SNPs

Simon Noël simon.noel.2 at ulaval.ca
Tue Jan 11 22:18:38 CET 2011


   Hi,


   You say that one of the warning was because I was only having SNPs from
   chr1.  I decided to add the 2 first SNPs I have from chrd but I get a new
   error...

   > myids   <-  c("rs7547453",  "rs2840542",  "rs1999527",  "rs4648545",
   "rs10915459",  "rs16838750",  "rs12128230", "rs4637157", "rs11900053",
   "rs999999999")
   > mysnps <- makeGRangesFromRefSNPids(myids)
    Errorin solveUserSEW0(start = start, end = end, width = width) :
     solving row 8: range cannot be determined from the supplied arguments
   (too many NAs)
   In addition: Warning messages:
   1: In ans_locs[!is.na(myrows)] <- locs$loc[myrows] :
     number of items to replace is not a multiple of replacement length
   2: In ans_locs[!is.na(myrows)] <- locs$loc[myrows] :
     number of items to replace is not a multiple of replacement length

   But if I only try with the 2 first SNPs on chr2, I have

   > myids <- c("rs4637157", "rs11900053", "rs999999999")
   > mysnps <- makeGRangesFromRefSNPids(myids)
   Warning message:
   In ans_locs[!is.na(myrows)] <- locs$loc[myrows] :
     number of items to replace is not a multiple of replacement length
   > mysnps
    GRangeswith 3 ranges and 1 elementMetadata value
       seqnames         ranges strand |   RefSNP_id
          <Rle>      <IRanges>  <Rle> | <character>
   [1]      ch2 [29443, 29443]      * |   rs4637157
   [2]      ch2 [36787, 36787]      * |  rs11900053
   [3]  unknown [    0,     0]      * | rs999999999
   seqlengths
        ch2 unknown
         NA      NA

   So is that mean that I will have to go chr by chr and split my big file?

   Now for the problem of changing ch1 to chr1

   > seqnames(mysnps)
   'factor' Rle of length 8 with 2 runs
     Lengths:       7       1
     Values :     ch1 unknown
   Levels(2): ch1 unknown
   > seqnames(mysnps) <- sub("ch", "chr", seqnames(mysnps))
   > seqnames(mysnps)
   'factor' Rle of length 8 with 2 runs
     Lengths:       7       1
     Values :    chr1 unknown
   Levels(2): chr1 unknown
   > map <- as.matrix(findOverlaps(mysnps, tx))
   Message d'avis :
   In .local(query, subject, maxgap, minoverlap, type, select, ...) :
     Only some seqnames from 'query' and 'subject' were not identical
   > mapExon <- as.matrix(findOverlaps(mysnps, txExon))
   Message d'avis :
   In .local(query, subject, maxgap, minoverlap, type, select, ...) :
     Only some seqnames from 'query' and 'subject' were not identical
   >
   > mapped_genes <- values(tx)$gene_id[map[, 2]]
   > mapped_snps     <-    rep.int(values(mysnps)$RefSNP_id[map[,    1]],
   elementLengths(mapped_genes))
   > snp2gene           <-          unique(data.frame(snp_id=mapped_snps,
   gene_id=unlist(mapped_genes)))
   > rownames(snp2gene) <- NULL
   > snp2gene[1:4, ]
        snp_id gene_id
   1 rs7547453    6497
   2 rs2840542   79906
   3 rs1999527   63976
   4 rs4648545    7161
   So now it's working on my computer:)  but I am only able to do SNPs from one
   chromosome as I say.

   On     the    super    computer, it still doesn't work    and    on my
   computer, it still taking a lot of time.  What isn't working is

   > txdb <- makeTranscriptDbFromUCSC(genome="hg19", tablename="refGene")
    Downloadthe refGene table ... OK
    Downloadthe refLink table ... OK
    Extractthe 'transcripts' data frame ... OK
    Extractthe 'splicings' data frame ... OK
    Downloadand preprocess the 'chrominfo' data frame ... OK
    Preparethe 'metadata' data frame ... OK
    Makethe TranscriptDb object  ... Error  in  .writeMetadataTable(conn,
   metadata) : subscript out of bounds
   In addition: There were 50 or more warnings (use warnings() to see the first
   50)


   Simon Noël
   CdeC


More information about the Bioconductor mailing list