[Bioc-devel] a day in the life of gwascat

Martin Morgan mtmorg@n@b|oc @end|ng |rom gm@||@com
Thu Apr 30 12:59:44 CEST 2020


I'd look instead at or around line 35264 for use of quotes, e.g., "3' DNA", and change the argument read.delim(quote = "") (though I never get that right so probably wrong again...). A comment character might also be a problem.

If you point to the location of the file I could investigate further...

Martin

On 4/30/20, 6:55 AM, "Bioc-devel on behalf of Vincent Carey" <bioc-devel-bounces using r-project.org on behalf of stvjc using channing.harvard.edu> wrote:

    The EBI GWAS catalog is large -- now the download is over 100MB for 179K
    associations.  A "bug" in the
    package was reported, so I acquired the file by hand.

    > nn = read.delim("gwas_catalog_v1.0.2-associations_e98_r2020-03-08.tsv",
    sep="\t")

    *Warning message:*

    *In scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  :*

    *  EOF within quoted string*

    > dim(nn)

    [1] 35264    38


    The "bug" is the number 35264 ...


    >

    [1]+  Stopped                 R

    %vjcair> wc gwas_cat*tsv

      179365 13243516 120140148
    gwas_catalog_v1.0.2-associations_e98_r2020-03-08.tsv

    %vjcair> vi gwas_cat*tsv

    %vjcair> fg

    R


    > tail(nn)

    *Error: C stack usage  98161262 is too close to the limit*


    *Maybe my R needs to be updated.*


    *If I use data.table::fread to consume the tsv over HTTP all seems well,
    and perhaps*

    *I will switch to that.*

    -- 
    The information in this e-mail is intended only for the ...{{dropped:18}}

    _______________________________________________
    Bioc-devel using r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/bioc-devel


More information about the Bioc-devel mailing list