[Bioc-devel] a day in the life of gwascat
Martin Morgan
mtmorg@n@b|oc @end|ng |rom gm@||@com
Thu Apr 30 12:59:44 CEST 2020
I'd look instead at or around line 35264 for use of quotes, e.g., "3' DNA", and change the argument read.delim(quote = "") (though I never get that right so probably wrong again...). A comment character might also be a problem.
If you point to the location of the file I could investigate further...
Martin
On 4/30/20, 6:55 AM, "Bioc-devel on behalf of Vincent Carey" <bioc-devel-bounces using r-project.org on behalf of stvjc using channing.harvard.edu> wrote:
The EBI GWAS catalog is large -- now the download is over 100MB for 179K
associations. A "bug" in the
package was reported, so I acquired the file by hand.
> nn = read.delim("gwas_catalog_v1.0.2-associations_e98_r2020-03-08.tsv",
sep="\t")
*Warning message:*
*In scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :*
* EOF within quoted string*
> dim(nn)
[1] 35264 38
The "bug" is the number 35264 ...
>
[1]+ Stopped R
%vjcair> wc gwas_cat*tsv
179365 13243516 120140148
gwas_catalog_v1.0.2-associations_e98_r2020-03-08.tsv
%vjcair> vi gwas_cat*tsv
%vjcair> fg
R
> tail(nn)
*Error: C stack usage 98161262 is too close to the limit*
*Maybe my R needs to be updated.*
*If I use data.table::fread to consume the tsv over HTTP all seems well,
and perhaps*
*I will switch to that.*
--
The information in this e-mail is intended only for the ...{{dropped:18}}
_______________________________________________
Bioc-devel using r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel
More information about the Bioc-devel
mailing list