[Bioc-devel] a day in the life of gwascat
Vincent Carey
@tvjc @end|ng |rom ch@nn|ng@h@rv@rd@edu
Thu Apr 30 13:15:29 CEST 2020
right, line 35265 of
http://www.ebi.ac.uk/gwas/api/search/downloads/alternative has an unclosed
quote in a field.
35265 2019-04-10 30804558 Grove J 2019-02-25 Nat Genet
www.ncbi.nlm.nih.gov/pubmed/30804558 I dentification of common
genetic risk variants for autism spectrum disorder. Autism spectrum
disorder 18 ,381 European ancestry cases, 27,969 European
ancestry controls 2,119 European ancestry cases, 142,379 Euro pean
ancestry controls Intergenic
chr11:102751102"-?
chr11:102751102
0 1 0.037 8E-6 5.096910013008056
1.1641443 [NR] Illumina [9112387] (imputed) N autism
spectrum disorder http:/ /www.ebi.ac.uk/efo/EFO_0003756
GCST007556 Genome-wide genotyping array
On Thu, Apr 30, 2020 at 6:59 AM Martin Morgan <mtmorgan.bioc using gmail.com>
wrote:
> I'd look instead at or around line 35264 for use of quotes, e.g., "3'
> DNA", and change the argument read.delim(quote = "") (though I never get
> that right so probably wrong again...). A comment character might also be a
> problem.
>
> If you point to the location of the file I could investigate further...
>
> Martin
>
> On 4/30/20, 6:55 AM, "Bioc-devel on behalf of Vincent Carey" <
> bioc-devel-bounces using r-project.org on behalf of stvjc using channing.harvard.edu>
> wrote:
>
> The EBI GWAS catalog is large -- now the download is over 100MB for
> 179K
> associations. A "bug" in the
> package was reported, so I acquired the file by hand.
>
> > nn =
> read.delim("gwas_catalog_v1.0.2-associations_e98_r2020-03-08.tsv",
> sep="\t")
>
> *Warning message:*
>
> *In scan(file = file, what = what, sep = sep, quote = quote, dec =
> dec, :*
>
> * EOF within quoted string*
>
> > dim(nn)
>
> [1] 35264 38
>
>
> The "bug" is the number 35264 ...
>
>
> >
>
> [1]+ Stopped R
>
> %vjcair> wc gwas_cat*tsv
>
> 179365 13243516 120140148
> gwas_catalog_v1.0.2-associations_e98_r2020-03-08.tsv
>
> %vjcair> vi gwas_cat*tsv
>
> %vjcair> fg
>
> R
>
>
> > tail(nn)
>
> *Error: C stack usage 98161262 is too close to the limit*
>
>
> *Maybe my R needs to be updated.*
>
>
> *If I use data.table::fread to consume the tsv over HTTP all seems
> well,
> and perhaps*
>
> *I will switch to that.*
>
> --
> The information in this e-mail is intended only for the
> ...{{dropped:18}}
>
> _______________________________________________
> Bioc-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
--
The information in this e-mail is intended only for the ...{{dropped:18}}
More information about the Bioc-devel
mailing list