[R-sig-genetics] Reading large VCF into genind

Jombart, Thibaut t.jombart at imperial.ac.uk
Fri May 29 16:30:59 CEST 2015


Dear Stefano, 

the genind class has not been designed for such a large number of SNPs. Substantial improvements have been made in adegenet 2.0: 
https://github.com/thibautjombart/adegenet

but for this number of loci you may want to retain only biallelic SNPs and use genlight objects instead. These are documented in the 'adegenet-genomics' tutorial. We still need to update this one for adegenet 2.0, but most of it should still be relevant (this class has not changed much).

Best
Thibaut

==============================
Dr Thibaut Jombart
MRC Centre for Outbreak Analysis and Modelling
Department of Infectious Disease Epidemiology
Imperial College - School of Public Health
Norfolk Place, London W2 1PG, UK
Tel. : 0044 (0)20 7594 3658
http://sites.google.com/site/thibautjombart/
http://sites.google.com/site/therepiproject/
http://adegenet.r-forge.r-project.org/
Twitter: @thibautjombart



________________________________________
From: R-sig-genetics [r-sig-genetics-bounces at r-project.org] on behalf of Stefano Iantorno [si3 at sanger.ac.uk]
Sent: 29 May 2015 15:07
To: r-sig-genetics at r-project.org
Subject: [R-sig-genetics] Reading large VCF into genind

Hello

I have a VCF file containing 306596 biallelic SNP calls from 18 individuals. I was able to read the entire file as a �loci� object but when I try to convert it into a genind object I run into memory issues.

How can I circumvent this problem? Should I break the single �loci� object into smaller subsets with only 50 000 SNPs or so, and convert each one into a genind object separately with loci2genind? How do I combine the resulting genind objects?

- Stefano



--
 The Wellcome Trust Sanger Institute is operated by Genome Research
 Limited, a charity registered in England with number 1021457 and a
 company registered in England with number 2742969, whose registered
 office is 215 Euston Road, London, NW1 2BE.



        [[alternative HTML version deleted]]



More information about the R-sig-genetics mailing list