[R-sig-genetics] Reading large VCF into genind

Vikram Chhatre crypticlineage at gmail.com
Fri May 29 16:32:52 CEST 2015


Hi Stefano,

That's certainly a large data set, so you are bound to run into memory
issues.  But I have had luck reading a data set of about 150K SNP calls for
more than 500 individuals as a genind object from a structure file.  It
does take long though (close to 18 hours or so).

PGDspider will convert VCF to Structure on any size data set.  It's speed
is dependent upon how much memory  you allow it.

How much RAM do you have?

V

On Fri, May 29, 2015 at 10:07 AM, Stefano Iantorno <si3 at sanger.ac.uk> wrote:

> Hello
>
> I have a VCF file containing 306596 biallelic SNP calls from 18
> individuals. I was able to read the entire file as a “loci” object but when
> I try to convert it into a genind object I run into memory issues.
>
> How can I circumvent this problem? Should I break the single “loci” object
> into smaller subsets with only 50 000 SNPs or so, and convert each one into
> a genind object separately with loci2genind? How do I combine the resulting
> genind objects?
>
> - Stefano
>
>
>
> --
>  The Wellcome Trust Sanger Institute is operated by Genome Research
>  Limited, a charity registered in England with number 1021457 and a
>  company registered in England with number 2742969, whose registered
>  office is 215 Euston Road, London, NW1 2BE.
>
>
>
>         [[alternative HTML version deleted]]
>
>
> _______________________________________________
> R-sig-genetics mailing list
> R-sig-genetics at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-genetics
>
>

	[[alternative HTML version deleted]]



More information about the R-sig-genetics mailing list