[R] Mismatch distribution
Myriam Croze
myr|@m@croze07 @end|ng |rom gm@||@com
Tue Jan 22 02:28:07 CET 2019
Hello!
I need your help. I am trying to calculate the pairwise differences between
sequences from several fasta files.
I would like for each of my DNA alignments (fasta files), calculate the
pairwise differences and then:
- 1. Combine all the data of each file to have one file and one histogram
(mismatch distribution)
- 2. calculate the mean for each difference for all the file and again make
a mismatch distribution plot
Here the script that I wrote:
library("pegas")
> library("seqinr")
> library("ggplot2")
>
>
> Files <- list.files(pattern="fas")
> nb_files <- length(Files)
>
>
> for (i in 1:nb_files) {
> Dist <- as.numeric(dist.gene(read.dna(Files[i], "fasta"), method
> = "pairwise",
> pairwise.deletion = FALSE, variance = FALSE))
>
> Data <- merge(Data, Dist, by=c("x"), all=T)
> }
>
> hist(Data, prob=TRUE)
> lines(density(Data), col="blue", lwd=2)
>
However, the script does not work and I do not know what to change to make
it working.
Thanks in advance for your help.
Myriam
--
Myriam Croze, PhD
Post-doctorante
Division of EcoScience,
Ewha Womans University
Seoul, South Korea
Email: myriam.croze07 using gmail.com
[[alternative HTML version deleted]]
More information about the R-help
mailing list