[R] Mismatch distribution

Myriam Croze myr|@m@croze07 @end|ng |rom gm@||@com
Tue Jan 22 02:28:07 CET 2019


Hello!

I need your help. I am trying to calculate the pairwise differences between
sequences from several fasta files.
I would like for each of my DNA alignments (fasta files), calculate the
pairwise differences and then:
- 1. Combine all the data of each file to have one file and one histogram
(mismatch distribution)
- 2. calculate the mean for each difference for all the file and again make
a mismatch distribution plot

Here the script that I wrote:

library("pegas")
> library("seqinr")
> library("ggplot2")
>
>

> Files <- list.files(pattern="fas")
> nb_files <- length(Files)
>
>
> for (i in 1:nb_files) {
>         Dist <-  as.numeric(dist.gene(read.dna(Files[i], "fasta"), method
> = "pairwise",
>                            pairwise.deletion = FALSE, variance = FALSE))
>
>         Data <- merge(Data, Dist, by=c("x"), all=T)
>     }
>


> hist(Data, prob=TRUE)
> lines(density(Data), col="blue", lwd=2)
>

However, the script does not work and I do not know what to change to make
it working.
Thanks in advance for your help.

Myriam

-- 
Myriam Croze, PhD
Post-doctorante
Division of EcoScience,
Ewha Womans University
Seoul, South Korea

Email: myriam.croze07 using gmail.com

	[[alternative HTML version deleted]]



More information about the R-help mailing list