# [R] Fitting a distribution to peaks in histogram

Petr Pikal petr.pikal at precheza.cz
Wed Jul 19 14:54:51 CEST 2006

```Hi

There are some packages for mass spectra processing (spectrino,
caMassClass). I did not use them so I do not know how they suit your
needs.

However you can compute area (integrate) by these functions

# uses information interactively from plot(x,y)
# first it replots data between corners *replot(x,y)*
# then it computes sum between x axis and y values - osum -
# and between "baseline" and y values - cista - based
# on locator positions

integ<-function (x,y)
{
replot(x,y)
meze<-locator(2)
dm<-meze\$x[1]
hm<-meze\$x[2]
abline(v=c(dm,hm),col=2)
vyber<-x<=hm&x>=dm
f3 <- splinefun(x, y)
osum<-integrate(f3, dm, hm)\$value
o1<-(y[x==min(x[vyber])]+y[x==max(x[vyber])])*(max(x[vyber])-
min(x[vyber]))/2
cista<-osum-o1
return(c(osum,cista))
}

# similar as integ but you has to supply upper and lower limits (dm,
# hm) manually if you do not want to perform "integration" of whole #
area under the curve.

integ1<-function (x,y,dm=-Inf,hm=+Inf)
{
ifelse(dm==-Inf, dm<-min(x), dm<-dm)
ifelse(hm==+Inf, hm<-max(x), hm<-hm)
vyber<-x<=hm&x>=dm
f3 <- splinefun(x, y)
osum<-integrate(f3, dm, hm)\$value
o1<-(y[x==min(x[vyber])]+y[x==max(x[vyber])])*(max(x[vyber])-
min(x[vyber]))/2
cista<-osum-o1
return(c(osum,cista))
}

On 19 Jul 2006 at 11:58, Ulrik Stervbo wrote:

Date sent:      	Wed, 19 Jul 2006 11:58:38 +0200
From:           	"Ulrik Stervbo" <ulriks at ruc.dk>
To:             	r-help at stat.math.ethz.ch
Subject:        	[R] Fitting a distribution to peaks in histogram

> Hello list!
>
> I would like to fit a distribution to each of the peaks in a
> histogram, such as this:
> http://photos1.blogger.com/blogger/7029/2724/1600/DU145-Bax3-Bcl-xL.pn
> g .
>
> The peaks are identified using Petr Pikal peaks function (
> http://finzi.psych.upenn.edu/R/Rhelp02a/archive/33097.html), but after
> that I am quite stuck.
>
> Any idea as to how I can:
> Fit a distribution to each peak
> Integrate the area between each two peaks, using the means and widths
> of the distributions fitted to the two peaks. I will be using the
> integrate function
>
> The histogram is based on approximately 15000 events, which makes
> Mclust and pam (which both delivers the information I need) less
> useful.
>
> The whole point of this exercise is to find the percentage of cells in
> peak 1, 2, 3, and so on, and between peak 1-2, peak 2-3, peak 3-4 and
> so on. Having more that 6 peaks does not appears likely.
>
> I am quite new to R and apologise if the solution is fairly basic.
>
> Thank you in advance for any help and suggestions
>
> Sincerely,
> Ulrik
>
>  [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help