[R] geom_smooth with sd

Sun Aug 11 18:40:43 CEST 2024

Thanks Erin

Quite relevant. Yes now +sd and -sd are the same values. However they are about +/- 5 and not the values received by the simple code here. I still think it is as the length of y differs.

Simple statistics

> mean(MS2020[MS2020$Bio=="1",]$QI_A, na.rm=TRUE)

[1] 26.81225

> sd(MS2020[MS2020$Bio=="1",]$QI_A, na.rm=TRUE)

[1] 21.12419

> mean(MS2020[MS2020$Bio=="0",]$QI_A, na.rm=TRUE)

[1] 15.86196

> sd(MS2020[MS2020$Bio=="0",]$QI_A, na.rm=TRUE)

[1] 15.00405

Kind regards

Sibylle 

From: Erin Hodgess <erinm.hodgess using gmail.com> 
Sent: Sunday, August 11, 2024 6:30 PM
To: sibylle.stoeckli using gmx.ch
Cc: R-help using r-project.org
Subject: Re: [R] geom_smooth with sd

Hi!

This is probably completely off base, but your ymin and y max setup lines are different.  One uses sqrt(y), while the second uses sqrt(length(y)).

Could that play a part, please?

Thank you

Erin Hodgess, PhD

mailto: erinm.hodgess using gmail.com <mailto:erinm.hodgess using gmail.com> 

On Sun, Aug 11, 2024 at 10:10 AM SIBYLLE STÖCKLI via R-help <r-help using r-project.org <mailto:r-help using r-project.org> > wrote:

Dear community

Using after_stat() I was able to visualise ggplot with standard deviations
instead of a confidence interval as seen in the R help.

p1<-ggplot(data = MS1, aes(x= Jahr, y= QI_A,color=Bio, linetype=Bio)) + 

                geom_smooth(aes(fill=Bio,
ymax=after_stat(y+se*sqrt(length(y))), ymin=after_stat(y-se*sqrt(y))) ,
method = "lm" , formula = y ~ x + I(x^2),linewidth=1) +

                theme(panel.background = element_blank())+

                theme(axis.line = element_line(colour = "black"))+

  theme(axis.text=element_text(size=18))+

  theme(axis.title=element_text(size=20))+

                ylab("Anteil BFF an LN [%]") +xlab("Jahr")+

  scale_color_manual(values=c("red","darkgreen"), labels=c("ÖLN", "BIO"))+

  scale_fill_manual(values=c("red","darkgreen"), labels= c("ÖLN", "BIO"))+

                theme(legend.title = element_blank())+

  theme(legend.text=element_text(size=20))+

  scale_linetype_manual(values=c("dashed", "solid"), labels=c("ÖLN", "BIO"))

p1<-p1 + expand_limits(y=c(0, 30))

When comparing the plots to the simple statistics the standard deviation do
not match. I assume it is because of the na.rm=TRUE which does not match
length(y) in the  after_stat code. However I was not able to adapt the code
using NA values?

Simple statistics

> mean(MS2020[MS2020$Bio=="1",]$QI_A, na.rm=TRUE)

[1] 26.81225

> sd(MS2020[MS2020$Bio=="1",]$QI_A, na.rm=TRUE)

[1] 21.12419

> mean(MS2020[MS2020$Bio=="0",]$QI_A, na.rm=TRUE)

[1] 15.86196

> sd(MS2020[MS2020$Bio=="0",]$QI_A, na.rm=TRUE)

[1] 15.00405

Kind regards

Sibylle

        [[alternative HTML version deleted]]

______________________________________________
R-help using r-project.org <mailto:R-help using r-project.org>  mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

	[[alternative HTML version deleted]]