[R] how to combine uncertainty and weighting in spearman correlation?

Abby Spurdle @purd|e@@ @end|ng |rom gm@||@com
Tue Jun 23 06:57:31 CEST 2020


Hi,

I suspect that there's a formula for this.
However, I couldn't find it.
So, here's (most of) the code to produce a simulated data set.

---------------------------
sim.data <- function (xmean, ymean, xsd=NULL, ysd=NULL, w, ...,
print=FALSE, plot=FALSE, nsub=1000)
{   N <- length (xmean)

    x <- y <- u <- matrix (0, nsub, N)
    for (k in 1:N)
    {   if (is.null (xsd) ) x [,k] <- xmean [k]
        else x [,k] <- rnorm (nsub, xmean [k], xsd [k])
        if (is.null (ysd) ) y [,k] <- ymean [k]
        else y [,k] <- rnorm (nsub, ymean [k], ysd [k])
        u [,k] <- w [k] / nsub
    }
    x <- as.vector (x)
    y <- as.vector (y)
    u <- as.vector (u)

    if (print)
        print (cbind (x, y, u) )
    if (plot)
    {   plot (x, y)
        points (xmean, ymean, pch=16, col="blue")
    }

    cor (x, y)
}
---------------------------

I didn't install the wCorr package, so you'll need to change the
second to last line.
The weights are in the vector, u (not w).

A subsample is generated for each group.
I've computed weights, such that the total weights for each group are
equal to the original weight for that group.
That sounds right, but I'm not completely sure.

Then call it using something like:
sim.data (x1, y1, x1_SD, NULL, corr_weight)

You can remove the print/plot parts if you want.
But if you call it with print=TRUE, then set nsub to a smaller value.
sim.data (x1, y1, x1_SD, NULL, corr_weight, print=TRUE, plot=TRUE, nsub=10)

If there's any problems, let me know.

On Mon, Jun 22, 2020 at 7:55 PM Frederik Feys <frefeys using gmail.com> wrote:
>
> Thanks Abby, some info on the data:
>
> score   score_SD        death_count     population_size
> x1              x1_SD           y1                      corr_weight
> 4.3             2.3                     5800            900.000
> 5.7             6.1                     250                     11.000.600
> ..              ..                      ..                      ..
>
> > Op 22 jun. 2020, om 02:02 heeft Abby Spurdle <spurdle.a using gmail.com> het volgende geschreven:
> >
> > I need to fix my mistakes, from earlier this morning.
> > The sums should be over densities, so:
> >
> > fh (X, Y) = [fh1 (X1, X1) + fh2 (X2, Y2) + ... + fhn (Xn, Yn)] / n
> >
> > fh (X, Y) = w1*fh1 (X1, X1) + w2*fh2 (X2, Y2) + ... + wn*fhn (Xn, Yn)
> >
> >    assuming the weights sum to 1
> >
> > If simulated data is used, then the expressions above can be replaced
> > with the union of multiple (sub)samples.
> > Then an estimate/inference (say correlation) can be computed from one
> > or more combined samples.
> >
> > Sorry, for triple posting.
> >
> >
> > On Mon, Jun 22, 2020 at 10:00 AM Abby Spurdle <spurdle.a using gmail.com> wrote:
> >>
> >> Hi Frederick,
> >>
> >> I glanced at the webpage you've linked.
> >> (But only the top three snippets).
> >>
> >> This is what I would call the sum of random variables.
> >> (X, Y) = (X1, X1) + (X2, Y2) + ... + (Xn, Yn)
> >>
> >> The example makes the mistake of assuming that the Xs are normally
> >> distributed, and each of the Ys are from exactly the same uniform
> >> distribution.
> >> By "combine"-ing both approaches, are you wanting to weight each pair?
> >>
> >> w1(X1, X1) + w2(X2, Y2) + ... + wn(Xn, Yn)
> >>
> >> I note that you haven't told us much about your data.
> >> There may be an easier way of doing things...
> >>
> >>
> >> On Mon, Jun 22, 2020 at 1:53 AM Frederik Feys <frefeys using gmail.com> wrote:
> >>>
> >>> Hello everyone
> >>>
> >>> At the moment I put a lot of attention in the uncertainty of my analyzes. I want to do a spearman correlation that takes into account the uncertainty in my observations and has weighting.
> >>>
> >>> uncertainty of observations: I came across this excellent blog that proposes a bootstrap function: https://www.r-bloggers.com/finding-correlations-in-data-with-uncertainty/
> >>>
> >>> weighted: I do weighted correlations with the wCorr package.
> >>>
> >>> Now I want to combine both approaches in one approach for a final analysis. How would you do that?
> >>>
> >>> Thanks for the help!
> >>>
> >>> Frederik Feys
> >>> PhD Medical Sciences
> >>> Onafhankelijk Methodoloog
> >>> https://www.researchgate.net/profile/Frederik_Feys
> >>> +32488020010
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>        [[alternative HTML version deleted]]
> >>>
> >>> ______________________________________________
> >>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list