[R-sig-ME] Regression analysis with small but complete dataset (fully representing reality)?

sree datta @reedt@8 @end|ng |rom gm@||@com
Sat Dec 26 07:27:05 CET 2020


Hi Diana

In addition to using descriptive statistics, I would also recommend using
Partial Least Squares regression that was specifically designed for the
problem of small sample size and having many variables. (your dependent can
be continuous, binary or multinomial in PLS). I have successfully used PLS
regression in medical / healthcare arena for rare and orphan disease
analyses where the affected population is very small and getting data from
30 patients represents any where from 25% to 60% of the overall population.

I strongly recommend this excellent resource (a detailed PDF document - 235
pages)  by Gaston Sanchez on his website:
https://www.gastonsanchez.com/PLS_Path_Modeling_with_R.pdf

Hope this helps. If you have any questions or need additional information
please get back to me and I can help you in identifying whether PLS
regression would be relevant and helpful for you.

Sree



<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon>
Virus-free.
www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

On Fri, Dec 25, 2020 at 12:08 PM Patrick (Malone Quantitative) <
malone using malonequantitative.com> wrote:

> Diana,
>
> cc'ing the list again in case anyone else has input
>
> I was asking if the missing was structural--for example, hours per shift if
> someone is unemployed at the time of measurement. In that scenario, you
> could have missing "values" but still completely observed *data*.
>
> Normally, I would assume that questions about missing data refer to
> incomplete observation, but you clearly have a special situation, which is
> why I asked.
>
> If your population data is completely observed, again, you don't need
> inferential statistics.
>
> If not, you do indeed have a sample of the data, not the population, even
> though you have most of it. I believe there are corrections that need to be
> made to inferential statistics for small populations. I don't have
> experience with that, but that might get you started.
>
> Pat
>
> On Fri, Dec 25, 2020 at 9:55 AM Diana Michl <dianamichl using aikq.de> wrote:
>
> > Hi Pat,
> >
> > thanks very much for your help! Helps me see things a bit more clearly.
> > Well, the present values aren't the only ones that could exist. There are
> > questions like "How long is your shift", which could be 3, 4, or 5 hours;
> > "How many shifts per week do you have", which could be between 1 and 7,
> or
> > "how many callers do you have per semester" which could be - in theory -
> > between 0 and thousands. Of course, there's only one response to every
> > question that's actually true.
> > (Maybe I'm misunderstanding your question, though, cause you probably
> > didn't mean whether there could be only one possible response to every
> > question, right?)
> >
> > Diana
> >
> >
> > Am 24.12.2020 um 17:22 schrieb Patrick (Malone Quantitative):
> >
> > Diana,
> >
> > It depends on the nature of the missing. Are the present values the only
> > ones that could exist? If so, you have the entire population's data, and
> > descriptive statistics are in fact preferable to inferential ones.
> There's
> > no need to run inferential statistics if you have the population--they
> are
> > by definition for inferring population values from a sample.
> >
> > Pat
> >
> > On Thu, Dec 24, 2020 at 6:21 AM Diana Michl <dianamichl using aikq.de> wrote:
> >
> >> I have a repeated measures design with about 16 cases and 5-6 points of
> >> measuring. Sometimes, 1-4 full cases or some points of measure are
> >> missing. (The measures are 20 numerical and categorical data taken from
> >> questionnaires.)
> >>
> >> The clue is: It's a small dataset with holes in it, but the 16 cases are
> >> all that even exist. So they fully represent reality wherever they're
> >> complete.
> >>
> >> I wanted to run logistic regressions with up to 6 predictors. But can I
> >> do that? I know about the many problems such small datasets have for
> >> regression analysis - but do they matter as much if there aren't any
> >> more cases in reality?
> >> Are descriptive analyses the only ones I can use?
> >>
> >> Many thanks
> >>
> >> --
> >> Dr. Diana Michl
> >> #www.diana-michl.de
> >>
> >> #Film: Der unberührte Garten - eine ungewöhnliche Geschichte übers
> >> Erwachsenwerden (www.vimeo.com/148014360)
> >>
> >> #Musik: Singer-Songwriter (www.youtube.com/user/ghiaghiafy)
> >>
> >>
> >>         [[alternative HTML version deleted]]
> >>
> >> _______________________________________________
> >> R-sig-mixed-models using r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >>
> >
> >
> > --
> > Patrick S. Malone, Ph.D., Malone Quantitative
> > NEW Service Models: http://malonequantitative.com
> >
> > He/Him/His
> >
> > --
> > Dr. Diana Michl
> > Kastanienallee 4
> > 14471 Potsdam
> > Tel: 0331 – 27 34 15 10
> > 01577 – 3065650
> > dianamichl using aikq.de
> >
> > #www.diana-michl.de
> >
> > #Film: Der unberührte Garten - eine ungewöhnliche Geschichte übers
> > Erwachsenwerden (www.vimeo.com/148014360)
> >
> > #Musik: Singer-Songwriter (www.youtube.com/user/ghiaghiafy)
> >
>
>
> --
> Patrick S. Malone, Ph.D., Malone Quantitative
> NEW Service Models: http://malonequantitative.com
>
> He/Him/His
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list