[R] Distribution to use to calculate p values

Jim Lemon drjimlemon at gmail.com
Tue Apr 28 07:25:01 CEST 2015


Hi Lalitha,
If you want to find a reasonable model distribution for your data, try
plotting the histogram of the variable you want to predict and compare
this to the density curves of the distributions that you think will
fit. So for example:

# plot a histogram of a uniform distribution
hist(seq(1,10,length.out=100))
# overlay a normal density function with the same mean
lines(seq(1,10,length.out=91),dnorm(seq(1,10,by=0.1),mean=5.5)*30)

Not a very good fit, but:

hist(rnorm(100,5.5))
lines(seq(1,10,length.out=91),dnorm(seq(1,10,by=0.1),mean=5.5)*90)

Much better. You can then perform a "goodness of fit" test if you need
it to justify your choice of distribution. In most cases, you will
have to find a "family" (link function) to use in a generalized linear
modeling (glm) test.

Another approach is to use a non-parametric test if one gives an
appropriate answer to your question.

Jim


On Tue, Apr 28, 2015 at 5:07 AM, David Winsemius <dwinsemius at comcast.net> wrote:
>
> On Apr 27, 2015, at 10:50 AM, Lalitha Viswanathan wrote:
>
>> Hi
>> I have a dataset as below
>> Price Country Reliability Mileage Type Weight Disp. HP
>>
>>
>> 8895 USA 4 33 Small 2560 97 113
>> (Hundreds of rows)
>>
>> I am trying to find the best possible distribution to use, to find p-values
>> and compute which factors most influence efficiency.
>
> "Finding p-values" is a task that requires research questions. You obviously have some sort of meaning attached to the word "efficiency" but have not stated what it is. This appears to be a request for a statistical tutorial an a topic that has not been described. (And if this is course homework, then it is off-topic for r-help.)
>
>>
>> Any starting points for the functions I could use, or similar examples I
>> could follow, would be a start.
>> I am a relative novice at R having used it many years ago and am now
>> getting back to it.
>> So looking for pointers
>>
>> Thanks
>>
>>       [[alternative HTML version deleted]]
>
> The Posting Guide suggests that you create a small example in R code and describe your question more clearly (if it's not homework.)
>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list