[R] Re lative Novice ? "Can I get some explanation of the docs for fitdistr(MASS)?"
r.ted.byers at gmail.com
Fri Sep 19 21:28:03 CEST 2008
In the docs I see:
fitdistr(x, densfun, start, ...)
x A numeric vector.
densfun Either a character string or a function returning a density
evaluated at its first argument.
Distributions "beta", "cauchy", "chi-squared", "exponential", "f", "gamma",
"geometric", "log-normal", "lognormal", "logistic", "negative binomial",
"normal", "Poisson", "t" and "weibull" are recognised, case being ignored.
OK, on first glance this seemed simple enough. But now I am puzzled about
the precise meaning of:
"x A numeric vector."
Yes, I know it is a vector of numbers, but what precisely are those numbers
supposed to represent? I have data, imported into R, where each column
after the first is an independant sample, and the numbers are exactly the
observed fraction of of one kind of event that produces an event of a second
kind occuring in a given week (specified by an integer in the first column).
This ISN'T an empirical density function or probability function since there
is a fraction of the events of the first kind that never result in an event
of the second kind (hence the sum of all the fractions within a column
varies between 0.25 and 0.45).
My first question is, how do I tell fitdistr what my data means and get it
to relate the data in column i to the data in column 1, (asuming arrays in R
count columns from 1 rather than 0 as is sthe case in C++ and Java).
My second question is, given that all columns end this week, how do I use
fitdistr, or what it produces, to get an estimate of the number of events of
my second type that can be expected next week, or that can be expected over
the next six months, with a confidence interval?
I am hoping that I can iterate over columns 2 through n to apply fitdistr to
each column in turn, and then use the answer to my second question to get
estimates from each. Obviously, the total number of events I should expect
to see next week will be the sum of the estimates from each of the columns.
Is fitdistr able to handle the data I have, or do I have to massage my data
into a form fitdistr can handle? If the latter, is there something
available with R that will make the massage simple and quick?
View this message in context: http://www.nabble.com/Relative-Novice---%22Can-I-get-some-explanation-of-the-docs-for-fitdistr%28MASS%29-%22-tp19578052p19578052.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help