[R] "Re-creating" distributions

Bert Gunter gunter.berton at gene.com
Fri Jun 8 06:29:51 CEST 2012


Related comment:

"Even the data aren't sufficient." -- Brian Joiner (some years ago).

Explanation: See W.E. Deming on "analytic" vs "enumerative" statistics.

--- Bert

On Thu, Jun 7, 2012 at 8:06 PM, R. Michael Weylandt
<michael.weylandt at gmail.com> wrote:
> Short answer: no, those are (in general) insufficient parameters to
> characterize a distribution.
>
> Long answer: unfortunately, it's not uncommon that those "summary
> statistics" are the only ones reported based on someone or other's
> limited experience with the Gaussian. There are a few things you could
> try, but each of them has problems:
>
> i) Pretend like your data is in fact normal and use those parameters
> because they do uniquely characterize a normal distribution. MASS
> (among others) provides a multivariate normal distribution [mvrnorm]
> if you have a covariance matrix available.
>
> ii) If you have reason to imagine another distribution [guided by
> domain knowledge], try to get its parameters in so far as possible by
> moment matching. Covariance structures are much harder for the general
> case though.
>
> iii) If you can get something that resembles original data, simply
> work by bootstrapping / imputation.
>
> Hope this helps,
> Michael
>
> On Thu, Jun 7, 2012 at 3:34 PM, Andras Farkas <motyocska at yahoo.com> wrote:
>> Dear All,
>>
>> I often have to work with certain models in which I try to "reproduce" a distribution the best I can with very little known information avaible. Is there a package or function in R that could best reproduce a probability distribution using only the mean, median and SD values availble without knowing the actual distribution type to begin with and/or the covariance matrix (for more then 1 data set)? All I usually have reported availble is mean, median and SD. I hope I made my question clear enough...
>>
>> thanks,
>>
>> Andras
>>
>>
>>        [[alternative HTML version deleted]]
>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm



More information about the R-help mailing list