[R] Waaaayy off topic...Statistical methods, pub bias, scientific validity
Ravi Varadhan
rvaradhan at jhmi.edu
Fri Jan 7 17:43:50 CET 2011
I have just recently written about this issue (i.e. open learning and data
sharing) in a manuscript that is currently under review in a clinical
journal. I have argued that data hoarding is unethical. Participants in
research studies give their time, effort, saliva and blood in the altruistic
hope that their sacrifice will benefit humankind. If they were to realize
that the real (ulterior) motive of the study investigators is only to
advance their careers, they would really think hard about participating in
the studies. The study participants should only consent to participate if
they can get a signed assurance from the investigators that the
investigators will make their data available for scrutiny and for public use
(under some reasonable conditions that are fair to the study investigators).
As Vickers (Trials 2006) says, "whose data is it anyway?" I believe that we
can achieve great progress in clinical research if and only if we make a
concerted effort towards open learning. Stakeholders (i.e. patients,
clinicians, policy-makers) should demand that all the data that is
potentially relevant to addressing a critical clinical question should be
made available in an open learning environment. Unless, we can achieve this
we cannot solve the problems of publication bias and inefficient and
sub-optimal use of data.
Best,
Ravi.
-------------------------------------------------------
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns
Hopkins University
Ph. (410) 502-2619
email: rvaradhan at jhmi.edu
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Spencer Graves
Sent: Friday, January 07, 2011 8:26 AM
To: Mike Marchywka
Cc: r-help at r-project.org
Subject: Re: [R] Waaaayy off topic...Statistical methods, pub bias,
scientific validity
I wholeheartedly agree with the trend towards publishing datasets.
One way to do that is as datasets in an R package contributed to CRAN.
Beyond this, there seems to be an increasing trend towards journals
requiring authors of scientific research to publish their data as well. The
Public Library of Science (PLOS) has such a policy, but it is not enforced:
Savage and Vickers (2010) were able to get the raw data behind only one of
ten published articles they tried, and that one came only after reminding
the author that s/he had agreed to making the data available as a condition
of publishing in PLOS. (Four other authors refused to share their data in
spite of their legal and moral commitment to do so as a condition of
publishing in PLOS.)
There are other venues for publishing data. For example, much
astronomical data is now routinely web published so anyone interested can
test their pet algorithm on real data
(http://sites.google.com/site/vousergroup/presentations/publishing-astronomi
cal-data).
Regarding my earlier comment, I just found a Wikipedia article on
"scientific misconduct" that mentioned the tendency to refuse to publish
research that proves your new drug is positively harmful. This is an
extreme version of both types of bias I previously mentioned: (1) only
significant results get published. (2) private funding provides its own
biases.
Spencer
#########
Savage and Vickers (2010) "Empirical Study Of Data Sharing By Authors
Publishing In PLoS Journals", Scientific Data Sharing, added Apr. 26, 2010
(http://scientificdatasharing.com/medicine/empirical-study-of-data-sharing-b
y-authors-publishing-in-plos-journals-2
<http://scientificdatasharing.com/medicine/empirical-study-of-data-sharing-b
y-authors-publishing-in-plos-journals-2/>).
On 1/7/2011 4:08 AM, Mike Marchywka wrote:
>
>
>
>
>
>
>> Date: Thu, 6 Jan 2011 23:06:44 -0800
>> From: peter.langfelder at gmail.com
>> To: r-help at r-project.org
>> Subject: Re: [R] Waaaayy off topic...Statistical methods, pub bias,
>> scientific validity
>>
>> > From a purely statistical and maybe somewhat naive point of view,
>> published p-values should be corrected for the multiple testing that
>> is effectively happening because of the large number of published
>> studies. My experience is also that people will often try several
>> statistical methods to get the most significant p-value but neglect
>> to share that fact with the audience and/or at least attempt to
>> correct the p-values for the selection bias.
> You see this everywhere in one form or another from medical to
> financial modelling. My solution here is simply to publish more raw
> data in a computer readable form, in this case of course something
> easy to get with R, so disinterested or adversarial parties can run their
own "analysis."
> I think there was also a push to create a data base for failed drug
> trials that may contain data of some value later. The value of R with
> easily available data for a large cross section of users could be to
> moderate problems like the one cited here.
>
> I almost
> slammed a poster here earlier who wanted a simple rule for "when do I
> use this test" with something like " when your mom tells you to" since
> post hoc you do just about everything to assume you messed up and
> missed something but a priori you hope you have designed a good
> hypothesis. And at the end of the day, a given p-value is one piece of
> evidence in the overall objective of learning about some system, not
> appeasing a sponsor. Personally I'm a big fan of post hoc analysis on
> biotech data in some cases, especially as more pathway or other theory
> is published, but it is easy to become deluded if you have a conclusion
that you know JUST HAS TO BE RIGHT.
>
> Also FWIW, in the few cases I've examined with FDA-sponsor rhetoric,
> the data I've been able to get tends to make me side with the FDA and
> I still hate the idea of any regulation or access restrictions but it
> seems to be the only way to keep sponsors honest to any extent. Your
> mileage may vary however, take a look at some rather loud disagreement
> with FDA over earlier DNDN panel results, possibly involving threats
against critics. LOL.
>
>
>
>
>
>> That being said, it would seem that biomedical sciences do make
>> progress, so some of the published results are presumably correct :)
>>
>> Peter
>>
>> On Thu, Jan 6, 2011 at 9:13 PM, Spencer Graves
>> wrote:
>>> Part of the phenomenon can be explained by the natural
>>> censorship in what is accepted for publication: Stronger results
>>> tend to have less difficulty getting published. Therefore, given
>>> that a result is published, it is evident that the estimated
>>> magnitude of the effect is in average larger than it is in reality,
>>> just by the fact that weaker results are less likely to be
>>> published. A study of the literature on this subject might yield an
>>> interesting and valuable estimate of the magnitude of this selection
bias.
>>>
>>>
>>> A more insidious problem, that may not affect the work of
>>> Jonah Lehrer, is political corruption in the way research is funded,
>>> with less public and more private funding of research
>>>
(http://portal.unesco.org/education/en/ev.php-URL_ID=21052&URL_DO=DO_TOPIC&U
RL_SECTION=201.html).
>>> For example, I've heard claims (which I cannot substantiate right
>>> now) that cell phone companies allegedly lobbied successfully to
>>> block funding for researchers they thought were likely to document
>>> health problems with their products. Related claims have been made
>>> by scientists in the US Food and Drug Administration that certain
>>> therapies were approved on political grounds in spite of substantive
>>> questions about the validity of the research backing the request for
>>> approval (e.g., www.naturalnews.com/025298_the_FDA_scientists.html).
>>> Some of these accusations of political corruption may be groundless.
>>> However, as private funding replaces tax money for basic science, we
>>> must expect an increase in research results that match the needs of
>>> the funding agency while degrading the quality of published
>>> research. This produces more research that can not be replicated --
>>> effects that get smaller upon replication. (My wife and I routinely
>>> avoid certain therapies recommended by physicians, because the
>>> physicians get much of their information on recent drugs from the
>>> pharmaceuticals, who have a vested interest in presenting their
>>> products in the most positive light.)
>>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Spencer Graves, PE, PhD
President and Chief Operating Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San Josi, CA 95126
ph: 408-655-4567
[[alternative HTML version deleted]]
More information about the R-help
mailing list