[R] predictive accuracy

Marc Schwartz marc_schwartz at me.com
Tue May 31 17:22:31 CEST 2011


Ahmed,

I agree with your final statement, regarding the competency of contributors to the R lists. That has been my experience over a 10+ year time frame.

That being said, you are really seeking conceptual assistance regarding your particular problem, which is not R specific, albeit, you may require a solution in R once you have identified a possible course of action. 

There is another resource that you should join and post to, which is MedStats:

  http://groups.google.com/group/MedStats

That group is set up for such interaction, focused on, of course, medical applications such as yours. There are also quite a few of us on the R lists that participate in MedStats, so you will get some overlap. Of course, there are others on MedStats, who are also highly competent, who may not be useRs, but can offer conceptual insights.

As has been noted in both on and off list communications with you, Pfizer (like any very large pharma company) certainly has internal folks that will have been exposed to the unique nature of multi-national, multi-site clinical studies and should have the expertise to assist you. If they are either unwilling or not available in a timely fashion, then perhaps somebody from here or MedStats will contact you to engage separately. You may find however, that such assistance, depending upon scope and timeline, will ultimately come at a real financial cost, as part of a formal consultative engagement.

Regards,

Marc

On May 31, 2011, at 9:44 AM, El-Tahtawy, Ahmed wrote:

> 1-	I used R packages (design, lasso) to develop and validate prognostic models. I could have enclosed optimism from the model with and without the strong irrelevant predictor, but that will make the message very long (against guidelines for the site). 
> 
> 2-	This issue is challenging and interesting and we can all learn something from each other - no one answer is right, but, we are seeking the more reliable accurate method to handle such situation that many of us may encounter.
> 
> I came to this group after exhausting many consults. This group has some of the best minds in the area of using R to solve challenging issues.
> 
> Best Regards
> Ahmed 
> 
> 
> -----Original Message-----
> From: Mike Marchywka [mailto:marchywka at hotmail.com] 
> Sent: Thursday, May 26, 2011 8:55 PM
> To: gunter.berton at gene.com; El-Tahtawy, Ahmed
> Cc: r-help at r-project.org
> Subject: RE: [R] predictive accuracy
> 
> ----------------------------------------
>> Date: Thu, 26 May 2011 13:50:15 -0700
>> From: gunter.berton at gene.com
>> To: Ahmed.El-Tahtawy at pfizer.com
>> CC: r-help at r-project.org
>> Subject: Re: [R] predictive accuracy
>> 
>> 1. This is not about R, and should be taken off list.
> 
> Well, depending on what mod's think a little bit of
> generic "how do I REALLY use this tool" discussion may be of 
> benefit for all here- a maillist for a certain brand of hammer
> may discuss various uses and types of nails etc.
> 
> Pesonally I have an interest in this-if the OP will 
> post the data it may be possible to explore some analysis
> options. 
> 
>> 2. You are wading in an alligator infested swamp. Get help from
>> (other) statisticians at Pfizer (there are many good ones there).
> 
> I thought that is what statisticians do? LOL. 
> We don't know the situation- intern, looking for outside ideas after
> exhausting internals, specific issues with internal peers,
> summer student not wishing to bother everyone there for details etc. 
> 
>> 
>> Best,
>> Bert
>> 
>> P.S. The answer to all your questions is "no" (imho).
> 
> 
> 
>> 
>> 
>> 
>> On Thu, May 26, 2011 at 1:35 PM, El-Tahtawy, Ahmed
>> wrote:
>>> The strong predictor is the country/region where the study was
>>> conducted. So it is not important/useful for a clinician to use it (as
>>> long he/she is in USA or Europe).
>>> Excluding that predictor will make another 2 insignificant predictors to
>>> become significant!!  Can the new model have a reliable predictive
>>> accuracy? I thought of excluding all patients from other countries and
>>> develop the model accordingly- is the exclusion of a lot of patients and
>>> compromise of the power is more acceptable??
> 
> LOL, quite the contrary, post hoc selection increases power to find
> whatever you or sponsor desire... 
> 
> Presuming your general interest is in finding out attributes of a given
> drug under various conditions, you would probably want to combine 
> the observations with tentative thoughts on causality and see
> what makes the best story. 
> 
> Statistical significance in isolation is a function of the data and analysis method,
> doesn't really have anything specific to do with underlying systems.
> 
> In this case, if you have other continuous prognostic factors, say 
> age, LDH, hemoglobin come to mind, you may be able to find that you
> have nonmonotinc  relations between prognostic factor and outcome.
> But, furhter,say you have enough patients that you could in fact map
> dose response curves. It may turn out that this curve is in fact non-montonic
> with parameters non-monotonic in prognsotic factor. Consider 
> 
> avg_survival= a+b*d-c*d^2
> 
> where d is the dose. At for small d, it seems to help but for larger dose it 
> makes things worse. Now consider that "c" is a complicated function
> of hematocrit, it may not be hard to imagine that anemics and siderositic( is 
> that a word LOL?) have some underlying problems dealing with your drug. 
> These may be distributed geographically etc.
> 
> This is all stuff you can simulate in R or even on paper. 
> 
> 
> It sounds like you are already trying to write a label, which may
> be a bit premature ( although I defer to the guy from DNA for that LOL).
> " indicated for use in patients in Western Hemisphere with .... " 
> 
> You may have decent luck looking at FDA panel discussion transcripts, search for
> related general stats terms confined to "site:fda.gov"
> 
> 
>>> Thanks for your help...
>>> Al
>>> 
>>> -----Original Message-----
>>> From: Marc Schwartz [mailto:marc_schwartz at me.com]
>>> Sent: Thursday, May 26, 2011 10:54 AM
>>> To: El-Tahtawy, Ahmed
>>> Cc: r-help at r-project.org
>>> Subject: Re: [R] predictive accuracy
>>> 
>>> 
>>> On May 26, 2011, at 7:42 AM, El-Tahtawy, Ahmed wrote:
>>> 
>>>> I am trying to develop a prognostic model using logistic regression.
>>> I
>>>> built a full , approximate models with the use of penalization -
>>> design
>>>> package. Also, I tried Chi-square criteria, step-down techniques. Used
>>>> BS for model validation.
>>>> 
>>>>> The main purpose is to develop a predictive model for future patient
>>>> population.   One of the strong predictor pertains to the study design
>>>> and would not mean much for a clinician/investigator in real clinical
>>>> situation and have been asked to remove it.
>>>>> Can I propose a model and nomogram without that strong -irrelevant
>>>> predictor?? If yes, do I need to redo model calibration,
>>> discrimination,
>>>> validation, etc...?? or just have 5 predictors instead of 6 in the
>>>> prognostic model??
>>>> 
>>>> 
>>>> 
>>>> Thanks for your help
>>>> 
>>>> Al
>>> 
>>> 
>>> Is it that the study design characteristic would not make sense to a
>>> clinician but is relevant to future samples, or that the study design
>>> characteristic is unique to the sample upon which the model was
>>> developed and is not relevant to future samples because they will not be
>>> in the same or a similar study?
>>> 
>>> Is the study design characteristic a surrogate for other factors that
>>> would be relevant to future samples? If so, you might engage in a
>>> conversation with the clinicians to gain some insights into other
>>> variables to consider for inclusion in the model, that might in turn,
>>> help to explain the effect of the study design variable.
>>> 
>>> Either way, if the covariate is removed, you of course need to engage in
>>> fully re-evaluating the model. You cannot just drop the covariate and
>>> continue to use model fit assessments made on the full model.
>>> 
>>> HTH,
>>> 
>>> Marc Schwartz
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>> 
>> 
>> 
>> --
>> "Men by nature long to get on to the ultimate truths, and will often
>> be impatient with elementary studies or fight shy of them. If it were
>> possible to reach the ultimate truths without the elementary studies
>> usually prefixed to them, these would not be preparatory studies but
>> superfluous diversions."
>> 
>> -- Maimonides (1135-1204)
>> 
>> Bert Gunter
>> Genentech Nonclinical Biostatistics
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 		 	   		  
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list