[R] cpquery problem
Ross Chapman
ross.chapman at ecogeonomix.com
Thu Aug 4 10:37:50 CEST 2016
Hi Marco
Thank you very much for your helpful advice.
I have tried you suggestion of using method = 'lw' with cpquery and can
now obtain conditional probabilities.
However, I am still puzzled over the outputs from the predict() and
cpquery functions.
The network that I am working on has the following coefficients for the
node that I am interested in (ABW):
Parameters of node ABW (conditional Gaussian distribution)
Conditional density: ABW | EST + TR + FFB + RF
Coefficients:
0 1 2
(Intercept) -0.480612729 -5.834617332 0.809011487
TR 1.857271045 1.584331230 1.964198638
FFB 0.182533645 0.066891147 0.028620951
RF -0.002822838 0.002155205 -0.001608243
Standard deviation of the residuals:
0 1 2
1.5140402 1.1764351 0.9675918
Discrete parents' configurations:
EST
0 K1
1 M1
2 M2
If I run predict() using this fitted network I get ABW results very close
to those expected. For example, for test case 1, I get a predicted ABW of
15.022, which is very close to the actual ABW value for this case
(14.871).
However, running cpquery() using the values for this test case returns a
conditional probability of 0 for all levels of ABW observed in the
training data. For example, the conditional probability returned for a
event where ABW<15 is 0; similarly the conditional probability for an
event where ABW is between the minimum and maximum ABW values observed in
the data is again zero; while the conditional probability of an ABW event
>24 (which is in excess of all observed values) is 1.
Why does cpquery not return a high conditional probability for an event
which is predicted from the same coefficients?
Many thanks for your assistance with these queries.
Regards
Ross
On Mon, August 1, 2016 7:35 pm, Marco Scutari wrote:
> Hi Ross,
>
>
> On 31 July 2016 at 09:11, Ross Chapman <ross.chapman at ecogeonomix.com>
> wrote:
>
>> I have tried running the cpquery in the debug mode, and found that it
>> typically returns the following for instances where the conditional
>> probability is returned as 0:
>>
>>> event matches 0 samples out of 0 (p = 0)
>>
>> Am I right in understanding that the Monte Carlo sampling has been
>> unable to create any cases that match the query? If so, why would this
>> be if the evidence used is very typical of an average case in the data
>> used to train the network?
>
> Yes, that is what is happening. As to the reason why, I guess that the
> dependencies in the data may not be adequately represented in the fitted
> Bayesian network for some reason. What is apparent is that
> (EST=='y' & TR>9 & BU>15819 & RF>2989) has an associated probability
> low enough that you do not observe any such sample in rejection sampling.
> Now, that being the case, you have two options:
>
>
> 1) use a much larger "n" with a smaller "batch = 10^6" to generate a
> lot more particles; 2) switch to likelihood weighting, i.e.
>
>
> cpquery(fitted,event=(ABW<=11), evidence=list(EST ='y', TR = c(9,
> max(data$TR)), BU = c(15819, max(data$BU)), RF = c(2989, max(data$RF)),
> n=10^6, method = "lw")
>
> 3) look at the parameters in your fitted network and diagnose why this
> is happening.
>
> Cheers,
> Marco
>
>
> --
> Marco Scutari, Ph.D.
> Lecturer in Statistics, Department of Statistics
> University of Oxford, United Kingdom
>
>
More information about the R-help
mailing list