Hi Marco Thank you very much for your helpful advice.
I have tried you suggestion of using method = 'lw' with cpquery and can now obtain conditional probabilities. However, I am still puzzled over the outputs from the predict() and cpquery functions. The network that I am working on has the following coefficients for the node that I am interested in (ABW): Parameters of node ABW (conditional Gaussian distribution) Conditional density: ABW | EST + TR + FFB + RF Coefficients: 0 1 2 (Intercept) -0.480612729 -5.834617332 0.809011487 TR 1.857271045 1.584331230 1.964198638 FFB 0.182533645 0.066891147 0.028620951 RF -0.002822838 0.002155205 -0.001608243 Standard deviation of the residuals: 0 1 2 1.5140402 1.1764351 0.9675918 Discrete parents' configurations: EST 0 K1 1 M1 2 M2 If I run predict() using this fitted network I get ABW results very close to those expected. For example, for test case 1, I get a predicted ABW of 15.022, which is very close to the actual ABW value for this case (14.871). However, running cpquery() using the values for this test case returns a conditional probability of 0 for all levels of ABW observed in the training data. For example, the conditional probability returned for a event where ABW<15 is 0; similarly the conditional probability for an event where ABW is between the minimum and maximum ABW values observed in the data is again zero; while the conditional probability of an ABW event >24 (which is in excess of all observed values) is 1. Why does cpquery not return a high conditional probability for an event which is predicted from the same coefficients? Many thanks for your assistance with these queries. Regards Ross On Mon, August 1, 2016 7:35 pm, Marco Scutari wrote: > Hi Ross, > > > On 31 July 2016 at 09:11, Ross Chapman <ross.chap...@ecogeonomix.com> > wrote: > >> I have tried running the cpquery in the debug mode, and found that it >> typically returns the following for instances where the conditional >> probability is returned as 0: >> >>> event matches 0 samples out of 0 (p = 0) >> >> Am I right in understanding that the Monte Carlo sampling has been >> unable to create any cases that match the query? If so, why would this >> be if the evidence used is very typical of an average case in the data >> used to train the network? > > Yes, that is what is happening. As to the reason why, I guess that the > dependencies in the data may not be adequately represented in the fitted > Bayesian network for some reason. What is apparent is that > (EST=='y' & TR>9 & BU>15819 & RF>2989) has an associated probability > low enough that you do not observe any such sample in rejection sampling. > Now, that being the case, you have two options: > > > 1) use a much larger "n" with a smaller "batch = 10^6" to generate a > lot more particles; 2) switch to likelihood weighting, i.e. > > > cpquery(fitted,event=(ABW<=11), evidence=list(EST ='y', TR = c(9, > max(data$TR)), BU = c(15819, max(data$BU)), RF = c(2989, max(data$RF)), > n=10^6, method = "lw") > > 3) look at the parameters in your fitted network and diagnose why this > is happening. > > Cheers, > Marco > > > -- > Marco Scutari, Ph.D. > Lecturer in Statistics, Department of Statistics > University of Oxford, United Kingdom > > ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.