Academia.eduAcademia.edu

Outline

Knowledge and data fusion in probabilistic networks

2003

Abstract

Probability theory provides the theoretical basis for a logically coherent process of combining prior knowledge with empirical data to draw plausible inferences and to refine theories as observations accrue. Increases in the expressive power of languages for expressing probabilistic theories have been accompanied by refinements and adaptations of Bayesian learning methods to handle the more expressive constructs. These innovations have established Bayesian learning as a unifying theoretical framework for learning in intelligent systems, and have given rise to practical techniques that are receiving wide application. This paper describes theory and methods for exact and approximate learning of probabilistic theories from a combination of background knowledge and observations. The concepts and methods can be adapted to any knowledge representation framework that can express probability distributions over interpretations of a first-order logic. We focus specifically on methods to learn theories that can be expressed in the Multi-Entity Bayesian Network (MEBN) probabilistic logic. MEBN logic is sufficiently general to represent a probability distribution over interpretations of any set of statements that can be expressed in firstorder predicate calculus. Bayesian inference provides both a proof theory for combining prior knowledge with observational evidence to derive plausible conclusions and a learning theory for refining a representation as observational evidence accrues. A formal specification is provided for the MEBN logic. A semantics is based on random variables provides a logically coherent foundation for open world reasoning. The paper describes modifications of standard Bayesian learning methods to handle the repeated structures that occur in MEBN theories. Methods are given for specifying domain knowledge as MEBN fragments with structure and parameter prior distributions.

FAQs

sparkles

AI

How does knowledge-data fusion enhance learning in probabilistic networks?add

The paper reveals that integrating expert knowledge with empirical data significantly improves the performance of intelligent agents by allowing them to adapt and refine their models effectively under diverse conditions.

What is the role of MEBN logic in probabilistic reasoning?add

MEBN logic extends Bayesian networks by allowing the representation of complex first-order theories, which enhances expressive power and modularity in learning probabilistic models.

How do Bayesian networks learn structure and parameters from data?add

The study discusses general methods that learn both the structural elements and parameters of Bayesian networks from observations, allowing for a coherent integration of expert guidance.

What methods facilitate knowledge integration in MEBN learning?add

The paper describes techniques such as using prior distributions over structures and parameters based on expert input to enhance the fusion of knowledge and data.

What challenges arise from incomplete data in MEBN logic?add

MEBN logic tackles the bias introduced by non-ignorable missing data through observability modeling and expert assessments to ensure accurate parameter estimates despite data limitations.

References (90)

  1. Fahiem Bacchus. Representing and Reasoning with Probabilistic Knowledge. MIT Press, Cambridge, Massachusetts, 1990.
  2. Fahiem Bacchus, Adam Grove, Joseph Y. Halpern and Daphne Koller. From statistical knowledge bases to degrees of belief. Artificial Intelligence 87: 75-143, 1997.
  3. Olav Bangsø and Pierre-Henri Wuillemin. Object Oriented Bayesian Networks: A Framework for Topdown Specification of Large Bayesian Networks and Repetitive Structures. Technical Report CIT-87.2-00-obphw1, Department of Computer Science, Aalborg University, 2000.
  4. Olav Bangsø, Helge Langseth, and Thomas Nielsen. Structural Learning in Object Oriented Domains. In Proceedings of the Thirteenth International Florida Artificial Intelligence Research Society Conference, pages 282-286. AAAI Press, Menlo Park, California, 2001.
  5. Matthew J. Beal and Zoubin Gharamani, The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures, Proceedings of the Seventh Valencia Conference on Bayesian Statistics, in press. Available electronically from http://www.gatsby.ucl.ac.uk/~zoubin/papers.html.
  6. Thomas Binford and Tod S. Levitt. Evidential reasoning for object recognition, I E E E Transactions Pattern Analysis and Machine Intelligence, in press.
  7. Xavier Boyen and Daphne Koller. Tractable inference for complex stochastic processes. In Proceedings of the Conference on Uncertainty in Artificial Intelligence, 1998.
  8. Michael P. Brown, Richard Hughey, Anders Krogh, I. Saira Mian, Kimmen Sjolander, and David Haussler. Using Dirichlet Mixture Priors to Derive Hidden Markov Models for Protein Families. In Proceedings of the First International Conference on Intelligent Systems for Molecular Biology, pages 47-55, AAAI/MIT Press, Menlo Park, California, 1993.
  9. Wray Buntine. Operations for Learning with Graphical Models. Journal of Artificial Intelligence Research, 2: 159-225, 1994.
  10. Enrique Castillo, José Manuel Gutiérrez, and Ali S. Hadi. Sensitivity Analysis in Discrete Bayesian Networks. IEEE Transactions on Systems, Man and Cybernetics, 27: 412-423, 1997. Eugene Charniak and Robert Goldman. A Bayesian Model of Plan Recognition. Artificial Intelligence, 64: 53-79, 1993.
  11. Ming-Hui Chen, Qui-Man Shao, and Joseph G. Ibrahim. Monte Carlo Methods in Bayesian Computation. Springer-Verlag, New York, New York, 2000.
  12. Gregory Cooper and Edward Herskovits. A Bayesian Method for the Induction of Probabilistic Networks from Data. Machine Learning, 9: 309-347, 1992.
  13. Veerle M. H. Coupe and Linda C. van der Gaag. Properties of Sensitivity Analysis of Bayesian Belief Networks. Universitet Utrecht UU-CS-1999-29, 1999.
  14. Robert G. Cowell, A. Phillip Dawid, and David J. Spiegelhalter. Sequential Model Criticism in Probabilistic Expert Systems. IEEE Transactions on Pattern Analysis and Machine Intelligence,15(3): 209-219, 1993.
  15. Robert G. Cowell, A. Phillip Dawid, Steffen L. Lauritzen, and David J. Spiegelhalter, Probabilistic Networks and Expert Systems. Springer-Verlag, New York, NY, 1999.
  16. Bruce d'Ambrosio. Local expression languages for probabilistic dependency. In Proceedings of the Seventh Conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann Publishers, San Mateo, California, 1991.
  17. Bruce d'Ambrosio, Masami Takikawa, and Daniel Upper. Dynamic Situation Modeling. In DARPA Information Survivability Conference and Exposition, 2001. Available electronically via http://www.iet.com/Projects/jspi/.
  18. Ernest Davis. Representations of Commonsense Knowledge. Morgan Kaufmann Publishers, Inc., San Mateo, CA, 1990.
  19. A. Philip Dawid. Statistical Theory: the Prequential Approach. Journal of the Royal Statistical Society, A 147: 278-292, 1984.
  20. Morris DeGroot and Mark J. Schervish. Probability and Statistics (3 rd edition). Addison Wesley, Boston, Massachusetts, 2002.
  21. Arthur P. Dempster, Nan M. Laird, and Donald Rubin. Maximum Likelihood Estimation from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, 39: 1-38, 1977.
  22. David Draper. Assessment and propagation of model uncertainty (with discussion). Journal of the Royal Statistical Society, B 57: 45-97, 1995.
  23. Marek J. Druzdzel and Fransisco Diaz. Criteria for Combining Knowledge from Different Sources in Probabilistic Models. In Notes from the Workshop on Fusion of Domain Knowledge with Data for Decision Support, Sixteenth Conference on Uncertainty in Artificial Intelligence, Stanford, California, 2000.
  24. Marek J. Druzdzel and Herbert A. Simon. Causality in Bayesian belief networks. In Proceedings of the Ninth Annual Conference on Uncertainty in Artificial Intelligence (UAI-93), pp. 3-11, Morgan Kaufmann Publishers, Inc., San Francisco, CA, 1993.
  25. Marek J. Druzdzel and Linda C. van der Gaag. Building probabilistic networks: "Where do the numbers come from?" IEEE Transactions on Knowledge and Data Engineering, 12: 481-486, 2000.
  26. Robert J. Elliott, Lakhdar Aggoun and John B. Moore. Hidden Markov Models: Estimation and Control. Springer-Verlag, New York, New York, 1995.
  27. Nir Friedman. The Bayesian Structural EM Algorithm. In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann Publishers, San Mateo, California, 1998.
  28. Nir Friedman. Learning belief networks in the presence of missing values and hidden variables. In Proceedings of the Fourteenth International Conference on Machine Learning, Morgan Kaufmann Publishers, San Mateo, California, 1997.
  29. Nir Friedman and Moises Goldszmidt. Sequential update of Bayesian network structure. In Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann Publishers, San Mateo, California, 1997.
  30. Nir Friedman and Moises Goldszmidt. Learning Bayesian Networks with Local Structure. In Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann Publishers, San Mateo, California, 1996.
  31. Nir Friedman and Daphne Koller. Being Bayesian about Network Structure. In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann Publishers, San Mateo, California, 2000.
  32. Daniel Geiger and David Heckerman. Advances in probabilistic reasoning. In Proceedings of the Seventh Conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann Publishers, San Mateo, California, 1991.
  33. Andrew Gelman, John B. Carlin, Hal S. Stern, and Donald B. Rubin. Bayesian Data Analysis. Chapman and Hall, London, 1995.
  34. Michael R. Genesereth and Nils J. Nilsson. Logical Foundations of Artificial Intelligence. Morgan Kaufmann Publishers, San Mateo, California, 1987.
  35. Christian Genest and J. Zidek. Combining probability distributions: A critique and annotated bibliography. Statistical Science 1(1), 114-148, 1986.
  36. Lise Getoor, Nir Friedman, Daphne Koller, and Avi Pfeffer. Learning Probabilistic Relational Models. In Saso Dzeroski and Nada Lavrac, editors. Relational Data Mining, Springer- Verlag, New York, New York, 2001.
  37. Lise Getoor, Daphne Koller, Benjamin Taskar, and Nir Friedman. Learning Probabilistic Relational Models with Structural Uncertainty. In Proceedings of the ICML-2000 Workshop on Attribute-Value and Relational Learning:Crossing the Boundaries, Stanford, California, 2000.
  38. W. Gilks, A. Thomas, and David J. Spiegelhalter. A language and program for complex Bayesian modeling. The Statistician, 43: 169-178, 1994.
  39. W. Gilks, Sylvia Richardson, and David J. Spiegelhalter. Markov Chain Monte Carlo in Practice. Chapman and Hall, London, 1996.
  40. Ulf Grenander. Elements of Pattern Theory. Johns Hopkins University Press, Baltimore, MD, 1995.
  41. Liang Gu and Kenneth Rose, Sub-state tying in tied mixture hidden Markov models, Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, pp.1013-1016, Istanbul, Turkey, Jun. 2000.
  42. Robin Hanson. Combinatorial information market design. Information Systems Frontiers 5(1):105-119, 2003.
  43. David Heckerman. Probabilistic Similarity Networks. MIT Press, Cambridge, MA, 1991.
  44. David Heckerman. A Tutorial on Learning with Bayesian Networks. Microsoft, Redmond, Washington, 1996.
  45. David Heckerman, Daniel Geiger, and David Maxwell Chickering. Learning Bayesian Networks: The Combination of Knowledge and Statistical Data. Machine Learning, 20: 197-243, 1995.
  46. Jennifer Hoeting, David Madigan, Adrian Raftery, and Chris Volinsky. Bayesian Model Averaging. Statistical Science, 14(4): 382-417, 1999.
  47. William H. Jefferys and James O. Berger, Ockham's razor and Bayesian analy\sis, American Scientist 80: 64-72, 1992.
  48. Finn V. Jensen. Bayesian Networks and Decision Graphs. Springer-Verlag, New York, New York, 2001.
  49. Michael I. Jordan, editor. Learning in Graphical Models. MIT Press, Cambridge, Massachussets, 1999.
  50. Robert Kass and Adrian Raftery. Bayes Factors. Journal of the American Statistical Association, 90(430): 773-795, 1995.
  51. Daphne Koller and Avi Pfeffer. Object-Oriented Bayesian Networks. In Uncertainty in Artificial Intelligence: Proceedings of the Thirteenth Conference, pages 302-313, Morgan Kaufmann Publishers, San Francisco, California, 1997.
  52. Helge Langseth and Olav Bangsø, Parameter learning in object oriented Bayesian networks. Annals of Mathematics and Artificial Intelligence, 31(1/4): 221-243, 2001.
  53. Helge Langseth and Thomas Nielsen. Fusion of Domain Knowledge with Data for Structured Learning in Object-Oriented Domains. Journal of Machine Learning Research, 2003
  54. Kathryn B. Laskey. Learning Extensible Multi-Entity Directed Graphical Models. In Proceedings of the Workshop on Artificial Intelligence and Statistics, 1999.
  55. Kathryn B. Laskey. Sensitivity Analysis for Probability Assessments in Bayesian Networks. IEEE Transactions in Systems, Man and Cybernetics, 25(6): 901-909,1995.
  56. Kathryn B. Laskey, Bruce d'Ambrosio, Tod Levitt, and Suzanne M. Mahoney. Limited Rationality in Action: Decision Support for Military Situation Assessment. Minds and Machines, 10(1): 53-77, 2000.
  57. Kathryn B. Laskey and Suzanne M. Mahoney. Network Engineering for Agile Belief Network Models. IEEE Transactions in Knowledge and Data Engineering, 2000.
  58. Kathryn B. Laskey and Suzanne M. Mahoney. Network Fragments: Representing Knowledge for Constructing Probabilistic Models. In Uncertainty in Artificial Intelligence: Proceedings of the Thirteenth Conference, Morgan Kaufmann Publishers, San Mateo, California, 1997.
  59. Kathryn B. Laskey, Suzanne M. Mahoney, and Edward Wright. Hypothesis Management in Situation-Specific Network Construction. In Uncertainty in Artificial Intelligence: Proceedings of the Seventeenth Conference, Morgan Kaufmann Publishers, San Mateo, California, 2001.
  60. Kathryn B. Laskey and James M. Myers. Population Markov Chain Monte Carlo. Machine Learning, 50(1): 175-196, 2003.
  61. Stephan Lauritzen. Graphical Models. Oxford Science Publications, Oxford, 1996. Stephan Lauritzen. The EM algorithm for graphical association models with missing data. Computational Statistics & Data Analysis, 19: 191-201, 1995.
  62. Tod S. Levitt, C. Larrabee Winter, Charles J. Turner, Richard A. Chestek, Gil J. Ettinger, and Steve M. Sayre. Bayesian Inference-Based Fusion of Radar Imagery, Military Forces and Tactical Terrain Models in the Image Exploitation System/Balanced Technology Initiative. International Journal of Human-Computer Studies, 42, 1995.
  63. Roderick JA Little and Donald B. Rubin. Statistical Analysis with Missing Data. Wiley, New York, NY, 1987.
  64. David Madigan and Jeremy York. Bayesian Graphical Models for Discrete Data. International Statistical Review, 63: 215-232, 1995.
  65. Suzanne M. Mahoney. Network Fragments. PhD thesis, George Mason University, Fairfax, Virginia, 1999.
  66. Suzanne M. Mahoney and Kathryn B. Laskey. Representing and Combining Partially Specified Conditional Probability Tables. In Uncertainty in Artificial Intelligence: Proceedings of the Fifteenth Conference, Morgan Kaufmann Publishers, San Mateo, California, 1999.
  67. Suzanne M. Mahoney and Kathryn B. Laskey. Constructing Situation Specific Networks. In Uncertainty in Artificial Intelligence: Proceedings of the Fourteenth Conference, Morgan Kaufmann Publishers, San Mateo, California, 1998.
  68. Ron Musick. Minimal Assumption Distribution Propagation in Belief Networks. In Proceedings of the Ninth Conference on Uncertainty in Artificial Intelligence, pages 251-258, Morgan Kaufmann Publishers, San Mateo, California, 1993.
  69. Liem Ngo and Peter Haddawy. Answering Queries from Context-Sensitive Probabilistic Knowledge Bases. Theoretical Computer Science, 171:147, 1996.
  70. Richard E. Neapolitan and James R. Kenenvan. Investigation of Variances in Belief Networks. In Proceedings of the 7th Conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann Publishers, San Mateo, California, 1991.
  71. Ann Nicholson, Tal Boneh, Tim Wilkin, Kaye Stacey, Liz Sonenberg, and Vicki Steinle. A Case Study in Knowledge Discovery and Elicitation in an Intelligent Tutoring Application. In Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann Publishers, San Mateo, California, 2001.
  72. Agnieszka Onisko, Marek J. Druzdzel, and Hanna Wasyluk. Learning Bayesian Network Parameters from Small Data Sets: Application of Noisy-OR gates. In Working Notes of the Workshop on Bayesian and Causal Networks: From Inference to Data Mining, 12th European Conference on Artificial Intelligence, Berlin, Germany. Available online at http://www.pitt.edu/~druzdzel/publ.html, 2000.
  73. Judea Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers, San Francisco, California, 1988.
  74. Judea Pearl. Causality. Cambridge University Press, Cambridge, UK, 2000.
  75. Kim-Leng Poh. Utility-based Categorization. PhD thesis, Stanford University, Stanford, California, 1993.
  76. Marco Ramoni and Paola Sebastiani. Robust learning with missing data. Machine Learning, 45: 147-170, 2001.
  77. Christian Robert. The Bayesian Choice (2 nd edition). Springer-Verlag, New York, New York, 2001. Christian Robert and George Casella. Monte Carlo Statistical Methods. Springer-Verlag, New York, New York, 1999.
  78. Stuart Russell. Rationality and Intelligence. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, 1995.
  79. Stuart Russell and Peter Norvig. Artificial Intelligence: A Modern Approach. Prentice-Hall, New York, New York, 1995.
  80. Stuart Russell and Devika Subramanian. Provably Bounded Optimal Agents. Journal of Artificial Intelligence Research, 2, 1995.
  81. David Schum. Evidential Foundations of Probabilistic Reasoning. Wiley, New York, New York, 1994.
  82. Dana S. Scott. A type-theoretic alternative to CUCH, ISWIM, OWHY. Manuscript, 1969. Later published in Theoretical Computer Science 121: 411-440, 1993.
  83. John F. Sowa. Knowledge Representation: Logical, Philosophical, and Computational Foundations. Pacific Grove, California, Brooks/Cole Thomson Learning, 2000.
  84. David J. Spiegelhalter and Adrian Smith. Bayes Factors and Choice Criteria for Linear Models. Journal of the Royal Statistical Society B, 42: 213-220, 1980.
  85. David J. Spiegelhalter, Andrew Thomas, and Nicky Best. Computation on Graphical Models. Bayesian Statistics, 5: 407-425,1996.
  86. Harald Steck and Tommi Jaakkola. On the Dirichlet prior and Bayesian regularization. Proceedings of the Neural Information Processing Society, 2002a.
  87. Harald Steck and Tommi Jaakola. Unsupervised active learning in large domains. Proceedings of the Seventeenth Annual Conference on Uncertainty in Artificial Intelligence, 2002b.
  88. Masami Takikawa, Bruce d'Ambrosio and Ed Wright, Real-time inference with large-scale temporal Bayes nets, Proceedings of the Conference on Uncertainty in Artificial Intelligence, 2001. Henri Thiel. Principles of Econometrics. Wiley, New York, New York, 1971.
  89. Manfred Warmuth and Olivier Bousquet. Tracking a small set of experts by mixing past posteriors, Journal of Machine Learning Research 3:363-296, 2002.
  90. Joe Whittaker.. Graphical Models in Applied Multivariate Statistics. Wiley, Chichester, 1990.