A brief introduction to neural networks

Lyle Ungar

Outline

A brief introduction to neural networks

Lyle Ungar

1997

Abstract

Abstract Artificial neural networks are being used with increasing frequency for high dimensional problems of regression or classification. This article provides a tutorial overview of neural networks, focusing on back propagation networks as a method for approximating nonlinear multivariable functions. We explain, from a statistician's vantage point, why neural networks might be attractive and how they compare to other modern regression techniques. KEYWORDS: neural networks function approximation backpropagation.

Key takeaways
AI

Neural networks efficiently model large, complex regression and classification problems, such as handwritten ZIP code recognition.
Backpropagation networks use sigmoidal activation functions and require many parameters for flexibility in approximation.
Overfitting remains a challenge; cross-validation is essential for optimizing network architecture and training iterations.
Neural networks lack interpretability, functioning more as black boxes compared to traditional statistical methods.
The paper aims to provide a comprehensive tutorial on neural networks, focusing on their methodologies and applications.

Figures (9)

A two layer network with 9 input nodes (the 2's), a bias node and linear activation functions was started from a random initial point for the weights. The estimated weights are shown in Table 2.

As a first generalization of standard linear regression, one might consider an additive adal anf the farm: Regression methods can be characterized either as projection methods, where linear or non-linear combinations of the original variables are used as predictors (such as principal components regression), or subset selection methods, where a subset of the original set. of input variables is used in the regression (e.g. stepwise regression). Table 4 characterizes several popular methods. Neural networks are a nonlinear projection method: they use non- linear combinations of optimally weighted combinations of the original variables to predict the response.

References (41)

Barron, A.R. (1994), \Approximation and Estimation Bounds for Arti cial Neural Net- works." Machine Learning 14, 115-133
Barron, A.R., Barron, R.L., and Wegman, E.J. (1992), \Statistical learning networks: A unifying view", in Computer Science and Statistics: Proceedings of the 20th Symposium on the Interface, edited by E.J. Wegman, 192-203.
Baxt, W. G. and H. White. \Bootstrapping con dence intervals for clinical input vari- able e ects in a network trained to indentify the presence of acute myocardial infarction." Neural Computation 7 (1995) 624-638
Bishop, C. M. (1995) Neural Networks for Pattern Recognition. Oxford: Clarendon Press.
Broomhead, D.S., and D.Lowe (1988), \Multivarible functional interpolation and adap- tive networks." Complex Systems 2, 321-355.
Buntine, W.L. and A.S. Weigend (1991), \Bayesian back-propagation." Complex Sys- tems 5, 603-643.
Cheng, B. and D.M. Titterington (1994), "Neural Networks: a review from a statistical perspective, with discussion." Stat. Science 9(1) 2-54
De Veaux, R. D., Psichogios, D. C., and Ungar, L. H. (1993a) \A Tale of Two Non- Parametric Estimation Schemes: MARS and Neural Networks." in Fourth International Workshop on Arti cial Intelligence and Statistics.
De Veaux, R. D., Psichogios, D. C., and Ungar, L. H. (1993b) \A Comparison of Two Non-Parametric Estimation Schemes: MARS and Neural Networks." Computers and Chemical Engineering, 17(8), 819-837.
De Veaux, R. D., et al. (1996) \Applying Regression Prediction Intervals to Neural Networks" in preparation.
DeWeerth, S.P., Nielsen, L., Mead, C.A., Astrom, K.J. (1991), \A Simple neuron servo," IEEE Transactions on Neural Nets, 2(2), 248-251.
Friedman, J.H. (1991), \Multivariate adaptive regression splines" The Annals of Statis- tics 19(1), 1-141.
Friedman, J.H. and Silverman, B., (1987) \Flexible parsimonious smoothing and additive modeling ",Stanford Technical Report, Sept. 1987.
Friedman, J.H., and Stuetzle W. (1981) \Projection pursuit regression," Journal of the American Statistical Association, 76, 817-823.
Geman, S., and Bienenstock E. (1992), \Neural Networks and the Bias/Variance Dilemma." Neural Computation 4, 1-58.
Hastie, T. and Tibshirani, R., (1986),\Generalized Additive Models." Statistical Science 1:3, 297-318.
Hastie, T.J. and Tibshirani, R.J., Generalized Additive Models, Chapman and Hall, London, 1990
Haykin, S., Neural Networks: A comprehensive Foundation, Macmillan, NY, 1994
Hertz,J., Krogh A., and Palmer, R.G. (1991). Introduction to the Theory of Neural Computation, Addison-Welsey, Reading, MA.
Holcomb, T. R., and Morari, M. (1992) \PLS/Neural Networks." Computers and Chem- ical Engineering 16:4, 393-411.
Hwang, J.T. Gene and A. Adam Ding. \Prediction Intervals in Arti cial Neural Net- works" (1996).
Kulkarni, A.D. (1994), \Arti cial neural networks for image understanding" Van Nos- trand Renhold, New York.
Leonard J., Kramer, M., and Ungar, L.H. (1992) \Using radial basis functions to ap- proximate a function and its error bounds." IEEE Transactions on Neural Nets, 3(4), 624-627.
Lippmann, R.P., (1987) \An introduction to computing with neural nets." IEEE ASSP Magazine, 4-22.
MacKay, D.J.C., (1992) \A Practical Bayesian Framework for Backpropagation Net- works." Neural Computation, 4, 448-472.
Mammone, R.J., editor, (1994) \Arti cial neural networks for speech and vision" Chap- man and Hall, New York.
Miller, III, W.T, Sutton, R.S., and Werbos. P.J., eds. (1990) Neural Networks for Control MIT Press, Cambridge, Mass..
Moody, J. and Darken, C.J., (1989) \Fast learning in networks of locally tuned processing units." Neural Computation, 1, 281-329.
Narendra, K. S., and Parthasarathy, K. (1990) \Identi cation and Control of Dynamical Systems Using Neural Networks." IEEE Transactions on Neural Networks 1, 4-27.
Poggio, T., and Girosi, F. (1990) \Regularization algorithms for learning that are equiv- alent to multilayer networks." Science 247, 978-982.
Pomerleau, D.A. (1990) \Neural network based autonomous navigation." Vision and Navigation: The CMU Navlab Charles Thorpe, (Ed.) Kluwer Academic Publishers. Psichogios, D.C. and Ungar, L.H. (1992) \A Hybrid Neural Network | First Principles Approach to Process Modeling." AIChE Journal, 38(10), 1499-1511.
Qin, S. J., and McAvoy, T. J. (1992) \Nonlinear PLS Modeling Using Neural Networks." Computers and Chemical Engineering 16:4, 379-391.
Ripley, B.D. (1993) \Statistical Aspects of Neural Networks." In Networks and Chaos - Statistical and Probabilistic Aspects (eds. O.E. Barndor -Nielsen, J. L. Jensen and W.S. Kendall),40-123,Chapman and Hall, London.
Ripley, B.D. (1994), \Neural networks and related methods for classi cation." Journal of the Royal Statistical Society Series B 56(3) 409-437
Ripley, B.D. (1996) \Statistical Aspects of Neural Pattern Recognition and Neural Net- works" Cambridge University Press.
Rumelhart, D., Hinton, G., and Williams, R. (1986) \Learning Internal Representations by Error Propagation." Parallel Distributed Processing: Explorations in the Microstruc- tures of Cognition, Vol 1: Foundations Cambridge: MIT Press, 318-362.
Sanner, R.M. and Slotine J.-J.E., (1991) \Direct Adaptive Control with Gaussian Net- works." Proc. 1991 Automatic Control Conference, 3, 2153-2159.
Werbos, P., (1990) \Backpropagation through time: what it does and how to do it." Proceedings of the IEEE 78:1550-60.
White, H., (1989) \Learning in neural networks: a statistical perspective." Neural Computation, 1(4), 425-464.
Wytho , B.J. (1993) \Backpropagation neural networks -a tutorial." Chemometrics and Intelligent laboratory systems 18(2) 115-155
Zupan, J. and Gasteiger, J. (1991) \Neural networks: a new method for solving chemical problems or just a passing phase?" Analytica Chimica Acta, 248, 1-30.

A brief introduction to neural networks

Sign up for access to the world's latest research

Abstract

Key takeawaysAI

Related papers

References (41)

Related papers

Key takeaways
AI