MULTILAYER PERCEPTRON ARTIFICIAL NEURAL NETWORKS: AN APPROACH FOR LEARNING THROUGH THE BAYESIAN FRAMEWORK
Main Article Content
Abstract
The machine learning area has recently gained prominence and articial neural networks are among the most popular techniques in this eld. Such techniques have the learning capacity that occurs during an iterative process of model tting. Multilayer perceptron (MLP) is one of the rst networks that emerged and, for this
architecture, backpropagation and its modications are widely used learning algorithms. In this article, the learning of the MLP neural network was approached from the Bayesian perspective by using Monte Carlo via Markov Chains (MCMC) simulations. The MLP architecture consists of the input, hidden and output layers. In the structure, there are several weights that connect each neuron in each layer. The input layer is composed
of the covariates of the model. In the hidden layer there are activation functions. In the output layer, there are the result which is compared with the observed value and the loss function is calculated. We analyzed the network learning through simulated data of known weights in order to understand the estimation by the Bayesian method. Subsequently, we predicted the price of WTI oil and obtained a credibility interval for the
forecasts. We provide an R implementation and the datasets as supplementary materials.
Article Details
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
References
AGGARWAL, C. C.Neural networks and deep learning. Springer, 2018. 512p.BEALE, E. M. L.A derivation of conjugate gradients. In LOOSTMA, F. A.Numerical methods for nonlinear optimization, London: Academic Press, London,1972. p.39-43
CARLIN, B. P.; LOUIS, T. A.Bayesian methods for data analysis. 3.ed. BocaRaton:Chapman and Hall/CRC, 2008. 552p.
FRITSCH, S.; GUENTHER, F.; GUENTHER, M. F.Package neuralnet. TheComprehensive R Archive Network, 2016.
GELMAN, A.; CARLIN, J. B.; STERN, H. S.; RUBIN, D. B.Bayesian dataanalysis.2.ed. Boca Raton: Chapman and Hall, 2014. 690p.
GEMAN, S.; GEMAN, D. Stochastic relaxation, gibbs distributions, and thebayesian restoration of images.IEEE Transactions on pattern analysis and machineintelligence, v.6, n.6 p.721-741, 1984.
GOMES, L. E. S.ipeadatar: API Wrapper for Ipeadata. R package version 0.1.0,2019.
HASTINGS, W. K. Monte carlo sampling methods using markov chains and theirapplications.Biometrika, v.57, n.1, p.97-109, 1970.
HAYKIN, S.Redes neurais: princ ́ıpios e pr ́atica. 2.ed., Bookman Editora, 2003.898p.
HESTENES, M. R.; STIEFEL, E. Methods of conjugate gradients for solving linearsystems.Journal of research of the National Bureau of Standards, v.49, n.6, p.409-436, 1952.
ILONEN, J.; KAMARAINEN, J. K.; LAMPINEN, J. Differential evolution trainingalgorithm for feed-forward neural networks.Neural Processing Letters, v.17, n.1,p.93-105, 2003.
LEVENBERG, K. A method for the solution of certain non-linear problems in leastsquares.Quaterly Journal of Applied Mathematics, v.2, n.2, p.164-168, 1944.
MARQUARDT, D. W. An algorithm for least-squares estimation of non-linearparameters.Journal of the Society of Industrial and Applied Mathematics, v.11,n.2, p.431-441, 1963.
METROPOLIS, N.; ROSENBLUTH, A. W.; ROSENBLUTH, M. N.; TELLER, A.H.; TELLER, E. Equation of state calculations by fast computing machines.Thejournal of chemical physics, v.21 , p.1087-1091, 1953.
MÓDULO, M.Classificação Automática de Supernovas Usando Redes NeuraisArtificiais. 2016. p. Tese (Doutorado em Computa ̧c ̃ao Aplicada) - Instituto Nacionalde Pesquisas Espaciais (INPE), Sao José dos Campos, 2016.
MOLLER,M.Efficient training of feed-forward neural networks, 1997. p. Ph.D.Thesis, Computer Science Department, Aarhus University, Arhus, Denmark, 1997.
PLAUT, D.; NOWLAN, S. and HINTON, G. Experiments on Learning by Back-Propagation,Technical Report CMU-CS-86-126, Department of Computer Science,Carnegie Mellon University, Pittsburgh, PA, 1986.
PLUMMER, M.Jags version 4.3. 0 user manual[software manual], 2017.
R Core Team:R: A Language and Environment for Statistical Computing. RFoundation for Statistical Computing, Vienna, Austria, 2019. URL https://www.R-project.org/
TURKMAN, M.; PAULINO, C.; MURTEIRA, B.; SILVA, G.Estatística bayesiana.Lisboa: Fundação Calouste Gulbenkian. 2018. 616p.