A reinforcement agent for threshold fusion

Maryam Shokri; Hamid R. Tizhoosh

doi:10.1016/J.ASOC.2006.12.003

Outline

A reinforcement agent for threshold fusion

Maryam Shokri

2008, Applied Soft Computing

https://doi.org/10.1016/J.ASOC.2006.12.003

visibility

…

description

8 pages

link

1 file

Abstract

Finding an optimal threshold in order to segment digital images is a difficult task in image processing. Numerous approaches to image thresholding already exist in the literature. In this work, a reinforced threshold fusion for image binarization is introduced which aggregates existing thresholding techniques. The reinforcement agent learns the optimal weights for different thresholds and segments the image globally. A fuzzy reward function is employed to measure object similarities between the binarized image and the original gray-level image, and provide feedback to the agent. The experiments show that promising improvement can be obtained. Three well-established thresholding techniques are combined by the reinforcement agent and the results are compared using error measurements based on ground-truth images.

Figures (10)

Fig. 1. Basic components of reinforcement learning.

Fig. 2. Overall structure of the reinforced threshold fusion for image binariza- tion. The selected thresholding techniques deliver their thresholds to the agent. Q-learning [26](Table 1) is applied as the RL algorithm. It is the most widely known and employed reinforcement learning algorithm. We mainly selected it because of its simple implementation and reportedly good performance. An optimal policy must be defined in order to make action selection possible. It enables the agent to map the current states to proper actions. We use the é-greedy policy

Definitions for weight adjustment factor A (constants and intervals are selected heuristically)

Fig. 4. From left to right: original images; corresponding ground truth images; results of Otsu method; results of Kittler method; results of Kapur method; results of proposed technique.

The error measure x (Eq. (7)) for the selected algorithms and their RL fusion

Overall performance (O,) of thresholding techniques based on ranking mea- surement

Fig. 5. Typical curves for accumulated rewards (plots of average reward vs. number of iterations). The average number of iterations for convergence of the algorithm is ~ 63 for illustrated cases. The typical curves for accumulated rewards (plots of average reward versus number of iterations) are presented in Fig. 5. As it can be seen from Table 6, the proposed technique reaches the highest ranking mainly because it has the lowest error in 8 cases and, in contrast to the fused methods, has not been the last choice for any of the tested images.

Average computing times of Otsu, Kittler, Kapur, and proposed technique for presented results (all methods implemented in MATLAB)

References (35)

A. Ayesh, Emotionally motivated reinforcement learning based controller, in: Proceedings of the IEEE SMC, The Hague, The Netherlands, 2004.
H.R. Berenji, Fuzzy Q-learning: a new approach for fuzzy dynamic programming problems, in: Proceedings of the Third IEEE International Conference on Fuzzy Systems, Orlando, FL, June 1994.
D. Bhandari, N.R. Pal, D.D. Majumder, Fuzzy divergence, probability measure of fuzzy events and image thresholding, Pattern Recogn. Lett. 13 (1992) 857-867.
K.R. Castleman, Digital Image Processing, Prentice Hall, Upper Saddle River, New Jersey, 1998, 07458.
S. Gadanho, Reinforcement learning in autonomous robots: an empirical investigation of the role of emotions, Ph.D. Thesis, University of Edin- burgh, Edinburgh, 1999.
P.Y. Glorennec, Fuzzy Q-learning and dynamical fuzzy Q-learning, in: Proceedings of the Third IEEE International Conference on Fuzzy Sys- tems, IEEE Press, Piscataway, NJ, 1994, pp. 474-479.
P.Y. Glorennec, L. Jouffe, Fuzzy Q-learning, in: Proceedings of the Sixth International Conference on Fuzzy Systems, Barcelona, Spain, (1997), pp. 659-662.
R.C. Gonzalez, R.E. Woods, Digital Image Processing, 2nd ed., Prentice Hall, New Jersey, 2002, 07458.
W.D. Gregg, Analog and Digital Communication, John Wiley and Sons, Inc., USA, 1977.
R.W. Hamming, Error detecting and error correcting codes, Bell System Tech. J. 29 (1950) 933-968.
L.K. Huang, M.J.J. Wang, Image thresholding by minimizing the mea- sures of fuzziness, Pattern Recogn. 28 (1995) 41-51.
L. Jouffe, Fuzzy inference system learning bay reinforcement methods, IEEE Trans. Syst. Man Cybern. 28 (1998) 338-355.
L.P. Kaelbling, M.L. Littman, A.W. Moore, Reinforcement learning: a survey, J. Artif. Intell. Res. 4 (1996) 237-285.
J.N. Kapur, P.K. Sahoo, A.K.C. Wong, A new method for gray-level picture thresholding using the entropy of the histogram, Comput. Vis. Graph. Image Process. 29 (1985) 273-285.
J. Kittler, J. Illingworth, Minimum error thresholding, Pattern Recogn. 19 (1) (1986) 41-47.
S.U. Lee, S.Y. Chung, R.H. Park, A Comparative performance study of several global thresholding techniques for segmentation, Comput. Vis. Graph. Image Process. 52 (1990) 171-190.
N. Otsu, A threshold selection method from gray level histogram, IEEE Trans. Syst. Man Cybern. SMC-9 (1979) 62-66.
S.J. Russell, P. Norvig, Artificial Intelligence: A Modern Approach, Prentice Hall, Englewood Cliffs, N.J., 1995.
P.K. Sahoo, S. Soltani, A.K.C. Wong, Y.C. Chen, A survey of thresholding technique, Comput. Vis. Graph. Image Process. 41 (1988) 233-260.
B. Sankur, M. Sezgin, Survey over image thresholding techniques and quantitative performance evaluation, J. Electron. Imaging 13 (1) (2004) 146-165.
M. Shokri, H.R. Tizhoosh, Using reinforcement learning for image thresholding, Can. Conf. Electr. Comput. Eng. 1 (2003) 1231-1234.
M. Shokri, H.R. Tizhoosh, Q(l)-based image thresholding, Can. Conf. Comput. Robot Vis. (2004) 504-508.
R.S. Sutton, A.G. Barto, Reinforcement Learning: An Introduction, MIT Press, Cambridge, Mass, 1998.
W. Tsai, Moment-preserving thresholding: a new approach, Comput. Vis. Graph. Image Process. 29 (1985) 377-393.
M.A. Walker, An application of reinforcement learning to dialogue strategy selection in a spoken dialogue system for email, J. Artif. Intell. Res. 12 (2000) 387-416.
C.J.C.H. Watkins, Learning from Delayed Rewards, Cambridge Uni- versity, Cambridge, 1989.
C.J.H. Watkins, P. Dayan, Technical note, Q-learning, Mach. Learn. 8 (1992) 279-292.
H. Yan, Unified formulation of a class of image thresholding techniques, Pattern Recogn. 29 (12) (1996) 2025-2032.
P. Yin, Maximum entropy-based optimal threshold selection using deter- ministic reinforcement learning with controlled randomization, Signal Process. 82 (2002) 993-1006.
J. Yoshimoto, Sh. Ishii, M. Sato, Application of reinforcement learning to balancing of acrobat, IEEE Int. Conf. Syst. Man Cybern. (2003) 516-521.
W. Zhang, T.G. Dietterich, Value function approximations and job-shop scheduling, Workshop on Value Function Approximation in Reinforce- ment Learning at ICML-95, 1995, submitted.
S. Ali, K.A. Smith, On learning algorithm selection for classification, Appl. Soft Comput. J. 6 (2) (2006) 119-138.
I. Ko ´kai, A. Lo ¨rincz, Fast adapting value estimation-based hybrid archi- tecture for searching the world-wide web, Appl. Soft Comput. J. 2 (1) (2002) 11-23.
M. Shokri, H.R. Tizhoosh, M. Kamel, The outline of reinforcement learning technique for e-learning applications, in: S. Pierre (Ed.), E- Learning Networked Environments and Architectures: A Knowledge Processing Perspective, Springer Book Series, 2007.
B. Bhanu, Y. Lin, Object detection in multi-modal images using genetic programming, Appl. Soft Comput. J. 4 (2) (2004) 175-201.

A reinforcement agent for threshold fusion

Sign up for access to the world's latest research

Abstract

Related papers

References (35)

Related papers

Related topics

Cited by