Lecture Notes in Computer Science
Download 12.42 Mb. Pdf ko'rish
|
- Bu sahifa navigatsiya:
- bee rotate
- bird rotate
- Inverse-Halftoning for Error Diffusion Based on Statistical Mechanics of the Spin System
- Keywords
- 2 General Formulation
bird rotate 0 10 20 30 40 50 60 70 0 2 4 6 8 x 10 4 time
prediction error bird whirl 0 10 20 30 40 50 60 70 2 3 4 5 x 10
4 time
prediction error bee rotate 0 10 20 30 40 50 60 70 1 2 3 4 5 6 x 10 4 time prediction error bee whirl Fig. 6. This figure displays the prediction error of the embedding view predictor, near- est neighbor view predictor and iterative reconstruction view predictor for the two trajectories rotate and whirl of the bird and the bee. The error is a sum of absolute difference between the predicted and the actual view. Almost all time the embedding view predictor is superior to the other two. Furthermore, we analyzed the three predictors concerning their ability to pre- dict further views without being updated with actual views, i.e. we simulated an occlusion of the objects. To this end the three view predictors were again applied to the whirl and rotate trajectories but this time they had to rely solely on their own prediction from the 10th time step on. The results are shown in Fig. 7. It strikes that the embedding view predictor is able to reliably predict up to 10 further views while the other two predictors are only able to predict 2-3 further views. A possible explanation could be the higher ambiguity in the high Walking Appearance Manifolds without Falling Off 661
embedding view predictor nearest neighbor view prediction iterative reconstruction view prediction 0 5
15 20 25 0 2 4 6 8 x 10 4 time
prediction error bird rotate 0 5 10 15 20 25 0 2 4 6 8 10 12 x 10 4 time
prediction error bird whirl 0 5 10 15 20 25 0 2 4 6 8 10 12 x 10 4 time
prediction error bee rotate 0 5 10 15 20 25 0 2 4 6 8 10 12 x 10 4 time
prediction error bee whirl Fig. 7. This figure shows the prediction error of the three view predictors applied to the two trajectories rotate and whirl of the bird and the bee. From the 10th time step (view) on the objects are considered to be completely occluded. This means that the predictors have to rely entirely on their own prediction. It can be observed that the embedding predictor can reliably predict up to 10 further views while the other two predictors cannot predict more than two to three views. dimensional appearance space. This is a hint that predicting on the embedding of the appearance manifold in a low dimensional space is more appropriate for tracking the appearance of objects than predicting directly in the high dimen- sional appearance space. 6 Conclusion We introduced an approach for predicting views of an object by means of its appearance manifold. By applying Isomap to the various views of an object the appearance manifold of that object can be extracted and embedded into a lower dimensional space. A change of object appearance corresponds to a trajectory on the appearance manifold as well as its embedding. By keeping track of the position of the object on the embedded manifold it is possible to forecast the upcoming appearance. We used an iterative version of the reconstruction idea of LLE in order to map points from the embedding space back into the appearance space and showed that this maps points from the embedding space only to points on the appearance manifold, i.e. only valid views of the object are predicted. Simulations have shown that following the trajectory (and by doing so predicting views) is less error prone using the embedded manifold than its high dimensional equivalent. Furthermore, we have shown that predicting the appearance for several following time steps is also more accurate using the low dimensional embedding. We want to stress that the introduced approach is
662 N. Einecke et al. no full-fledged real object tracking system but rather a scheme for predicting complex views. In future work we want to investigate the possibility of using the simplex method for calculating the reconstruction weights as it implicitly constraints the weights to a convex combination. Furthermore, we want to analyze our approach with real objects and integrate it into a tracking architecture based on a view prediction and confirmation model, hopefully boosting the performance of the tracker strongly. References 1. Poggio, T., Edelman, S.: A network that learns to recognize three-dimensional objects. Nature 343, 263–266 (1990) 2. Edelman, S., Buelthoff, H.: Orientation dependence in the recognition of familiar and novel views of 3D objects. Vision Research 32, 2385–2400 (1992) 3. Ullman, S.: Aligning pictorial descriptions: An approach to object recognition. Cognition 32(3), 193–254 (1989) 4. Morency, L.P., Rahimi, A., Darrell, T.: Adaptive View-Based Appearance Models. In: Proceedings of CVPR 2003, vol. 1, pp. 803–812 (2003) 5. Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by Locally Linear Embedding. Science 290(5500), 2323–2326 (2000) 6. Tenenbaum, J.B., de Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000) 7. Zhang, Z., Zha, H.: Principal Manifolds and Nonlinear Dimensionality Reduction via Tangent Space Alignment. SIAM J. Sci. Comput. 26(1), 313–338 (2004) 8. Elgammal, A., Lee, C.S.: Inferring 3D Body Pose from Silhouettes Using Activity Manifold Learning. In: Proceedings of CVPR 2004, vol. 2, pp. 681–688 (2004) 9. Lim, H., Camps, O.I., Sznaier, M., Morariu, V.I.: Dynamic Appearance Modeling for Human Tracking. In: Proceedings of CVPR 2006, pp. 751–757 (2006) 10. Liu, C.B., et al.: Object Tracking Using Globally Coordinated Nonlinear Manifolds. In: Proceedings of ICPR 2006, pp. 844–847 (2006) 11. Saul, L.K., Roweis, S.T.: Think globally, fit locally: unsupervised learning of low dimensional manifolds. Journal of Machine Learning Research 4, 119–155 (2003)
M. Ishikawa et al. (Eds.): ICONIP 2007, Part I, LNCS 4984, pp. 663–672, 2008. © Springer-Verlag Berlin Heidelberg 2008 Inverse-Halftoning for Error Diffusion Based on Statistical Mechanics of the Spin System Yohei Saika
Wakayama National College of Technology, 77 Noshima, Nada, Gobo, Wakayama 644-0023, Japan saika@wakayama-nct.ac.jp Abstract. On the basis of statistical mechanics of the Q-Ising model with fer- romagnetic interactions under the random fields, we formulate the problem of inverse-halftoning for the error diffusion using the Floyd-Steinburg kernel. Then using the Monte Carlo simulation for a set of the snapshots of the Q-Ising model and a standard image, we estimate the performance of our method based on the mean square error and edge structures of the reconstructed image, such as the edge length and the gradient of the gray-level. We clarify that the optimal performance of the MPM estimate is achieved by suppressing the gradient of the gray-level on the edges of the halftone image and by removing a part of the halftone image if we set parameters appropriately. Keywords: Bayes inference, digital halftoning, error diffusion, inverse- halftoning, Monte Carlo simulation. 1 Introduction For many years, a lot of researchers have investigated information processing, such as image analysis, spatial data and the Markov-random fields [1-5]. In recent years, based on the analogy between probabilistic information processing and statistical mechanics, statistical-mechanical methods have been applied to image restoration [6] and error-correcting codes [7]. Pryce and Bruce [8] have formulated the threshold posterior marginal (TPM) estimate based on statistical mechanics of the classical spin system. Then Sourlas [9,10] has pointed out the analogy between error-correction of the Sourlas’ codes and statistical mechanics of spin glasses. Then Nishimori and Wong [11] have constructed the unified framework of image restoration and error- correcting codes based on statistical mechanics of the Ising model. Recently, the statistical-mechanical techniques are applied to various problems, such as mobile communication [12]. In the field of the print technology, a lot of techniques in information processing have played important roles to print a multi-level image with high quality. Especially digital halftoning is an essential technique to convert a multi-level image into a bi-level image which is visually similar to the original image [13]. Various techniques for digital halftoning have been established, such as the threshold mask method [14], the dither method [15], the blue noise mask method [16] and the error diffu- sion [17,18]. Also inverse-halftoning is an important technique to reconstruct the 664 Y. Saika
multi-level image from the halftone image [19]. For this purpose, various techniques [20] for image restoration have been used for inverse-halftoning. In recent years, the MAP estimate [21] has been applied both for the threshold mask method and the error diffusion method. Recently the statistical mechanical method has applied for the threshold mask method [21-23]. In this article, we show the statistical-mechanical formulation for the problem of inverse-halftoning of the error diffusion method using the maximizer of the posterior marginal (MPM) estimate. This method is based on the Bayes inference and then based on the Bayes formula the posterior probability can be estimated using the model prior and the likelihood. In this study, we use the model prior expressed by the Boltzmann factor of the Q-Ising model and the likelihood expressed by the Boltz- mann factor of the random fields enhancing the halftone image. Then using the Monte Carlo simulation both for a set of the snapshots of the Q-Ising model and a standard image we estimate the performance of our method based on the mean square error and edge structures observed in an original, halftoning and reconstructed images, such as the edge length and the gradient of the gray-level. In this study, we investigate the edge structures of the reconstructed image because the dot pattern with complex structures appearing in the halftone image is considered to influence on the perform- ance of inverse-halftoning. The simulation clarifies that the MPM estimate works effectively for the problem of inverse-halftoning for the halftone image converted by the error diffusion method using the Floyd-Steinburg kernel if we set the parameters appropriately. We also clarify that the optimal performance of our method is achieved by suppressing the gray-level difference between neighboring pixels and by removing a part of the edges which are embedded in the halftone image through the procedure of error diffusion. Further we clarify the dynamical properties of the MPM estimate for inverse-halftoning. If the parameters are set appropriately, the mean square error smoothly converges to the optimal value irrespective of the choice of the initial condi- tion. On the other hand, the convergent value of the MPM estimate depends on the initial condition of the Monte Carlo simulation. The present article is organized as follows. In chapter 2, we show the statistical- mechanical formulation for the problem of inverse-halftoning for the error diffusion. Then using the Monte Carlo simulation both for a set of the snapshots of the Q-Ising model and a gray-level standard image, we estimate the performance of the MPM estimate based on the mean square error and the edge structures of the original, half- tone and reconstructed images, such as the edge length and the gradient of the gray- level. The chapter 4 is devoted to summary and discussion. 2 General Formulation Here we show a statistical-mechanical formulation for the problem of inverse- halftoning for a halftone image which is generated by the error diffusion using the Floyd-Steinberg kernel. First, we consider a gray-level image {ξ
} in which all pixels are arranged on the lattice points located on the square lattice. Here we set as ξ
= 0,…,Q-1, x, y = 1,…,L.
Inverse-Halftoning for Error Diffusion 665
(a) (b) (c)
(d) (e) (f)
tone image converted from (a) by the error diffusion method using the Floyd-Steinburg kernel, (c) the 4-level image reconstructed by the MPM estimate when h s =1, T s =1, h=1, T=0.1, J=0.875, (d) the 256-level standard image “girl” with 100×100 pixels, (e) the halftone image converted from (d) by the error diffusion method using the Floyd-Steinburg kernel, (f) the 256- level image reconstructed from (e) by the MPM estimate when h=1, T=0.1, J=1.40.
In this study, we treat two kinds of the original images. One is the set of the gray- level images generated by a true prior expressed by the Boltzmann factor of the Q-Ising model: ( )
) , exp 1 } { Pr . . 2 ' ,' , , ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎣ ⎡ − − = ∑ n n y x y x s s s y x T h Z ξ ξ ξ
(1) where Z s is the normalization factor and n.n. is the nearest neighboring pairs. As shown in Fig. 1 (a), we can generate the gray-level image which has smooth struc- tures, as can be seen in the natural images, if we appropriately set the parameters h s
s . Then the other is the 256-level standard image “girl”, as shown in Figs. 1(d). Next, in the procedure of digital halftoning based on the error diffusion, we convert the gray-level image {ξ x,y } into a halftone image {τ x,y } which is visually similar to the original gray-level image, where we set as τ
=0, 255 and x, y = 1, …, L. The halftone images obtained due to the error diffusion method are in Figs. 1 (b) and (e). The error diffusion algorithm is performed following the block diagram in Fig. 2 and the Floyd- Steinburg kernel in Fig. 3. As shown in these figures, the algorithm proceeds through the image in a raster scan and then a binary decision at each pixel is made based on the input gray-level at the (x,y)-th pixel and filtered errors from the previous threshold samples. At the (x,y)-th pixel the gray-level u x,y is rewritten into the modified gray- level u’
as
. ) , ( , , , , ∑ ∈ − − − =
l k l y k x l k y x y x e h u u'
(2) 666 Y. Saika
Here {h k,l } is the Floyd-Steinburg kernel and S is the region which supports the site (x,y) by the Floyd-Steinburg kernel. Then e
is the error of the halftone image τ x,y to
the gray-level one u x,y at the site (x, y) as . ,
, y x y x y x u' e − = τ
(3) Here the pixel value τ x,y of the halftone image is obtained using the threshold proce- dure as . ) otherwize ( 0 ) 2 / ) 1 ( ( 1 , , ⎩ ⎨ ⎧ − ≥ − =
z' Q y x y x τ
(4) Next, using the MPM estimate based on statistical mechanics of the Q-Ising model, we reconstruct a gray-level image for the halftone image converted by the error diffu- sion method using the Floyd-Steinburg kernel. In this method, we use the model sys- tem which expressed by a set of Q-Ising spins {z
} (z x,y = 0,…, Q-1, x, y = 1,…, L) located on the square lattice. The procedure of inverse-halftoning is carried out so as to maximize the posterior marginal probability as ( )
} { | } { max arg ˆ , } { , , ∑ = ≠ y x z z y x z y x J z P z
(5) where the posterior probability is estimated based on the Bayes formula: ( ) ( ) ( ) { } |{ } = { } { } |{ }
P z J P z P J z
(6) using the model of the true prior and the likelihood. In this study, we assume the model of the true prior which is expressed by the Boltzmann factor of the Q-Ising model as { }
( ) ( ) . exp
1 . . 2 ' ,' , m m ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ − − = ∑ n n y x y x z z T J Z z P
(7) This model prior is expected to enhance smooth structures which can be seen in natu- ral images. Then, we assume the likelihood: { } { } (
( ) . ˆ exp
| , 2 , , m ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ − − ∝ ∑ y x y x y x z T h z P τ τ (8)
which generally enhances the gray-level image: ∑ − = + + = 1 1 , , , , ˆ
i j y i x j i y x a τ τ (9)
where {a i,j } is the kernel of the conventional filter. In this study, we set the halftone image itself which is obtained by the error diffusion method using the Floyd- Steinburg kernel. We note that our method corresponds to the MAP estimate in the limit of T → 0. Inverse-Halftoning for Error Diffusion 667 threshold Error filter a
ij ξ
τ x,y T x,y + ‐ E x,y + +
Download 12.42 Mb. Do'stlaringiz bilan baham: |
ma'muriyatiga murojaat qiling