Lecture Notes in Computer Science
Effect of Spatial Attention in Early Vision for
Download 12.42 Mb. Pdf ko'rish
|
- Bu sahifa navigatsiya:
- Keywords
- 2 The Model
- 2.1 V1 Module
- 2.2 V2 Module
- 2.3 Posterior Parietal (PP) Module
- 3 Simulation Results
- 3.1 Simulations and Psychophysics for Ambiguous Block Stimuli
Effect of Spatial Attention in Early Vision for
the Modulation of the Perception of Border-Ownership Nobuhiko Wagatsuma, Ryohei Shimizu, and Ko Sakai Graduate School of Systems and Information Engineering, University of Tsukuba, 1-1-1 Ten-nodai, Tsukuba, Ibaraki, 305-8577, Japan wagatsuma@cvs.cs.tsukuba.ac.jp, shimizu@cvs.cs.tsukuba.ac.jp, sakai@cs.tsukuba.ac.jp http://www.cvs.cs.tsukuba.ac.jp/ Abstract. We propose a computational model consisting of mutually linked V1, V2, and PP modules. The model reproduces the effect of attention in the determination of border-ownership (BO) that tells which side of the contour owns the border. The V2 module determines BO based on surrounding contrast extracted by the V1 module that could be influenced by top-down spatial attention from the PP module. We carried out the simulations of the model with random-block ambiguous figures to test whether spatial attention alters BO for these meaningless stimuli. To compare quantitatively these results with human perception, we carried out psychophysical experiments corresponding to the simulations. The results of these two showed good agreement in that the perception of BO was flipped when altering the location of spatial attention. These results suggest that spatial attention is a crucial factor for the modulation of figure direction in meaningless figures, and that the effects of spatial attention in early visual area are crucial for the modulation of figure direction. Keywords: Attention, border ownership, figure, early vision, model, psychophysics. 1 Introduction We have a function that focuses on the most important and salient object or location at the moment, which is known as visual attention. Visual attention not only boosts our perception [1] but also alters the perception of an object, or figures, which is apparent in ambiguous figures such as the Rubin’s vase [2]. We propose that attention alters the contrast gain in early vision, and the modified contrast then alters the border-ownership (BO) signals that are essential for the determination of the figure direction [3, 4]. If attention is significant, perception of figure direction is flipped because of the modulation of the activities of BO-selective neurons. As a result, perception of the object is changed. In the case of Rubin’s vase, for example, we can perceive two objects, the vase and two facing faces, depending on which side we pay attention. Visual attention has two distinct modes: spatial attention and object-based attention. Both types of attention have been shown to facilitate human perception Effect of Spatial Attention in Early Vision for the Modulation of the Perception of BO 349
from a number of aspects [5]. In particular, a recent study has reported that spatial attention alters contrast gain in early visual areas [6], the mechanism for which have been reported by several modeling works [e.g.7]. These models focus on interaction between the visual attention and lower visual functions such as contrast sensitivity, however they cannot explain more complex perception like a figure/ground segregation. It has been reported that majority of neurons in monkey’s V2 and V4 showed BO selectivity: their responses change depending on which side of a border owns the contour [8]. Computational works have suggested that the BO coding is determined based on the surrounding suppression/facilitation observed in early visual areas, thus luminance contrast around the classical receptive field is crucial for the determination of BO [3,4]. These models, however, don’t reproduce the perception of BO for ambiguous figures in which BO flips alternatively. These previous studies led us to propose following hypothesis: spatial attention alters contrast gain in early vision then the increased contrast modifies the activities of BO selective neurons. Based on this hypothesis, we propose a network model consisting of mutually connected V1, V2, and PP module. Top-down spatial attention from PP alters contrast gain in V1. The change in the contrast signal then modifies activities of BO selective neurons in V2 because BO is determined solely from surrounding contrast. We carried out the simulations of the model and the corresponding psychophysical experiments to investigate the effect of attention in the BO determination. Results of the simulations and the psychophysics show good agreement: the direction of figure was flipped by spatial attention in ambiguous stimuli. In addition, the activities of BO model cells are modified depending on the location of the attention when Rubin’s vase is provided. These results suggest that perception of the figure direction is altered when spatial attention functions in early visual area. A number of previous studies have reported significant effects of attention in the visual area V2 [9] and V4 [9, 10], however we focus on the effects in early visual
Fig. 1. An illustration of the model architecture. This model consists of three modules, V1, V2 and PP, with mutual connections, except for PP to V2 pathway to avoid the direct influence of attention from PP to V2.
350 N. Wagatsuma, R. Shimizu, and K. Sakai area, V1, to investigate bottom-up attention that biases BO-selective neurons. The bottom-up attention seems to be crucial for the determination of BO, specifically for meaningless figures, because the latency of BO signal is short [8], and the switch of figure is achieved automatically [2]. Needless to say, BO-selective neurons in V2 and V4 might be affected directly by attention, however, it is not straight forward to explain how they alter the direction of BO. Here, we focus on the effects of attention in early visual area that modulate afferent BO neurons automatically and rapidly.
In our model, spatial attention alters contrast gain in early vision, and the increased contrast modifies the activities of BO selective neurons, which may underlie the switch of figure ground. The model consists of three modules: V1, V2 and Posterior Parietal (PP) modules, as illustrated in Fig.1. Top-down and bottom-up pathways link mutually these modules, except for PP to V2. We excluded this PP to V2 connection to avoid direct influence of the attention to BO model cells. Each module consists of 100x100 model cells distributed retinotopically. In the absence of external input, the activities of a cell at time t, A(t), is given by )) ( ( ) ( ) (
A F t A t t A μ τ + − = ∂ ∂ , (1) where the first term on the right side is a decay, and the second term shows the excitatory, recurrent signals among the excitatory model cells. Non-linear function,
( ) ) ) ( 1 ( 1 log 1 )) ( (
x T t x F r τ τ − − = , (2)
where τ is a membrane time-constant, and r T is absolute refractory period. The dynamics of this equation as well as appropriate values for constants has been widely studied [11].
The V1 module models the primary visual cortex, in which local contrast is extracted from an input stimulus, and spatial attention modulates the contrast gain. The input image Input is a 124x124 pixel, gray scale image with intensity values ranging between zero and one. The local contrast, C θω (x, y,t), is extracted by the convolution of the image with a gabor filter, G θω , C θω (x, y,t) = Input(x, y) ∗
θω (x, y) , (3)
where indices x and y are spatial positions, and ω represents spatial frequency. Orientation, θ , was selected from 0, π 2 ,
π and
2 3 π . The extracted contrast is modulated by spatial attention, thus the contrast at the attended location is enhanced. The activity of a model cell in V1 module, A θω
V1 , is given by Effect of Spatial Attention in Early Vision for the Modulation of the Perception of BO 351
τ ∂
θω
(t) ∂
= −
A θω
V 1 (t) + μ
θω xy V 1 (t)) +
−
(t) +
θω
(t) +
, (4) where 2 1 V V xy I − shows the feedback from V2 to V1, o I is random noise, and μ
θω C , is modulated by the feedback from PP to V1,
− 1 , as given by the following equation [7]: ∑ ∑ ∑ − − − ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ + + + + + = − = − = θω δ θω δ γ θω θω ) ( ) ( ) ( , 1 1 1 1 ) , , ( ) 1 2 )( 1 2 ( 1 )) , , ( ( ) ( t I J J j I I i t I t I E V xy PP V xy PP V xy PP V xy t i y j x C J I S t y x C t I ,
(5) where S in eq(5) prevents the denominator to become zero. γ and
δ are constants. In the V1 module, spatial attention influences contrast gain, therefore the contrast at the attended location is enhanced. 2.2 V2 Module The V2 module models BO-selective cells reported in V2 that determine the direction of BO. Activities of the BO model cells is determined based on the surrounding contrast signal extracted by the V1 module, as illustrated in Fig.2[3, 4]. Each BO model cell has single excitatory and inhibitory regions. The activity of a BO model cell is modulated based on the location and shape of these surrounding regions. To reproduce a wide variety of BO selectivity, we implemented ten types of BO-left and BO-right model cells with distinct surrounding regions.
in the excitatory surrounding region enhances the activity of the cell. In contrast, if contrast exists in the inhibitory region, the activity of BO-left cell is suppressed. A dominant model cell owns this border. In this case, BO-right cell owns the border. 352 N. Wagatsuma, R. Shimizu, and K. Sakai The activity of a BO-selective model cell is given by
+ + − + − = ∂ ∂ − ) ( )) ( ( )) ( ( ) ( ) ( , 1 2 , 2 , 2 , 2 , 2 γ μ τ , (6)
where BO V V xyN I , 1 2 − represents afferent input from V1. An index BO shows left- or right- BO selectivity, and N represents the type of BO model cells that is distinguished by their surround region. If BO-left model cells are more active than BO-right model cells, a figure is judged as located on the left side. The third term of the equation represents the input from inhibitory cells that gathers signals from all model cells in the layer. The activity of an inhibitory V2 model-cell is given by ∑ + + − = ∂ ∂
BO V xyN inh V inh V inh V t A F t A F t A t t A )) ( ( )) ( ( ) ( ) ( , 2 , 2 , 2 , 2 κ μ τ , (7) where κ is a constant. This inhibitory cell receives inputs from excitatory neurons in V2, and inhibits these neurons. 2.3 Posterior Parietal (PP) Module The PP module encodes spatial location, with the aim of facilitating the processing of the attended location. The location of spatial attention is given explicitly in this module, which will boost the contrast gain of the location in V1 module. PP module receives afferent inputs from V1 and V2 modules. The activity of an excitatory model-cell in the PP module is given by o A PP xy BT PP xy inh PP PP xy PP xy PP xy I t I t I t A F t A F t A t t A + + + − + − = ∂ ∂ ) ( ) ( )) ( ( )) ( ( ) ( ) ( , , , γ μ τ , (8)
A PP xy I , represents the strength of attention with a Gaussian shape, which mimics top- down attention, the details of which is out of the focus of this model. BT PP xy I , represents afferent inputs from V1 and V2 modules to PP module, this process could be considered as saliency map based on luminance contrast. When there is no top-down attention, the PP module will be activated by afferent signals from V1 and V2. The third term shows input from an inhibitory PP model-cell. The activity of the inhibitory cell is determined from the activities of all excitatory PP cells as in the case of eq(7). The PP module encodes spatial location, and facilitates the processing in and around the attended location in V1. Note that spatial attention does not directly affect BO-selective model-cells in V2 module, because we focus on the effect of spatial attention in early vision V1. 3 Simulation Results We carried out the simulations of the model with a variety of stimuli, in order to test the characteristics of the model in various situations. Specifically, we investigated whether human perception of the direction of figure is reproduced in ambiguous
Effect of Spatial Attention in Early Vision for the Modulation of the Perception of BO 353
figures. First, we compare the simulation results with that of corresponding psychophysical experiments for ambiguous, random-block figures (Fig.3(a)). Second, we present an example of the simulation results for well-known ambiguous figures, specifically Rubin’s vase (Fig.3(b)). 3.1 Simulations and Psychophysics for Ambiguous Block Stimuli First, we carried out the simulations of the model with the block objects as illustrated in Fig.3(a). These block objects are ambiguous figures; we can perceive a right- and left-hand object as figure. Fig.4 shows the simulation results for these ambiguous figures. Black and white bars represent the proportion that the black or white object is perceived as figure, respectively. We carried out the simulation of the model with three conditions: no attention, attending to the left- or right- object. By changing the location of the spatial attention, the dominant populations of BO model cells, either right or left, are switched. This result suggests that the perception of the figure direction is altered according to the attended location. To estimate whether the model reproduces the human perception, we carried out psychophysical experiments with similar settings to the simulations, as its procedure illustrated in Fig.5. We presented to human subjects the figures identical to those used in the simulations, and measured how they perceive the BO direction. Fig.6 shows the results of the psychophysical experiment. Subjects showed a tendency that the attended object is perceived as figure. This result suggests that spatial attention has influence for the determination of the figure direction in these meaningless block figures. Because this tendency was apparent similarly in the simulation results, we tested statistically whether there is a difference in the magnitude of the attention modulation between the model and psychophysics. Here, we define the magnitude of the attention modulation as below, )) ( ( )) ( ( )) ( ( )) ( ( without attn white white attn white without attn black black attn black m + = , (9)
where m represents the magnitude of modulation, black(x) shows the proportion that the black object is perceived as figure and white(x) does the white object. attn(y) is the conditions of the attention. There was no significant difference between the modulation magnitude between the model and psychophysics (ANOVA : p=0.6798). (a) (b) Fig. 3. Examples of stimuli. (a) Ambiguous, meaningless, random-block figures. Either black or white random block object may be perceived as figure. (b) the Rubin’s vase. A grey circle indicates the location and extent of the receptive field of BO model cells.
354 N. Wagatsuma, R. Shimizu, and K. Sakai The magnitude of the attention modulation of the model agrees with that of human perception. This result suggests that covert spatial attention could be a crucial factor for the modulation of figure direction.
Download 12.42 Mb. Do'stlaringiz bilan baham: |
ma'muriyatiga murojaat qiling