Lecture Notes in Computer Science

Effect of Spatial Attention in Early Vision for

bet	31/88
Sana	16.12.2017
Hajmi	12.42 Mb.
	#22381

1 ... 27 28 29 30 31 32 33 34 ... 88

Keywords
2 The Model
2.1 V1 Module
2.2 V2 Module
2.3 Posterior Parietal (PP) Module
3 Simulation Results
3.1 Simulations and Psychophysics for Ambiguous Block Stimuli

Effect of Spatial Attention in Early Vision for

the Modulation of the Perception of Border-Ownership

Nobuhiko Wagatsuma, Ryohei Shimizu, and Ko Sakai

Graduate School of Systems and Information Engineering, University of Tsukuba,

1-1-1 Ten-nodai, Tsukuba, Ibaraki, 305-8577, Japan

wagatsuma@cvs.cs.tsukuba.ac.jp,

shimizu@cvs.cs.tsukuba.ac.jp,

sakai@cs.tsukuba.ac.jp

http://www.cvs.cs.tsukuba.ac.jp/

Abstract. We propose a computational model consisting of mutually linked V1,

V2, and PP modules. The model reproduces the effect of attention in the

determination of border-ownership (BO) that tells which side of the contour

owns the border. The V2 module determines BO based on surrounding contrast

extracted by the V1 module that could be influenced by top-down spatial

attention from the PP module. We carried out the simulations of the model with

random-block ambiguous figures to test whether spatial attention alters BO for

these meaningless stimuli. To compare quantitatively these results with human

perception, we carried out psychophysical experiments corresponding to the

simulations. The results of these two showed good agreement in that the

perception of BO was flipped when altering the location of spatial attention.

These results suggest that spatial attention is a crucial factor for the modulation

of figure direction in meaningless figures, and that the effects of spatial

attention in early visual area are crucial for the modulation of figure direction.

Keywords: Attention, border ownership, figure, early vision, model, psychophysics.

1 Introduction

We have a function that focuses on the most important and salient object or location

at the moment, which is known as visual attention. Visual attention not only boosts

our perception [1] but also alters the perception of an object, or figures, which is

apparent in ambiguous figures such as the Rubin’s vase [2]. We propose that attention

alters the contrast gain in early vision, and the modified contrast then alters the

border-ownership (BO) signals that are essential for the determination of the figure

direction [3, 4]. If attention is significant, perception of figure direction is flipped

because of the modulation of the activities of BO-selective neurons. As a result,

perception of the object is changed. In the case of Rubin’s vase, for example, we can

perceive two objects, the vase and two facing faces, depending on which side we pay

attention.

Visual attention has two distinct modes: spatial attention and object-based

attention. Both types of attention have been shown to facilitate human perception

Effect of Spatial Attention in Early Vision for the Modulation of the Perception of BO

349

from a number of aspects [5]. In particular, a recent study has reported that spatial

attention alters contrast gain in early visual areas [6], the mechanism for which have

been reported by several modeling works [e.g.7]. These models focus on interaction

between the visual attention and lower visual functions such as contrast sensitivity,

however they cannot explain more complex perception like a figure/ground

segregation. It has been reported that majority of neurons in monkey’s V2 and V4

showed BO selectivity: their responses change depending on which side of a border

owns the contour [8]. Computational works have suggested that the BO coding is

determined based on the surrounding suppression/facilitation observed in early visual

areas, thus luminance contrast around the classical receptive field is crucial for the

determination of BO [3,4]. These models, however, don’t reproduce the perception of

BO for ambiguous figures in which BO flips alternatively.

These previous studies led us to propose following hypothesis: spatial attention alters

contrast gain in early vision then the increased contrast modifies the activities of BO

selective neurons. Based on this hypothesis, we propose a network model consisting of

mutually connected V1, V2, and PP module. Top-down spatial attention from PP alters

contrast gain in V1. The change in the contrast signal then modifies activities of BO

selective neurons in V2 because BO is determined solely from surrounding contrast. We

carried out the simulations of the model and the corresponding psychophysical

experiments to investigate the effect of attention in the BO determination. Results of the

simulations and the psychophysics show good agreement: the direction of figure was

flipped by spatial attention in ambiguous stimuli. In addition, the activities of BO model

cells are modified depending on the location of the attention when Rubin’s vase is

provided. These results suggest that perception of the figure direction is altered when

spatial attention functions in early visual area.

A number of previous studies have reported significant effects of attention in the

visual area V2 [9] and V4 [9, 10], however we focus on the effects in early visual

Fig. 1. An illustration of the model architecture. This model consists of three modules, V1, V2

and PP, with mutual connections, except for PP to V2 pathway to avoid the direct influence of

attention from PP to V2.

350

N. Wagatsuma, R. Shimizu, and K. Sakai

area, V1, to investigate bottom-up attention that biases BO-selective neurons. The

bottom-up attention seems to be crucial for the determination of BO, specifically for

meaningless figures, because the latency of BO signal is short [8], and the switch of

figure is achieved automatically [2]. Needless to say, BO-selective neurons in V2 and

V4 might be affected directly by attention, however, it is not straight forward to

explain how they alter the direction of BO. Here, we focus on the effects of attention

in early visual area that modulate afferent BO neurons automatically and rapidly.

2 The Model

In our model, spatial attention alters contrast gain in early vision, and the increased

contrast modifies the activities of BO selective neurons, which may underlie the

switch of figure ground. The model consists of three modules: V1, V2 and Posterior

Parietal (PP) modules, as illustrated in Fig.1. Top-down and bottom-up pathways link

mutually these modules, except for PP to V2. We excluded this PP to V2 connection

to avoid direct influence of the attention to BO model cells.

Each module consists of 100x100 model cells distributed retinotopically. In the

absence of external input, the activities of a cell at time t, A(t), is given by

))

(

)

(

)

(

t

A

F

t

A

t

t

A

−

∂

(1)

where the first term on the right side is a decay, and the second term shows the

excitatory, recurrent signals among the excitatory model cells. Non-linear function,

F(x), is given by

(

)

(

log

))

(

(

t

x

T

t

x

F

r

−

(2)

where

is a membrane time-constant, and

r

T is absolute refractory period. The

dynamics of this equation as well as appropriate values for constants has been widely

studied [11].

2.1 V1 Module

The V1 module models the primary visual cortex, in which local contrast is extracted

from an input stimulus, and spatial attention modulates the contrast gain. The input

image Input is a 124x124 pixel, gray scale image with intensity values ranging

between zero and one.

The local contrast, C

θω

(x, y,t), is extracted by the convolution of the image with a

gabor filter, G

θω

(x, y,t)

=

Input(x, y)

∗

G

θω

(x, y)

(3)

where indices x and y are spatial positions, and

represents spatial frequency.

Orientation,

, was selected from 0,

2 ,

and

. The extracted contrast is

modulated by spatial attention, thus the contrast at the attended location is enhanced.

The activity of a model cell in V1 module, A

θω

xy

, is given by

Effect of Spatial Attention in Early Vision for the Modulation of the Perception of BO

351

∂

A

θω

xy

V 1

(t)

∂

t

= −

θω

xy

V 1

(t)

μ

F(A

θω

xy

V 1

(t))

+

I

xy

V 1

−

V 2

(t)

+

I

θω

xy

V 1,E

(t)

+

I

o

(4)

where

1 V

V

xy

I

−

shows the feedback from V2 to V1,

o

I

is random noise, and

represents a scaling contrast. The local contrast,

θω

C

, is modulated by the feedback

from PP to V1,

PP

V

xy

I

−

, as given by the following equation [7]:

∑

∑ ∑

−

⎟

⎠

⎞

⎜

⎝

⎛

−

θω

)

(

)

(

)

(

)

(

)

)(

(

))

(

)

(

t

I

J

J

j

I

I

i

t

I

t

I

E

V

xy

PP

V

xy

PP

V

xy

PP

V

xy

t

i

y

j

x

C

J

I

S

t

y

x

C

t

I

(5)

where S in eq(5) prevents the denominator to become zero.

and

are constants. In

the V1 module, spatial attention influences contrast gain, therefore the contrast at the

attended location is enhanced.

2.2 V2 Module

The V2 module models BO-selective cells reported in V2 that determine the direction

of BO. Activities of the BO model cells is determined based on the surrounding

contrast signal extracted by the V1 module, as illustrated in Fig.2[3, 4]. Each BO

model cell has single excitatory and inhibitory regions. The activity of a BO model

cell is modulated based on the location and shape of these surrounding regions. To

reproduce a wide variety of BO selectivity, we implemented ten types of BO-left and

BO-right model cells with distinct surrounding regions.

Fig. 2. A mechanism of the BO determination [3, 4]. In the case BO-right cell, contrast signal

in the excitatory surrounding region enhances the activity of the cell. In contrast, if contrast

exists in the inhibitory region, the activity of BO-left cell is suppressed. A dominant model cell

owns this border. In this case, BO-right cell owns the border.

352

N. Wagatsuma, R. Shimizu, and K. Sakai

The activity of a BO-selective model cell is given by

o

BO

V

V

xyN

inh

V

BO

V

xyN

BO

V

xyN

BO

V

xyN

I

t

I

t

A

F

t

A

F

t

A

t

t

A

−

∂

−

)

(

))

(

))

(

)

(

)

(

(6)

where

BO

V

V

xyN

I

−

represents afferent input from V1. An index BO shows left- or right-

BO selectivity, and N represents the type of BO model cells that is distinguished by

their surround region. If BO-left model cells are more active than BO-right model

cells, a figure is judged as located on the left side. The third term of the equation

represents the input from inhibitory cells that gathers signals from all model cells in

the layer. The activity of an inhibitory V2 model-cell is given by

∑

−

∂

∂

Nxy

BO

V

xyN

inh

V

inh

V

inh

V

t

A

F

t

A

F

t

A

t

t

A

))

(

))

(

)

(

)

(

(7)

where

is a constant. This inhibitory cell receives inputs from excitatory neurons in

V2, and inhibits these neurons.

2.3 Posterior Parietal (PP) Module

The PP module encodes spatial location, with the aim of facilitating the processing of

the attended location. The location of spatial attention is given explicitly in this

module, which will boost the contrast gain of the location in V1 module.

PP module receives afferent inputs from V1 and V2 modules. The activity of an

excitatory model-cell in the PP module is given by

o

A

PP

xy

BT

PP

xy

inh

PP

PP

xy

PP

xy

PP

xy

I

t

I

t

I

t

A

F

t

A

F

t

A

t

t

A

−

∂

)

(

)

(

))

(

))

(

)

(

)

(

(8)

A

PP

xy

I

represents the strength of attention with a Gaussian shape, which mimics top-

down attention, the details of which is out of the focus of this model.

BT

PP

xy

I

represents

afferent inputs from V1 and V2 modules to PP module, this process could be considered

as saliency map based on luminance contrast. When there is no top-down attention, the

PP module will be activated by afferent signals from V1 and V2. The third term shows

input from an inhibitory PP model-cell. The activity of the inhibitory cell is determined

from the activities of all excitatory PP cells as in the case of eq(7). The PP module

encodes spatial location, and facilitates the processing in and around the attended location

in V1. Note that spatial attention does not directly affect BO-selective model-cells in V2

module, because we focus on the effect of spatial attention in early vision V1.

3 Simulation Results

We carried out the simulations of the model with a variety of stimuli, in order to test

the characteristics of the model in various situations. Specifically, we investigated

whether human perception of the direction of figure is reproduced in ambiguous

Effect of Spatial Attention in Early Vision for the Modulation of the Perception of BO

353

figures. First, we compare the simulation results with that of corresponding

psychophysical experiments for ambiguous, random-block figures (Fig.3(a)). Second,

we present an example of the simulation results for well-known ambiguous figures,

specifically Rubin’s vase (Fig.3(b)).

3.1 Simulations and Psychophysics for Ambiguous Block Stimuli

First, we carried out the simulations of the model with the block objects as illustrated

in Fig.3(a). These block objects are ambiguous figures; we can perceive a right- and

left-hand object as figure. Fig.4 shows the simulation results for these ambiguous

figures. Black and white bars represent the proportion that the black or white object is

perceived as figure, respectively. We carried out the simulation of the model with

three conditions: no attention, attending to the left- or right- object. By changing the

location of the spatial attention, the dominant populations of BO model cells, either

right or left, are switched. This result suggests that the perception of the figure

direction is altered according to the attended location.

To estimate whether the model reproduces the human perception, we carried out

psychophysical experiments with similar settings to the simulations, as its procedure

illustrated in Fig.5. We presented to human subjects the figures identical to those used

in the simulations, and measured how they perceive the BO direction. Fig.6 shows the

results of the psychophysical experiment. Subjects showed a tendency that the

attended object is perceived as figure. This result suggests that spatial attention has

influence for the determination of the figure direction in these meaningless block

figures. Because this tendency was apparent similarly in the simulation results, we

tested statistically whether there is a difference in the magnitude of the attention

modulation between the model and psychophysics. Here, we define the magnitude of

the attention modulation as below,

))

(

))

(

))

(

))

(

(

without

attn

white

white

attn

white

without

attn

black

black

attn

black

m

(9)

where m represents the magnitude of modulation, black(x) shows the proportion that

the black object is perceived as figure and white(x) does the white object. attn(y) is

the conditions of the attention. There was no significant difference between the

modulation magnitude between the model and psychophysics (ANOVA : p=0.6798).

(a) (b)

Fig. 3. Examples of stimuli. (a) Ambiguous, meaningless, random-block figures. Either black

or white random block object may be perceived as figure. (b) the Rubin’s vase. A grey circle

indicates the location and extent of the receptive field of BO model cells.

354

N. Wagatsuma, R. Shimizu, and K. Sakai

The magnitude of the attention modulation of the model agrees with that of human

perception. This result suggests that covert spatial attention could be a crucial factor

for the modulation of figure direction.

Download 12.42 Mb.

Do'stlaringiz bilan baham:

1 ... 27 28 29 30 31 32 33 34 ... 88