Plans or Outcomes: How do we attribute intelligence to

Participants were presented with the following prompt

bet	4/5
Sana	27.01.2023
Hajmi	0,73 Mb.
	#1134484

1 2 3 4 5

Bog'liq
Intelligence final

Participants were presented with the following prompt:
PLEASE READ THE FOLLOWING INSTRUCTIONS CAREFULLY. Your task is to exit the maze by reaching
the red square in as few steps as possible. You can move one square at a time, by clicking on the white squares
near your character. Blue squares are walls. You cannot see through the walls, so the squares you cannot
see yet are black. The exit is equally likely to be hidden behind any of the black squares.
At the end of the
experiment we will add all steps you took and show you how you did compared to previous results.
You have an opportunity to earn a bonus for completing the mazes in fewer steps.
The instructions included an annotated image to disambiguate the meaning of terms like ‘wall’
and the reference to the agent character. After reading the instructions, participants completed
three practice trials, and answered a comprehension quiz. The quiz contained two multiple choice
questions ( My task is to .. and My bonus will be bigger if I .. ). Participants who answered
the comprehension quiz incorrectly proceeded with the experiment, but their responses were dis-
carded.
On each trial participants looked for an ‘exit’ (marked when visible as a bright red circle) in a
series of mazes, by controlling an agent using a mouse. The agent could move one grid square at
a time: N, W, S, or E, and had a 360 degree view of the maze limited by walls. A maze started
out dark (except for walls), but was uncovered as a participant opened new areas. Participants
initially saw the layout of the rooms and location of the barrier walls, but did not know where the
exit was. Participants received a performance-based bonus of up to $1 for finishing all mazes with
least steps. To receive the maximal bonus, participants needed to achieve a step cost within 5%
of the optimal solution. The bottom 10% of participants, determined offline after we analyzed the
responses, received no bonus. At the end of MST Search, participants answered a question about
how they made their decisions. The decision times (times to make a move) and the path taken were
recorded for each trial.
24

4.2.2
MST Attribution
Participants rated the intelligence of maze-solving agents, in a procedure identical to Experiment
1. As before, at the end of the task participants were asked how they made their decisions.
4.3
Stimuli
4.3.1
MST Search
There were 12 stimuli, which are shown on Fig. 8.
Figure 8: Stimuli used in MST Search in Experiment 2. Black cells indicate unseen areas. White
cells are empty cells already revealed. Grey cells are walls, and are always visible. The agent can
move through empty areas, but not through walls. The starting location was always at the top-left
corner. The exit was hidden behind one of the black squares. Solid lines indicate the averaged
path taken by the majority of participants. Here the average paths are calculated by taking a step in
each cell that is taken by the majority of the participants. The average paths were identical to the
optimal solution in all mazes but two – in these two case the optimal solution was the second most
common path. Dashed blue lines indicate the second most common path.
25

0.0
0.4
0.8
-2
-1
0
1
2
3
Planning in MST Search
Planning weights in MST Attribution
0.00
0.25
0.50
0.75
1.00
-2
-1
0
1
2
3
Planning in MST Search
Outcome weights in MST Attribution
Figure 9: Individual’s planning values in MST Search plotted against MST attribution weights
placed on planning (left) and on outcome (right) in Experiment 2. The better participants planned
in the MST Search, the more they weighted planning, and the less they weighted outcome in MST
Attribution.
4.3.2
MST Attribution
MST Attribution consisted of the 32 trials identical to Experiment 1.
4.4
Results
4.4.1
MST Search
We measure a participant’s planning ability in the same way as we measure an agent’s planning
in Experiment 1. We infer the decision noise parameter τ required to generate the participant’s
path and calculate planning values as z − score(ln(1/τ)) (see Section 2.3). We found a significant
correlation between participants’ planning in MST Search and the weights they placed on planning
in MST Attribution, r = .45, p < .001 (see Fig. 9 and MST Attribution results below), suggesting
that both processes use a common mental planning mechanism. We also found a significant anti-
correlation between an individual’s planning in MST Search and and the weighs placed on outcome
(r = −0.32, p > .001). This suggests that participants who were better at planning in MST Search
26

also placed less weight on outcome in MST Attribution. To further characterize behavior in general,
we calculated the average paths, by taking a step in each cell that is taken by the majority of the
participants. The average paths, and the second most common paths, are shown on Fig. 8. The
average paths were identical to the optimal trajectory in all mazes but two – in these two case the
optimal solution was the second most common path.
4.4.2
MST Attribution
The distribution of z-scored ratings for each type of agent are shown on Fig. 10. Each triangle
represents a participant’s mean rating of four agents of a given type.
Figure 10: Intelligence ratings for each type of agent in Experiment 2, following the same conven-
tions as in Experiment 1
We replicated the results of the MELM analysis from Experiment 1 to assess the effect of planning
and outcome on ratings of complete trials. We constructed a null model including only a random
effect of participant, and found that it was significantly improved by modeling random slopes for
outcome per participant χ
2
(1) = 1433.9, p < .001. The null model was also significantly improved
by modeling random slopes for planning per participant, χ
2
(1) = 1437.2, p < .001. Adding ran-
dom slopes for planning per participant to the outcome-only model resulted in further significant
improvement, χ
2
(1) = 850.4, p < .001, as did adding random slopes for outcome to the planning-
only model χ
2
(1) = 847.06, p < .001. Mean rating predictions from each model compared to mean
human ratings are shown on Fig. 11. The outcome and planning weights fitted to individuals are
27

Figure 11: Mean rating predictions from each model compared to mean human ratings in Ex-
periment 2. Error bars indicate 95% confidence intervals. Agent labels are: OL - optimal-lucky,
OF - optimal-fair
, OUL - optimal-unlucky, PS-R - pseudo-random, SL - suboptimal-lucky, SUL -
suboptimal-unlucky
, OPT-part - optimal-part, SUB-part - suboptimal-part. The combined (Both)
model is the closest to human attributions.
shown on Fig. 12.
We also replicated the effect of planning on ratings of incomplete trials. We constructed a MELM
null model with only a random intercept for participant, and found that it was significantly im-
proved by adding a random slope for planning, χ
2
(1) = 41.211, p < .001, meaning that on trials
where outcome was not shown planning alone still had an effect on attributed intelligence. The
correlation between the weights placed on planning on incomplete and on all trials was signifi-
cant (r = .73, p < .0001), meaning that participants’ planning-based attributions on all trials can be
predicted from how they evaluated incomplete trials.
4.4.3
Task Order Effects
We compared the planning and outcome MELM random slope coefficients of participants in Ex-
periment 1 and 2 (Fig. 13) using an independent sample t-test t(166) = −2.68, p = .008, and
found that in Experiment 2 participants’ planning weights were higher than in Experiment 1.
There was no significant difference between the weights placed on outcome in Experiment 1
and 2. An independent sample sample t-test comparing outcome coefficients was not significant,
t
(166) = 1.9, p = .06.
28

0.0
0.4
0.8
1.2
0.0
0.4
0.8
1.2
Outcome weight
Planning weight
Figure 12: The MELM random slope coefficients for outcome and planning fitted to individuals
in Experiment 2. Error bars indicate bootstrapped 95% confidence intervals. The red line visually
separates participants who weight planning more than outcome, and who weight outcome more
than planning. Both weights are different from zero for all participants.
4.5
Discussion
In the second Experiment we replicated the findings of Experiment 1. We additionally found a sig-
nificant relationship between one’s performance in MST Search and attributing intelligence based
on planning in MST Attribution. Specifically, we found that participants’ proximity to the optimal
planner in MST Search predicted planning-based attributions of intelligence in MST Attribution.
This suggests that intelligence attribution depends on cognitive abilities related to planning, as
measured by MST Search and CRT.
MST Search and MST Attribution both require approximating rational planning. The finding that
the quality of planning approximations during search and during evaluation are correlated, suggests
that people may use a similar cognitive mechanism to plan for themselves, and to evaluate others.
If the two mechanisms were unrelated, we would have seen no such correlation. For example, one
could do the task by guessing, but expect others to follow the optimal plan. One could also do the
task rationally, and yet evaluate others based on outcome. Our results exclude this possibility, and
29

Figure 13: MELM random slope coefficients of individual participants in Experiment 1 and 2.
Participants in Experiment 2 (who completed the MST Search) were fitted with higher planning
coefficients.
suggest that there is a dependence between how well people plan, and how strongly they expect
good planning from others.
We also found that in the second experiment outcome-bias was reduced through an increase in the
weights placed on planning in MST Attribution after completing MST search. At the same time,
we have observed a marginal difference between outcome weights placed in Experiments 1 and
2 (p = .06), meaning that the weights placed on outcome may have been reduced as well. This
suggests that, at least for some participants, the choice of attribution strategy is flexible, and can be
influenced by context. In Experiment 3, we further verify this result by presenting MST Attribution
and MST Search in a counter-balanced order.
We note that the reduction in outcome bias in MST Attribution could happen by training the par-
ticipants planning skills, in addition to increasing the salience of task constraints. If so, then MST
Search performance could be affected by task order in Experiment 3, assuming MST Attribution
and Search tasks train one another.
Notably, in Experiments 1 and 2 the effect of optimal planning is consistent with real-world expec-
tations: optimal planning minimizes unlucky outcomes, so that when optimal agents are unlucky,
their outcome is on average better, compared to suboptimal-unlucky agents. In Experiment 3 we
30

designed a new set of stimuli that equalized the number of steps across optimal and suboptimal
agents, to preempt the concern that outcome depends on planning, and potentially increase the
discriminating strength of the design. We achieved this by manipulating the starting location of
each agent: suboptimal agents tended to start closer to the goal, so that on average they took the
same number of steps as optimal agents and differed only in the quality of their planning. As
a result of this manipulation, the optimal agents in Experiment 3 "play with a handicap", which
means that the aggregate statistics of the stimuli may differ from what subjects expect in real life.
By replicating the results of Experiments 1,2 in Experiment 3 we show that our findings generalize
across the environments, and are independent of the aggregate outcomes of optimal and suboptimal
agents.
5
Experiment 3
In the third experiment our goal was to replicate the results of Experiments 1 and 2, controlling for
the total number of steps taken by different types of agents. We compared four types of agents:
optimal-lucky, suboptimal-lucky, optimal-unlucky and suboptimal-unlucky, and varied the agent’s
starting location so that optimal and suboptimal agents always took an equal number of steps. The
lucky agents took between 7 and 11 steps. The unlucky agents took between 14 and 23 steps. In
Experiment 3 we also use a different set of MST Search stimuli to ensure the generalizability of our
MST Search results. As a result of this manipulation, we expected to see a more interpretable dif-
ference between planning-based and outcome-based attributions. We also presented MST Search
and MST Attribution in a counterbalanced order, to verify the task order effects observed between
Experiments 1 and 2.
5.1
Participants
One hundred and thirty six participants were recruited online via Amazon Mechanical Turk
3
, re-
stricted to US participants and to participants who had not taken the previous experiments in this
3
We recruited participants until we had 120 who passed the exclusion criterion.
31

series. Of our initial pool, 14 participants were discarded for failing the instruction quiz, and two
failed the verbal responses. The exclusion procedure was identical to the procedure in Experi-
ments 1 and 2. The analysis thus included 120 participants (47 females, 73 males, median age 34,
SD
= 9.7).
5.2
Method
The experiment was presented on a computer screen in a web browser using a JavaScript inter-
face developed in our lab. Participants first read a consent page, on which they provided their
age and gender and read a short description of the experiment. Following the consent participants
completed MST Search and MST Attribution (detailed below). The task order was counterbal-
anced between participants. At the end of the experiment participants answered 3 CRT questions,
identical to those in Experiment 1.
5.2.1
MST Search
Participants searched for an exit from a maze, as in Experiment 2. After reading the instructions
(similar to Experiment 2, except that the subjects did not receive a bonus), participants completed
three practice trials and answered a comprehension quiz. As before, participants who answered the
comprehension quiz incorrectly proceeded with the experiment, but their responses were discarded.
At the end of the task participants answered a question: How did you make your decisions?
5.2.2
MST Attribution
This task was similar to the MST Attribution used in Experiments 1 and 2. As before, after reading
instructions (same as Experiments 1 and 2), participants answered a comprehension quiz. Partici-
pants who answered the comprehension quiz incorrectly proceeded with the experiment, but their
responses were discarded. Participants then viewed and rated three familiarization examples fol-
lowed by 24 more trials. At the end of the task participants answered a question: How did you
make your decisions?
32

-0.5
0.0
0.5
1.0
-2
0
2
Planning in MST Search
Planning weights in MST Attribution
0.0
0.5
1.0
-2
0
2
Planning in MST Search
Outcome weights in MST Attribution
Figure 14: Individual planning in MST Search plotted against weights placed on planning and on
outcome in MST attribution in Experiment 3. Error bars indicate 95% confidence intervals.
5.3
Stimuli
5.3.1
MST Search
The search stimuli included 12 mazes of varying difficulty, as in Experiment 2.
5.3.2
MST Attribution
Stimuli were 24 videos of agents searching for the exit in a maze. There were four types of agents:
optimal-lucky, optimal-unlucky, sub-optimal-lucky and sub-optimal-unlucky, which were shown
in 6 different mazes similar to stimuli used in Experiments 1 and 2 with controlling for the length
of path.
5.4
Results
Of the 120 participants, 60 completed the MST Attribution first and 60 completed MST Search
first.
33

5.4.1
MST Search
As an Experiment 2, we assessed the quality of participants’ planning as described in Section 2.3.
We replicated the results of Experiment 2, and found a significant correlation of individual’s qual-
ity of planning in MST Search with the weight that they place on planning in MST Attribution
(see Fig. 14 and MST Attribution results below), r = .24, p = .008. However, unlike in Experi-
ment 2, no significant correlation was observed between individuals’ quality of planning in MST
Search and outcome weights in MST Attribution (p = .3). The correlations between performance
in MST Search and outcome weights in MST Attribution were not significant both for subjects
who did MST Search first (p = .35) and second (p = .37). This confirms the finding of Experi-
ment 2, that participants’ planning quality predicts their tendency to attribute intelligence based on
planning.
5.4.2
MST Attribution
The distribution of ratings for each type of agent are shown on Fig. 15. Each triangle represents a

Download 0,73 Mb.

Do'stlaringiz bilan baham:

1 2 3 4 5