Plans or Outcomes: How do we attribute intelligence to
Participants were presented with the following prompt
Download 0.73 Mb. Pdf ko'rish
|
Intelligence final
Participants were presented with the following prompt: PLEASE READ THE FOLLOWING INSTRUCTIONS CAREFULLY. Your task is to exit the maze by reaching the red square in as few steps as possible. You can move one square at a time, by clicking on the white squares near your character. Blue squares are walls. You cannot see through the walls, so the squares you cannot see yet are black. The exit is equally likely to be hidden behind any of the black squares. At the end of the experiment we will add all steps you took and show you how you did compared to previous results. You have an opportunity to earn a bonus for completing the mazes in fewer steps. The instructions included an annotated image to disambiguate the meaning of terms like ‘wall’ and the reference to the agent character. After reading the instructions, participants completed three practice trials, and answered a comprehension quiz. The quiz contained two multiple choice questions ( My task is to .. and My bonus will be bigger if I .. ). Participants who answered the comprehension quiz incorrectly proceeded with the experiment, but their responses were dis- carded. On each trial participants looked for an ‘exit’ (marked when visible as a bright red circle) in a series of mazes, by controlling an agent using a mouse. The agent could move one grid square at a time: N, W, S, or E, and had a 360 degree view of the maze limited by walls. A maze started out dark (except for walls), but was uncovered as a participant opened new areas. Participants initially saw the layout of the rooms and location of the barrier walls, but did not know where the exit was. Participants received a performance-based bonus of up to $1 for finishing all mazes with least steps. To receive the maximal bonus, participants needed to achieve a step cost within 5% of the optimal solution. The bottom 10% of participants, determined offline after we analyzed the responses, received no bonus. At the end of MST Search, participants answered a question about how they made their decisions. The decision times (times to make a move) and the path taken were recorded for each trial. 24 4.2.2 MST Attribution Participants rated the intelligence of maze-solving agents, in a procedure identical to Experiment 1. As before, at the end of the task participants were asked how they made their decisions. 4.3 Stimuli 4.3.1 MST Search There were 12 stimuli, which are shown on Fig. 8. Figure 8: Stimuli used in MST Search in Experiment 2. Black cells indicate unseen areas. White cells are empty cells already revealed. Grey cells are walls, and are always visible. The agent can move through empty areas, but not through walls. The starting location was always at the top-left corner. The exit was hidden behind one of the black squares. Solid lines indicate the averaged path taken by the majority of participants. Here the average paths are calculated by taking a step in each cell that is taken by the majority of the participants. The average paths were identical to the optimal solution in all mazes but two – in these two case the optimal solution was the second most common path. Dashed blue lines indicate the second most common path. 25 0.0 0.4 0.8 -2 -1 0 1 2 3 Planning in MST Search Planning weights in MST Attribution 0.00 0.25 0.50 0.75 1.00 -2 -1 0 1 2 3 Planning in MST Search Outcome weights in MST Attribution Figure 9: Individual’s planning values in MST Search plotted against MST attribution weights placed on planning (left) and on outcome (right) in Experiment 2. The better participants planned in the MST Search, the more they weighted planning, and the less they weighted outcome in MST Attribution. 4.3.2 MST Attribution MST Attribution consisted of the 32 trials identical to Experiment 1. 4.4 Results 4.4.1 MST Search We measure a participant’s planning ability in the same way as we measure an agent’s planning in Experiment 1. We infer the decision noise parameter τ required to generate the participant’s path and calculate planning values as z − score(ln(1/τ)) (see Section 2.3). We found a significant correlation between participants’ planning in MST Search and the weights they placed on planning in MST Attribution, r = .45, p < .001 (see Fig. 9 and MST Attribution results below), suggesting that both processes use a common mental planning mechanism. We also found a significant anti- correlation between an individual’s planning in MST Search and and the weighs placed on outcome (r = −0.32, p > .001). This suggests that participants who were better at planning in MST Search 26 also placed less weight on outcome in MST Attribution. To further characterize behavior in general, we calculated the average paths, by taking a step in each cell that is taken by the majority of the participants. The average paths, and the second most common paths, are shown on Fig. 8. The average paths were identical to the optimal trajectory in all mazes but two – in these two case the optimal solution was the second most common path. 4.4.2 MST Attribution The distribution of z-scored ratings for each type of agent are shown on Fig. 10. Each triangle represents a participant’s mean rating of four agents of a given type. Figure 10: Intelligence ratings for each type of agent in Experiment 2, following the same conven- tions as in Experiment 1 We replicated the results of the MELM analysis from Experiment 1 to assess the effect of planning and outcome on ratings of complete trials. We constructed a null model including only a random effect of participant, and found that it was significantly improved by modeling random slopes for outcome per participant χ 2 (1) = 1433.9, p < .001. The null model was also significantly improved by modeling random slopes for planning per participant, χ 2 (1) = 1437.2, p < .001. Adding ran- dom slopes for planning per participant to the outcome-only model resulted in further significant improvement, χ 2 (1) = 850.4, p < .001, as did adding random slopes for outcome to the planning- only model χ 2 (1) = 847.06, p < .001. Mean rating predictions from each model compared to mean human ratings are shown on Fig. 11. The outcome and planning weights fitted to individuals are 27 Figure 11: Mean rating predictions from each model compared to mean human ratings in Ex- periment 2. Error bars indicate 95% confidence intervals. Agent labels are: OL - optimal-lucky, OF - optimal-fair , OUL - optimal-unlucky, PS-R - pseudo-random, SL - suboptimal-lucky, SUL - suboptimal-unlucky , OPT-part - optimal-part, SUB-part - suboptimal-part. The combined (Both) model is the closest to human attributions. shown on Fig. 12. We also replicated the effect of planning on ratings of incomplete trials. We constructed a MELM null model with only a random intercept for participant, and found that it was significantly im- proved by adding a random slope for planning, χ 2 (1) = 41.211, p < .001, meaning that on trials where outcome was not shown planning alone still had an effect on attributed intelligence. The correlation between the weights placed on planning on incomplete and on all trials was signifi- cant (r = .73, p < .0001), meaning that participants’ planning-based attributions on all trials can be predicted from how they evaluated incomplete trials. 4.4.3 Task Order Effects We compared the planning and outcome MELM random slope coefficients of participants in Ex- periment 1 and 2 (Fig. 13) using an independent sample t-test t(166) = −2.68, p = .008, and found that in Experiment 2 participants’ planning weights were higher than in Experiment 1. There was no significant difference between the weights placed on outcome in Experiment 1 and 2. An independent sample sample t-test comparing outcome coefficients was not significant, t (166) = 1.9, p = .06. 28 0.0 0.4 0.8 1.2 0.0 0.4 0.8 1.2 Outcome weight Planning weight Figure 12: The MELM random slope coefficients for outcome and planning fitted to individuals in Experiment 2. Error bars indicate bootstrapped 95% confidence intervals. The red line visually separates participants who weight planning more than outcome, and who weight outcome more than planning. Both weights are different from zero for all participants. 4.5 Discussion In the second Experiment we replicated the findings of Experiment 1. We additionally found a sig- nificant relationship between one’s performance in MST Search and attributing intelligence based on planning in MST Attribution. Specifically, we found that participants’ proximity to the optimal planner in MST Search predicted planning-based attributions of intelligence in MST Attribution. This suggests that intelligence attribution depends on cognitive abilities related to planning, as measured by MST Search and CRT. MST Search and MST Attribution both require approximating rational planning. The finding that the quality of planning approximations during search and during evaluation are correlated, suggests that people may use a similar cognitive mechanism to plan for themselves, and to evaluate others. If the two mechanisms were unrelated, we would have seen no such correlation. For example, one could do the task by guessing, but expect others to follow the optimal plan. One could also do the task rationally, and yet evaluate others based on outcome. Our results exclude this possibility, and 29 Figure 13: MELM random slope coefficients of individual participants in Experiment 1 and 2. Participants in Experiment 2 (who completed the MST Search) were fitted with higher planning coefficients. suggest that there is a dependence between how well people plan, and how strongly they expect good planning from others. We also found that in the second experiment outcome-bias was reduced through an increase in the weights placed on planning in MST Attribution after completing MST search. At the same time, we have observed a marginal difference between outcome weights placed in Experiments 1 and 2 (p = .06), meaning that the weights placed on outcome may have been reduced as well. This suggests that, at least for some participants, the choice of attribution strategy is flexible, and can be influenced by context. In Experiment 3, we further verify this result by presenting MST Attribution and MST Search in a counter-balanced order. We note that the reduction in outcome bias in MST Attribution could happen by training the par- ticipants planning skills, in addition to increasing the salience of task constraints. If so, then MST Search performance could be affected by task order in Experiment 3, assuming MST Attribution and Search tasks train one another. Notably, in Experiments 1 and 2 the effect of optimal planning is consistent with real-world expec- tations: optimal planning minimizes unlucky outcomes, so that when optimal agents are unlucky, their outcome is on average better, compared to suboptimal-unlucky agents. In Experiment 3 we 30 designed a new set of stimuli that equalized the number of steps across optimal and suboptimal agents, to preempt the concern that outcome depends on planning, and potentially increase the discriminating strength of the design. We achieved this by manipulating the starting location of each agent: suboptimal agents tended to start closer to the goal, so that on average they took the same number of steps as optimal agents and differed only in the quality of their planning. As a result of this manipulation, the optimal agents in Experiment 3 "play with a handicap", which means that the aggregate statistics of the stimuli may differ from what subjects expect in real life. By replicating the results of Experiments 1,2 in Experiment 3 we show that our findings generalize across the environments, and are independent of the aggregate outcomes of optimal and suboptimal agents. 5 Experiment 3 In the third experiment our goal was to replicate the results of Experiments 1 and 2, controlling for the total number of steps taken by different types of agents. We compared four types of agents: optimal-lucky, suboptimal-lucky, optimal-unlucky and suboptimal-unlucky, and varied the agent’s starting location so that optimal and suboptimal agents always took an equal number of steps. The lucky agents took between 7 and 11 steps. The unlucky agents took between 14 and 23 steps. In Experiment 3 we also use a different set of MST Search stimuli to ensure the generalizability of our MST Search results. As a result of this manipulation, we expected to see a more interpretable dif- ference between planning-based and outcome-based attributions. We also presented MST Search and MST Attribution in a counterbalanced order, to verify the task order effects observed between Experiments 1 and 2. 5.1 Participants One hundred and thirty six participants were recruited online via Amazon Mechanical Turk 3 , re- stricted to US participants and to participants who had not taken the previous experiments in this 3 We recruited participants until we had 120 who passed the exclusion criterion. 31 series. Of our initial pool, 14 participants were discarded for failing the instruction quiz, and two failed the verbal responses. The exclusion procedure was identical to the procedure in Experi- ments 1 and 2. The analysis thus included 120 participants (47 females, 73 males, median age 34, SD = 9.7). 5.2 Method The experiment was presented on a computer screen in a web browser using a JavaScript inter- face developed in our lab. Participants first read a consent page, on which they provided their age and gender and read a short description of the experiment. Following the consent participants completed MST Search and MST Attribution (detailed below). The task order was counterbal- anced between participants. At the end of the experiment participants answered 3 CRT questions, identical to those in Experiment 1. 5.2.1 MST Search Participants searched for an exit from a maze, as in Experiment 2. After reading the instructions (similar to Experiment 2, except that the subjects did not receive a bonus), participants completed three practice trials and answered a comprehension quiz. As before, participants who answered the comprehension quiz incorrectly proceeded with the experiment, but their responses were discarded. At the end of the task participants answered a question: How did you make your decisions? 5.2.2 MST Attribution This task was similar to the MST Attribution used in Experiments 1 and 2. As before, after reading instructions (same as Experiments 1 and 2), participants answered a comprehension quiz. Partici- pants who answered the comprehension quiz incorrectly proceeded with the experiment, but their responses were discarded. Participants then viewed and rated three familiarization examples fol- lowed by 24 more trials. At the end of the task participants answered a question: How did you make your decisions? 32 -0.5 0.0 0.5 1.0 -2 0 2 Planning in MST Search Planning weights in MST Attribution 0.0 0.5 1.0 -2 0 2 Planning in MST Search Outcome weights in MST Attribution Figure 14: Individual planning in MST Search plotted against weights placed on planning and on outcome in MST attribution in Experiment 3. Error bars indicate 95% confidence intervals. 5.3 Stimuli 5.3.1 MST Search The search stimuli included 12 mazes of varying difficulty, as in Experiment 2. 5.3.2 MST Attribution Stimuli were 24 videos of agents searching for the exit in a maze. There were four types of agents: optimal-lucky, optimal-unlucky, sub-optimal-lucky and sub-optimal-unlucky, which were shown in 6 different mazes similar to stimuli used in Experiments 1 and 2 with controlling for the length of path. 5.4 Results Of the 120 participants, 60 completed the MST Attribution first and 60 completed MST Search first. 33 5.4.1 MST Search As an Experiment 2, we assessed the quality of participants’ planning as described in Section 2.3. We replicated the results of Experiment 2, and found a significant correlation of individual’s qual- ity of planning in MST Search with the weight that they place on planning in MST Attribution (see Fig. 14 and MST Attribution results below), r = .24, p = .008. However, unlike in Experi- ment 2, no significant correlation was observed between individuals’ quality of planning in MST Search and outcome weights in MST Attribution (p = .3). The correlations between performance in MST Search and outcome weights in MST Attribution were not significant both for subjects who did MST Search first (p = .35) and second (p = .37). This confirms the finding of Experi- ment 2, that participants’ planning quality predicts their tendency to attribute intelligence based on planning. 5.4.2 MST Attribution The distribution of ratings for each type of agent are shown on Fig. 15. Each triangle represents a Download 0.73 Mb. Do'stlaringiz bilan baham: |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling