Granada – Playing IMGAMEs with Groups

The IMGAME team went to Granada, Spain, in early November 2013 to play an Imitation Game on gender. We wanted to find out how well females were able to pretend to be males and males were able to pretend to be females.

Given the potential abundance of participants, we also decided to experiment with a different set-up for Step1. Normally, participants play the three roles – Judge, Pretender and Non-Pretender – on their own. Informed by an interesting paper on expert performance we wanted to test whether playing in groups had an effect on Judge questions and Non-Pretender answers which are retained after S1 (see here).

A comparative approach was chose to find this out. Instead of the usual 2 Step1 sessions, we played three separate session: 2 ‘individual’ sessions and 1 ‘group’ session. A total of 40 participants (20 males and 20 females) were involved in the two ‘individual’ session, whereas a total of 80 people (40 males and 40 females) participated in the group session. In each of the three sessions, 20 Imitation Games were played (which gave us a total of 40 ‘individual’ Imitation Games and 20 ‘group’ Imitation Games. No ‘special’ arrangements were made for Step2 and Step4, which meant that both Pretenders (S2) and Judges (S4) did not know whether they dealt with questions (S2) or dialogue sets (S4) generated by individuals or groups.

Completing the IMGAME within 5 days was a challenge as it was one of the largest IMGAMEs played to date – we had 120 participants at S1, 540 at S2 and 130 at S4 – but the results are very promising indeed. As the table below shows, the two group games produce the lowest Pass Rates when compared to the respective individual games. The ‘group effect’ is especially pronounced in the IG on ‘Femaleness’, i.e. the game where males pretend to be females.

group game

Another way of highlighting the group effect is to compare the Pass Rates that all the Dialogue Sets (DS), whether generated during individual or group play during Step1, produce:
group game II
The chart above shows the results of Dialogue Sets for individual games in grey and for group games in black (the lower the bars the lower the Pass Rate). The aggregated Pass Rate for individual games is also shown in grey, while the group game Pass Rate is shown in black. As can be seen the outcome of group-based DSs demonstrate much lower pass rates on average than individual-based DSs and the resulting mean pass rate for groups is strikingly lower than for individuals.

At this moment – the Granada game is the only one so far in which the group method has been tested – we can only speculate as to why there is a group effect.
One reason could be that putting S1 players in groups enhances their ability to ask challenging questions and provide better answers due to combining their different experiences. This is quite similar to a pub quiz where it is good to have a team in which members have different interests so that they can contribute to a questions from a variety of topics. Playing in groups might also produce a motivating effect as participants might enjoy playing the game more (as was evident in Granada where the atmosphere was very ‘lively’ during the group game session compared to the two individual game sessions) and spur each other on. These effects ought to reduce some of the ‘noise’ in the experiment, which enters when S1 Judges ask questions that are not challenging or when S1 Non-Pretenders are not able to answer questions competently. Reduced noise in these elements of the game should lead to reduced Pass Rates. Moreover, while questions and non-pretender answers are generated by groups at Step1, Pretenders in Step2 are on their own, which makes their task considerably harder and thus might also contribute to the lower Pass Rates.

Further qualitative analysis of the Step1 data from Granada is needed to find more about the reasons for the drop in Pass Rates. Such analysis might also shed light on another aspect of the results presented above: the Pass Rates in the ‘Femaleness’ game are generally lower than in the ‘Maleness’ game, which suggests that it is harder for Males to pretend to be Females than the other way around.


