Amazon Mechanical Turk (MTurk) has become increasingly popular as an online tool for conducting social science research. What are the specific advantages and downsides of using online crowdsourcing tools like MTurk for conducting research? What practical and/or moral dilemmas might emerge in the course of the research process, and what concrete strategies have scientists developed to address them?
Presented as part of the Social Sciences and Data Science event series, co-sponsored by the UC Berkeley D-Lab, a panel discussion recorded on October 1, 2021 brought together researchers from diverse disciplines, who shared their experience with the MTurk platform and discussed social and ethical aspects of MTurk more generally.
Moderated by Serena Chen, Professor and Chair of Psychology and the Marian E. and Daniel E. Koshland, Jr. Distinguished Chair for Innovative Teaching and Research at UC Berkeley, the panel featured Ali Alkhatib, Interim director of the Center for Applied Data Ethics at the University of San Francisco; Stefano DellaVigna, Daniel Koshland, Sr. Distinguished Professor of Economics and Professor of Business Administration at UC Berkeley; and Gabriel Lenz, Professor of Political Science at UC Berkeley.
“MTurk has been a huge boon to the social sciences in general, partly because along with a lot of other online platforms, it has reduced the cost, especially the administrative costs, of running experiments,” Lenz said. “MTurk has lots of issues you all should be aware of. But it’s still been a net positive and helped helped us understand real-world problems and real-world behaviors.”
Lenz said that researchers should be wary of assuming MTurk provides a representative sample of large populations, though he noted that there may be some predictability in when and how MTurk is not representative, based on what is known about the platform’s “worker” population.
“Demographically, this is not a representative sample of the US population, and you should never treat it that way,” Lenz said. “If you’re hoping to generalize your findings to the US population, don’t. But the argument for it is that it’s a more diverse sample than your typical lab sample.”
Lenz also warned that researchers should be attuned to bias based on “social desirability,” as MTurk survey respondents may not input their honest opinions. And there may also be bias due to workers’ high level of exposure to information about certain topics, such as politics. He recommended using real-world examples, rather than hypotheticals, to encourage more candid responses. “Try to use Mechanical Turk in ways that you’ll know will reflect more on the real world,” Lenz advised. “For example, we always try to ask people about their actual members of Congress when we’re doing studies on voting.”
One of the trade-offs with using a paid survey service such as Mechanical Turk, Lenz noted, is that the more you pay, the more people appear to attempt to cheat or use bots to shortcut the survey process. “You want to pay people more, but you don’t want people trying to do the study many times,” Lenz said. “Everybody struggles with this.”
In his talk, Stefano DellaVigna talked about how MTurk has made it more efficient to replicate studies without high investment. “It is wonderful to be able to have this quick access to obtain data and evaluate replicability,” DellaVigna said.
He also praised the platform for enabling research during the pandemic, and for allowing graduate students to conduct small-scale studies to gather initial results; he shared an anecdote about a PhD student who came up with a question and ran a study on MTurk in a matter of hours. “It is so empowering and lowers inequality in access to study samples,” he said.
In his talk, Ali Alkhatib from the Center for Applied Data Ethics explained that he is less of a user of MTurk than a researcher focused on understanding the workers behind the platform. “I have been studying the crowd workers themselves, and what they are experiencing as they as they engage with these platforms,” Alkhatib explained.
He noted that researchers should keep in mind the circumstances of the workers on MTurk and similar platforms, who often are struggling to make a living. He noted that, if the workers are in communication with each other, it may be because “they’re not trying to game the system; they’re just trying to not get stiffed. These workers are highly networked and and talking with each other and trying to exchange notes.”
He also explained that researchers should work to build trust in the MTurk community, and gain an understanding of how the platform works before diving in. “Mechanical Turk is very much a community, very much a culture,” he said. “Think of this as a relationship that you try to foster and build and nurture, because these are people, and as much as we would like to think that they pass through and are stateless, the reality is that they are human beings who are just as affected by the research and the treatments that we that we bring to them as as anybody else.”
Alkhatib said that researchers should be “as clear as possible” and “as communicative as possible,” while also trying to be “as humane as possible to the people that we’re working with. It also leads you to a much richer sort of understanding of why you get certain findings or why things don’t necessarily add up.”
“Mechanical Turk is not a panacea,” Alkhatib said. “It doesn’t solve all the problems, but it solves some of them, or it may ameliorate some of them. But we do need to be conscious of how it shifts other problems around as well.”