Julia Berazneva is a PhD candidate at Cornell’s Dyson School and is currently on the job market.
Anybody who has worked with original data has a long list of associated complaints, and they range from missing observations to outliers, to poor or non-existent codebooks. But for the most part, nobody questions the fact that the interviews underlying the data actually took place…
Well… unless they collected the data themselves. Markus Goldstein discusses this very issue in his blog post this summer, citing both his experiences doing fieldwork in Ghana and a recent paper by Arden Finn and Vimal Ranchhod in the context of South Africa. ‘Faking data’ – ignoring the existence of household members, skipping sections, or even whole interviews – does take place for a variety of reasons. And while detecting cheating is necessary, little can be done after the fact. What can we do as empirical researchers to prevent fake data a priori?
This was my question in 2011 when I was leaving for Kenya. I was on my way to conduct a survey of agricultural households with the goal of improving our understanding of the availability and management of on-farm natural resources and studying how the current practices of western Kenyan farmers affect soil fertility and agricultural productivity. Given that I could not be present during each interview, how could I make sure that the interviews actually took place? Thinking and talking about common practices with my friends and colleagues gave me an idea. We’ll audio record all the interviews.
The idea was not without its skeptics. Will small farmers in remote areas agree to have their interviews recorded? Will they change their behavior in the presence of audio recorders? A simple field test resolved any concerns. We explained our reasons to audio record the interviews and asked the households for permission. Our audio recorders (Olympus VN-8100PC Digital Voice Recorders) were small in size and resembled mobile phones. The quick spread of mobile phone technologies across Sub-Saharan Africa made sure that the great majority of our households were at least basically familiar with electronic communications technology (in our sample over 83% of households owned mobile phones, with two phones per household on average). And the farmers did not mind – all but one in the sample of almost 350 households consented to the use of audio recorders.
These small devices proved to be so useful. The audio files served as an irrefutable proof of authenticity of household visits. They showed the date, time and length of each interview. Listening to them provided confirmation that the interviews took place with the correct households and that all the questions were asked properly. Combined with our use of hand-held GPS units that indicated the exact location and time of each household visit, I had no doubt about the authenticity of each visit. In the end, making sure that the interviews took place was not an issue during my fieldwork – I traveled with my enumerators and met all of the farmers in my sample, and I trusted my enumerators to do a good job. But the audio recorders provided so many unanticipated benefits!
First of all, we used them as a training tool. During the early field testing and training period, my fieldwork team and I listened to the audio transcripts to refine the survey questions and response categories of pre-coded questions, agree on the interpretation of each question, offer constructive feedback to each team member, and learn from each other. The use of audio transcripts helped identify any problematic areas of the questionnaire, which in the end led to greater consistency of the data collected. The audio files were also useful for identifying and revising incorrect data – missing, illogical, inconsistent or outlier values, etc. – associated with mistakes interpreting and recording the respondents’ answers. Our use of audio files to correct errors prior to data entry also reduced the time needed for data cleaning and made sure that the entire data set had no observations for which answers were missing or did not make sense. Audio transcripts also provided an additional and secure method for data storage, archiving, and sharing.
I discuss these and other benefits and costs of audio recording household surveys in this field report published in the Journal of International Development (and I thank my colleague Brian Dillon for encouraging me to write about it). My fieldwork relied on paper and pencil interviewing. The use of audio recording, however, can also benefit any computer-assisted data collection. While computer-aided interviewing may decrease the number of mistakes in data entry, it will not necessarily decrease the mistakes associated with the interpretation of answers and proper recording in pre-coded questionnaires. Moreover, the costs are low. Our 9-month long project spent an additional $236 in total for three digital voice recorders, several sets of rechargeable batteries and a battery charger.
I can think of situations in which audio recording interviews is inappropriate – for example, interviews with politicians, or on sensitive topics such as health or gender roles. But for many projects, the use of audio recording can help collect consistently high-quality data at a low cost. And of course, a rigorous study of the impacts of audio recording on respondents’ behavior and answers is needed for this solution to become widely accepted.
Can anonymity be guaranteed if a voice recording of the respondent is taken?
I think validation mechanisms should exist, but not if it comes at a high cost to anonymity.
Ensuring the anonymity of the respondents with an audio-recorded interview follows the same procedures as a computer-assisted or a pencil-and-paper interview that collects identifying information. In our case, we explained the data confidentiality procedures and asked for households’ permission to audio record the interviews, and made sure that the audio files together with hard copies of interviews were accessible to researchers only.
I believe there are research questions that require more sensitive information (e.g., topics related to gender, health, political affiliations, etc.), in which case the use of audio recording may make respondents anxious and uneasy. In such cases, researchers need to use other validation mechanisms in order to reassure their respondents of absolute confidentiality.