The status-quo around survey design needs to change. Too often, data analysis is an afterthought to learning and evaluation processes; survey designs are disconnected from the end goals of analysis. Data analysis is not considered until the data is compiled, and someone who has been completely disconnected from the planning and design process of the project is—all of a sudden—responsible for its analysis.
This needs to change because improving the capacity of program staff to appreciate data analysis and design better surveys improves our ability to answer key questions about our work.
This blog will delve into some common problems of survey design and data, and how correcting these issues has the power to greatly improve the data analysis process for peace and development. Yes, leading questions, unbalanced response scales, and other fundamental mistakes are still major problems in survey design. However, those issues have been covered elsewhere (see, for example, http://www.surveysystem.com/sdesign.htm).
Let’s dig into Timeframes and Recall Bias, Rare Events, and Demographic Information.
Common Problem 1: Timeframes, recall and project cycles
Survey questions with a recall component can be difficult to frame. This can manifest itself as related to recall bias and project timelines.
A question about ‘non-fundamental’ events with a long time frame will introduce recall bias. Consider two survey questions acting as ‘proxies’ for social cohesion: 1) How many times have you visited another community member’s house in the past year? and 2) How many weddings have you attended in your community in the past year? Asking about the number of weddings over the last year will be more useful than asking about the number of visits, because weddings are significantly easier to remember than visits.This is an important consideration because the issue of ‘random’ noise in your data is inversely related to the sample size. If a significant number of your respondents cannot accurately recall the number of ‘visits’ and opt to not respond, this non-response bias could have a substantial impact if your sample is relatively modest. Even more serious, in a treatment-control study, if your treatment group is consistently over- or under-reporting on key variables relative to the control group, your study will be fraught by reporting bias and conclusions will be faulty.
Additionally, existing surveys should not be recycled without taking into account the specific requirements of a project. A survey for a project of six months will examine changes over a different timespan than a survey for a project of two years. We need to ensure that we are capturing changes from pre- to post -project; if we ask beneficiaries from the 6 month project to reflect over changes that might have happened outside of the project’s timespan, then that data is not relevant to the project. This does not mean that we shouldn’t lean on earlier survey development work, but rather that we need to tailor questions to the specific requirements of the project.
For any timeframe, as survey designers we need to put ourselves in the respondents’ shoes. Would you be able to remember how often you interacted with neighbors in the last 6 months, or how often you bought a typical consumption good? Add in potential illiteracy or low levels of education and you realize that you need to be careful when asking your respondents about recalling their past.
Common Problem 2: ‘Rare events’ and variability
Peace builders generally operate in the ‘rare events’ space. We might want to measure whether our programming has had an effect on community violence or individual respondents’ violent behaviors. The problem arises when we develop questions about very rare events, to the point that we cannot draw statistically significant conclusions. For example, consider a community where 5% of the population has contributed to violence. Even in the (highly) unrealistic scenario where all perpetrators answer truthfully to questions about their involvement, the overall sample might only consist of 300 individuals, and the number of ‘positive’ cases will be a mere 15 individuals. In this scenario, there is no statistically significant power to draw any conclusions.
While the raw count in this example could be interesting in its own right, it is important to think through what the purpose of each and every question is and what type of information we can realistically extract from a variable or set of variables. If the goal in the above scenario is to start unwrapping the ‘characteristics of violence’ or similar lofty goals, either the sample size needs to be increased significantly, the question needs to be re-phrased (maybe that means avoiding behavioral questions altogether in the scenario outlined above).
A related issue with regards to low variability is respondents’ unwillingness to express their true preferences or behavior when asked about sensitive topics. In the example above, where respondents are asked about their history of violent behavior, a very small percentage of actual perpetrators will admit their involvement to an external enumerator. If we want to create a robust representative sample of actual violent behavior (or support for violent groups or similar) new techniques that are rapidly being tested and improved need to be adopted (see some of these techniques discussed in this paper by Rosenfeld, Bryn, Imai and Shapiro).
Common Problem 3: Demographic Information
Without certain demographic data, large swathes of information cannot be fully utilized. It is pivotal to get a sense of which groups an intervention is reaching and for which groups an intervention is falling flat. Confidentiality concerns should be taken seriously, and this concern naturally increases when the sample is small and respondents are worried that their answers can be traced back to them personally. However, that should not prevent us from asking about essential information, but rather drive us to create strong processes to uphold anonymity and clearly communicate it to our respondents. From my experience, most subjects are more willing to answer personal questions than what is assumed a-priori. In cases where that isn’t true, subjects are rarely going to get highly offended and storm out, but will rather make it clear that they are not comfortable answering the question and the survey team can swiftly move on.
There are promising signs for strengthening data analysis in the international development and peacebuilding fields. The link between academia and practitioners is stronger now than just a couple of years ago. The United States Institute of Peace recently hosted a two-day CVE conference coined ‘Evidence for CVE: Advancing Community-Based Approaches’, that had the explicit aim of bringing practitioners and academics together to discuss and develop a plan for the establishment and support of a network of place-based researchers that will contribute to plug gaps in our understanding of violent extremism around the world. This and similar initiatives contribute building stronger evidence about what works and what doesn’t, and also building capacity and understanding in the practitioner community about how all steps of a research or evaluation process are interrelated. Attention to survey design has profound impact on our ability to learn from our programs. Developing the data analysis plan prior to survey finalization and including the data analyst at every stage of the process will improve our ability to learn from the data we collect.
Ruben Grangaard is a senior research analyst supporting the Learning and Evaluation Team at USIP. He joined USIP after a year and a half as an independent consultant and on-site research analyst with Mercy Corps.
Ruben has experience developing research designs, survey designs and conducting data analysis for a broad range of evaluations. Moreover, he has field research experience from India and Bangladesh. Among his research interests are economic development as a tool for peace and drivers of youth violence. His current focus is on how we can better measure ‘effectiveness’ of peacebuilding programs.
Ruben holds an M.A. in International and Development Economics from University of San Francisco and a B.A. in Economics from University of Queensland, Australia.