How to build impact evaluation capacity
To leverage the benefits of impact evaluations, governments need to build their capacity—expertise, data, and funding—to conduct them effectively. Building this capacity can be challenging, even to officials who understand the importance of these studies. But government officials can choose an approach that best fits their available resources. These include: developing internal staff; building partnerships with external research entities; making better use of administrative data systems that enable researchers to use existing data to conduct impact evaluations; and aligning data policies and funding to support evaluations.
Hire or train staff to facilitate impact evaluations
In order to effectively and more frequently complete impact evaluations, governments can either hire research staff with the requisite skills to conduct these studies, or hire new or train existing staff to manage impact evaluation contracts. Governments can develop staff capacity in individual agencies and offices, as well as in centralized offices such as the office of a mayor, county commissioner, governor, or legislature.
Some states have hired staff with the requisite skills to conduct impact studies. For example, the Washington Department of Social and Health Services’ Research and Data Analysis (RDA) office has about 70 full-time employees who perform a range of analytical and data management tasks, including impact evaluations. About 70 percent of these staff are funded through time-limited grants and projects, and 30 percent are supported by legislative appropriation. RDA has evaluated innovative pilot programs as well as long-standing projects that had not previously been studied. These evaluations have significantly affected policy and program decisions in the state, such as a 2009 study of a chronic care management practice that led to a scaling up of two of the state’s health projects.11
While Results First researchers found few offices in the 50 states, other than RDA, that regularly conduct impact evaluations, they did identify several offices that conduct other types of evaluations. For example, some legislative audit and research offices, as well as some agency research divisions, perform outcome evaluations that aim to measure the results of state programs or policies but are unable to control for external factors likely to influence those outcomes. Policymakers could work with these offices to identify opportunities to conduct impact evaluations, particularly on high dollar programs, or programs being considered for expansion.
Even where governments contract out impact studies to universities or external evaluation firms—a viable option for jurisdictions with limited resources—Results First researchers found that maintaining a small level of staff knowledgeable in evaluation has substantial benefits. Such staff can, for instance, help select external evaluators, manage evaluation contracts, collaborate on choosing study designs, and assist with data access. These staff may also have a deeper understanding of the data and issues relevant to a program than external evaluators do, and can help facilitate communication and knowledge transfer between external evaluators and program staff.12 Michael Martinez-Schiferl, research and evaluation supervisor at the Colorado Department of Human Services, noted, “Program knowledge is very important to the research aspect. Having internal research staff embedded with program staff promotes collaboration on research and provides opportunities for research staff to develop some program expertise, as there are so many nuances about the program that they couldn’t understand from an outside perspective.”13
NYC Opportunity contracts out its evaluations while maintaining staff to oversee the work. The staff members manage the contracts of the independent evaluation firms, oversee impact studies of the anti-poverty programs and key mayoral initiatives, and work in partnership with city agencies to use the evaluation results to inform decisions to expand, improve, or discontinue programs.14 NYC Opportunity has a dedicated fund from the city to support its staff and external evaluations, but also seeks funding from federal grants and philanthropy to supplement this work.15
In the past 10 years, the organization has launched more than 70 programs, most of which have undergone an evaluation and some of which have become a national model for success.16 One example is the City University of New York’s Accelerated Study in Associate Programs, which provides extensive financial, academic, and personal support to working adults who are pursuing an associate’s degree. The program’s first impact evaluation, done in partnership with a research organization, showed promising early results on academic outcomes, including lower drop-out rates and higher total credits accumulated, and a subsequent study demonstrated increased graduation rates.17 Following these successful evaluations, the program is expanding from 4,000 students in fiscal 2014 to 25,000 students in fiscal 2019, and has been replicated in other parts of the country.18
Build partnerships with external research entities to leverage expertise
Governments can help fill evaluation capacity gaps by creating long-term partnerships with external research entities, such as local universities, to perform an evaluation in its entirety or provide technical assistance or training. For example, Brian Clapier, former associate commissioner at the City of New York’s Administration for Children’s Services (ACS), attributed some of his office’s most successful evaluations to partnerships with universities.19
“Based on my experience the research-to-policy gap is a real challenge. One key strategy is to bring in the research partners (often from universities), and have these partners conduct the evaluations. Once the evaluation has occurred, strategically placed public agency staff can bridge the findings of the evaluation to program staff responsible for the policy development.”
This approach proved successful when, in 2012, ACS contracted with Chapin Hall at the University of Chicago to implement and test a pilot child welfare program to promote healthy families and child well-being. The study found that two of the program’s interventions—KEEP (Keeping Foster and Kin Parents Supported and Trained) and Parenting Through Change—were 11 percent more likely than a comparison group to achieve permanency.20 Based on the results, ACS continued funding both programs.
Results First researchers identified several government-university partnerships—some that perform impact evaluations for programs spanning a range of policy areas and others that evaluate the same program over a period of time. For instance, state and county agencies in Wisconsin frequently partner with the University of Wisconsin Population Health Institute to perform evaluations on a range of human services programs, including behavioral health, child welfare, juvenile justice, and health.21 The New York State Office of Children and Family Services, on the other hand, has partnered with the Center for Human Services Research at the University of Albany since 1995 to perform multiple evaluations—including impact studies—of one child welfare program, Healthy Families New York.22
Several jurisdictions have joined forces with policy labs—typically housed in universities—to help them design and test the effectiveness of government programs. For example, the Colorado Evaluation and Action Lab is a new government-research partnership that will help state officials to evaluate public programs and policies.23
Government leaders also partner with individual researchers, rather than a research organization, whose interests and skills align with a particular policy or program they want to evaluate. South Carolina’s Department of Health and Human Services, with the help of the Abdul Latif Jameel Poverty Action Lab (J-PAL), a research center at the Massachusetts Institute of Technology, paired up with health-focused researchers at Northwestern University to develop a randomized evaluation of assignment to Medicaid managed care plans.24
Policymakers should pair these research partners with trained government staff in the offices that operate (or oversee providers who operate) the programs being evaluated.
Make better use of existing administrative data systems to reduce impact evaluation costs
Most governments maintain reporting systems that collect administrative data, such as criminal arrest or education records, which could be used to help conduct impact evaluations at a lower cost than collecting this information from scratch.25 For example, New Mexico’s Department of Corrections conducted a quasi-experimental design (QED) study of a substance use disorder program using administrative data from three state correctional offices. With a small evaluation budget, the department was able to successfully answer policymakers’ questions about whether a program affected recidivism or substance use disorder relapse rates.26
Some states, such as Washington and South Carolina, have built sophisticated data warehouses that link data across multiple agencies and can be used for performing evaluations. For example, the South Carolina Revenue and Fiscal Affairs Office (RFA) operates an integrated data warehouse that receives copies of agency databases used for program administration and research.27 While the originating agencies maintain control over the use of the data, the RFA provides guidance on sharing and usage agreements to help researchers access the information to evaluate programs.28 A new Pay for Success project focused on expanding an evidence-based family support and coaching program in the state will use data from the warehouse to evaluate the program’s impact and calculate its return on investment.29
Washington state’s RDA also uses administrative data for most of its impact evaluations, which enables the agency to conduct more frequent evaluations on a wide range of programs. “We can knock out a quasi-experimental evaluation, assuming there’s no new data to collect, in a relatively short time and at a fraction of the cost of doing an external evaluation,” said health economist David Mancuso.30
Both federal and private entities are creating funding opportunities to support state and local governments in using administrative data to support low-cost RCTs. (See Appendix A for a list of funding sources.)
Align data policies and funding to support evaluation
Two key obstacles to conducting impact evaluations are accessing the data necessary for the study and finding the resources to fund it. Yet state and county officials have found creative ways to mitigate these seemingly formidable challenges.
To generate new evidence on what works, researchers need access to government data, but service providers and government agencies may be hesitant to share data due to privacy issues or concerns over how the data might be used. Government leaders can alleviate these sensitivities and facilitate information access by developing sharing and usage agreements that outline the purpose of the data sharing, how it will be shared with the public, and privacy protections.31
For example, the Colorado Department of Education established a data-sharing and usage agreement with Mathematica Policy Research to allow Mathematica to access the department’s administrative data to conduct an impact evaluation of a new charter school program’s effects on education achievement.32 The agreement outlines the types of data to be shared, as well as the responsibility of the requestor to use the data only for the purposes outlined in the agreement (in other words, the impact study), to secure and later destroy the shared data, and to share analyses with the department prior to publication. Because of the shared data, Mathematica could perform an impact evaluation that showed state officials that taxpayer investments in the program had positive impacts on reading and math skills at the elementary, middle school, and high school grade levels.33
States and localities can finance impact evaluations by setting aside internal funding for the studies, allowing governments to select programs in most need of an evaluation rather than being subject to the priorities of external funders. Results First researchers identified several ways governments are setting aside funding for rigorous evaluations, including through legislation, grants, and budget allocations.
For example, California passed a law in 2014 that sets aside funds to award contracts to recipients who agree to partner with an independent evaluator to assess the effectiveness of programs funded through the contracts.34 Three counties received $1.25 million to $2 million in 2016 to implement and evaluate selected social services programs.35 Though the law does not require recipients to conduct an impact evaluation, Alameda County is performing an RCT of a life coaching and mentoring services program aimed at reducing recidivism and increasing employment.36
Some state and local governments dedicate funds to support staff who oversee or conduct impact evaluations. Washington state’s RDA receives approximately 30 percent of its funding from a legislative appropriation that includes support for research staff who manage evaluations.37 RDA supplements this funding with federal grants, including Medicaid and a Substance Abuse and Mental Health Block Grant.
Even when state and local governments build impact evaluation staff or set aside funds to support these studies, additional federal and private funding can help fill remaining capacity and funding gaps.
The federal government has provided several competitive and formula grant opportunities. For example, the Institute of Education Sciences released a request for applications in 2017 for low-cost RCTs or QEDs of education interventions.38 The U.S. Department of Health and Human Services has also provided grants that included funding to evaluate child welfare and teen pregnancy interventions.39 While these opportunities provide substantial support for impact evaluations, they should not be a substitute for using existing government resources to support this work; many are one-time grants that limit support to one study, and some target programs or policies that might not be an area of need in a particular jurisdiction.
Even jurisdictions that have never completed an impact evaluation have opportunities to start building this capacity through external sources. For instance, in 2016 J-PAL launched the State and Local Innovation Initiative to help jurisdictions perform randomized studies of social programs,40 with eight jurisdictions chosen to participate in the first two rounds.41 In addition to funding, each will receive technical support and custom trainings to expand the internal capacity to create, use, and share rigorous evidence. (See Appendix A for more information on funding opportunities.)
How to select programs for an impact evaluation
While state and local governments have demonstrated the value of impact studies to assess programmatic investments, it is not practical (or even necessary) to rigorously evaluate every program. Decision-makers can identify and prioritize which programs to study by considering four key questions:
- Does the program have an evidence base? To identify programs that could benefit from an impact evaluation, governments can inventory the programs they fund and determine which ones have evidence supporting their effectiveness. National research clearinghouses—which review and aggregate impact evaluations in order to rate programs by their level of evidence of effectiveness—can help determine if local programs have existing evidence. Governments can prioritize evaluations for programs that do not have strong evidence of their effectiveness.
- Has the program been properly implemented? Poorly implemented programs are less likely to achieve the outcomes that leaders and residents expect, which would impair the results of an impact assessment.
- Are the right data available for an impact study? To conduct an impact study, evaluators need access to the right kinds of data. If the data are owned by other parties (e.g., another agency or program provider) or do not exist, governments should consider the feasibility of getting the data, which could entail developing data-sharing agreements or spending additional funds and time to collect new data.
- Does the program serve a significant number of people and/or is it a large budget item? Programs that have a higher number of clients and/or are costlier typically have a larger impact on a government’s budget than those that are less prescribed or less costly, and may be more attractive options for an impact evaluation.
Decision-makers may find that some of their untested programs are not good candidates for an impact evaluation. In that case, they can take other steps to ensure these programs are generating positive results, such as tracking outcomes of participants and reviewing implementation to identify any issues with operation and delivery. Decisionmakers can review these programs again at a later time to determine if they have become evaluation-ready.
Policymakers care about funding what works, and impact evaluations are an important tool that can be used to inform better, data-driven decisions. Impact evaluations provide critical information on program effectiveness, which policymakers can consider when making decisions about when to scale up, scale back, or make adjustments to a particular program or initiative.
By building their jurisdiction’s capacity to evaluate untested programs, policymakers can hold themselves accountable to the public, and ensure that the state’s public dollars are directed to those programs that yield positive results. While challenges still exist for governments seeking to regularly evaluate their programs, new technology and opportunities to leverage impact evaluation expertise through partnerships or grants have made it more feasible than ever for state and local governments to conduct rigorous evaluations of local programs. By carefully prioritizing which programs are ripe for impact evaluations, governments can make the most of their resources and fill in gaps about which programs are working and which are not.