Introduction
When the National Institutes of Health initiated the Institutional Development Award (IDeA) program in 2012, the intention was to build capacity and enhance research in states that were identified as having historically low levels of NIH funding. Twenty-three states and Puerto Rico were eligible to pursue competitive biomedical infrastructure support grants funded through the National Institute of General Medical Sciences (NIGMS). IDeA supports Centers for Biomedical Research Excellence, IDeA Networks of Biomedical Research Excellence, and IDeA-Clinical and Translational Research (IDeA-CTR) programs. IDeA-CTRs (referred to hereafter as CTRs) are particularly focused on building statewide and regional capacity to conduct biomedical research.
As of 2024, there are 14 active CTR grants. All CTRs have a set of required program “cores” that provide the various required component intervention activities of these grants. A Tracking and Evaluation component (TEVAL) is required as either a standalone core or embedded within the Administrative Core. TEVAL units are responsible for internal support of the CTR through data collection to facilitate continuous improvement and measure impact, and for external compliance with NIGMS reporting requirements. For more than a decade, CTR evaluators have not only provided data to report on and inform their own CTR’s functioning, but they have also collaborated with other translational research evaluators to share approaches, instruments, and best practices. The National CTR Evaluators Network facilitates connections and collaborations between CTR evaluators across the nation. The Network meets quarterly to collectively tackle challenges evaluators face, share innovative approaches used by translational research evaluators, and plan cross-CTR efforts and dissemination of lesson learned. In addition, the American Evaluation Association’s Translational Research Evaluation Topical Interest Group has been instrumental in connecting evaluators, particularly CTR and CTSA (Clinical and Translational Science Award) evaluators. NIH funding of both CTRs and CTSAs has enabled the evaluation field to advance its understanding, approaches, and tools to evaluate capacity-building activities, infrastructure change programs, and multi-institutional systemic change initiatives.
We share findings from a collaboration between the Delaware ACCEL CTR, Rhode Island Advance CTR, and Nebraska-led Great Plains CTR, focused on understanding the practices CTR evaluators engage in to improve the quality of their practice as well as the challenges identified in those efforts. Specifically, the collaboration studied how CTR evaluators evaluated their own evaluations, also known as meta-evaluation, as well as how meta-evaluation informs their evaluation practice. In addition, we investigated the challenges CTR evaluators face in their evaluation planning and implementation.
“Meta-evaluation,” coined by Scriven [Reference Scriven1,Reference Scriven2,Reference Scriven3], refers to any effort to use evaluation methods to improve and/or ensure the quality of an evaluation. Stufflebeam [Reference Stufflebeam4] elaborated and systematized many aspects of meta-evaluation in his ground-breaking work, promulgating program evaluation standards widely employed for meta-evaluation [Reference Yarbrough, Shulha, Hopson and Caruthers5]. Stufflebeam’s work also prefigured more recent formative approaches to meta-evaluation with attention to an array of potential problems for evaluations to anticipate, diagnose, and address. Recent applications of the term have focused on formative processes extending over time, including both internal evaluator self-evaluative models [Reference Harnar, Hillman, Endres and Snow6] and external evaluation models [Reference Sturges and Howley7]. This study focused on internally-driven meta-evaluation, referred to as internal formative meta-evaluation (IFME), and asked evaluators about the methods they used to define and measure the quality of their own local work, including any relevant evaluation standards employed in this process, such as standards for utility and accuracy [Reference Yarbrough, Shulha, Hopson and Caruthers5]. Benefitting from prior surveys conducted by CTSA evaluators [Reference Hoyo, Nehl and Dozier8,Reference Kane, Alexander, Hogle, Parsons and Phelps9,Reference Patel, Rainwater, Trochim, Elworth, Scholl and Dave10], we sought to capture the local processes used for improving evaluations. Our study also aimed to identify the generalizable challenges that made these evaluations difficult and what strategies helped to improve them, following a developmental scheme for structuring questions dealing with the stages of the evaluation process in a CTR context. It is intended that these findings are helpful not only to translational research evaluators but also to any evaluator who seeks to improve their practice through meta-evaluation and lessons learned from within the evaluation community.
Methods
Research questions guiding the development of the survey instrument included:
1) To what extent do IDeA-CTR evaluators use meta-evaluative practices and how does meta-evaluation inform their evaluation?
2) What challenges do IDEA-CTR evaluators face in their evaluation planning and implementation?
To address these questions, evaluators from the Delaware ACCEL CTR, Rhode Island Advance CTR, and Nebraska-led Great Plains CTR developed a framework based on prior surveys in adjacent fields, the evaluation standards, and the evaluation lifecycle. This framework included four areas: CTR characteristics and evaluation resources, primary evaluation users, meta-evaluative practices, and evaluation challenges. This first area of the framework focused on characteristics of the CTR and a description of the evaluation resources; items for this component of the survey were drawn from the 2021 CTSA Evaluators’ Survey [Reference Hoyo, Nehl and Dozier8]. Because identifying the intended users of an evaluation is critical to understanding evaluation focus, usefulness, influence, and use of findings and because evaluation usefulness and use is fundamental to evaluation quality [Reference Patton and Campbell-Patton11], the framework also included questions to understand the primary intended users of evaluation services and products, the ways in which evaluators communicated with their primary users, and how evaluation had influenced decision-making with the CTR.
The third and fourth areas of the framework focused on evaluation practices, including how CTR evaluators examined their own practice as well as asking evaluators to reflect on the challenges they encountered during each evaluation phase. Items were framed based on the evaluation standards [Reference Yarbrough, Shulha, Hopson and Caruthers5] and the phases of the embedded evaluation model (define, plan, implement, interpret, inform, and refine) [Reference Giancola12].
Within this framework, the survey instrument was comprised of five sections and included both closed-ended and open-ended items. In the first section of the survey, respondents were asked their role within the CTR, the year of their initial award, the number of member organizations in their CTR, the total number of TEVAL staff, and the total number of TEVAL full-time equivalents (FTEs). The second section of the survey focused on evaluation users. Respondents were asked about their primary intended user as well as other intended users; they were also asked the style of meetings they held with intended users and the extent to which evaluation data had influenced changes or improvements within their CTR. Section three asked about meta-evaluation practices. In particular, evaluators were asked if they had conducted a meta-evaluation to judge the quality of their evaluation. For those who had conducted a meta-evaluation, data were collected regarding when the meta-evaluation occurred, the type of meta-evaluation, sources of evidence used for the meta-evaluation, the evaluation standards they applied, to whom the results were reported, and how the findings were used. The fourth section asked about challenges TEVAL cores had faced during different phases of the evaluation, from design to refine, following a CTR-context-specific sequential path, and how challenges experienced had affected their evaluation. Finally, section five gave respondents an opportunity to share recommendations they would give to other evaluators to help them improve the quality of their evaluation, and to identify areas of training that would help them improve their evaluation practice.
The survey was tested and refined through cognitive interviews and piloting within the three-CTR collaboration to reduce measurement error. In addition to testing to minimize measurement error, the Total Survey Error Framework [Reference Groves, Fowler, Couper, Lepkowski, Singer and Tourangeau13] was used to examine potential errors in coverage, sampling, and nonresponse. Coverage and sampling error were not of concern because it was a census of the target population. Nonresponse error was mitigated by sending the survey notification to both evaluation leads and assistant leads, following-up separately to ensure each evaluation lead received the survey, and using the Dillman method [Reference Dillman14] to increase response rates. A copy of the instrument is included in the supplementary materials.
At the time of the survey, there were 12 funded CTRs. The survey was administered through the Qualtrics survey platform to the lead or assistant lead evaluator of each of the 12 CTRs. All CTRs were asked to have one person complete the survey, to avoid duplicate responses. In addition to the initial email, multiple patiently and persistently deployed follow-ups were conducted. No incentives were provided to complete the survey; however, respondents were assured findings would be shared post-survey.
Survey data were exported from Qualtrics, Closed-ended items were analyzed in SPSS using basic descriptive statistics. Open-ended responses were coded, categorized, and analyzed for themes using Dedoose. Quantitative results and themes emerging from the qualitative data are detailed in the Findings section.
Findings
Evaluators from 12 CTRs responded to the survey, a response rate of 100%. However, some respondents did not answer every item on the survey, thus response rate by item varies.
CTR characteristics and evaluator resources
Ten respondents identified as the director of the evaluation, while two were assistant directors. CTR respondents ranged from early awardees (2012) to relatively recent awardees (2020 and 2021). The number of member organizations comprising each CTR ranged from 3 to 17, with a median of 6.5. Of those who responded to this item, four CTRs had 5 or fewer member organizations, three had between 6 and 10 member organizations, and three had 11 or more member organizations. Evaluators were asked the number of people and the total FTEs they had in their TEVAL Core. The number of evaluators ranged from 3 to more than 5, with a median of 4. Most evaluators (7/12; 58.3%) reported having less than 1.5 FTEs dedicated to evaluation, while one respondent indicated more that 2.5 FTEs.
Intended users
All respondents reported the CTR principal investigator (PI) as an intended user of the evaluation. Other frequently reported intended users included the executive or steering committee, the EAC or IAC, NIH, other core leads or members, community members, administrative staff, researchers within their CTR, and the evaluation community outside of their CTR. Most evaluators reported that CTR leadership (PI and core leads) were their primary intended users, while one respondent said NIH was their primary intended user and another indicated administrative staff.
Most evaluators (9/12; 75.0%) reported that their predominant style of meeting with the CTR PI was a formal standing meeting, while three (25.0%) indicated their predominant style was ad-hoc meetings. Both formal standing meetings and ad-hoc meetings were also used frequently to communicate with administrative staff and core leads.
When asked about the extent to which evaluation data had influenced the functioning of the core components of the CTR, five evaluators (41.7%) said evaluation data had a substantial influence, six evaluators (50.0%) said evaluation has had a moderate influence, and one (8.3%) reported it had a minimal influence. In terms of resource allocation, four evaluators (33.3%) said evaluation data had a moderate influence, seven (58.3%) reported a minimal influence, and one (8.3%) indicated it had no influence. Further, one-half of respondents (6/12; 50.0%) reported evaluation data had influenced restructuring of major activities moderately or substantially, while the other half said it had no or minimal influence. At the same time, nearly all (9/12; 75.0%) said evaluation data had influenced the refinement of minor activities either moderately or substantially; three (25.0%) reported the influence was minimal or not at all.
Meta-evaluative practices
CTR evaluators were asked if their evaluation core had engaged in meta-evaluative activities. One evaluator answered that they were not sure, while four respondents said they had not engaged in meta-evaluation. Three of these evaluators said they were interested in conducting a meta-evaluation in the future.
Evaluators from seven of the 12 CTRs (58.3%) indicated they had engaged in meta-evaluation, with six explicitly calling the activity a meta-evaluation and one implicitly describing that they routinely sought feedback for quality improvement. Five used internal evaluators and two used external evaluators, with one respondent having conducted a meta-evaluation using both internal and external evaluators. The most useful sources of evidence used for the meta-evaluation were evaluation reports/briefs and stakeholder surveys or interviews. The seven respondents who had conducted meta-evaluations were also asked about the evaluation standards [Reference Harnar, Hillman, Endres and Snow6] they had applied in their meta-evaluation. As seen in Table 1, six evaluators reported using the Utility standard and five used the Feasibility standard. The accuracy and accountability standards were applied by four CTRs. All standards were used by at least three of the seven evaluators.
Table 1. Use of standards for meta-evaluation (n = 6)

Four evaluators reported they conducted their meta-evaluation in the prior year, while one had conducted it 1–2 years before the survey and another 3–4 years before the survey. One evaluator did not respond to this item. Results were reported in various ways; some evaluators published or presented the results, while others used them for internal reporting to governance. Likewise, the audiences included other evaluators (reached through national evaluation conferences or evaluation publications) and leadership or governance within their CTR.
Responding to a question regarding how their meta-evaluation findings had been used, the predominant responses were to validate the utility of evaluation (“to demonstrate ways tracking and evaluation has influenced the CTR”) and to improve evaluation processes and products going forward (“used internally to inform our methods and processes”).
Evaluation challenges
All respondents were asked about challenges they had experienced during different phases of the evaluation, from initial design to informing primary users and use of evaluation findings. Response options included: major challenge, requiring unanticipated time or expense; moderately challenging, requiring a shift in focus or direction; somewhat challenging, beyond what was expected; minimally challenging, but no more than what was expected; and not a challenge. Half or more of evaluators found three areas more challenging than they had expected (i.e., either a major challenge, moderately challenging, or somewhat challenging): 1) developing a feasible and useful system for collecting, organizing, tracking, and reporting data; 2) developing metrics for project-wide impacts; and 3) fostering realistic expectations of systemic change impact attribution to the CTR (see Figure 1). In fact, three evaluators (25.0%) said that developing a feasible and useful system for collecting, organizing, tracking, and reporting data was a major challenge requiring unanticipated time and/or expense; two evaluators said that fostering realistic expectations of systemic change impact attribution to CTR was a major challenge.

Figure 1. Percent of evaluators reporting a challenge as greater than expected.
The least challenging areas for evaluators were engaging in routine reporting to Core leadership and developing a logic model for their CTR – with all respondents reporting these areas as either not a challenge or minimally challenging. Other less challenging areas included engaging in routine reporting to CTR leadership; facilitating/ encouraging use of evaluation reports and products by CTR leadership; and participating in grant renewal collaboration. See Figure 1 for a full list of challenges and the percent of evaluators reporting each as being greater than expected (i.e., at least somewhat challenging).
In addition to responding to challenges associated with defining the program using a logic model to encouraging use of evaluation findings, evaluators provided open-ended comments about situations that they found particular challenging and how they responded to the situation. Two primary themes emerged from the data: data infrastructure and staff turnover. The situations described frequently reflected the challenges experienced in building and maintaining a robust data infrastructure. These included setting up data systems to collect and store data, aggregating data across multiple data sources, and accurately capturing accomplishments and impacts. Evaluators responded to these challenges by continuously searching for methods to improve tracking or managing expectations regarding what could and could not be provided with current systems.
Another challenge emerging from the qualitative responses was staff turnover. Turnover of CTR leadership and of evaluation staff were both identified as challenges. Respondents had not found any methods to overcome this challenge and instead postponed projects, sometime indefinitely. See Figure 2 for a summary of comments.

Figure 2. Qualitative comments on challenges and responses.
Evaluator recommendations
In addition to challenges, evaluators were asked what areas they would like to learn more about to improve the quality of their evaluation as well as what recommendation, based on their experience, they would give to another CTR evaluator to help them improve evaluation quality.
Evaluators indicated they were interested in training in three areas: measuring impact, engaging leadership, and providing innovative evaluation topics. Regarding impact measurement, respondents were interested in learning more about ways to measure impact at multiple levels, including clinical practice, institutional change, and community health. See Figure 3 for a summary of comments.

Figure 3. Training interests to improve evaluation practice.
Evaluator advice to other CTR evaluators focused primarily on communication. In particular, respondents recommended regular communication with primary users regarding evaluation activities and findings and engaging evaluation users in the evaluation process. Other strategies to improve evaluation practice included advice such as taking the time necessary to plan the evaluation, continually monitoring data quality, having a systems approach to evaluation, allocating adequate time for data collection, and utilizing lessons learned from the evaluation community. See Figure 4 for a summary of comments.

Figure 4. Recommendations for improving evaluation practice.
Lastly, respondents were asked what advice they would give to new CTR evaluators. Their advice focused on feasibility, collaboration, impact, and CTR leader buy-in. See Figure 5 for a summary of comments.

Figure 5. Advice for new CTR evaluators.
Discussion
This study provides new insights into how CTR evaluators across the nation have looked at the quality of their own evaluations, and what they see as successes and challenges their evaluation programs have faced as they engage in providing both formative and summative perspectives on the development of infrastructure and a prepared workforce for Clinical and Translational Research.
It is clear from the descriptions of their work that most CTR evaluators see themselves as internal evaluators, reporting to project leadership as their primary user, and participating in regular meetings with various types of project staff. The complexity of the organizational systems within which they work is conveyed by the number of organizations linked together for these projects and the diverse kinds of groupings involved in guidance (steering and executive committees, administrative staff, core leadership, IACs, EACs), all of which serve in some capacity as clients for evaluation activities and reports – along with the funder, NIGMS. The small number of evaluation FTEs suggests the inherent challenges these units face as their complex projects are planned, implemented, scaled up, and adjusted to effectively respond to both internal and external conditions affecting project success over time.
The survey results reveal that most CTR evaluation teams recognize the value of stepping back to examine their own effectiveness, with over half already having conducted a meta-evaluation, to examine their own evaluative practices, using internal and/or external evaluators. Another quarter of the teams reported intending to conduct meta-evaluations in the future. We hope that our findings increase attention to the value of conducting meta-evaluations for formative purposes and encourage more evaluators to adopt meta-evaluative processes.
The meta-evaluations conducted by CTR evaluators were reportedly based on well-established program evaluation standards. Reflecting on the relative use of each of the five standards, we conclude that for internal evaluation, the priority focus on utility of internal reports – evaluation feedback directed to the component activities of the program and the project-wide leadership coordinating and funding those activities – is essential. Philibert and colleagues [Reference Philibert, Fletcher, Poppert Cordts and Rizzo15] treat the extent of internal use of their CTR evaluation recommendations as the meaningful metric for their IFME, demonstrating the centrality of this concern. With many data streams to identify, collect, and integrate, and complex systemic structures with which to collaborate in doing so, the feasibility standard is also essential for managing limited resources (few FTEs) to efficiently generate impactful evaluation outputs. Accuracy and accountability are also quite likely to be attended to, and the fact that respondents were conducting meta-evaluations would itself constitute attention to accountability. Propriety may understandably seem to have less urgency in programs where most of the social contexts and direct contacts for the evaluation work are in academic and health institutional settings with MDs and PhDs as both program delivery staff and intended targets for program interventions. These professionals may seem to need less protection and sensitivity, but the propriety standard deserves more evaluator attention in future meta-evaluations, as it is better to uncover frictions and sensitivities within and between the many “cultures” in these programs before an explosion – or subtle resistance to compliance – undoes utility and accuracy.
The study also surfaced useful insights on challenges inherent in CTR evaluation, and evaluators’ recommendations for others experiencing similar challenges. A common theme entailed evaluators’ reporting that building data systems was considerably more challenging than expected. Developing feasible and useful internal data systems for tracking and reporting data, including managing multiple data sources and data from multiple institutions, was reported as one of the biggest challenges for CTR evaluators. Attention to commercially produced systems, and published accounts of potential solutions [Reference He, Sampson and Obeid16,Reference Wood and Campion17], are now informing consideration by CTR sites but these solutions were not yet widely known or implemented at the time of our survey. Recommendations from our respondents suggest the frustration associated with seeking a single all-encompassing data system capable of executing every needed task. Some advised scaling back expectations for integrative systems and identifying practical solutions for each data need, while others were still looking for the holy grail.
Evaluators also shared challenges related to measuring project-wide impact. Developing metrics to assess impacts was reported as a challenge by more than half of responding CTR evaluators. This is a critical finding given the expectation that NIGMS-funded CTR centers should produce shared learnings and generalizable knowledge to accelerate progress in developing translational science infrastructure.
The CTSA-developed Translational Science Benefits Model [Reference Luke, Sarli and Suiter18] (TSBM) provides helpful constructs and metrics for the challenge of measuring and communicating project impact. However, the context is different for CTRs, where impacts logically focus on the necessary structures and workforce competencies to bring about the longer-term programs of research needed to have TSBM benefits in areas such as statewide health policies, patents, and significant epidemiological effects on disease incidence and prevalence. That is, for CTRs, infrastructure impacts and systemic institutional change are foundational elements to enabling an ecosystem leading to innovative research that produces policy, economic, community, and clinical impacts.
A third frequently endorsed challenge was fostering realistic expectations among internal stakeholders of what system change impacts can be attributed to the CTR. The resources and scope for evaluation in CTR projects emphasizes the tracking function at the core level, and more powerful cause-detecting designs may call for cross-hub collaborations. The value of cross-hub collaboration to produce more rigorous and compelling evaluation results has been actively promoted in CTSA-generated recommendations for many years [Reference Hoyo, Nehl and Dozier8,Reference Patel, Rainwater, Trochim, Elworth, Scholl and Dave10,Reference Trochim, Rubio and Thomas19], but is just emerging as a real possibility for CTR evaluators, with our survey as an early effort in that direction. Practical advice from our respondents emphasized the need to be realistic in discussions with the various CTR stakeholders about what can be accomplished in program evaluation as well as the limitations of findings that are provided under imperfect conditions (e.g. incomplete data). One common theme for how meta-evaluation findings were used points to the same concern – how to promote the value of internal formative feedback processes with local leadership and components of the CTR that use and benefit from evaluation data.
Evaluators also shared that ongoing communication with stakeholders is critical to improving the quality of evaluations as well as stakeholder understanding and receptivity. Consistent with the internal evaluation frame implied by the identification of primary clients within the local site, a participatory approach was advocated.
Finally, evaluators offered advice to new CTR evaluators that is both pertinent to the challenges in CTR evaluation and relevant to evaluators across a broad range of domains. In particular, respondents reminded others in similar contexts to focus their limited resources by measuring what matters; work to develop and maintain strong, trusting relationships with project leadership; and utilize the vast knowledge within our evaluation community to continue to improve our practice.
An important facet of our findings was the relationship of our work to CTSA evaluator survey findings, and the implications of this for the future of CTR evaluation. As noted, a number of our descriptive items were drawn from a CTSA evaluator survey [Reference Hoyo, Nehl and Dozier8], most recently conducted in 2021, and although the size, objectives, and age of their grant-funded programs are different from CTR circumstances in important ways, comparison is still relevant for understanding the implications of our results. The CTSA survey response rate is impeccable, with 51 hubs’ evaluators responding, representing 96% of the total population. That compares to the 100% rate from our much smaller base of twelve sites. The CTSA sites have been receiving their NIH funding for much longer, with 75% established by 2010 compared to the earliest of ours established in 2012. Comparison of the number of partnering organizations on the grant suggest somewhat larger groups (3–17 organizations, with a median of 6.5) for CTR sites in comparison with 4–5 organizations for the CTSAs. The budgets for CTSAs are much larger; they are directed at settings with well established, highly productive biomedical research enterprises; their evaluation component has been merged with administration for far longer, and they have been formally organized as a cross-hub policy-focused entity. Thus, the CTSA comparison is relevant but distinctive, and that is apparent in the focus for their survey’s objectives and recommendations, which attend primarily to how the national network of CTSA hubs can benefit from improved systemic integration and collaborative use of evaluation across the hubs. However, it is confirmatory to see that Hoyo and colleagues [Reference Hoyo, Nehl and Dozier8] recommended best practices for evaluation do connect directly with challenges our evaluators reported experiencing: data reporting systems and credibility with project leadership. As well, Trochim and colleague’s [Reference Trochim, Rubio and Thomas19] earlier set of recommendations for CTSA evaluation do have very useful elements directed at the individual site level along with their national system-wide attention. For example, they call for a collaborative/participatory evaluation approach within hubs; a central focus on utilization of evaluation results; recognition that although tracking is an important function of evaluation, measurement of impacts is essential; and recognition of the incredible resource of the cross-grant evaluation community.
Conclusions
Findings from quantitative and qualitative responses tell a compelling story of the challenges in translational research evaluation and how evaluators have overcome them, providing generalizable knowledge for CTR evaluators and the field of program evaluation in a wide range of domains. Major points with broad application include: (1) pay careful attention to the primary role of tracking and reporting for formative improvement; (2) identify realistic, measurable project-wide impacts to be documented over time; (3) maintain active, responsive communication with internal stakeholders; and (4) appreciate and utilize the broad national network of evaluators dealing with the same or similar challenges to the ones you face.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/cts.2025.10121.
Acknowledgements
We would like to thank the members of the NIH National CTR Evaluators’ Group for their participation in the survey.
Author contributions
Sue Giancola: Conceptualization, Data curation, Formal analysis, Methodology, Project administration, Writing-original draft, Writing-review & editing; John Stevenson: Conceptualization, Methodology, Project administration, Writing-review & editing; Ingrid Philibert: Conceptualization, Methodology, Writing-review & editing.
Funding statement
This work was partially supported by the National Institute of General Medical Sciences of the National Institutes of Health (grant numbers U54-GM104941 [Pl:Hicks], U54-GM115677 [PI:Rounds], and U54-GM115458 [PI:Rizzo]).
Competing interests
The authors declare none.