Expectations for methodology and translation of animal research: a survey of health care workers

Background Health care workers (HCW) often perform, promote, and advocate use of public funds for animal research (AR); therefore, an awareness of the empirical costs and benefits of animal research is an important issue for HCW. We aim to determine what health-care-workers consider should be acceptable standards of AR methodology and translation rate to humans. Methods After development and validation, an e-mail survey was sent to all pediatricians and pediatric intensive care unit nurses and respiratory-therapists (RTs) affiliated with a Canadian University. We presented questions about demographics, methodology of AR, and expectations from AR. Responses of pediatricians and nurses/RTs were compared using Chi-square, with P < .05 considered significant. Results Response rate was 44/114(39%) (pediatricians), and 69/120 (58%) (nurses/RTs). Asked about methodological quality, most respondents expect that: AR is done to high quality; costs and difficulty are not acceptable justifications for low quality; findings should be reproducible between laboratories and strains of the same species; and guidelines for AR funded with public money should be consistent with these expectations. Asked about benefits of AR, most thought that there are sometimes/often large benefits to humans from AR, and disagreed that “AR rarely produces benefit to humans.” Asked about expectations of translation to humans (of toxicity, carcinogenicity, teratogenicity, and treatment findings), most: expect translation >40% of the time; thought that misleading AR results should occur <21% of the time; and that if translation was to occur <20% of the time, they would be less supportive of AR. There were few differences between pediatricians and nurses/RTs. Conclusions HCW have high expectations for the methodological quality of, and the translation rate to humans of findings from AR. These expectations are higher than the empirical data show having been achieved. Unless these areas of AR significantly improve, HCW support of AR may be tenuous.


Background
Biomedical animal research (AR) involves some harm to sentient animals including distress (due to confinement, boredom, isolation, and fear), pain, and early death [1][2][3]. AR is said to be morally permissible because the balance of these costs (harms to the animals) and benefits (to human medical care, quality of life, and survival) is favorable [4]. It is generally assumed that the benefits are great to human medicine [5]. An awareness of the empirical costs and benefits of AR is an important issue in medicine for several reasons. Health care workers (HCW) often perform (and are expected to perform) AR, promote AR directly with trainees and indirectly as role models, and advocate for use of public funds (from granting agencies and charitable foundations) toward medical related AR.
Since most AR is funded by public money through government and charitable granting agencies, it is important to know the public perception of, and the level of public support for AR. Surveys of the public find that the majority are 'conditional acceptors' of AR; they accept the practice because of the promise of cures and treatments for life-threatening and debilitating human diseases, so long as animal welfare is at least minimally considered and protected [41]. To our knowledge, no survey has asked for the details of this conditional acceptance of AR. In this survey we ask HCW directly what the minimal acceptable standards in AR methodology might be, and what the minimal acceptable translation rate of AR to human treatments might be. This is important in order to determine how strong the support is for the empirical practice of AR, and how AR could be improved to increase the level of support. We found that HCW have high expectations for the methodological quality of, and the translation rate to humans of findings from AR.

Questionnaire administration
All pediatricians and pediatric intensive care unit nurses and respiratory therapists (RTs) who are affiliated with one Canadian University were e-mailed the survey using an electronic, secure, survey distribution and collection system (REDCap, Research Electronic Data Capture) [42]. A cover letter stated that "we very much value your opinion on this important issue" and that the survey was anonymous and voluntary. We offered the incentive that if the response rate was at least 70% we would donate $1000 to the Against Malaria Foundation or the PICU Social Committee. Non-responders were sent the survey by e-mail at 3-week intervals for 3 additional mailings.

Questionnaire development
We followed published recommendations [43]. To generate the items for the questionnaire, we searched Medline from 1980 to 2012 for articles about the methodology and translation of AR. This was followed by collaborative creation of the background section and questions for the survey by the authors. Content and construct validation were done using a table of specifications filled out by experts including two ethics philosophy professors, and two pediatricians. Face and content validation were done by pilot testing of the survey, by non-medical, universityeducated lay people (n = 9), pediatricians (n = 2), pediatric intensive care nurses (n = 2), and an ethics professor (n = 1). Each pilot test was followed by a semistructured interview by 1 of the authors to ensure clarity, realism, validity, and ease of completion. A published clinical sensibility tool was used for the expert and pilot testing [43]. After minor modifications, the survey was approved by all the authors.

Questionnaire content
The background section stated: "In this survey, 'animals' means: mammals, such as mice, rats, dogs, and cats. It has been estimated that over 100 million animals are used in the world for research each year. There are many good reasons to justify AR, which is the topic of this survey. Nevertheless, some people argue that these animals are harmed in experimentation, because their welfare is worsened. In this survey, 'harmful' means such things as: pain, suffering (disease/injury, boredom, fear, confinement), and early death. This survey is about how AR should be performed. We value your opinion on the very important issue of the methodology of AR." We presented demographic questions, 15 questions that asked respondents "about the methods of AR that are commonly discussed by animal researchers", 4 questions that asked the respondent "to consider what you think the benefits to humans are as a result of AR", and 8 questions that asked the respondent "for your opinions about what you expect from AR paid for with public funds (for example, funding by government using tax dollars, or charitable foundations using donations)." Response choices included scales of "strongly agree, agree, undecided, disagree, strongly disagree", "nearly always, often, sometimes, not often, almost never", and "5-20%, 21-40%, 41-60%, 61-80%, over 80%" depending on the type of question. All the questions are shown in the Tables 1, 2, 3, and 4.

Ethics approval
The study was approved by the health research ethics board 2 of our university (study ID Pro00039590) and return of a survey was considered consent to participate.

Statistics
The web-based tool (REDCap) allows anonymous survey responses to be collected, and later downloaded into an SPSS database for analysis. The proportions of respondents with different answers were expressed as percentages. The responses of the two predefined groups, pediatricians and pediatric intensive care unit nurses/ RTs, were compared using the Chi-square statistic, with P ≤ .05 after Bonferroni correction for multiple comparisons considered significant.

Pediatricians Demographics
Forty-eight responded, but only 44/114 (39%) gave responses to more than the demographic questions. Demographics are given in Table 1.

Expectations regarding methodology of AR
The majority of respondents agreed that: anesthetic use should be monitored during surgery (100%), pain should be monitored after this surgery even over-night (91%), and experimenters in a research study should have similar training on the procedures involved (97%) ( Table 2). The majority disagreed that it is acceptable: to use less humane methods of euthanasia to reduce costs or improve results (82% or 52% respectively), to use animals when alternatives are available (73%), to do an animal experiment without a systematic literature review (100%), and to do an animal experiment using suboptimal methods (including randomization, blinding, and primary outcome specification) in order to save costs (82-93%). Only a minority of respondents agreed that failed animal models of a disease should continue to be used (30%), or that stressed animals should be used (37%). Finally, the majority agreed that guidelines consistent with these responses should be required for publicly funded AR (95%).

Perceptions of human benefits from AR
Most respondents believe that discoveries from AR sometimes or often lead to a treatment for human disease directly (77%) or indirectly (84%), and that researchers sometimes or often claim large benefits from AR (91%) ( Table 3). The majority did not agree (84%) with the statement that "AR rarely produces benefits to humans." Expectations for translation to humans from AR paid for with public funding The majority of respondents think that drugs tested on animals should correctly predict the following for humans at least 41% of the time: adverse reactions (69% of respondents), disease treatment (62% of respondents), carcinogenicity or teratogenicity (74% of respondents), and treatment of stroke, severe infection, cancer, brain or spinal cord injury (59% of respondents). The majority also expected that replication of AR findings in second laboratories or other strains of the animal should occur at least 61% of the time (95% and 68% of respondents respectively). The majority agreed that misleading (in terms of human benefit and/or harm) animal experiments should occur at most 40% of the time (86% of respondents). Finally, when asked to "assume drugs studied in animals accurately predict effects in humans less than 20% of the time. If this were true, it would significantly reduce your support for animal research", 40% disagreed (Table 4).

Expectations regarding methodology of AR
The majority of respondents agreed that: anesthetic use should be monitored during surgery (98%), pain should be monitored after this surgery even over-night (96%), and experimenters in a research study should have similar training on the procedures involved (96%) ( Table 2). The majority disagreed that it is acceptable: to use less humane methods of euthanasia to reduce costs or improve results (87% or 81%), to use animals when alternatives are available (88%), to do an animal experiment without a systematic literature review (96%), and to do an animal experiment using suboptimal methods (including randomization, blinding, and primary outcome specification) in order to save costs (87-95%). Only a minority of respondents agreed that failed animal models of a disease should continue to be used (27%), or that stressed animals  should be used (19%). Finally, the majority agreed that guidelines consistent with these responses should be required for publicly funded AR (91%).

Perceptions of the benefits to humans from AR
Most respondents believe that discoveries from AR sometimes or often lead to a treatment for human disease directly (84%) or indirectly (88%), and that researchers sometimes or often claim large benefits from AR (97%) ( Table 3). The majority did not agree (87%) with the statement that "AR rarely produces benefits to humans." Expectations for translation to humans from AR paid for with public funding The majority of respondents think that drugs tested on animals should correctly predict the following for humans at least 41% of the time: adverse reactions (85% of respondents), disease treatment (82% of respondents), carcinogenicity or teratogenicity (89% of respondents), and treatment of stroke, severe infection, cancer, brain or spinal cord injury (88% of respondents). The majority also expected that replication of AR findings in second laboratories or other strains of the animal should occur at least 61% of the time (92% and 83% of respondents respectively). The majority agreed that misleading (in terms of human benefit and/or harm) animal experiments should occur at most 40% of the time (84% of respondents). Finally, when asked to "assume drugs studied in animals accurately predict effects in humans less than 20% of the time. If this were true, it would significantly reduce your support for animal research", only 6% disagreed (Table 4).

Differences between pediatricians versus nurses/RTs
There were few statistically significant differences. Nurses more often responded that drugs for stroke, severe infection, cancer, brain or spinal cord injury should There were no statistically significant differences in responses between pediatricians and nurses/RTs on any of these questions. There was a statistically significant (p < 0.001) difference in response between pediatricians versus nurses/RTs to the question "Some people argue that animal research rarely produces benefits to humans. Do you agree that this is likely?" work in humans. Nurses were more uncertain whether AR "rarely produces benefits to humans", and would be less supportive of AR if it accurately predicted effects in humans <20% of the time.

Discussion
There are several important findings from this survey. First, most HCW respondents expect that AR is done with high methodological quality, and that costs and difficulty are not acceptable justifications for lower quality. Most expect that guidelines for AR funded with public money should be consistent with these expectations. Second, most respondents thought that there are either sometimes or often large benefits to humans from AR. Most disagreed that "AR rarely produces benefit to humans." Third, most respondents expect that AR findings should translate to humans at least 41% of the time, with many expecting this at least 61% of the time. This includes AR findings of adverse events (toxicity), carcinogenicity and teratogenicity, and disease treatments. The majority thought misleading AR results should occur no more often than 20% of the time. If translation from AR to humans was to occur <20% of the time, most would be less supportive of AR. Finally, most respondents expect that AR findings should be reproducible between laboratories and between strains of the same species.
There are important implications of these findings for public and HCW acceptance of AR (Table 5). There was a statistically significant (p < 0.001) difference in response between pediatricians versus nurses/RTs to the two questions: "Drugs that work well in animals with stroke, severe infection, cancer, brain or spinal cord injury should work in humans at least what percent of the time?" and "Assume drugs studied in animals accurately predict effects in humans less than 20% of the time. If this were true, it would significantly reduce your support for animal research." Previous public surveys have generally asked only whether people support AR for human benefit, and not asked people to evaluate the details of their expectations of AR. For example, the Eurobarometer asks "scientists should be allowed to experiment on animals like dogs and monkeys if this can help sort out human health problems"; in 2010, 44% of Europeans responded 'agree' and 37% 'disagree' [44]. This support for AR was linked with "greater appreciation of the contributions of science to the quality of life" and "an omnipotent vision of science" [45]. In the UK the 2012 Ipsos MORI determined that most (85%) are 'conditional acceptors' of AR; people accept AR "so long as it is for medical research purposes", "for life-threatening diseases", "so long as there is no unnecessary suffering", or "where there is no alternative", considering AR as a "necessary evil" for human benefit [41]. In the United States the 2011 Gallup's Values and Beliefs survey found that when asked whether medical testing on animals is morally acceptable or morally wrong, 43% (and 54% of young adults 18-29 yr old) responded 'morally wrong' [46]. In a survey in Sweden including patients with rheumatoid arthritis and scientific expert members of research ethics boards, most respondents agreed to AR for at least some type of biomedical research. Support was highest for AR into "fatal diseases" (83.1%), and diseases with "insufficient treatment options" (82.1%) [47]. In a UK survey of scientists promoting AR, lay public, and animal welfarists, the support for AR (on a Likert scale of 7) was 5.33 (1.46), 3.57 (1.70), and 1.48 (0.87) respectively. Scientists and lay public supported animal use only for "medical research", and not for dissection, personal decoration, or entertainment [48]. These surveys suggest people support AR on the understanding that it is necessary to provide significant benefit for humans with severe diseases, and is done to high ethical standards. However, none asked for the amount of detail as in our survey.
Some qualitative research also suggests there is conditional public acceptance of AR based on a utilitarian analysis of costs (to animals) and benefits (to humans) [49,50]. This conditional acceptance is usually based on the assumption that regulation has assured AR is to high animal welfare standards, of high scientific validity and merit (i.e., high quality research, leading to human benefit and cures), and that there are not alternative research methods [49][50][51]. Scientists understand this role of Compatible with recommendations of recent guidelines from the UK, USA, and Canada [63][64][65].
Studies have found poor reporting of animal welfare, including poor attention to pain control, and not using the most acceptable methods of euthanasia [11,12].
AR may need to be of much higher animal welfare quality in order to maintain public and HCW support.
AR is done using the best known methods: high standards of methodological quality. b Compatible with recommendations of recent guidelines from the UK, US, and Canada [63][64][65].
AR may need to be of much higher methodological quality in order to maintain public and HCW support.
AR often produces benefit to humans. Press releases by academic medical centers often promote AR, and most claim relevance to human health without caveats about extrapolating results to people [66]. Of published basic research papers, 0.004% led to the development of a clinically useful class of drugs [67].
Most HCW may not be aware of the literature regarding translation of AR.
AR may need to be much better at predicting human responses to drugs and disease in order to maintain public and HCW support.
AR: animal research; HCW: health care workers. a For example, monitoring and titration of anesthesia, monitoring and titration of pain control even over-night, using the most humane known methods of euthanasia, avoiding stressed animals, and using the fewest number of animals possible. b For example, performing a systematic literature review to inform study design, using optimal design including randomization and blinding, attention to training of staff, and to choosing models that have shown translation of findings to humans. c For example, most think translation rate should be over 40%, that misleading results for humans should occur no more than 20% of the time, and that if this was not the case their support for AR would be significantly reduced.
regulation as leading to societal acceptance of AR, and see regulation as legitimating AR practice [51][52][53]. However, our survey suggests that this trust in regulation may be misplaced, because regulation does not result in AR that meets HCW expectations for animal welfare, methodological quality, human benefit, or rates of translation to human medicine and cures (Table 5). Moreover, these studies showed that the public is far less accepting of the use of genetically modified animals in research, based on a deontological approach where this AR is seen as 'wrong' [49,50]. We did not ask about the common use of genetically modified animals in AR, and therefore may have underestimated HCW expectations of AR. There are two main explanations for the poor predictive accuracy of AR for humans. First, it is possible that the poor methodological quality of AR has resulted in a biased literature that has led to many human trials based on inappropriate data. Second, it is possible that animal models are not good 'causal analogical models'; not useful to extrapolate findings to humans because there are major causal disanalogies between species [54,55]. Animal models are based on this reasoning: when an animal model is similar to the human with respect to traits/ properties a,b,c [e.g. fever, hypotension, and kidney injury in sepsis], and when the animal model is found to have property d [e.g. response to protein-C treatment], then it is inferred that the human also likely has property d. This 'causal analogy' assumes that there are few causal disanalogies: few properties e,f,g that are unique to either the animal or human and that interact causally with the common properties a,b,c. However, animals are evolved complex systems; they have a myriad of interacting modules at hierarchical levels of organization [56]. As a result of this complexity, animals have emergent properties [e.g. animal traits/functions, like property d] that are dependent on initial conditions [e.g. gene expression profiles, the context of the organism, like properties a,b,c, and e,f,g]. In complex systems [e.g. animals], very small differences in initial conditions [e.g. properties e,f,g specific to a species/strain] can result in dramatic differences in response to the same perturbation [e.g. drug, treatment, or disease leading to property d] [54][55][56][57][58]. There is much empirical data finding major causal disanalogies between animal species: differences in gene expression at baseline and in response to perturbations, and in disease susceptibilities [59][60][61][62]. Thus, complexity science suggests there may be an in principle limitation for AR to predict human responses. Our survey suggests that these competing explanations must be sorted out to determine whether translation can meet public expectations in weighing the costs and benefits of AR.
This study has several limitations. Response rates for pediatricians and nurses/RTs were 39% and 58% respectively; thus we cannot rule out biased participation in the survey. Statements presented needed to be short and concise, and this may have left out important details that would have influenced the understanding of and response to the text. The moderate sample size from one University limits the generalizability of our results. Nevertheless, this is the first survey we are aware of that asks any group not just to consider whether they support AR; rather, to consider in detail the expectations for the methodology and translation of AR. Strengths of this study include the rigorous survey development process, and the inclusion of the most common critiques of the empirical practice of AR. Future study should determine the generalizability of our results.

Conclusion
We found HCW respondents had high expectations for the methodological quality of AR, and the translation of findings from AR to human responses to drugs and disease. These expectations are far higher than the empirical data show having been achieved. This disconnect between HCW expectations of AR and the empirical reality of AR suggests that if HCW were better informed they would likely withdraw their conditional support of AR. Improved methodological quality is an achievable goal if this is prioritized by researchers, reviewers, editors, and funders. Whether methodologically optimal AR can achieve better human translation to meet HCW expectations is an open question.