Ethical and social issues arising from genetic research involving human populations
Research findings are often represented with over generalized descriptors such as “Asian” or other continental terms. In reality, samples are taken from much more discrete groups or specific and identifiable geographical regions. This tendency may be the result of a number of forces. Researchers sometimes attempt to draw more general conclusions than the actual data can support. In other cases, researchers may feel that without broad terms, it is difficult to gain recognition in the review process for publication or for the obtainment of a grant. The lack of education and training for scientists and researchers regarding the use of descriptors and associated problems seems to be another cause of overgeneralization. Indeed, at the workshop, some of the participants shared their experiences of receiving such pressure to generalize from both research institutions and publishers. At the current time there is no data that maps the extent of this phenomenon and, as such both quantitative and qualitative systematic investigation is needed. The following examples demonstrate how, from the perspective of genetic research done in Asia, the inappropriate use of population descriptors could cause confusion and social controversies.
Although the term “Mongoloid” is rarely used today in North America and Europe, the situation is different in Japan and some other regions of Asia. Our preliminary analysis showed 113 hits in PubMed that contain the term “Mongoloid” in titles or abstracts of papers published during the period of 2004-2013, with no signs that use is decreasing. However, even among researchers, there is little awareness of the issues and little consistency in use, and its meaning can vary significantly depending on context [9–11]. Some researchers may use the term to designate a population in a particular or a variety of regions, including Eastern Asia, Southeast Asia or indigenous peoples in North America [12]. For others, it refers only to East Asians, or may be a synonym for the more generic Asian [13, 14]. Moreover, the term has, in the past, been used to refer to individuals with Down’s syndrome. In general, despite its continuing use, the term is problematic both because of the uncertainty regarding the population referred to, and because of its past controversial use [15, 16].
Another example of the challenges associated with the use of population descriptors can be found in the frequent use of the terms European, African, and Asian. These continental terms are tremendously broad in scope. At the Tokyo meeting, for example, it was noted that even among the Japanese researchers, there was no unitary understanding of what populations should be considered “Asian.”
More importantly, these terms can, in some contexts, be interpreted as referring to white, black, and Asian, the three classic, and socially constructed “races.” There continues to be a great deal of academic work that highlights the degree to which these broad “racial” categories are, in reality, social constructs [17–19]. Although we should not overlook the correlation between “race” and socio-economic inequality involving factors such as health care and medical care, such discussion has usually arisen within the context of some North American and European societies. However, outside of these societies, the divergence between samples and population descriptors is also problematic. When the actual samples in the name of “European”, “African”, and “Asian” are taken from certain limited groups, without taking into account significant diversity within each region, it is unlikely that such broad terms have any scientific meaning, at least from the perspective of genetics on the global level [20, 21]. Moreover, the research results may be taken as supporting the classic “racial” categories, with any discovered “differences” misinterpreted as genetically determined “racial differences.”
The importance of the distinction between race and ethnicity cannot be overemphasized as the latter pays close attention to (presumably) shared cultural factors such as language, diet, and religion [22]. When considering the contribution of environmental as well as genetic factors to diversity within each continental region, the scientific validity of the use of such broad terms to describe samples becomes even more questionable.
In contrast to the above tendency to prefer broad terms, an influential study based on genome-wide 50 K SNP data reveals the detailed patterns of genetic differentiations within “Asians” [23]. The genetic ancestry of most populations was associated with ethnic and linguistic affiliations. Along the same lines, an analysis of 7,003 individuals from across Japan reveals interesting regional variations within the “Japanese” population. At one level, most Japanese fell into two main clusters from individuals taken in mainland Japan and those in Okinawa in a principal component analysis (PCA) plot based on genome-wide 140 K SNP data. Upon closer look, even among mainland Japanese, statistically meaningful genetic differentiation was found among individuals in different regions, such as Tohoku, Kanto, Kinki, and Kyushu [24].
The above study highlights that even populations traditionally presumed to have a high degree of homogeneity may have local genetic differentiations, that make the use of broader population terms less scientifically or clinically relevant. Researchers should strive to select terms that, as much as possible, reflect the sample population and nature of each study. Since genetic subpopulation structure is still generally unknown, sampling without considering the specifics of the subject population could cause false positive results on risk alleles of diseases. In addition, differences in whole genome sequences between individuals belonging to different populations should not be overgeneralized and misinterpreted as population differences.
Through our dialogue, it became apparent that the ways in which descriptors are selected sometimes differ depending on specialized fields. For example, researchers in physical/biological anthropological studies have a relatively long history of working on population genetics studies concerning local residents from whom they obtain sample data, and accumulate information on various populations from the perspective of long-term human evolution. Medical studies, on the other hand, are more concerned with the applicability of genetic studies contributing to the diagnosis and treatment of diseases. Disease gene surveys often take samples from patients at hospitals without controlling such factors as current location of residence or generational continuity in each place. Such disciplinary differences in research purposes and methods have sometimes created different understandings and placed varying levels of attention on the issue of population description. This is one example why dialogue between scholars in different disciplines is indispensable in considering appropriate population descriptors.
There has been a growing discussion of the “co-production” of knowledge by the interplay between science and society [25]. The popular press is often blamed for the use of inappropriate or imprecise terms in the context of population genetic studies, whereas many scientists may believe that they take adequate precautions when describing the study samples, defining populations, and presenting discussions based on their research results. However, evidence indicates that imprecise and less than ideal descriptors are introduced throughout the research communication process [26]. If these descriptors are not carefully chosen, they create the potential for confusion both within the scientific community and in the wider society, leading to research inefficiencies and various social, ethical, and clinical problems [7].
What, then, would be a more desirable way to describe populations under study? The key is to use population descriptors that are scientifically valid for the particular study. For the first step, we recommend the use of population descriptors with more specific characteristics, such as geographical location and ethnic labeling as previously attempted – albeit imperfectly [27] – by research initiatives like the International HapMap Project [28]. This recommendation is based on the fact that various studies demonstrate the strong correlation between genetic distances and distances based on geography as well as ethnic affiliations [23, 29]. This is, we believe, a better solution, but not a final one. Even when scientists choose more specific terminology, they have to explain the rationale behind the descriptors and what rules they employ in selecting the samples and defining the population.
Finally, the importance of education for undergraduate and graduate students as well as young trainees in human genetics and medicine cannot be overemphasized. It is urgent to prepare appropriate curriculums incorporating these ethical and social issues in order to effectively change the awareness of scholars and practitioners in the near future.