The development of a brief and objective method for evaluating moral sensitivity and reasoning in medical students

Background Most medical schools in Japan have incorporated mandatory courses on medical ethics. To this date, however, there is no established means of evaluating medical ethics education in Japan. This study looks 1) To develop a brief, objective method of evaluation for moral sensitivity and reasoning; 2) To conduct a test battery for the PIT and the DIT on medical students who are either currently in school or who have recently graduated (residents); 3) To investigate changes in moral sensitivity and reasoning between school years among medical students and residents. Methods Questionnaire survey: Two questionnaires were employed, the Problem Identification Test (PIT) for evaluation of moral sensitivity and a portion of the Defining Issues Test (DIT) for moral reasoning. Subjects consisted of 559 medical school students and 272 residents who recently graduated from the same medical school located in an urban area of Japan. Results PIT results showed an increase in moral sensitivity in 4th and 5th year students followed by a decrease in 6th year students and in residents. No change in moral development stage was observed. However, DIT results described a gradual rising shift in moral decision-making concerning euthanasia between school years. No valid correlation was observed between PIT and DIT questionnaires. Conclusion This study's questionnaire survey, which incorporates both PIT and DIT, could be used as a brief and objective means of evaluating medical students' moral sensitivity and reasoning in Japan.


Background
Most medical schools in Japan have incorporated manda-tory courses on medical ethics [1]. Course objectives typically include increasing students' understanding of ethical norms and resolving ethical dilemmas in clinical settings. To this date, however, there is no established means of evaluating medical ethics education in Japan.
Concurrent to the "coming of age" of medical ethics [2], a variety of measures have been developed to evaluate pedagogical methodology, including standardized tests, subjective reports and clinical vignettes [3][4][5]. Some studies have found that structure, design and curriculum influence the degree to which students' ethical reasoning skills change during courses. In particular, a great deal of attention has focused on how to evaluate a student's ability to recognize and assess the ethical problems encountered in their clinical practice and research.
Rest and colleagues constructed the four-component model in order to develop an approach to evaluating moral development [6]. This model describes the psychological process of moral behavior in four steps: moral sensitivity, moral reasoning, moral commitment (decision) and moral action. Based on this four-component model and Kohlberg's theory, Rest and colleagues developed the Defining Issues Test (DIT) as an instrument to evaluate the relative degree of moral reasoning. The DIT is a multiple choice, group-administered computer-scored measure [6,7]. Yamagishi has translated and adapted the DIT to the Japanese context [8].
The DIT has been acclaimed to be the best tool available to measure moral reasoning in the ethics field; it is a reliable and valid instrument for measuring moral development [9][10][11]. Despite its prolific use, the DIT was not specifically designed for medical ethics. Accordingly, the original DIT as it stands may not be apt to medical ethics education.
In light of this, Hebert and colleagues developed a questionnaire to measure medical students' and professionals' ethical sensitivity [12,13]. Nishimura and colleagues later translated and adapted this questionnaire to the Japanese clinical setting, currently referred to as the Problem Identification Test (PIT) [14].
With the aim of developing a fitting evaluation measure for moral sensitivity and reasoning, we designed a test battery that incorporates the Japanese version of the ethical sensitivity test (PIT) and the two most relevant vignettes to medical ethics from the DIT. It aims to measure the first two steps of the four-component model, moral sensitivity and moral reasoning [6].
This research is the first attempt in developing a systematic evaluative survey for medical ethics education among medical school students and residents in Japan. This multi-step research project was designed with the following objectives: 1. To develop a brief, objective method of evaluation for moral sensitivity and reasoning, 2. To conduct a test battery for the PIT and the DIT on medical students who are either currently in school or who have recently graduated (residents), 3. To investigate changes in moral sensitivity and reasoning between school years among medical students and residents.

Methods
Subjects consisted of 559 medical school students from one urban medical school (86 first year students, 67 second year students, 100 third year students, 102 fourth year students, 95 fifth year students, and 109 sixth year students) and 272 residents who recently graduated from the same medical school (within three years) totaling 831 subjects altogether. Medical schools in Japan are six years in length and the vast majority of students enter after high school. This medical school's second year curriculum includes a short course on the "introduction to medicine", which consists of discussion and lectures on medical ethics. Bedside learning starts during one's fourth year.
We used two self-administered questionnaire tests, the Problem Identification Test (PIT) [Appendix 1, See Additional File 1] and the first two vignettes of Japanese version of the DIT [Appendix 2, See Additional File 1]. The reliability and validity of both questionnaires have been previously examined [8,14]. The questionnaires were sent to subjects by mail with a cover letter stating that this survey is research and that participation is voluntary. The subjects were asked to fill out two questionnaires within 25 minutes. Responses were mailed back anonymously. This survey was conducted in February of 1996.
Statistical analysis was performed using SPSS 10.0J. We employed Pearson's correlation coefficients, chi-square tests and one-way ANOVA followed by Tukey's test; all with a level of significance of 0.05.

PIT
The PIT is originally based on Hebert and colleagues' approach [12,13]. The Japanese version was adapted to the context of Japanese clinical settings; the original four vignettes were condensed to three consisting of 1) a Jehovah's witness who denies blood transfusion, 2) treatment of a premature infant, and 3) treatment of a terminal patient. The PIT submits these vignettes to subjects and asks them to list all ethical issues related to each case. Instructions emphasize that subjects only list ethical issues relevant to each vignette and not explain how to deal with each case. Each vignette is scored according to the number of issues identified; this number is evaluated as an indication of problem identification ability. Each vignette encompasses three domains: A) autonomy and patient's right; B) beneficence and nonmaleficence; and C) justice and contextual features. Key phrases comprising each domain's scoring standards are stated in the appendix. The maximum number of points for each vignette is three for domain A and B, four for domain C; and 10 for the questionnaire in total.

DIT
The Japanese version of the DIT consists of six vignettes. Each vignette has 11-12 domains considered to be necessary to solve an ethical dilemma. In this survey, we implemented the first two vignettes. These two are considered to be the most relevant to medical ethics: 1) whether or not to steal medicine in order to save one's wife's life, and 2) euthanasia on a terminal patient who is experiencing great pain. The DIT is filled out by the subject as follows: 1) firstly, the subject chooses the most suitable action (decision); 2) upon doing so, he or she then lists reasons for that action by degree of significance; and 3) lastly, the subject lists the four most significant reasons in order. The DIT was scored according to instructions as stated in Appendix 2 (See Additional File 1). DIT scores provided two values: moral development stage and DP values. DP values (DP2, DP3, DP4, DP4.5, DP5) correspond to each moral development stage (i.e. stage 3 = DP 3). As described in Appendix 2 (See Additional File 1), calculated DP values reflect the percentage of respondents in each stage of moral development within their own particular school year. Accordingly, the sum of DP values for each school year of medical student or among residents is 100%. Calculated DP values provide a lens to better distinguish trends between moral development stage and school years. DIT analysis produced results concerning 1) the change in decision-making between enrolled medical students and residents, 2) the moral development stage (moral reasoning), and 3) DP values; refer to Appendix 2 (See Additional File 1).
PIT scores are exemplified in Figure 1. A significant difference between groups was seen in Domain B (beneficence and nonmaleficence). Scores remained constant between first, second and third years, yet then rose significantly (p < 0.05) in value for fourth and fifth year students. However, scores dropped in sixth year students and residents. A similar trend was apparent in the total group (p < 0.1). There was no statistically significant change between groups in scores associated with Domain A and C. DIT also indicated several statistically significant differences between groups. In Vignette 1, which pertains to the stealing of medicine in order to save one's wife, first year students responded highest with the answer "it is better to steal" (54.0%). Gradually with succeeding group, this percentage decreased with residents at 20.3%. As seen in Figure 2, this trend was inversely paralleled by the contrary response.

PIT Scores: School years
In Vignette 2, which pertains to euthanasia on a terminal patient who is experiencing great pain, first year students responded highest with the answer "it is better to prescribe" (48.0%). Gradually with succeeding group, this percentage decreased with sixth year students at 26.4% and residents at 25.7%. Conversely, the opposite trend was seen with the response of "it is better not to prescribe": first year students responded with 20.0% and gradually increased to 52.7% among residents. As shown in Figure 3, the number of subjects who responded "unsure" was consistent between groups. Chi square tests resulted in a significant difference between groups in both Vignette 1 (p < 0.01) and Vignette 2 (p < 0.05). Table 1 displays the results of moral development stage for each vignette and in total. No statistically significant differences were observed. DP values are labeled according to stage (DP3 signifies stage 3 development). Significant differences were observed in DP3 and DP4. As shown in Table 2, DP3 values decreased in association with school year. Conversely, DP4 values increased according to school year along with DP 4.5 values (not statistically significant). There were no apparent differences in DP2 and DP5 values between groups. Table 3 shows correlation coefficients between PIT scores and DIT stages among the entire sample. The correlation was low and not statistically significant. This correlation analysis was also performed within each school year group; no significant correlation was found (data not shown).

Discussion
The present study serves as the first exploratory trial for a systematic evaluation of medical ethics education in Japan. This study's test battery for the PIT and the DIT, which measures the first two steps of the four-component model of problem identification and moral reasoning, could serve as an objective and brief method for assessing courses' varying designs, methods, and curriculums.

Concerning the significance of combining the PIT and DIT
As indicated in the Background, the combination of the two tests is conceptually valid since they are theoretically measuring different aspects of Rest's four-component model. Calculations of correlation coefficients between PIT and DIT scores found no items to be significantly correlated. Our finding of no significant correlation may lend additional support to the hypothesis that the DIT and PIT each measure different variables. Nonetheless, this lack of correlation is possibly related to the fact that the PIT was designed for medical settings while the DIT was originally Data are shown by mean with standard deviation in parenthesis.  created without such specificity. Further validation studies may be needed.

Concerning the results of the PIT
The PIT results described a significant increase in fourth and fifth year medical students for Domain B and a decrease among sixth year students and residents. This trend of decreasing values amongst graduates is consistent with research previously conducted [12,15]. The decrease in moral sensitivity is likely to arise from residents being too busy to think about ethics and sixth year students being too busy preparing for the national exam. While these findings may be similar to previous studies, we propose that a positive interpretation may be possible. For example, as subjects accumulate clinical experience, they begin to sense and intuitively resolve ethical problems without identifying them as, per se, ethical dilemmas. That is, students begin to react to so-called ethical problems in an ethically correct manner without having to second-guess. An exemplary case is that of informed consent; residents may no longer consider it an ethical dilemma.
The increase in PIT scores among fourth and fifth year students suggests that the onset of bedside learning during one's fourth year has an effect on students' ethical awareness. Although students enroll in a medical ethics course at the end of their second year, we surmise that the course's teachings may be better understood once students begin to attend to patients. Further research is needed concerning possible factors to this increase in moral sensitivity among mid-year medical students.
Concerning the results of the DIT Vignette 2 decision making exemplified a significant change in choice between school years. This change may reflect a more passive attitude regarding euthanasia as a result of students' and residents' practical experience with clinical medicine. Vignette 1 decision-making also showed a significant gradual change between school years (age groups). Overall, our results showed that moral development stage was consistent regardless of age group; these findings correlate with those of previous studies [16][17][18][19].
A significant difference in DP3 and DP4 values was recognized throughout school years. Kohlberg's theory, which contends that moral development increases with age, may be able to explain this divergence in DP3 and DP4 values among respondents. Kohlberg's work and thus the theories upon which the DIT is based have been widely criticized and discussed [20]. Kohlberg's equation of moral reasoning is largely based on justice reasoning. The works of Noddings and Gilligan indirectly draw attention to this distinction by emphasizing an ethics of care in contrast to an ethics of justice in accounting for morality [21,22]. In short, while moral reasoning is applicable to the milieu of medical ethics, that of justice reasoning may not.
Several researchers have criticized Kohlberg's notions in that their justice-laden framework is inapt to the Japanese cultural background where interpersonal relationships are highly valued [8]. Accordingly, an environment where interpersonal relationships and consideration of peripheral circumstances are prioritized over reasons of justice weighs Kohlberg's 3rd and 4th stages of moral development with greater significance than the 5th and 6th stages.
In light of the above, we surmise that DIT results regarding decision-making carry more significance than those results pertaining to simple moral development stage and DP values. While the original DIT may be able to assess moral reasoning in the context of medical ethics to some degree, we contend that changes in subjects' moral thoughts (decision-making) can be evaluated by using the two most relevant vignettes.

Limitations of the present study
Interpretation of results is to some extent limited by the hypothetical character of the scenarios and by the sampling of students and residents affiliated with only one medical school located in an urban area of Japan. Additionally, the response rate for residents was low. This may be in association with respondents' level of interest regarding ethical issues. Further comparative studies are needed between residents in order to investigate this possible factor.
As recognized by Hebert and colleagues, the PIT survey may be incapable of evaluating other aspects of morality including attitudes, skills, facts and formal knowledge [13]. This test battery examines only the first two steps of the four component model. Further research to develop the other two components is necessary. Lastly, this study is limited by a quantitative approach [23,24], and is crosssectional and not longitudinal in design [19,25].

Conclusion
This study has utilized both the PIT and DIT in aims of developing an objective and brief method for evaluating medical students' moral sensitivity and reasoning. No significant correlation was found between PIT scores and DIT stages. PIT results demonstrated that values of Domain B (beneficence and nonmaleficence) significantly increased in fourth and fifth year students, yet once again dropped in sixth year students and in residents.
Although changes in moral development stage were statistically insignificant, DIT results highlighted substantial differences in decision-making (i.e. euthanasia, theft of medications) between school years.