Skip to main content

“That’s just Future Medicine” - a qualitative study on users’ experiences of symptom checker apps

Abstract

Background

Symptom checker apps (SCAs) are mobile or online applications for lay people that usually have two main functions: symptom analysis and recommendations. SCAs ask users questions about their symptoms via a chatbot, give a list with possible causes, and provide a recommendation, such as seeing a physician. However, it is unclear whether the actual performance of a SCA corresponds to the users’ experiences. This qualitative study investigates the subjective perspectives of SCA users to close the empirical gap identified in the literature and answers the following main research question: How do individuals (healthy users and patients) experience the usage of SCA, including their attitudes, expectations, motivations, and concerns regarding their SCA use?

Methods

A qualitative interview study was chosen to clarify the relatively unknown experience of SCA use. Semi-structured qualitative interviews with SCA users were carried out by two researchers in tandem via video call. Qualitative content analysis was selected as methodology for the data analysis.

Results

Fifteen interviews with SCA users were conducted and seven main categories identified: (1) Attitudes towards findings and recommendations, (2) Communication, (3) Contact with physicians, (4) Expectations (prior to use), (5) Motivations, (6) Risks, and (7) SCA-use for others.

Conclusions

The aspects identified in the analysis emphasise the specific perspective of SCA users and, at the same time, the immense scope of different experiences. Moreover, the study reveals ethical issues, such as relational aspects, that are often overlooked in debates on mHealth. Both empirical and ethical research is more needed, as the awareness of the subjective experience of those affected is an essential component in the responsible development and implementation of health apps such as SCA.

Trial registration

German Clinical Trials Register (DRKS): DRKS00022465. 07/08/2020.

Peer Review reports

Background

Imagine waking up with a sore throat and other symptoms of a cold. You pick up your smartphone and open the Symptom Checker App (SCA) that a friend just recommended. The chatbot asks you some questions about your symptoms, including duration, intensity of your pain, and other relevant factors. As you lie in your bed, you click through the questions, and after a few minutes, the SCA provides a list of possible causes and a recommendation to stay home for now. While many people have likely had similar experiences in using such apps, little is currently known about subjective experiences of SCA users, including their expectations, motivations or concerns.

SCAs are mobile or online applications that allow lay people to identify possible causes of their symptoms (symptom analysis) and provide recommendations on whether to seek health care (self-triage). Most SCAs are freely available. Typically, a SCA prompts the user to answer a series of questions about their symptoms through a structured questionnaire or chatbot interface. ‘Chatbots are systems that are capable of conversing with users in natural language in a way that simulates the interaction with a real human’ [1]. In the case of SCA, the app asks questions via text that the user is supposed to answer either by selecting predefined choices or in a free-form manner. To be precise, we cannot say that SCAs are like ‘conversational agents’, but rather a kind of questioning tool. When the information is collected, the app generates a list with possible causes along with recommendations, such as seeing a physician in a timely manner or staying at home. The introductory example illustrates that lay people could easily use SCAs to assess their symptoms at home, without having to consult medical professionals. SCAs are part of the increasing global trend of online health information-seeking behavior [2]. The search for information on health and illness outside the traditional health care system that has emerged with the internet is critically examined, for example as an individual ‘lifestyle’ issue [3], its’ influence on the patient-physician relationship [4], or regarding the underlying models of information seeking behaviour [5]. Despite the vast amount of literature on mobile or online health applications in general, there is a lack of (empirical) research focused specifically on SCAs [6].

There are few empirical studies specifically on SCAs, in particular on their diagnostic and triage accuracy. A 2015 study showed that symptom checkers had deficits in symptom assessment and triage advice [7]. Seven years later another study that refers back to Semingran et al. showed that the triage performance of SCA did not improve on average, but rather decreased in two examples [8]. However, SCAs are very diverse. One study compared the accuracy of two SCAs in diagnosing inflammatory rheumatic diseases with physician diagnosis and concluded that the diagnostic accuracy of SCAs is limited [9]. A study by Fraser et al. investigated diagnostic and triage accuracy of a symptom checker used by patients at an emergency department [10]. The symptom checker’s diagnoses were compared with those of emergency physicians and the final diagnoses from the emergency department. This study concluded that SCAs can provide acceptable diagnostic accuracy for patients with various urgent conditions [10]. One study by Gilbert et al. compared the extent of disease coverage, the accuracy of the suggested diseases, and the appropriateness of the recommendations from diverse SCAs with general practitioners (GPs) [11]. In this study no SCA outperformed the GPs. Although SCAs promise to improve diagnostic processes, reduce misdiagnosis, and guide patients more effectively through health care systems, there exists little supporting evidence at the moment.

In addition, the motivations why people use SCAs and whether they find them useful has been little studied. Furthermore, it is unclear if people use SCAs to supplement medical advice or as a substitute for in-person physicians’ visits. A pilot study by Miller et al. examined the usability, acceptability, and utility of an SCA in primary care [12]. In this study, the symptom checker was evaluated as very user-friendly and acceptable by patients in primary care [12]. One study by Meyer et al. examined patients’ experiences using an online symptom checker [13]. Despite ongoing concerns and little empirical evidence about the accuracy of SCAs, patients in the study by Meyer et al. perceived the symptom checker as a useful tool for diagnosis [13]. Further studies investigated the perspectives on symptom checkers by young adults [14] or older adults [15], and in specific contexts, for example, in cases of rheumatology [9; 16] or at emergency departments [10]. It is crucial to know and understand the users’ perspectives on SCA use, as these insights can show how SCAs can be improved and implemented in a responsible way.

There are sporadic debates on the ethical and social aspects of SCAs. It is debated, for example, who should have access to this new technology, whether these apps might empower users or whether they are an acceptable substitute for a medical consultation. It is also debated whether SCAs lead to over- or under-triage and whether (legal) regulation is needed [6]. While the ethical debate about SCA might be sporadic, there is a growing ethical debate surrounding mHealth and Health Apps in general [17,18,19,20], and it should be remembered that SCAs are situated within this context. Although a wide variety of perspectives play a role in these ethical debates, one of the most important perspectives, that of the user, remains underrepresented. The discussions on the ethical implications of SCAs are often poorly supported by empirical data. Furthermore, the subjective perspective of the users and the ethical aspects are not interlinked yet.

This qualitative study investigates the subjective perspectives of SCA users to address the ethical and social aspects in the context of SCA use. The aim is to investigate the subjective perspectives of SCA users to bridge the empirical gap in the existing literature by answering the following main research question: How do lay persons (healthy users and patients) experience the usage of SCAs, including their attitudes, expectations, motivations and concerns regarding their SCA use? The study focuses on SCAs that are for lay persons (healthy users and patients), not for healthcare professionals, and do not focus exclusively on specific types of illness.

Methods

Study design

This qualitative study is part of the project CHECK.APP [21]. This project investigates the impact of SCAs at different levels of the healthcare system (at the micro, meso, and macro level), from different disciplines (ethics, law, sociology, social medicine, occupational health medicine, and health services research), and from multi perspectives ((non)users, general practitioners, and healthcare experts) [21]. The CHECK.APP project has different, but overlapping study phases, starting with a comprehensive literature review on the ethical, legal, and social aspects of SCAs [6], a representative survey on the usage of SCAs in Germany, a self-observational diary study combined with qualitative user interviews, and lastly, qualitative interviews with GPs and healthcare experts. This paper focuses particularly on the data from the individual user interviews, highlighting users’ experiences, expectations, and assessments.

The results of the preceding study phases (review, survey, and diary study) influenced the objectives and design of the qualitative interview study. Based on the empirical research gap - identified by the scoping review [6] - an explorative study approach was chosen. The lack of empirical data on the subjective user perspectives in the literature on SCAs informed the research objective: to gain a first-hand understanding of those using SCAs. Accordingly, an explorative qualitative design, namely a qualitative semi-structured interview-study, was chosen to clarify the relatively unknown experience of SCA use.

The main research question is, how do individuals using SCAs experience their usage. This overarching question is divided into five sub-questions: What are the users’ expectations of SCAs? How do users assess the findings and recommendations of the SCA? How do the users perceive the communication with the app? How do they perceive the role of the app in the context of a physician’s visit? What risks do users notice when using SCAs? These sub-questions were derived through the research within the CHECK.APP project, i.e., from previous studies and regular discussions within the research group. According to these five sub-questions, the framework for the interview study and the following analysis were structured around five themes: (i) users’ expectations regarding SCA use; (ii) assessment of findings and recommendations; (iii) communication; (iv) role of the app in the context of a physician’s visit; (v) risks.

Based on these five themes, a semi-structured qualitative interview guide was developed in a stepwise process. In the first step, using the results of the literature review and the precedent survey, tentative questions for the interviews were formulated. In particular, the participants’ notes of the diary study were used as a basis for developing the interview questions. In the second step, all questions collected were checked for their suitability, e.g., whether the questions were relevant regarding the objectives of the study and the main research question. In the last step, the relevant questions were sorted and grouped according to the five themes. The resulting interview guide had around 10 questions, and a planned duration of about an hour. The guide starts with open-ended, more narrative questions and ends with more specific questions on normative issues (see Supplement 1: Interview Guide English). Two pilot tests were conducted in December 2021. These two pilot interviews were included in the final analysis as they resulted only in minor modifications to the interview guide. The final semi-structured qualitative interview guide was then applied for interviews with SCA users.

Sample and recruitment

The CHECK.APP project used different recruiting tools to get a representative sample. For the first project phase, German citizens were contacted via mail by an external partner to participate in the representative survey. Additional recruiting was conducted by mailing lists of the University Medicine Tübingen, and social media. Inclusion criteria for participation in the CHECK.APP project was the ability to give consent and German language skills of at least B1 of the Common European Framework of References for Languages.

For the diary study the sample was then restricted to participants who had previous experience with SCAs, specifically with the symptom checker Ada [22]. Ada was used in the qualitative parts of the study as an example of a SCA, as it is one of the most popular SCA in Germany. The participants for the individual interviews were recruited from the previous diary study. The content of the diary-based self-observation, SCA usage behaviour, medical indication, and socioeconomic factors were used as criteria for sampling. The sample size was calculated on the 5D model of information power by Malterud et al. [23].

All study participants were informed orally and in written form about the research goals of the CHECK.APP project, the process of the study (parts) and their rights. The participants got financial compensation for participating in the individual interviews and a following member check.

Data collection and analysis

Due to the COVID-19 pandemic, the semi-structured qualitative interviews were carried out via video call (Zoom Meetings). The interview study was conducted by two male and two female researchers (RM, MK, RK, and one assistant), who worked in the CHECK.APP project at the time of the study and are trained in qualitative methods. The interviewers had different credentials (PhD and MD) and various disciplinary backgrounds (Medicine, Philosophy/Ethics, and Social Sciences). Each interview was led by two researchers from different disciplines in tandem. While one researcher asked the questions, the other researcher took notes. The researchers discussed the interviews after their completion and in regular project meetings. All interviews were audio-recorded and transcribed verbatim by an external partner.

The interview transcripts were randomly distributed to three researchers (RM, MK, and one assistant) who analysed the transcripts by content analytical procedures. The methodology selected for the data analysis was qualitative content analysis according to Mayring [24]. This data analysis technique provides a systematic way of reducing and synthesising a wide range of data. Its central idea is to assign categories to text passages through a qualitative-interpretative act. The technique follows a systematic procedure and strict content-analytical rules combining deductive and inductive category development. The aim was to shorten the large data material and filter the essential contents via reduction and progressive generalisation.

The three researchers worked through the interview transcripts with the aid of the software program MAXQDA and with the previously developed, deductively obtained system of the five themes: (i) users’ expectations regarding SCA use; (ii) assessment of findings and recommendations; (iii) communication; (iv) role of the app in the context of a physician’s visit; (v) risks. Before starting the investigation, the units of analysis and the level of reduction and generalisation were defined. In the first step, the researchers independently formulated text passages relevant to the research questions into simplified short paraphrases (paraphrasing). In the second step, the contents of the paraphrased passages were generalised on the previously defined abstraction level (generalisation). In the third step, the generalised passages were reduced and summarised into central categories (reduction). While the researchers discussed preliminary results, uncertainties and emerging questions during the coding process, each researcher individually worked through the transcripts. In the last step all developed categories were summarised by one researcher (RM) into a final category system, which was then back-checked on the original material and through discussions among the CHECK.APP researchers.

Coder influence was controlled by three researchers (RM, MK, and one assistant) with different professional backgrounds being involved in the data interpretation and repeating discussions in the CHECK.APP project during the analysis. Additionally, in the middle of the analyses a digital member check was conducted with participants of the CHECK.APP study to test the validity of the results. During the member check, we presented our interim results to the interview partners and asked them for their feedback on these results. The member check was seen as a tool to validate partial results of the interview analysis while checking the relevance of the results for the people concerned. Internal methods-workshops in the CHECK.APP project and an external advisory board served as further tools for quality control.

The study was conducted in accordance with the Declaration of Helsinki [25]. Ethical approval was obtained from the ethics committee of the University Tübingen. Ethics research requirements, such as informed consent and data protection, were carefully considered. The reporting of this qualitative study is oriented on the COREQ checklist for reporting qualitative studies [26].

Results

Fifteen interviews were conducted between January and March 2022 in Germany via video call. The interviews lasted an average of 46 min, ranging from 35 to 76 min (median 41 min). The age of the participants ranges from 20 to 69 years, yet the most participants were between 20 and 29 years. Nine interviews were conducted with male SCA users, six interviews with female users. The recruiting tools did not lead to the inclusion of trans- or intersex-persons in the sample. Most of the participants are well educated (‘Abitur’) and live in rural areas. The participants’ characteristics can be seen in Table 1.

Table 1 Participants’ Characteristics

We analysed all fifteen interviews. Throughout the analysis the framework with the five predefined themes was modified through the coding process and expanded to seven categories (in alphabetical order): (1) Attitudes towards findings and recommendations, (2) Communication, (3) Contact with physicians, (4) Expectations (prior to use), (5) Motivations, (6) Risks, and (7) SCA-use for others. The distinction between attitudes, expectations and motivations had been developed deductively. The differences, for example, between an attitude and a motivation were not easily discernible in the interview material. Expectations can be understood as a transverse category, because they are implicitly reflected in all categories. Nevertheless, we were explicitly interested in the users’ expectations prior to their SCA use, as opposed to their attitudes after their SCA use. Expectations, in our understanding, refer to the app (e.g., “The SCA should do this or that…”). By motivation, we understood statements that refer to the users themselves (e.g., “I use SCA to….“).

During the process of categorisation, up to five levels of subcategories were identified and 1052 codes were assigned all up. Table 2 shows all categories and subcategories up to the fifth level, and the frequency of their overall occurrence. Selected study results will be presented in the following sections according to the seven main themes and, subsequently, selected aspects will be discussed.

Table 2 Categories and Subcategories

Attitudes towards findings and recommendations

The study revealed that those using SCAs assess the findings (in the following we use “findings” and “symptom analysis” synonymously) and triage-advice provided by the app (in the following “recommendations”) very heterogeneously, depending, for example, on their chosen criteria for assessment. This first main category “Attitudes towards findings and recommendations” encompasses seven subcategories (in alphabetical order): 1.a) Adherence (to recommendations), 1.b) Advantages of SCA use, 1.c) Criteria for assessing findings, 1.d) Disadvantages of SCA use (as compared to other sources), 1.e) Effects on users’ perceptions, 1.f) (No) perception of findings as a diagnosis, 1.g) Personal factors influencing assessment.

Regarding the subcategory 1.a) Adherence (to recommendations) the interview participants described both behaviours: following or not following the app’s advice. Whether the recommendations were followed (or not) depended on the given symptoms, the listed diseases, one’s own experience (e.g., family history of the disease), the assumed probability that the findings are correct, and one’s own assessment. Reasons for following the recommendations were personal reasons (such as time resources), individual suffering (e.g., current pain), urgency and severity of the listed diseases, plausibility of the SCA results, and fear (e.g., of worsening symptoms). Reasons for not following the recommendations were also personal reasons, including long waiting times for physical appointments, no confidence in the SCA results, a “wait and see” attitude, perceived implausibility, incorrectness, and unspecificity of the SCA results. No acuteness and no emergency cases were also stated as reasons for not following the SCA recommendations. For example, one participant said that she would have disregarded the app’s recommendation had she been suffering from tonsillitis. The reason she gave was that tonsillitis, not being life-threatening in her view, did not warrant a visit to a physician or an emergency room. Consequently, she cited the lack of necessity as her reason for not following the app’s advice.

“In the case of tonsillitis, it is nothing life-threatening. I then have pain in my throat that most likely needs to be treated with antibiotics, but I don’t need to see a doctor or someone who treats me acutely or initiates any emergency measures that I would urgently need in the event of a stroke or heart attack or something. And that’s why I didn’t follow the app’s recommendation for the tonsillitis, because I knew it wasn’t necessary.” (NT 39)

Regardless of whether the recommendations were followed or not, they were considered primarily as an orientation or guidance by the SCA users.

In the interviews, several positive aspects of using SCAs were mentioned, which we summarised under subcategory 1.b) Advantages of SCA use. Here, confirmation and reassurance through app use were described by the participants. The support via the SCA regarding decisions and further actions was valued. The expertise and seriousness of SCA were emphasised, in particular compared to simple googling. Furthermore, information, orientation, and the naming and frequencies of possible causes were highlighted as further positive aspects. Some participants described a general satisfaction and confidence with the SCA results and emphasised their reliability. The timeliness of the query and its usefulness as a reminder and for documentation were also described.

In the subcategory 1.c) Criteria for assessing findings we collected the criteria that users relied on to assess the list with the possible causes. Most participants reported that they did not only rely on the list and information provided by SCA but turned also to other sources of information. Other digital search tools, such as Google, other apps and third parties, such as friends or family members, were mentioned. One participant reported, for example, that she compared the findings of SCA with her own family medical history. In addition, many participants reported that they asked for further verification and specification by physicians to confirm (or reject) the symptom analysis of the SCA. For example, one participant regarded the findings of the SCA as a kind of recommendation that should be clarified with a physician, especially in urgent cases. In the user’s assessment of the findings, the physician would provide more certainty than the SCA.

“[…] with the app, it’s more, I would say, like a recommendation for a diagnosis. So, if it were something really serious, I would definitely go to the doctor afterwards and clarify it with him, maybe also show him what the app said. Then he’ll probably make his own medical checks, but I think that in the end, it’s to a certain extent the same, but I wouldn’t put the same certainty [on the app results], as when I go to an expert.” (NT 43)

The plausibility of the findings provided by the SCA was an important factor for the users in their assessment of the app. Participants were more likely to accept results that seemed plausible to them, but were sceptical about the app’s findings and recommendations if the listed causes or conditions were unknown to them or had no connection to their individual experience. Consistency with their knowledge, personal experience and the current individual situation served as further criteria for the plausibility of SCA findings.

“If you had received a diagnosis that was completely different from anything you had heard before, I probably would have used other online tools first and looked for diagnoses to see if something like that even made sense or why it came up. Whenever something came up that was plausible to me, I took it seriously […] if it was consistent. My personal attitude: sceptical towards recommendations if they don’t match other things, or if you’ve never heard them before. Then online search or in other apps, whether this makes any sense.” (NT 47)

Another criterion for users in assessing the app was their own (body) sensation. The interview participants emphasised their abilities and feelings to assess the SCA’s findings. One participant, for example, said that he “would put [his] body feeling above the app results” (NT 21). Awareness of one’s own body, trust in one’s own abilities, common sense, and gut feelings were referred by the participants as criteria assessing the SCA’s findings.

All interview passages in which the app’s symptom analysis and recommendations were negatively described were summarised in the subcategory 1.d) Disadvantages of SCA use (as compared to other sources). Some participants expressed concerns that the SCA findings could be negatively influenced and biased by their own user behaviour. One participant, for example, reported that he tried to use the SCA with utmost neutrality. He aimed to avoid letting his prior knowledge to sway the questions and, therefore, the outcome of the SCA.

“I tried to use the app as neutrally or as unbiased as possible, even if I already had something in my mind. Nevertheless, I tried to answer the questions as they come without being influenced by it, simply because, with my previous knowledge, I don’t want to steer the questions in a certain direction that I like, perhaps, or the way I think it might be.” (NT 44)

Like this participant, many users worried that their inputs could inadvertently influence or “determine” the questions from the chatbot and, accordingly, the findings. SCA users’ emotions, assumptions, and incorrect knowledge were seen as aspects that could lead to incorrect input. According to the users, instead of neutral input, the users could (more or less unconsciously) influence the SCA’s final findings by their answers and choices. In addition to this concern, the participants also directly criticised the SCA’s findings. It was criticised that already known diagnoses were not recognized by the SCA. Concrete examples were migraine, asiderosis, and diverse skin rashes. Further critiques described the SCA’s findings as implausible or one-sided. The participants also complained about the functional logic of the SCA, specifically about the intransparency of the SCA. For example, when they could not understand the order of the questions in the chat. Some recommendations of the app were evaluated as too extreme (e.g., going to the emergency room). Some participants referred in this context to the risk-aversion inherent in the SCA. Another disadvantage was seen in the “stand-alone-character” of the SCA, e.g., that the SCA is not connected with other information services or tools. Further critiques were an increase of mental stress, the lack of further information on self-treatment options, no objectivity or reliability of the apps’ findings, and therefore no trust in it. Some participants found the results to be too nonspecific and the probabilities displayed to be unhelpful, leaving too much room for interpretation.

Some participants also reported effects on their own (body) perceptions through SCA use. These effects were summarised in the subcategory 1.e. Effects on users’ perceptions. One participant described being more aware of his own body and being more reflective about his body as a result of SCA use. Another participant said that the use of the app brought some aspects to the forefront that she would not have otherwise paid attention to. In contrast, some participants described that SCA use had no effect on their perception of their own body feelings or symptoms.

In the interviews, there were different statements about whether the SCA findings could be understood as a diagnosis or not. We summarised these statements in the subcategory 1.f) (No) perception of findings as a diagnosis. Some participants called the app’s findings a diagnosis, saying it would be the same or at least equivalent to a medical diagnosis. In contrast, other participants stressed that the SCA findings could not be understood as diagnoses and emphasised the differences compared with both their own assessment and a physician’s diagnosis. One participant described, for example, “It was not a diagnosis for me because it did not match my feelings […] I wouldn’t take it seriously without asking a doctor.” (NT39) The participants expressed opposing views and a lot of uncertainties on that topic. Often, they had no clear opinion and described it rather as gradual distinction or continuum (e.g., “It is not a complete diagnosis” (NT31), “to some extent it is a diagnosis” (NT19)). Whether the SCA’s findings were perceived as a diagnosis depended on various aspects, for example on the severity of the symptoms and urgency of the treatment. The participants referred to the physician’s judgement, the plausibility of the results, their own abilities, and other sources of information. Very often the participants perceived the app’s findings as a hint or an impulse to the “correct” diagnosis, whereas the “correct” diagnosis was defined by a diagnosis made by a physician. One participant stated for example: “The SCA gives me an impulse to say, maybe it could go in that direction or that. But at the doctor’s, it’s just that I expect a very specific correct diagnosis.” (NT21) The judgement by a physician and the specificity of this judgement were seen as prerequisites for a “correct” diagnosis. In addition, the physician’s diagnosis was often assumed to have a certain authority, that was not doubted. For example, one participant described the diagnosis as something conclusive. “For me a diagnosis is such a final thing, so when someone goes to the doctor and gets a diagnosis, you don’t doubt it, that’s just the way it is.” (NT43) Aspects for a complete or correct diagnosis were the inclusion of other examinations (e.g., physical examination or laboratory procedures) and following steps (e.g., treatment options, referrals to specialists, or medical prescriptions). The participants emphasised that both cannot be provided by the SCA.

The participants described the influence of personal factors on their app assessment. These statements were summarised in the subcategory 1.g) Personal factors influencing assessment. The participants explained that personal traits, such as a timid character, can influence how the users assess the app’s findings and response accordingly. An example often given in the interviews was that an anxious person would evaluate the app’s findings differently than a risk-taking person and, accordingly, behave differently. The term “hypochondria” has often come up in this context. Another example was that the personal patient history can also have an impact on how individual users respond to the app’s findings and recommendations.

Communication

In the second main category “Communication” we collected all interview passages that referred to the communication of the SCA users. This covers the interaction between users and their SCA, but also how the SCA influenced the users’ communication with others, for example with physicians. This main category has four subcategories (alphabetical order): 2.a) Advantages, 2.b) Comparison of communication with a physician, 2.c) Disadvantages, and 2.d) Modes of communication.

We subcategorized all positive aspects described by the participants when asked about their communication with the SCA in the subcategory 2.a) Advantages. In some interviews a correlation between a good communication with the SCA and a general affinity for technology was stated. It was reported by the participants that SCA made it easier to deal with unpleasant topics, for example body weight or sexually transmitted diseases. In addition, it was emphasised that the SCA went into detail and could visualise or localise symptoms, for example on an image of the human body in the app. The participants reported that the SCA was impartial and open, and simple to use. Some participants perceived the chat as similar to a human dialogue and felt “noticed” by the app. When describing how the SCA can affect their communication with physicians, participants reported positive as well as negative aspects.

“[…] when I have symptoms, I always look on the internet, in the app or somewhere else. And then I have an idea, what it could be. And then, of course, I go to the doctor with a certain expectation and see if this is confirmed. So, I think that [the SCA] has an influence on the conversation with the doctor because I emphasise certain symptoms more than others. And I can describe my health more accurately than if I hadn’t searched before. Maybe that’s good in some cases because it leads to a quicker result. But maybe it is also negative because I lead the doctor on a wrong track […].” (NT 06)

Like this SCA user, several participants described that using SCA could lead to the development of expectations that could influence their interaction with physicians. The positive effects would be, on the one hand, better information and a detailed description of the patient’s symptoms. On the other hand, a potential negative effect could be the unintentional influence on the physician’s judgement.

Many participants compared their communication with the SCA to their communication with a physician. These statements were summarised in the subcategory 2.b) Comparisons of communication with a physician. The participants described their dialogues with physicians as more personal and interactive: physicians would listen to them, take their needs seriously, respond to what they say and ask further questions. Furthermore, physicians can recognize facial expressions and gestures and question what is said by the users. In contrast, the communication with the SCA was not described as a dialogue, but rather as impersonal or non-human because of its “yes” or “no” questions.

“If you ask me whether it feels as if you’re in a dialogue, then I have to say quite clearly: No. It’s just such a sequence of yes and no questions, which of course also restricts the diagnostic capability, I would say, but, yes, … I think we’re technologically not advanced enough yet to make it better or convincingly human.” (NT 24)

At the same time, the communication with the SCA was reported as rich in details, more comprehensive, and more neutral. In particular, the interview style of the app was perceived as positive by some participants.

“I think this interview process is quite good. In part, you feel very well perceived, simply because of the questions. And if you compare that with a doctor, well, maybe a doctor, maybe a family doctor knows you. Maybe he knows more about you. But otherwise, very detailed questions are asked by the app, which a doctor would probably not ask. At least that’s how it seems to me.” (NT 07)

The many detailed questions of the SCA were described as positive in comparison to physicians, as they would, among other things, help the SCA user to feel perceived. Despite these positive aspects, various negative aspects regarding the communication with the SCA were reported. These statements were categorised in the subcategory 2.c) Disadvantages. In addition to general limitations of communication with an app, the participants criticised that there is no human counterpart and thus no real dialog with the app or chatbot. Some participants were irritated by anthropomorphizing features of the chatbot and explained personal uncertainties in this context.

“Sometimes I think it’s a little bit too much, as the app is trying to replicate a person or a real diagnostic conversation, I’d say. Because sometimes, it’s like you would say something in a conversation. I always find that a bit strange with apps, because it’s just an algorithm that calculates something. There’s no one, but it doesn’t really bother me. I think it just seems a bit more friendly. People are probably more willing to answer questions.” (NT 13)

While some users found the SCAs’ attempt to mimic personal conversation slightly disconcerting, it did not discourage future use. Some saw the purpose of this imitation as motivating the SCA users to answer the questions of the app. Some participants perceived the communication with the SCA as too technical and text loaded. The chat’s tediousness and lack of focus were criticised. Some participants spoke of the SCA as a gimmick and described their playful mode of interaction with the app. Most participants reported a truthful communication with the app, meaning that they entered truthful information about themselves (e.g., age) and their symptoms in the app.

“I found it quite practical, because things are asked that would perhaps be a bit more unpleasant to answer, at first, if a person would ask them to me. With the app, I’m sitting in front of it alone and can simply choose. That’s why it was easier for me to answer more truthfully.” (NT 42)

In some cases, as in this quote, it was described as easier to answer the SCA’s questions truthfully than questions from a person, especially when it came to unpleasant topics. Being alone with the app seems to facilitate a greater level of honesty among some users.

Contact with physicians

The third main category “Contact with Physicians’’ includes all interview passages concerning contact or a link between the SCA users and physicians. This main category has four subcategories: 3.a) Comparison SCA – physician, 3.b) Decisions regarding medical consultation, 3.c) Role of SCA at physicians visit, and 3.d) SCA as recommendation.

In the subcategory 3.a) Comparison SCA – physician we collected all statements in which SCAs were compared with physicians. In this category, advantages of the physician’s visits were described, such as the personal, individual, and comprehensive view of physicians. Most participants considered the physicians as professionals. In their view, and in contrast to SCAs, physicians could give reliable diagnosis, prognosis, and prescriptions. In addition, they could recommend subsequent treatment options.

“Well, the doctor is of course professional, he gives more individual information. Especially the family doctor, he knows me. He also knows if there are any pre-existing diseases. And he can of course give a more precise diagnosis, or more precise information about what it could actually be. And above all, how do we treat it? Of course, I can only find that out from my doctor, I didn’t find that out from the SCA. […] in principle I would say that [the SCA] definitely does not replace a visit to the doctor.” (NT 19)

The participants emphasised the relevance of the physical examination and underlined that an app cannot provide such an examination. In contrast, the participants also pointed out advantages of SCA usage. Some participants highlighted that the SCA could not reject a question or request, compared for example with long waiting times for a physician visit. Other participants highlighted that the SCA could ask in more detail and had a more comprehensive view. Still others emphasised that the SCA was more objective and practical. Even though a lot of advantages were addressed, the participants emphasised that the SCA was not a substitute for medical care from a trained and licensed physician. In this context, the trustworthiness of physicians and the SCA played a role.

In the subcategory 3.b) Decisions regarding medical consultation we collected all statements from the participants deliberating about using a medical consultation. The participants mentioned various reasons for getting in contact with physicians. Besides their own assessment, acute, very severe, and unknown symptoms as well as suffering were occasions for them to consult a physician. The confirmation or clarification of the symptoms through medical professionals and the notification of diseases were further reasons.

“For example, if I now had some symptoms, which really concerns me very strongly, where I now would have thought to myself: Hmm, maybe it would be better now [to go to the doctor]. And then I would have looked in the app and the app would confirm me in this thought, so then the visit to the doctor would be worthwhile.” (NT 37)

Some participants saw the SCA as confirmation of their intention to see a doctor. In general, the participants saw the SCA as a decision-making aid prior to medical consultation.

The third subcategory 3.c) Role of SCAs at physician visits includes all interview passages concerning the role of the SCA at physicians’ visits. Various roles were stated throughout the interviews. Many participants regarded the SCA as a useful reminder or tool for the documentation of their symptoms. The participants described both the confirmation of SCA results by physicians, but also discrepancies between SCA and physicians. In this context, the influence of SCA on the user-physician-relationship and the reference to the SCA during the physicians’ consultation were addressed by the participants. Some participants described difficulties mentioning the SCA to physicians, while others expressed no concerns. In the case of mentioning, some participants reported a lack of knowledge and astonishment as a reaction from the medical professionals.

“He [the doctor] found it very good that I had already used this app. But he didn’t know about it until then. I don’t know if he looked at it once at some point. He was quite surprised that something like that exists. […] he said: Yes, that’s just future medicine. But in any case, it didn’t have a negative influence. Whether it was positive? I would simply say it was neutral. He thought it was good that I read through the app. Of course, he also said that it will not replace his diagnosis. But, yes. It was in no way negative.” (NT 39)

In general, the participants experienced both positive and negative reactions. When asked what reactions they would wish for, the participants mentioned more discourse and collaboration with the medical professionals. In case of differences between what the physician says and what the SCA outputs, many participants would trust the physician more than the SCA. Knowledge of each other plays a role in this assessment and again the unquestionable nature of the information provided by the physician was mentioned.

“Assuming there is a difference between what the doctor tells me and what the SCA tells me, I would ask more questions based on the information I have from the SCA. […] So I guess my trust in the doctor is relatively high. I’ve known him for a relatively long time and so on. So, I don’t know, I would definitely ask [the doctor], maybe something will come out of it. But I think if it were different, I probably wouldn’t doubt it [….].“ (NT 04)

A few participants said that they would not mention or even conceal their SCA use to the physicians. Reasons for hiding the SCA were expected negative reactions and different treatment. One participant described, for example, his concern that the physician would treat them differently, as soon as she found out about his SCA use.

“I haven’t mentioned to any doctor that I’ve already tried this with the app […] because I assume that the doctor then deals with the patient in a completely different way. Well, not every doctor, but maybe some. Because he then [would say]

 “If you already ask Doctor Google, why then you come to me, then you know that already better than I do”.

I don’t want to give that impression, I simply let the doctor do. […] but, as I said, I didn’t mention the use of the app anywhere, not to any doctor.” (NT 19)

Other concerns included rejection and having to justify the visit to the physician. To prevent this, the participants would conceal their SCA use. Another reason for hiding their SCA use was the irrelevance of the SCA usage. Meaning, for example, that the physicians would do their own medical checks irrelevant from the SCA results.

Some participants used the SCA in advance, for example to prepare a visit. Others used it afterwards, to review or check the visits’ results. A number of participants interpreted the SCA just as a supplement to the physician’s consultation. However, regardless of the specific use, the participants wished for more cooperation between users/patients and medical professionals in this context. A few participants used the SCA because it was recommended to them by medical professionals. These statements were collected in the subcategory 3.d) SCA as recommendation.

Expectations (prior to use)

The main category “Expectations (prior to use)” covers all interview statements in which the participants described their expectations before their SCA use. This main category has twelve subcategories: 4.a) Adjustment of expectations (through SCA use), 4.b) Aggregation of symptoms, 4.c) Backup, 4.d) Confirmation/reassurance, 4.e) “False” expectations (from other users), 4.f) First aid/orientation, 4.g) Helpful with little things, 4.h) Knowledge/expertise, 4.i) Little or no use, 4.j) Overcoming barriers, 4.k) Savings, and 4.l) Showing alternatives.

The subcategory 4.a) Adjustment of expectations (through SCA use) covers the participants’ idea that SCA users need to adjust their expectations to what the SCA could actually deliver. Besides that, the participants described various expectations. One expectation is that the SCA could aggregate several (trivial) symptoms over a long period to one concrete diagnosis. Statements in this direction were summarised in the subcategory 4.b) Aggregation of symptoms. Another expectation is that the SCA was a backup option when, for example, doctor’s offices are closed or the user’s family doctor is on vacation. These expectations were collected in the subcategory 4.c) Backup. The subcategory 4.d) Confirmation/reassurance includes all text passages in which the participants described the confirmation of their previously held assumptions as one of their expectations.

Several participants spoke not about their own expectations, but about “false” expectations from other users. For example, some participants were concerned that other users might mistakenly expect a diagnosis from the SCA. Because of that concern, the participants argued for a general scepticism toward the SCA, indicating the limits of SCAs. These worries were grouped into the subcategory 4.e) “False” expectations (from other users). By contrast, first aid and orientation were appropriate expectations from the perspective of many participants, collected in the subcategory 4.f) First aid/orientation. Concrete expectations in this subcategory were the assessment of the actual risk (“It’s just a safeguard.” (NT 44)), decision-support prior to medical contact, and early detection and prevention of diseases. Expectations regarding trivial symptoms were summarised in the subcategory 4.g) Helpful with little things. This subcategory covers the expectation that SCAs could help the user especially with minor symptoms. Another expectation was precise information and expert knowledge, in particular comparing SCAs with internet research tools such as Google. These expectations were summarised in the subcategory 4.h) Knowledge/expertise. Despite the expectations of precise information and specialist knowledge, the SCAs were at the same time differentiated from physicians. One participant stated, for example, that she did not expect the SCA to make decisions or replace a physician’s appointment.

“I didn’t use the app expecting it to have the last word. In other words, I didn’t expect it to be able to replace a visit to the doctor if things were really serious.” (NT 06)

A few participants denied having expectations before their SCA usage. They reported no SCA use in their daily lives because they had no symptoms or complaints and, to that extent, no need for it. Others reported that they had symptoms but were already aware of them or the underlying disease. Still others had very low expectations and a general scepticism as to whether SCA can work at all. In this context, one participant referred to personal contact with the physician:

“I think my expectations were so low because I was thinking: This can’t work. How should this replace a personal medical assessment? Of course, it can’t do that, because of course you also have a doctor with whom you can talk a bit more, also with regard to the treatment afterwards. The app doesn’t do that, nor should it.” (NT 39)

Another expectation was that SCAs could help anxious or shy people to go into contact with medical professionals. These expectations were collected in the subcategory 4.j) Overcoming barriers. The subcategory 4.k) Savings summarises all expectations that SCA usage could lead to personal savings. Here, examples were long ways to physicians and long waiting times. The subcategory 4.l) Showing alternatives collects all statements of the participants that SCAs might show further possibilities and alternatives to already existing knowledge and findings (“type it in again, maybe something else will come up” (NT 31)).

Motivations

The main category “Motivations” was one of the categories directly delivered from the interview material and covers all interview statements concerning the motivation of the participants to use SCAs. This category has five subcategories (alphabetical order): 5.a) Apprehension, 5.b) Avoid physician visits, 5.c) Benign diseases, 5.d) Curiosity/interest, 5.e) “Testing” SCAs. The first subcategory 5.a) Apprehension covers all statements indicating that fear was the motivation to use SCAs. This includes, for example, the users’ anxiety that the symptoms or the disease could worsen. Another motivation using SCAs was to avoid physician’s visits. This motivation was summarised in the subcategory 5.b) Avoid physician visits. It covers visits considered unnecessary by the participants or when being abroad. Some used SCAs for everyday complaints or “trivialities”, summarised in 5.c) Benign diseases. One participant, for example, described the appeal of the SCA as being able to raise and reflect on trivial things that would otherwise not be discussed.

“I mean, [the symptoms] are somehow trivial. I thought that was the attraction of the app, that the things that otherwise go by the board, […] that you then somehow reflect these trivial things more.” (NT 47)

Others used it just out of curiosity, collected in the subcategory 5.d) Curiosity/interest.

“I was just curious to see how advanced the technology is, that they could do something like that. Somehow it goes a bit in the direction of a “virtual doctor”. A bit like that. And I was simply interested: How good the algorithm is, so I downloaded it and tried.” (NT 42)

As in this quote, curiosity often referred to the technology of the SCA, the algorithms behind it, or technical progress in general. In this context, thoughts were made about a virtual future. Other participants used the SCA to test it, collected in 5.e) “Testing” SCAs. This means that the users check whether the SCA works well or at all.

Risks

The sixth category “Risks” summarises the concerns of the participants in the context of their SCA use and has four subcategories: 6.a) No concerns, 6.b) Subsequent assignment of responsibilities, 6.c) Concerns, and 6.d) Sense of security. The participants’ perceptions of potential risks associated with SCAs varied widely. While some participants indicated various risks, other participants had no concerns at all. The latter was summarised in the subcategory 6.a) No concerns. Here, the participants denied having concerns, for example regarding their (health) data. One argument raised by the participants was their own laziness. Data protection regulations, such as the GDPR, and the reputable impression of SCAs were described as further reasons not to worry. In addition, potential concerns about data protection were relativised by referring to other major problems of data misuse on the internet and the participants’ use of further apps.

“For me, this data protection is a bit overdone […] I don’t care if Mr. Zuckerberg knows, for example, that I have asthma, because he can’t do anything with that. If it makes him happy, then he should know. I didn’t question that. Of course, I also read that … You can already see that in the installation. Well, I’m not someone who reads through ten pages of something here, that’s too exhausting for me.” (NT 39)

In the case of incorrect SCA findings and recommendations the participants took different positions regarding the question of responsibility. A few participants saw the responsibility with the SCA.

“It’s also the responsibility of the SCA. So, on the one hand, of course, there is the danger that it causes unnecessary panic in people, but on the other hand …. if the SCA says “heartburn is harmless” and people stay at home and the next day they are dead or have something serious because they just failed to go to the doctor in time. And that’s a difficult balancing act”. (NT 06)

Although some users also considered the SCA to be responsible, they considered this attribution to be a difficult decision. On the one hand, the SCA could be too cautious and thus cause fear among users, but if it warns too little, this could have serious consequences for the user. Most participants saw the reason, and therefore responsibility, for incorrect SCA findings and recommendations in the incorrect input of the app users. The participants reasoned that the questions by the chatbot were not understood correctly by the users, that their data input was incorrect or dishonest, and that users exaggerated or understated their symptoms.

“If a wrong input is made, it could of course be that someone, if he is not honest, won’t go to the doctor, but has something life-threatening, and then - let’s take the worst-case – pass away a day later, because the app suggested something harmless, and he actually has something serious, because he wasn’t honest. So, everyone has to be aware that if they don’t give honest information, they can’t expect a correct answer.” (NT 19)

Although most participants saw the responsibility on the side of the SCA users, the participants had difficulties to position themselves. All statements in this direction were collected in the subcategory 6.b) Subsequent assignment of responsibilities.

The subcategory 6.c) Concerns collects all descriptions of the participants about risks and concerns in the context of their SCA usage. The participants described a wide variety of concerns. Missing aspects or questions during the data query, such as intake of medication, were named, which could then lead to incorrect findings and following incorrect user behaviour. Mental distress and symptom intensification through SCA usage were further concerns described by the participants. Examples of mental distress include nervousness, impatient, or anxiety during SCA use. One participant described, for example, that an already existing anxiety could intensified while using the SCA.

“But if you now, for example, just now, with this rash don’t know what that could be and are already a bit scared, then you are already a bit nervous with the app and think so: Okay, okay, what could come out? What could it be?” (NT 41) Some worried about a non-critical use by app users, and possibly following self-treatment, which would not be embedded in a medical or professional relation. In this context, it was questioned by some participants whether SCA users place too much trust in the app.

“Maybe people trust the app too much. At the end of the day, it’s still an app and, yes, it doesn’t actually replace a doctor’s visit. I think that it might also be a risk that you rely on it too much.” (NT 41)

The participants argued that the risk-averse recommendations of the SCA could lead to unnecessary consumption of limited healthcare resources (over-triage).

“I think the app is very cautious. I have the feeling that the app sends you quickly to the doctor […] if there are now people who question the app’s results less, then the healthcare system, hospital, doctor, could be overloaded with things that are not necessarily worth treating or were not questioned. Because I simply saw for myself, yes, the app is a bit more cautious.” (NT 44)

In this context, the SCA was frequently described as overly cautious or excessively hasty in its judgment. Such a tendency could present challenges for healthcare systems, particularly concerning resource constraints and system overload. These issues could be exacerbated if the SCA users don’t engage critically with its recommendations.

“I just thought that this one reaction or the one response from the app was really exaggerated. And I think it’s questionable whether you send people like that to emergency rooms and use resources that we all know are very limited.” (NT 39)

At the same time, the participants claimed the waiver of benefits as a potential risk for the SCA users (under-triage). Life-threatening situations were repeatedly mentioned as an example.

“Situations can also arise where someone doesn’t go directly to the doctor, even though it really could be life-threatening and they urgently need medical help. So that is of course the reverse side of it.” (NT 39)

As this quote shows exemplarily, participants were aware of both sides. Although the participants mentioned a lot of risks through SCA use, the participants felt in general safe using SCAs. All descriptions of the participants in this direction were summarised in the subcategory 6.d) Sense of security.

SCA-use for others

The main category “SCA-use for others” was also directly derived out of the interview material and has three subcategories: 7.a) Addressee, 7.b) Problems, and 7.c) Unproblematic. The participants reported throughout the interviews that they used SCAs not only for themselves but also for others. Especially family members, such as partners and own children, but also friends were mentioned. The participants had entered symptoms from others in their SCA to get findings for them or used the SCA together. For example, one participant reported that since they had more than one child and in general children had very often complaints, they could check with the SCA whether physician visits were necessary or not (NT 21). A further group of addresses were technology-critical people. One participant described, for example, that she had used the SCA for someone who was sceptical about using smartphones and apps in general (NT 42). These different addresses were collected in the subcategory 7.a) Addressees. Some participants mentioned problems regarding privacy when entering personal data and symptoms of others into the app. For example, one participant described the use of the SCA as appropriate for a family member, but rather inappropriate for a person she did not know well, due to the perception of (health) data as private.

“It was my mom, so it was very familiar. If I had done it, I think, for someone I didn’t know well, I think I would have been uncomfortable. And I think I would put the smartphone in his hand and say, ‘You type. I don’t need to see exactly what you’re typing in.’” (NT 42)

In addition to concerns about privacy, participants raised questions of responsibility regarding SCA results and following recommendations for others.

“I was very unsure when I made recommendations for others. So, I used the SCA for other people … because then that happened somehow on my recommendation. And I then somehow had the feeling, if a doctor’s visit is not recommended, and there is still something serious, then […] I have done something wrong.” (NT 42)

Although recommendations for others were made by the users based on SCA findings, uncertainties were expressed in this regard. The uncertainties related primarily to whether it was acceptable to make a recommendation for others. Because, if the third party were to suffer harm as a result of the recommendation, then the responsibility would be localized to the person who had made the recommendation and this did not seem comfortable to the SCA users. The concerns of the participants regarding their SCA use for others were collected in the subcategory 7.b) Problems. While some participants assessed the SCA use for others as problematic, others saw this use as unproblematic. The latter statements were summarised in the subcategory 7.c) Unproblematic.

Discussion

This qualitative study shows the broad scope of subjective experiences of SCA users. The study provides an overview of the various topics that users found relevant and addressed in relation to their SCA use. Thus, a wide spectrum of subjective experiences, including attitudes, expectations, motivations, and concerns is presented. The results demonstrate not only the broad spectrum of experiences but also inconsistent aspects. They show neither uniform expectations nor a consensus on the positive and negative effects of SCA. Instead, there are very different assessments of this new technology. Contradictory statements can be found in the categories, but sometimes even within one interview. Often there is not a concrete evaluation or specific argument, but a continuum of statements. These different and sometimes contradictory evaluations may be due, among other things, to the different individual situations of the users. This shows that it is very difficult to assess new technologies such as SCAs in general, but instead the individual, social, and cultural contexts must be taken into account.

More attention should also be paid to the structural level and the situatedness of these apps, as mobile health technologies may be discriminatory or worsen structural injustices [27]. Health apps can bring new challenges for health equity, as they may not be designed for everyone who could benefit from their use, for example, due to digital literacy or language. In addition, power imbalances, e.g. (implicit) racism and sexism, can influence the design of health apps. It is therefore important for a comprehensive ethical analysis to consider the specific social context and the structural-social processes in which such health apps are used. Although these structural dimensions were less discussed by the participants in our study, there is more and more discussion in the ethical literature on the systemic and structural problems in which digital health technologies are used [27,28,29]. This debate needs to be better linked to the specific requirements of SCAs, meaning for example, that the content and the functions of SCAs should be evaluated through the lens of structural dimensions such as structural or epistemic injustice.

Another important concern with AI-based systems, such as health apps, is algorithmic bias. There are many sources and different types of algorithmic bias [30]. A more specific understanding of bias in the sense of discrimination against certain persons or groups is widely recognized in the scientific literature on AI, including much of the health literature [31]. Interestingly, the participants did not discuss algorithmic bias, but bias by their user behaviour. Since the interviewed persons in our study did not mention algorithmic bias as a concern, it may suggest that the importance of this issue for researchers could not necessarily translate to groups in (clinical) practice. Another interpretation could be that the participants were not aware of this issue. Further research would be important to understand how bias in the context of AI and health is perceived and categorized by users of mobile health technologies.

However, the contexts of the users and the range of their experiences have received little attention in the existing discussions. A recent review showed the absence of comparable empirical data in the debates on SCAs [6]. Further studies criticised the missing real-word data in these debates [e.g., 10; 12; 13]. These critiques indicate that the existing debates might not reflect the diverse users’ views on SCA. This study, in contrast, shows the plurality of the users’ experiences. This has implications for the ethical, social, and legal debates on SCAs. Also, it impacts empirical research: if absolute categories do not reflect the diversity of opinions and the same factor can both be regarded as negative or positive depending on the users’ context, questionnaire or interview guideline design needs to consider this. To gain a better understanding of the complexity of user experiences in digital innovations such as SCAs, context-related investigations are essential to identify the factors that led users to consider the usage of SCAs as beneficial, while others evaluated it more negatively.

The limited consideration of SCA user experiences in the existing discussions also opens up a debate about patient knowledge and epistemic injustices within digitized healthcare contexts. Epistemic injustice refers to situations where individuals or groups are harmed in their capacity as knowers [32]. Epistemic injustice can manifest in different ways. Common forms are downgrading certain persons’ testimonies and interpretations, their ability to contribute to knowledge, or not listening to them [32]. Patients’ testimonies and interpretations are often dismissed as irrelevant, too emotional, or time-consuming and often ignored, rejected, or subordinated to the authority of healthcare professionals [33, 34]. Regarding digitized healthcare, there are concerns about epistemic injustice, particularly when the knowledge of marginalized or vulnerable patient groups is dismissed by technology systems or in the development of these systems because digital technologies can reinforce existing epistemic inequalities and injustices in the health sector [35,36,37]. Whether SCAs promote or prevent epistemic injustices is another very interesting question. The point here, however, is that the experiences of users and patients should be given more attention in debates about SCAs. As the digitization of healthcare continues to advance, more ethical and social research must be developed to address these issues and ensure that patient knowledge is included in the debates and the development of digitized healthcare.

The users’ subjective experiences are one step in assessing whether the promises of SCAs contributing to better healthcare (systems) are being fulfilled or not. Since the views in our interviews diverge widely, it is difficult to follow that the expectations of the users are met (or not) by the SCA. Again, it depends strongly on the individual users’ prior experience, knowledge, and situation. In addition, satisfaction with a health tool does not necessarily mean that this tool is helpful or beneficial, but rather that this tool might still or continue to be used. Satisfaction can thus be seen as a prerequisite for realising potential benefits of SCAs. Whether SCAs are helpful or beneficial in the medical or clinical context, however, is another question. Although SCAs promise to advance diagnostic practices, diminish misdiagnosis, and guide patients through healthcare systems more effectively, there is currently little empirical evidence to support this. Nevertheless, empirical evidence on expectations and satisfaction is indispensable for ethical evaluation, because low satisfaction can lead to the application not being used, whereas high satisfaction can lead to the application being overused. Further qualitative studies are required to get a better understanding of the underlying behavioural and motivational processes outlined in this study. At the same time, large-scale quantitative studies would be necessary to go further regarding representation and more breadth. However, to determine whether SCAs can benefit patients and healthcare systems, it is primarily necessary to determine the diagnosis and triage accuracy of SCAs, the effects of SCAs on patients’ care-seeking behaviour and the safety and appropriateness of those decisions. Initial empirical studies [e.g., 7; 8; 10; 38] and guidelines for evaluating symptom checkers already exist [e.g., 39], but need to be further developed.

The study reveals that the users’ own judgement, as well as the physicians’ opinion, are placed over the app findings by the users. This result is contrary to the implicit but often underlying question in debates about whether SCAs could replace medical professionals or even lead to a devaluation of the professions [e.g., 38; 40; 41; 42]. This implicit concern in the literature is not confirmed in our qualitative study. This again highlights that the debates on SCAs have so far been conducted without the views of the users. Further ethical, social, and legal debates on SCA should therefore incorporate the users’ perspective to a greater extent. Further qualitative research should go even more into detail to allow for more nuanced debates. For example, if the concern of replacement is stated in the literature, it needs to be examined more precisely which user groups are referred to. More empirical research would be helpful to understand whether users’ characteristics have an impact on SCA use. Specifically, whether the characteristics influence how the users rank the app findings compared to their own evaluation or that of a physician. In addition, since the SCA users decide whether or not to accept the app’s findings, (e)health literacy seems to be a crucial factor in these decisions. Further research should investigate the role of (e)health literacy in these decision-making processes. In a study by Miller et al. [12] most SCA users responded that using SCAs would not change their decisions about what to do next (e.g., go to a GP). Another study by Fraser et al. [10] discussed the potential that patients would change the urgency level of care they sought based on SCA findings. It would be interesting to investigate whether the confidence to put their own opinion above the SCA findings is related to user characteristics, such as (e)health literacy, and to what extent an ongoing use of the app would influence this self-confidence.

The category “App use for others” evolved in the analysis process. App use for third parties or specific social contexts (friends, spouse, family) are not discussed in the literature on SCA. Even in the broader debates around health apps, most considerations focus on the individual app user and relational aspects are seldom considered [43]. The examples of app use for others in our interview study therefore raise hitherto unexplored ethical aspects. Unsolicited SCA use for third parties, for example for one’s own children or partner, can be further discussed from a paternalistic point of view. In addition, ethical questions regarding autonomy and privacy come up because sensitive personal information, such as health data, are needed for the SCA input. In the interviews, the participants also described situations in which the individuals used the SCA together because they preferred to talk about their symptoms to their friend or partner rather than to a physician, for example, out of shame. In these cases, normative questions arise about privacy, intimacy, the physician’s role and trustworthiness. A different example is when the SCA user replaces the physician, i.e., makes a recommendation for a third person via the SCA. This case also raises questions about the physician’s role, medical authority, and responsibility. Health apps such as SCAs can provide inaccurate or incomplete information, which may lead to false recommendations. Who is responsible if a friend or a parent uses the SCA making a recommendation that leads to harmful behaviour of a third individual? The examples from the interviews show that SCA use for others is a complex theme, which raises a lot of ethical questions.

Strengths and limitations

The qualitative and explorative study design affords a broad understanding of the subjective experiences of SCA users. The strength of the analysis lies in its wide scope and detail. The qualitative semi-structured interviews and the conduct of the interviews in tandem allowed the breadth. Ongoing discussions of the analysis in the research group with different disciplinary backgrounds helped to investigate the study subjects from different perspectives. In addition, a member check was used to check the relevance of the results for the people concerned. Internal methods-workshops in the CHECK.APP project and an external advisory board served as further tools to validate the results of the interview analysis.

Various recruitment tools were used in the overall project to obtain a representative sample. In the first phase of the project, however, only German citizens were contacted. Although further recruitment was carried out (e.g., via social media), only German-speaking people were included in the sample. It would be interesting to compare interviews with participants from other cultural contexts with those of the present study. Although different recruiting tools were used to ensure the inclusion of participants of differing gender, in different parts of their lifespans, and with varying experiences, most participants were between 20 and 29 years and well-educated. The results of our study should be seen against the background that young well-educated (seemingly cisgender) users made up a large part of the study sample. For example, the participants’ positive evaluation of their skills and the preference of their assessment in comparison with the SCA findings might be different for a more diverse sample. As there is currently too little empirical data about SCA user characteristics, it is difficult to say whether our sample is representative of typical SCA users [44]. Further studies should investigate the extent to which social categories such as age, gender, ethnicity, and education of users are related to the use of SCAs.

The sample included only participants who have already experienced the symptom checker Ada. The participants might have reported other aspects and experiences when talking also about other symptom checker apps. Ada was used in the study as an example for a SCA, as it is one of the most popular SCA in Germany. It may be that some study participants rated the app more positively since it is being researched within an independent academic institution that they trust. At the same time, it is possible that some participants were reluctant to make critical comments about the Ada app because our independence seems not always clear to them. Although we repeatedly indicated (in written and oral form) that we were conducting research independently of Ada, this may not always have been present to the study participants. In addition, the health status of the participants may have had an impact on the interview results. Depending on whether the participants were in a phase of illness or not, they might evaluate the SCA differently. Moreover, the pandemic situation also might have had an influence on the study results. For example, it was not possible to conduct the interviews in person. Instead, we conducted the interviews via video call. The nuances of body language and other nonverbal cues associated with face-to-face interaction may be lost over video calls, and trust may be more difficult to establish. In addition, the tandem design of the interviewer may have influenced the interviewed persons. It could be that the interviewed person felt less comfortable talking about their experiences because there were two researchers present.

Finally, there were some difficulties in the analysis and interpretation of the data. The clear definition of the search units, for example “motivations” and “expectations”, emerged as a major difficulty within the current study. There are no hard criteria within research literature regarding the formulation of what are motivations and what are expectations, and how these aspects can clearly be distinguished from each other. In order to resolve this situation and to obtain the most comprehensive picture possible, we used broad definitions and allowed multiple coding in the coding process. The chosen method allows the coexistence of attitudes, expectations, motivations, and concerns. Further empirical research would be important to better understand these distinctions in the context of health app use and how these categories are connected with each other.

Conclusion

The present qualitative study investigates the subjective perspectives of SCA users, in particular, their expectations and assessments of SCA usage. The aspects identified in the analysis demonstrate the immense scope of different experiences and evaluations. In addition, it shows the users’ uncertainty regarding the advantages and disadvantages of SCA use and corresponding ethical implications. The interviews also emphasise an ethical dimension of SCAs which has scarcely been discussed in the literature: the use for third parties. The interviews show that SCA use for others is a complex theme, which raises various ethical aspects such as autonomy and paternalism, intimacy and privacy, as well as trustworthiness. For example, the triangle between the recommendation of the SCA, the SCA owner, and a third person raises questions of responsibility. Furthermore, the physicians’ role in this constellation is not clear. These normative relational issues seem to be underexposed in the literature on health apps yet, particular on SCA. Relational aspects should be paid more attention in the ethical debates on mHealth, but also in the development and distribution processes of health apps such as SCA. It is important for app developers, providers, and regulators to be aware of these relational ethical issues and take steps to address them in order to ensure that health apps are developed and used in a responsible way that protects the rights of the individual user, but also of third parties involved.

Data availability

The interview material generated and analysed during the current study are not publicly available due to privacy issues. The first author can be contacted for access to the raw data analysed in the study. Contact details: regina.mueller@uni-bremen.de.

Abbreviations

GP:

General Practitioner

SCA:

Symptom Checker App

References

  1. Safi Z, Abd-Alrazaq A, Khalifa M, Househ M. Technical aspects of developing chatbots for medical applications. Scoping Rev JMIR. 2020;22(12):e19127. https://doi.org/10.2196/19127.

    Article  Google Scholar 

  2. Jia X, Pang Y, Liu LS. Online Health Information seeking behavior: a systematic review. Healthc (Basel). 2021;9(12):1740. https://doi.org/10.3390/healthcare9121740.

    Article  Google Scholar 

  3. Lewis T. Seeking health information on the internet: lifestyle choice or bad attack of cyberchondria? Media. Cult Soc. 2006;28(4):521–39. https://doi.org/10.1177/0163443706065027.

    Article  Google Scholar 

  4. Tan SS, Goonawardene N. Internet Health Information seeking and the patient-physician relationship: a systematic review. J Med Internet Res. 2017;19(1):e9. https://doi.org/10.2196/jmir.5729.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Johnson JD. Health-related information seeking: is it worth it? Inf Process Manag, 2014,50,5,708–17, https://doi.org/10.1016/j.ipm.2014.06.001.

  6. Müller R, Klemmt M, Ehni HJ, Henking T, Kuhnmünch A, Preiser C, Koch R, Ranisch R. Ethical, legal, and social aspects of symptom checker applications: a scoping review. Med Health Care Philos. 2022;25:737–55. https://doi.org/10.1007/s11019-022-10114-y.

    Article  PubMed  PubMed Central  Google Scholar 

  7. Semigran HL, Linder JA, Gidengil C, Mehrotra A. Evaluation of symptom checkers for self diagnosis and triage: audit study. BMJ. 2015;351:h3480. https://doi.org/10.1136/bmj.h3480.

    Article  PubMed  PubMed Central  Google Scholar 

  8. Schmieding ML, Kopka M, Schmidt K, Schulz-Niethammer S, Balzer F, Feufel MA. Triage accuracy of Symptom Checker apps: 5-Year follow-up evaluation. J Med Internet Res. 2022;24(5):e31810. https://doi.org/10.2196/31810.

    Article  PubMed  PubMed Central  Google Scholar 

  9. Knitza J, Mohn J, Bergmann C, Kampylafka E, Hagen M, Bohr D, Morf H, Araujo E, Englbrecht M, Simon D, Kleyer A, Meinderink T, Vorbrüggen W, von der Decken CB, Kleinert S, Ramming A, Distler JHW, Vuillerme N, Fricker A, Bartz-Bazzanella P, Schett G, Hueber AJ, Welcker M. Accuracy, patient-perceived usability, and acceptance of two symptom checkers (Ada and Rheport) in rheumatology: interim results from a randomized controlled crossover trial. Arthritis Res Ther. 2021;13(1):112. https://doi.org/10.1186/s13075-021-02498-8.

    Article  CAS  Google Scholar 

  10. Fraser HSF, Cohan G, Koehler C, Anderson J, Lawrence A, Pateña J, Bacher I, Ranney ML. Evaluation of Diagnostic and Triage Accuracy and Usability of a Symptom Checker in an Emergency Department: Observational Study. JMIR Mhealth Uhealth. 2022;10(9):e38364. https://doi.org/10.2196/38364.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Gilbert S, Mehl A, Baluch A, Cawley C, Challiner J, Fraser H, Millen E, Montazeri M, Multmeier J, Pick F, Richter C, Türk E, Upadhyay S, Virani V, Vona N, Wicks P, Novorol C. How accurate are digital symptom assessment apps for suggesting conditions and urgency advice? A clinical vignettes comparison to GPs. BMJ Open. 2020;16(12):e040269. https://doi.org/10.1136/bmjopen-2020-040269.

    Article  Google Scholar 

  12. Miller S, Gilbert S, Virani V, Wicks P. Patients’ utilization and perception of an Artificial intelligence–based Symptom Assessment and advice technology in a British Primary Care Waiting Room: exploratory pilot study. JMIR Hum Factors. 2020;7(3):e19713. https://doi.org/10.2196/19713.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Meyer AND, Giardina TD, Spitzmueller C, Shahid U, Scott TMT, Singh H. Patient perspectives on the usefulness of an Artificial intelligence–assisted Symptom Checker: cross-sectional survey study. J Med Internet Res. 2020;22(1):e14679. https://doi.org/10.2196/14679.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Aboueid S, Meyer S, Wallace JR, Mahajan S, Chaurasia A. Young adults’ perspectives on the Use of Symptom checkers for self-triage and Self-Diagnosis: qualitative study. JMIR Public Health Surveill. 2021;7(1):e22637. https://doi.org/10.2196/22637.

    Article  PubMed  PubMed Central  Google Scholar 

  15. Luger TM, Houston TK, Suls J. Older adult experience of online diagnosis: results from a scenario-based think-aloud protocol. J Med Internet Res. 2014;16(1):e16. https://doi.org/10.2196/jmir.2924.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Knitza J, Muehlensiepen F, Ignatyev Y, Fuchs F, Mohn J, Simon D, Kleyer A, Fagni F, Boeltz S, Morf H, Bergmann C, Labinsky H, Vorbrüggen W, Ramming A, Distler JHW, Bartz-Bazzanella P, Vuillerme N, Schett G, Welcker M, Hueber AJ. Patient’s perception of Digital Symptom Assessment Technologies in Rheumatology: results from a Multicentre Study. Front Public Health. 2022;22:10:844669. https://doi.org/10.3389/fpubh.2022.844669.

    Article  Google Scholar 

  17. Schmietow B, Marckmann G. Mobile health ethics and the expanding role of autonomy. Med Health Care Philos. 2019;22:623–30. https://doi.org/10.1007/s11019-019-09900-y.

    Article  PubMed  Google Scholar 

  18. Albrech UV, Fangerau H. Do ethics need to be adapted to mhealth? A plea for developing a consistent framework. World Med J. 2015;61(2):72–5.

    Google Scholar 

  19. Sharp M, O’Sullivan D. Mobile Medical apps and mHealth devices: a Framework to build medical apps and mHealth devices in an ethical manner to Promote Safer Use - A Literature Review. Stud Health Technol Inf. 2017;235:363–7.

    Google Scholar 

  20. Groß D, Schmidt M. E-Health und Gesundheitsapps Aus Medizinethischer Sicht. Bundesgesundheitsbl. 2018;61:349–57. https://doi.org/10.1007/s00103-018-2697-z.

    Article  Google Scholar 

  21. Wetzel AJ, Koch R, Preiser C, Müller R, Klemmt M, Ranisch R, Ehni HJ, Wiesing U, Rieger MA, Henking T, Joos S. Ethical, legal, and Social Implications of Symptom Checker Apps in Primary Health Care (CHECK.APP): protocol for an interdisciplinary mixed methods study. JMIR Res Protoc. 2022;16(5):e34026. https://doi.org/10.2196/34026.

    Article  Google Scholar 

  22. Ada. https://ada.com/app/. Accessed 02 May 2023.

  23. Malterud K, Siersma VD, Guassora AD. Sample size in qualitative interview studies: guided by Information Power. Qual Health Res. 2016;26(13):1753–60. https://doi.org/10.1177/1049732315617444.

    Article  PubMed  Google Scholar 

  24. Mayring P. Qualitative Content Analysis. Theoretical Foundation, Basic Procedures and Software Solution. Klagenfurt, Austria, 2014. https://nbn-resolving.org/urn:nbn:de:0168-ssoar-395173. Accessed 02 May 2023.

  25. World Medical Association (WMA). 2013. Declaration of Helsinki. Ethical Principles for Medical Research Involving Human Subject. https://www.wma.net/policies-post/wma-declaration-of-helsinki-ethical-principles-for-medical-research-involving-human-subjects/. Accessed 02 May 2023.

  26. Tong A, Sainsbury P, Craig J. Consolidated criteria for reporting qualitative research (COREQ): a 32-item checklist for interviews and focus groups. Int J Qual Health Care. 2007;19(6):349–57.

    Article  PubMed  Google Scholar 

  27. Introduction to ethics of mHealth. https://about-mhealth.net/ethics-of-mhealth/introduction-to-ethics/ (Access: 15.11.2023).

  28. Sauerborn E, Eisenhut K, Ganguli-Mitra A, Wild V. Digitally supported public health interventions through the lens of structural injustice: the case of mobile apps responding to violence against women and girls. Bioethics. 2022;36(1):71–6. https://doi.org/10.1111/bioe.12965.

    Article  PubMed  Google Scholar 

  29. Brewer LC, Fortuna KL, Jones C, Walker R, Hayes SN, Patten CA, Cooper LA. Back to the future: Achieving Health Equity through Health Informatics and Digital Health. JMIR Mhealth Uhealth. 2020;8(1):e14512. https://doi.org/10.2196/14512.

    Article  PubMed  PubMed Central  Google Scholar 

  30. Danks D, London AJ. Algorithmic Bias in Autonomous Systems. International Joint Conference on Artificial Intelligence (2017).

  31. Rajpurkar P, Chen E, Banerjee O, et al. AI in health and medicine. Nat Med. 2022;28:31–8. https://doi.org/10.1038/s41591-021-01614-0.

    Article  CAS  PubMed  Google Scholar 

  32. Fricker M. Epistemic injustice. Power and the ethics of knowing. Oxford University Press; 2007.

  33. Carel H, Kidd IJ. Epistemic injustice in healthcare: a philosophial analysis. Med Health Care Philos. 2014;17(4):529–40. https://doi.org/10.1007/s11019-014-9560-2.

    Article  PubMed  Google Scholar 

  34. Kidd IJ, Carel H. Epistemic injustice and illness. J Appl Philos. 2017;34:172–90. https://doi.org/10.1111/japp.12172.

    Article  PubMed  Google Scholar 

  35. Pozzi G. Testimonial injustice in medical machine learning. J Med Ethics. 2023;49(8):536–40. https://doi.org/10.1136/jme-2022-108630.

    Article  PubMed  Google Scholar 

  36. Durán JM, Jongsma KR. Who is afraid of black box algorithms? On the epistemological and ethical basis of trust in medical AI. J Med Ethics. 2021;47:329–35.

    Google Scholar 

  37. Grote T, Berens P. On the ethics of algorithmic decision-making in healthcare. J Med Ethics. 2020;46:205–11.

    Article  PubMed  Google Scholar 

  38. Hill MG, Sim M, Mills B. The quality of diagnosis and triage advice provided by free online symptom checkers and apps in Australia. Med J Aust. 2020;212(11):514–9.

    Article  PubMed  Google Scholar 

  39. Fraser HSF, Coiera E, Wong D. Safety of patient-facing digital symptom checkers. Lancet. 2018;392(10161):2263–4. https://doi.org/10.1016/S0140-6736(18)32819-8.

    Article  PubMed  Google Scholar 

  40. Lupton D, Jutel A. It’s like having a physician in your pocket!’ A critical analysis of self-diagnosis smartphone apps. Soc Sci Med. 2015;133:128–35.

    Article  PubMed  Google Scholar 

  41. Fiske A, Buyx A, Prainsack B. The double-edged sword of digital self-care: physician perspectives from Northern Germany. Soc Sci Med. 2020;260:1–10.

    Article  Google Scholar 

  42. Merz S, Bruni T, Bondio M. Diagnose-Apps: Wenig Evidenz. Deutsches Ärzteblatt. 2018;115(12):522–4.

    Google Scholar 

  43. Müller R, Kuhn E, Ranisch R, Hunger J, Primc N. Ethics of sleep tracking: techno-ethical particularities of consumer-led sleep-tracking with a focus on medicalization, vulnerability, and relationality. Ethics Inf Technol. 2023. https://doi.org/10.1007/s10676-023-09677-y. 25,4.

    Article  Google Scholar 

  44. Wetzel AJ, Klemmt M, Müller R, et al. Only the anxious ones? Identifying characteristics of symptom checker app users: a cross-sectional survey. BMC Med Inf Decis Mak. 2024;24:21. https://doi.org/10.1186/s12911-024-02430-5.

    Article  Google Scholar 

Download references

Acknowledgements

We thank Marie-Theres Steffen for their help in organising and conducting the interviews. We also thank Laura Hessel for their help in analysing the interviews. In addition, we thank Christine Preißer for their advice, in particular, regarding methodological questions in the interview study, and Oliver Feeney for the language check of the manuscript.

Funding

Open Access funding enabled and organized by Projekt DEAL. The project is part of the joint research project “CHECK.APP” and is fully funded by the German Federal Ministry of Education and Research (Grant No. 01GP1907A). The funding body played no role in the design of the study and collection, analysis, interpretation of data, and in writing the manuscript.

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Authors

Contributions

RM drafted the manuscript and managed the writing process. RM, MK and RK conducted the interviews. MK and RM analysed the transcripts. The results were discussed among RM, MK, RK, HE, TH, EL, UW, and RR. RM, MK, RK, HE, TH, EL, UW, and RR collaborated in revising the draft manuscript and all authors approved the final version. HE, TH, UW and RR were involved in the planning of the project from which this article derives and supervised the work.

Corresponding author

Correspondence to Regina Müller.

Ethics declarations

Ethics approval and consent to participate

The study was conducted in accordance with the Declaration of Helsinki. Ethical approval was obtained from the ethics committee of the University of Tübingen (ID: 464/2020BO). Ethics research requirements, such as informed consent and data protection, were carefully considered. Written informed consent was obtained from all participants.

Consent for publication

Not Applicable.

Competing interests

None declared.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1: CHECK.APP User Interview Guide (English)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Müller, R., Klemmt, M., Koch, R. et al. “That’s just Future Medicine” - a qualitative study on users’ experiences of symptom checker apps. BMC Med Ethics 25, 17 (2024). https://doi.org/10.1186/s12910-024-01011-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12910-024-01011-5

Keywords