Human versus chatbot: Understanding the role of emotion in health marketing communication for vaccines – Tsai – – Psychology & Marketing

1 INTRODUCTION

As one of the technologies that is most influential in changing our everyday lives (Moriuchi, 2021), artificial intelligence (AI) has been increasingly incorporated into the marketing process, including data mining for consumer insights and automated yet personalized advertising creation (H. Li, 2019). To advance the emerging research on AI in marketing, this study examines the potential of chatbots, one of the most popular and easily accessible AI applications (Wright, 2020) in health marketing communication for vaccines.

Across various industries, from online learning to e-commerce, chatbots have been embraced as the highly anticipated next step in digital evolution (Dudharejia, 2017). A chatbot is an AI-powered, automated, yet personalized, virtual assistant. Chatbots provide unprecedented opportunities for businesses to automate one-on-one interactions to engage individual consumers in private, intimate conversations. While chatbots are still a nascent technology, some estimate that they will handle 85 percent of customer-service interactions within a few years (Hinds, 2018). Not only are bots able to offer cost-efficient 24/7 service that can be easily embedded in various communication channels, from websites to mobile apps, but they can also calmly handle complaints in even the most tense customer service situations. Because of these advantages, chatbots have been advocated as a powerful tool to develop conversational marketing campaigns (Cancel & Gerhardt, 2019; Sotolongo & Copulsky, 2018).

This study examines consumer–chatbot interaction within the context of health marketing communication, which is one of the fastest growing fields for chatbot technology (Yun et al., 2021). Most recently, as the COVID-19 pandemic has spread across the globe, organizations and companies worldwide have launched chatbots to provide up-to-date information about rapidly shifting situations (Price, 2020). For example, the US Centers for Disease Control and Prevention’s COVID-19 chatbot has answered over a million inquiries per day from users in the United States (Smith, 2020), a feat which would be impossible, or at the very least, extremely costly and labor-intensive if such online support were to be administered by human representatives.

Despite the rapid adoption of chatbots in various industries, there are limited empirical data regarding the factors influencing consumer response to chatbots. Ling et al. (2021) review of the emerging conversational agent research in marketing has identified varied types of variables, from usage considerations (e.g., perceived usefulness, perceived enjoyment) to bot characteristics (e.g., anthropomorphism, empathy) and user traits (e.g., personal innovativeness), which can impact consumers’ attitude and evaluation. Specific to the study focus of the emotional context of consumer–chatbot interaction, prior studies have primarily addressed the usage considerations of utilitarian/cognitive versus hedonic/arousal factors (Sivaramakrishnan et al., 2007; Zarouali et al., 2018) driving consumer attitude toward the bot. However, important questions regarding the usefulness of chatbots in emotion-laden settings, such as placating angry customers or assuring anxious consumers, have not been sufficiently analyzed, even though affect is critical in interpersonal communication (Konijn & Van Vugt, 2008). In health marketing communication contexts where the chatbot needs to address sensitive issues such as mental or sexual health, consumer–chatbot interactions can be laden with embarrassment, anxiety, fear, and even frustration and anger. Research on the role of affect in influencing chatbot communication outcomes is thus imperative.

This study aims to advance the emerging field of chatbot-consumer interaction by assessing the affective influence on consumer evaluation of chatbot versus human brand representatives. A lab experiment was conducted using a 2 (brand representative identity: chatbot vs. human) × 3 (affect elicitation: embarrassment, anger, neutral) design in the context of discussing the sensitive topics of sexually transmitted disease (i.e., human papillomavirus, HPV) and vaccines. Specifically, we examined whether the perceived human versus chatbot identity of an HPV vaccine company’s virtual representative makes a difference in consumer response across situations when consumers may be angry, embarrassed, or emotionally neutral. By evaluating the emotional contexts in which chatbots might be perceived as effectively as, worse, or more effectively than human representatives, this study seeks to provide much-needed insights on whether and when human representatives may be replaced, or, importantly, not be replaced by chatbots.

2 LITERATURE REVIEW

2.1 Perceived agency and social influence

Human perceptions, actions, and experiences are shaped by the actual, perceptual, and implied presence and attributions of others (Allport, 1985). Recent breakthroughs in AI and robotics have transformed our understanding of the source and process of such social influences. For example, Kim and Sundar (2012) argue that when interacting with chatbots, people can sense the presence of another intelligent being and respond to it as a meaningful source of interaction. While research has widely recognized the social influence of chatbots (e.g., Adam et al., 2020; Liu & Shyam Sundar, 2018), the crucial question lies in whether and how chatbots may exert similar levels and types of social influence as compared to humans.

The concept of agency, defined as “the extent to which individuals perceive virtual others as representations of real persons” (Blascovich, 2002, p. 130), is essential to understanding the potential differences in how users respond to computer-controlled agents and human-controlled avatars (Nowak & Biocca, 2003). Although extensive research has explored the social influence of perceived agency (agents vs. avatars), the findings are mixed. Some scholars argue that users make distinct attributions to virtual characters based on whether they believe the character is being controlled by a computer or by another person (e.g., Blascovich, Loomis, et al., 2002). Computer agents are viewed as more depersonalized, unfeeling, and mechanical than human avatars (Fox et al., 2015). Consequently, even when computer agents function objectively as well as human avatars, users will be more susceptible to the social influences of a perceived human than a computer. This line of research has shown that users tend to be more open, agreeable, and extroverted, and use less profanity, longer words, and positive emotion words when communicating with perceived humans rather than chatbots (Hill et al., 2015; Mou & Xu, 2017). With a chatbot, they are less likely to establish common ground and sustain personal relationships during the conversation (Corti & Gillespie, 2016). Human avatars are also more persuasive than computer agents (Blascovich, 2002).

In contrast, other studies have proven that computer agents can be received as well as human avatars. The computers as social actors (CASA) literature argues that when computers mimic humans by demonstrating social cues, people automatically and unconsciously act as though they were interacting with other humans, rather than cognitively determine how they should respond differently (Nass & Moon, 2000; Reeves & Nass, 1996). Although users are aware that a computer agent is neither human nor deserving of human-like treatment, they respond to it as a real person, without consciously intending to do so (Carolus et al., 2019; Reeves & Nass, 1996). For instance, Nowak and Biocca (2003) found that avatars and agents evoke similar levels of perceived copresence, social presence, and telepresence. In fact, being ostracized by a computer agent when playing Cyberball can generate negative effects on self-esteem, sense of belonging, and meaningful existence that are comparable to being ostracized by a human partner (Zadro et al., 2004). Particularly relevant to the study focus of affect in consumer–chatbot interaction, Ho et al. (2018) observed that participants’ emotional self-disclosure to a perceived chatbot partner generated similar emotional, relational, and psychological benefits as those with a human partner.

Given these inconsistent findings, it is important to identify the factors enhancing or diminishing the agency effect to illuminate when chatbots may perform equally well or differently from human representatives for marketing communications. Empirical evidence suggests that certain factors can override the agency effect, such as media richness (e.g,, photorealistic virtual representation; Fox et al., 2009), and behavioral realism that addresses the degree to which virtual characters appear to behave as they would in the physical world (e.g., message contingency and situational appropriateness; Blascovich, Loomis, et al., 2002). That is, even when users perceive the virtual character as a computer agent, the character’s photorealistic representation or high behavioral realism can induce social influence that is comparable to humans’. Agency effects appear to be amplified in competitive and cooperative tasks, but not in neutral ones (Fryer et al., 2017; Shechtman & Horowitz, 2003). Bailenson et al. (2003) also observed that the agency effect is more prominent for high-level responses (e.g., meaningful conversations), but relatively muted for automatic, low-level responses that do not involve conscious attributions about the identity of the virtual interactant.

Despite these important insights, prior research has not adequately examined the role of affect in driving the agency effect. To expand the limited research on the influence of affect in human–computer communication beyond the topic of emotional disclosure examined in Ho et al.’s (2018) study, this study compares how different affects (i.e., anger vs. embarrassment) may impact the usefulness of chatbots, relative to human representatives. Given that our study focuses on high-level responses (i.e., meaningful conversations; Bailenson et al., 2003) and affective influences that have been widely recognized as essential factors in human–human communication (Robinson & el Kaliouby, 2009), we adopted the agency effect framework to compare the social influence of a perceived chatbot versus human brand representative. Specifically, this study evaluates the agency effect using self-reported measures of interaction satisfaction, perceived usefulness of the virtual representative, compliance intention of receiving the HPV vaccines, as well as behavioral, linguistic measures of self-disclosure, elaboration, and consumer interest during the conversation with the virtual representative.

2.2 Affect in human–machine communication

Prior studies highlight the disadvantage of chatbots as an inherently logic-driven, mechanical, and depersonalized computer program, thus inferior in handling emotional tasks compared to humans (Madhavan et al., 2006). However, Walter et al. (2014) observed that emotions experienced during human–machine interactions can be as complex and diverse as those in human–human interactions. Moreover, recent research on human-machine communication suggests that the effectiveness of chatbots on user response may depend on the particular communication context. This is particularly true in conversations that involve sensitive topics. Lucas et al. (2014) study on using virtual characters in medical interviews found that when participants believe they are interacting with a computer, they reported higher perceptions of system usability, reduced impression management, lower resistance to self-disclosure, and more importantly, were more forthcoming about the sadness they were experiencing. In related work, Kang and Gratch (2010) found that those with social anxiety are more likely to disclose intimate information when responding to prerecorded questions by a virtual human avatar than by an actual human recorded on video.

These findings indicate that affective influence is an important consideration in human-machine interaction, demanding more empirical investigations on the influence of different affects. Recognizing the increasing use of chatbots in customer service, including handling angry consumer complaints, as well as in health marketing communications often involving sensitive topics, the present study focuses on two discrete emotions: anger and embarrassment. We present the argument that in interactions where users are experiencing the particular emotion, the affect can differentially impact chatbots’ persuasive outcomes. To illustrate, anger is a physiologically arousing state often triggered when one’s goals are compromised (Lazarus & Lazarus, 1994) and attributions of responsibility are placed on another party (Tangney et al., 2007). One common way to alleviate anger is through venting (Fisher & Shapiro, 2005). Angry individuals may prefer venting to an actual human to feel that their frustration has been heard and understood. In contrast, embarrassment is a self-conscious emotion conceptually defined as “an aversive state of mortification, abashment, and chagrin that follows public social predicaments” (Miller, 1995; p. 322). Thus, when experiencing embarrassment, people may prefer to interact with a perceived nonjudgmental conversational partner, such as a chatbot. Below, we will detail how people’s evaluation of chatbots versus human brand representatives can vary based on whether they are experiencing embarrassment or anger.

2.3 Anger, venting, and perceived understanding

Anger has been recognized as a powerful predictor of response to failed service encounters and product dissatisfaction (Funches, 2011). People often resort to venting to regulate their anger that venting has been recognized as a common means to catharsis (Fisher & Shapiro, 2005). In fact, it has been suggested that when angry customers contact customer service, the first thing they want is to vent (Forbes, 2013). According to Pennebaker’s (1993) cognitive processing model, the act of disclosing and describing negative emotions through venting can transform the affective nature of these experiences to a cognitive mechanism, thereby decreasing the intensity and dominance of such negative emotions (Lieberman et al., 2007; Pennebaker & Chung, 2007). Notably, research on venting has conceptualized it as an interaction-based anger regulatory strategy and underscored the importance of interpersonal dynamics in venting interactions (Hareli & Rafaeli, 2008). Despite the interactive and reciprocal advantages of chatbots, no research to date has evaluated chatbots as partners in venting interactions.

Regarding brand response to consumers’ venting of negative emotions, the public relations and conflict management literature highlights that brand communications expressing remorse constitute a key strategy to assuage consumers’ anger (Patel & Reinsch, 2003). For example, expressing mortification (confessing guilt and asking for forgiveness) as a corporate crisis response is more effective than providing instructive information for improving organizational reputation and stakeholders’ supportive intention (Benoit & Drew, 1997; Coombs, 1999). However, the efficacy of chatbots as brand representatives for communicating remorse to alleviate consumers’ negative emotions has not yet been explored.

Based on the social psychology literature on people’s need to be understood by others, chatbots that are commonly perceived as unfeeling computer programs likely are unsuitable partners in venting interactions and poorly-equipped for communicating genuine remorse. People’s need to be understood is a powerful driver in interpersonal interactions (Finkenauer & Righetti, 2011). Feelings of being understood by another person not only bring a sense of self-validation, but also psychological connections with others (Reis et al., 2017). Such feelings of being understood can positively impact attitude and susceptibility towards the interactant (Leary, 2010). In Ho et al.’s (2018) study on self-disclosure with human versus bot interlocutors, they emphasized the importance people place on being understood in social interactions and proposed that the psychological benefits of intimate disclosures are only likely to be realized when the conversational partner is believed to “have sufficient emotional capacity to truly understand the disclosure, which chatbots inherently cannot” (p. 713).

Related to the need to be understood, Corti and Gillespie (2016) reported that people are less likely to repair and correct misunderstandings when communicating with a perceived computer agent. Luo et al. (2019) observed that users are relatively curt when conversing with a perceived computer agent due to the perception of computers as less empathetic, despite the objective competence of computers. Von der Pütten et al. (2010) also reported that users interacting with a computer agent reported more negative and low-dominance feelings (e.g., afraid, ashamed) than those interacting with a human avatar, which may suggest emotional withdrawal induced by perceptions of interacting with an unfeeling machine.

Similarly, when angry consumers feel the need to vent their frustration and negative emotions to a brand representative, they may not favor chatbots, viewing them as incapable of feeling genuine remorse for the emotional stress and turmoil caused by the company. As venting is an interaction-based anger regulation strategy, consumers may choose not to vent to a chatbot or vent less. Consequently, the intensity of their anger will persist. In fact, consumers may feel even more upset when they have to explain negative emotions to an unfeeling computer program, leading to negative evaluations of the chatbot and the company it represents. Additionally, given the importance of seeking a heartfelt apology as another coping strategy of regulating anger (Worthington, 1998), annoyed customers may view the apology and remorse communicated by chatbots as pre-programmed, automated reactions, instead of sincere apologies based on empathetic understanding of their feelings. In this way, although human representatives and chatbots may be equally effective in providing help and solutions to angry consumers, chatbots nonetheless may be viewed as not being able to understand and empathize with the user’s frustration and anger, resulting in more negative user response. Hence, we propose the following hypothesis:

  • H1:

    Participants in the anger elicitation condition who perceive the brand representative as a chatbot will (a) have lower interaction satisfaction, (b) perceive the representative as less useful and (c) report lower compliance intention to receive the HPV vaccine than those in the non-anger conditions.

2.4 Embarrassment and fear of negative judgment

Research on health communication suggests that people tend to avoid face-to-face conversations on embarrassing, sensitive, or stigmatized topics (Redston et al., 2018; Romeo, 2016; Skinner et al., 2003). Towle et al. (2005) suggest that embarrassment is a major barrier for young people in discussing sexual health with their physicians. Redston et al. (2018) argue that the anonymity of online health information is particularly appealing to those seeking information on embarrassing health problems. Indeed, Skinner et al. (2003) reported that adolescents use the Internet to seek information on sensitive health concerns, including sexual behaviors and activities, as they are hesitant to approach practitioners about such embarrassing topics. Anecdotal evidence reported in news media similarly suggests that people tend to be more honest in explaining embarrassing symptoms to a computer or robot than to a nurse or doctor (Sydell, 2018). Likewise, AI counselors have helped Syrian refugees deal with mental health issues that are stigmatized in their community (Romeo, 2016). Beyond the topics of personal health, Katsanis (1994) indicates that when purchasing “unmentionable products” that are embarrassing, unhealthy, or sexually oriented, embarrassment is involved in all stages of buying and consumption (Dahl et al., 2001). Hence, it is important to understand how embarrassment may impact consumer evaluations of chatbots versus human representatives in conversations about sensitive products or services.

The social evaluation model of embarrassment predicts that embarrassment is threatening because the observers or audiences may form an unfavorable impression (Withers & Vernon, 2006). Takahashi et al. (2004) further argue that embarrassment is a social emotion that is interaction-based, compared to other moral emotions such as guilt. Even when people do experience embarrassment in private, it is triggered by vivid imaginations of how others might judge them (Tangney et al., 1996). Although human avatars are typically associated with a stronger sense of social presence—the perception that another individual is in the environment (Lombard & Ditton, 1997)—which can make them more influential than chatbots, this may backfire in the context of discussing embarrassing topics and make people less engaged in the conversation. Related to social presence in computer mediated communication, research has documented that in offline settings, when people have to perform unfamiliar or complicated tasks, the presence of others can generate social inhibition effects that impair performance due to heightened evaluation apprehension (Geen, 1983). By the same token, in the context of discussing embarrassing topics that can trigger evaluation apprehension, a strong consciousness of the presence of others may negatively influence the communication experience and outcome.

As Weisband and Kiesler (1996) suggest, “The possibility that people would tell an impartial machine personal or embarrassing things about themselves, without fear of negative evaluation, has been raised since the first uses of computers for communication” (p.3). Emerging evidence indicates that chatbots may be an ideal tool to engage consumers in conversations that may be too embarrassing to discuss with another person due to evaluation apprehension. Hoyt et al. (2003) observed that participants who were told that the virtual observers were human avatars performed the task more poorly than those performing alone or in the presence of perceived computer agents. Particularly relevant to our study’s setting of sensitive health topics, Nadarzynski et al. (2019) interview study suggests that chatbot acceptability may be higher for stigmatized health issues because it offers greater anonymity. Lucas et al. (2014) observed that participants who believed they were interacting with a computer agent reported lower evaluation fears and impression management, and displayed their sadness more intensely. The researchers explained that because people believe that pre-programmed chatbots lack the ability to think independently, they believe that chatbots do not judge them. Thus, people may be more comfortable discussing embarrassing or stigmatizing information with a chatbot than with another person (Lucas et al., 2014). Based on the literature of evaluation apprehension and social inhibition effects driven by the presence of others, we posit the following hypothesis:

  • H2:

    Participants in the embarrassment elicitation condition who perceive the brand representative as a chatbot will (a) have higher interaction satisfaction, (b) perceived the representative as more useful, and (c) report higher compliance intention to receive the HPV vaccine than those in the non-embarrassment conditions.

Furthermore, Fox et al.’s (2015) meta-analysis suggests that objective, behavioral measures (e.g., number of words, interpersonal distance) may be more sensitive than subjective measures in capturing how differently users respond to chatbots versus human avatars. For instance, Hill et al. (2015) found that their participants, on average, used 30 times more profanity in the human-chatbot conversations than human–human conversations and typed fewer words per message to a chatbot than to a human avatar. Therefore, this study pays special attention to the potential behavioral differences in how consumers communicate with computer agents versus human avatars under different emotional contexts. On the one hand, people might be less concerned about their personal impressions and engage in more self-disclosure with a chatbot when feeling embarrassed. The associated behaviors could include expressing more concerns regarding health issues, sharing more personal information, asking more questions, and providing more elaborate answers when asked. On the other hand, when the interaction is motivated by angry venting, explaining one’s emotion to a pre-programmed chatbot that is incapable of offering empathy might be believed to be a fruitless effort, thereby leading to fewer typed words and sentences. Given the paucity of research on how affect influences the agency effect, the following research question is posed:

RQ: What are the potential linguistic differences in participants’ conversations with human versus bot representatives when anger and embarrassment are elicited?

3 METHOD

To test the hypotheses, a 2 (perceived agency: chatbot vs. human) × 3 (affect elicitation: embarrassment, anger, neutral) between-subject lab experiment was conducted. As Fox et al. (2015) point out, it is the perceived identity of the virtual character that drives its effectiveness, not the actual one. We therefore adopted the Wizard of Oz method commonly employed in prior studies comparing human avatars to chatbot agents (e.g., Go & Sundar, 2019; Ho et al., 2018). This approach uses a human confederate to chat with participants in both human and chatbot conditions, despite the fact that half the participants are told that the virtual representative is an AI-powered chatbot. In our study, a doctoral student in health communication went through extensive training to act as the confederate to chat with the participants via a straightforward communication style and a neutral tone without being overly friendly or empathetic. In the experiment, the confederate remained unaware of the condition (i.e., chatbot or human) to which the participant was assigned, thus maintaining a double-blinded design. A communication protocol (see Appendix 1) was developed through multiple rounds of online training with other graduate assistants and the confederate’s personal contacts who were not involved in the current study. To ensure the consistency of the conversations, the confederate strictly followed the same protocol in communicating with all participants. The study protocol was approved by the institutional review board (IRB) at the authors’ institution.

3.1 Stimuli design and manipulation check

Six versions of experimental stimulus websites were created for Gardasil,1 the most well-known HPV vaccine brand, which holds a 93% market share (Trefis Team, 2019), to enhance the external validity of the study findings. The stimuli websites provided information about the sexually transmitted disease and included a health survey and a live chat window to communicate with a Gardasil customer representative. We used the chat interface of tawk. to, an online chat platform that can be embedded on websites for real-time chats between the confederate and participants.

Affect elicitation was manipulated through a health survey that participants were required to complete before they were instructed to chat with the brand representative to ask any questions about HPV and the vaccine and to provide feedback for the website. Via the pre-chat health survey, embarrassment was elicited through a series of questions on sexual behavior and preference, including questions on orgasm, oral and anal sex, sexual partners, and sexual frequency. In the anger condition, the health survey did not include the embarrassing sexual behavior questions but instead covered general physical health behaviors such as exercising and receiving flu shots. Feelings of anger were evoked by designing the survey to appear to be malfunctioning such that the online questionnaire could not record answers, requiring participants to fill out the same survey three times. In the control condition, where no embarrassment or anger was elicited, the survey used the same general nonsensitive questions in the anger condition without malfunctioning.

A pretest was conducted among 127 participants recruited from a medium-sized southeastern university in the United States. Participants were asked to rate how embarrassed and angry they felt when filling out the online survey. This is an important step, as affect elicitation in a lab setting can be difficult. Conventionally-used stimuli are often not strong enough to evoke a typical affective response, resulting in lower responses to self-report inventories (Tassinary & Cacioppo, 1992). There are also other factors that may lead to reduced or inconsistent scores on self-report emotion inventories. For example, emotional expressivity is often more of a function of whether one perceives an audience than an actual underlying state; thus the presence of researchers may limit expressivity (Fridlund, 1991). Similarly, participants may under-report the negative affect in an effort to respond in socially desirable ways (Amodio et al., 2007). As a final example, cultural and gender differences in the normativity and acceptability of emotional expressivity may further influence participants’ emotional ratings (Brody & Hall, 2010; Gross & John, 1995). For these reasons, it is of critical importance to pretest experimental conditions to verify differences in emotional expressivity. Anger was measured by 6 items, including angry, annoyed, aggravated, mad, frustrated, and outraged. Embarrassment was measured by 5 items including embarrassed, uncomfortable, awkward, self-conscious, and humiliated. We also asked the participant to describe how ashamed, anxious, disgusted, and joyful they felt when filling out the survey. All items were measured on a 7-point scale. The results showed that participants in the anger condition reported a higher level of anger (F(2,124) = 80.100, p < .001; M = 3.67, SD = 1.61) compared to those in the embarrassment condition (M = 1.25, SD = .65) and control condition (M = 1.16, SD = .48). Similarly, participants in the embarrassment condition reported a higher level of embarrassment (F(2,124) = 14.46, p < .001; M = 2.91, SD = 1.47) than those in the anger (M=1.94, SD=1.51) and control (M = 1.43, SD = .83) conditions. No significant differences were found in the other 4 affects: ashamed (F(2,124) = .158, p = .854), anxious (F(2,124) = .467, p = .628), disgusted (F(2,124) = .420, p = .658), or joyful (F(2,124) = .792, p = .455). Given the difficulty of eliciting affect in lab settings and the type of stimuli involved (either a malfunctioning or an embarrassing questionnaire; both relatively mild emotional elicitations), the low to moderate levels of self-reported affect were expected and appropriate. Therefore, the affect manipulation was deemed successful.

3.1.1 Participants and procedure

The main study participants were recruited from the same university. Due to the increased HPV vaccination rate among the college-aged population (Boersma & Black, 2020) and to recruit from both vaccinated and unvaccinated students, we included a deception in the experiment. On the study websites, we planted misinformation stating that people who have and have not been previously vaccinated should receive an additional booster shot to prevent them from being affected by the “newly discovered HPV strains.” After being consented to the study, participants were randomly assigned to one of the six conditions. They first read through the website with information on HPV and filled out the questionnaire that was designed to elicit embarrassment, anger, or no affect. After they filled out the survey, they were instructed to chat with a Gardasil customer representative that was either a person or an artificial intelligence agent, regarding their feedback about the website and its content, and ask any questions they may have about HPV and the vaccine. Information regarding the identity of the brand representative as a chatbot or a real person was explicitly highlighted on the live chat webpage. Participants were instructed to tell the Gardasil representative that they no longer had any questions to end the conversation, and the representative would then provide the link to the final survey regarding their interaction with the representative. Participants first had to indicate their affective state at the beginning of the final survey and answered questions regarding their perception toward and interaction with the Gardasil representative. At the end of the experiment, all the participants were debriefed about the study purpose and manipulation, including the deception about the new HPV strains and the booster shot and why the deception was necessary. In the main study survey, participants had to answer whether they talked to a chatbot or human representative as an attention check question. Thirty-two participants whose answers did not match their assigned conditions were removed from the analysis. After removing participants who failed the attention check or gave poor quality responses (i.e., straight-lining), a final sample of 142 participants were included for data analysis.

3.1.2 Measures

All the study measures were adopted from the literature and adapted to fit the study context. This study focused on the key communication outcomes of interaction satisfaction, perceived usefulness of the virtual representative, and compliance intention of getting the vaccine. To be specific, interaction satisfaction was adopted from (Horzyk et al., 2009) on a 5-point Likert scale ranging from strongly disagree to strongly agree (α = .896), including “I enjoyed my interaction with the representative” and “The representative was very helpful.”

Perceived usefulness was measured on a 7 point Likert scale adopted from (Duijst, 2017); 1 = strongly disagree to 7 = strongly agree; α = .870), including “Because of this representative, I can quickly find out what I need” and “This representative is useless” (reverse-coded).

Participants were asked to indicate their intention to receive the HPV vaccine. Three statements such as “I intend to get the HPV vaccine in the next 12 months” were measured on a 5 point Likert scale ranging from strongly agree to strongly disagree (α = .883). In addition, participants’ affective state was measured using the PANAS scale (Thompson, 2007).

Two control variables, technological self-efficacy (Hopp & Gangadharbatla, 2016) and preferences for online social interactions (Caplan, 2003), were also measured. Hopp and Gangadharbatla (2016) introduced the concept of technological self-efficacy (TSE) which is conceptualized as “the belief in one’s ability to perform a technologically sophisticated new task” (p. 117); they found interaction attitude is associated with brand attitude when TSE was high. TSE was utilized in this study as a covariate and measured by 3 items (α = .950), including “I usually teach myself the latest technological tools;” “I feel confident explaining to others how to use the latest technology;” and “I am fairly confident in using advanced features of a given technology.” Moreover, people have different preferences regarding online social interactions, where some view online interactions to be safer (e.g., greater anonymity) or more convenient (e.g., requiring less interpersonal management), whereas others might prefer face-to-face interactions (Caplan, 2003). To control for these personal preferences, we adopted six items from Caplan (2003) (α = .893), including “I prefer online social interaction over face-to-face interaction” and “I am happier being online than I am offline.” However, neither of the covariates significantly affected the outcomes; these two variables were thus removed from consideration.

To answer the research question regarding the linguistic differences, we also coded and analyzed participants’ chat transcripts to assess the key linguistic characteristics identified in previous studies. We coded participants’ answers to the representative’s questions, including “how likely do you think you are to get HPV?” to assess if they disclosed concerns, anxiety, or fear, and “do you want to find out where the nearby clinic is?” as a measure of consumer interest. Following Z. Li and Rau (2019), we counted the number of requests or questions asked by the participants that are separate from their answers to the representative’s questions, including the number of requests for more information (Bickmore et al., 2009). As prior research suggests that users tend to provide shorter, simpler, less elaborate responses to computers than to humans, we also coded the number of meaningful or detailed answers. For example, when asked “is there anything that we need to improve about the survey?” responses such as “it was very clear, easy to follow” were coded as meaningful, and responses such as “no, thank you,” “it is okay,” and “yes, please” were not.

Beyond participants’ responses to the virtual representative’s questions, we further coded the number and types of unsolicited, voluntary responses, including their positive or negative feedback about the representative and the information provided (e.g., “I did not know about the age limit for the vaccine. The information you provided is really helpful!”). In addition, we measured participants’ verbal behavior: the total number of sentences participants typed during the conversations. Linguistic variables, including the number of shorthand, emoticons, capitalized words, and discourse fillers (e.g., aww, phew, gosh, uh-uh, oh and expressions such as “Okay…”) were also coded. Lastly, we coded whether the participants volunteered to share personal information with the representative (e.g., “I am married and I practice safe sex. I am in a monogamous relationship”) or disclosed concerns (e.g., expressing feelings of fear or anxiety). Three additional graduate students who were unaware of the study purpose were trained as coders. Chat transcripts from disqualified participants were used as the training data set. The intercoder reliability was calculated by using Perreault and Leigh’s (1989) formula. The index of reliability (Ir) which represents the agreement coefficient among coders ranged from .70 to .96 for the nominal variables.

4 RESULTS

Hypotheses 1 and 2 predicted that there would be an interaction between affect and perceived agency on the experiential, perceptual, and behavioral outcomes. To address these hypotheses, a 3 ×2 ANOVA was run for each of the dependent variables, including interaction satisfaction, perceived usefulness of the online representative, intention to receive the HPV vaccine, and verbal reactions to the representative.

Hypothesis 1a predicted that the interaction experience would be less satisfactory among participants who were exposed to anger elicitation and thought they interacted with a chatbot than those in the non-anger conditions. There was a significant interaction effect of the elicited affect and perceived agency on interaction satisfaction (F(2,136) = 5.815, p = .004, partial η2= .079; see Figure 1). The results revealed that the anger/chatbot condition generated significantly lower interaction satisfaction than the control/chatbot condition (Mdiff = −.721, SE = .200, p < .001) and the embarrassment/chatbot condition (Mdiff = −.853, SE = .184, p < .001). That is, when interacting with a chatbot, participants in the anger condition rated their interaction experience the least satisfactory compared to those in the embarrassment and control conditions. Additionally, within the anger conditions, participants who talked to a perceived human representative reported significantly higher interaction satisfaction compared to those who interacted with a perceived chatbot (F(1,136) = 9.136, p = .003, partial η2 = .063; Mdiff = .562, SE=.186). Therefore, H1a is supported. It is important to note that when participants thought they talked to a human representative, there was no significant effect of affect elicitation on interaction satisfaction (F(2,136) = .328, p = .721, partial η2 = .005). When interacting with a perceived human representative, participants in the anger condition reported similar satisfaction level to their counterparts in the embarrassment and control conditions. In other words, regardless of the emotional context, participants reported similar interaction satisfaction when it was with a perceived human representative.

Interaction between emotional elicitation and perceived agency on interaction satisfaction

Regarding the influence of embarrassment, H2a predicted that interaction satisfaction would be higher among participants in the embarrassment condition who thought they interacted with a chatbot than those in the non-embarrassment conditions. The results demonstrated that the embarrassment/chatbot condition generated a similar level of interaction satisfaction as the control/chatbot condition (Mdiff = .132, SE = .202, p = .514). Additionally, no agency effect was observed in the embarrassment condition (F(1,136) = 1.409, p = .237, partial η2 = .010) or the control condition (F(1,136) = 1.345, p = .248, partial η2 = .010). Chatbot and human representatives were rated comparably in terms of interaction satisfaction in the embarrassment and control conditions. Therefore, H2a is not supported.

We also tested the interaction effect between elicited affect and perceived agency on perceived usefulness (H1b and H2b) and intention to receive the HPV vaccines (H1c and H2c). The interaction effect was not significant on perceived usefulness (F(2,136) = 1.611, p = .204, partial η2 = .023; Figure 2). Similarly, no significant interaction effect was observed on the participants’ intention to receive the HPV vaccines (F(2,136) = 1.803, p = .169, partial η2 = .026; Figure 3). The result patterns suggest that the perceived identity of the online representative as human or chatbot did not affect the representative’s perceived usefulness or participants’ behavioral compliance intention of receiving the booster shot, regardless of the affect elicited. Hence, H1b, H2b, H1c, and H2c were not supported.

image

Interaction between emotional elicitation and perceived agency on perceived usefulness

image

Interaction between emotional elicitation and perceived agency on behavioral intention

Recognizing the difficulty of eliciting discrete emotions, we examined the actual levels of anger self-reported by participants. To further test the proposed hypotheses, we conducted a secondary interaction analysis looking at actual levels of anger experienced by participants. Specifically, moderation analyses using Hayes’s (2017) process macro for SPSS were conducted with only participants in the anger condition. Perceived agency (chatbot vs. human) was entered as the predictor variable, anger was entered as a moderator, and the outcome variables included: (a) interaction satisfaction, (b) perceived usefulness, and (c) intention to receive the HPV vaccine. A separate moderation analysis was conducted for each outcome variable. The relationship between chatbot condition and outcomes was assessed at three levels of anger: low (18th percentile), moderate (the median score), and high (84th percentile). In line with the aforementioned findings, there was a significant interaction between perceived agency and experienced anger on interaction satisfaction. Although there was no significant difference between the chatbot and human conditions when anger was low (coeff. = .323; SE = .239; p = .184), there were significant differences when anger was moderate (coeff. = .419; SE = .189; p = .032), and high (coeff. = .662; SE = .297; p = .030). These results suggest that although participants tend to prefer humans over chatbots, this difference was statistically significantly higher among those reporting moderate and high levels of anger (see Figure 4). On the other hand, there were no significant differences for any outcome variable in the embarrassment condition when looking at perceived embarrassment as a moderator.

image

Interaction between perceived agency and anger on interaction satisfaction

Regarding the research question exploring the linguistic differences in participants’ conversations with human versus bot representatives when anger, embarrassment, or no affect is elicited, the content analysis of the conversation showed that participants generally were inclined to provide more elaborate responses to the human than to the chatbot. A factorial ANOVA was conducted to compare the effects of perceived agency and affect elicitation on the total number of sentences participants entered. A significant interaction was found on the total number of sentences (F(2,136) = 3.351, p = .038, partial η2 = .047). That is, when people thought they were talking to a human, the embarrassment condition generated a significantly lower number of typed sentences than the anger condition (F(2,136) = 10.008, p < .001, partial η2 = .128; Mdiff = −6.226, SE = 1.417) or the control condition (Mdiff = −3.907, SE = 1.456, p = .008). However, when people thought they were talking to a chatbot, no differences in the number of typed sentences were found across the three conditions.

In addition, when embarrassed, participants were more likely to disclose their concerns about HPV risks to the human representative (F(1,46) = 4.591, p = .037, M = .85, SD = .988) rather than the chatbot (M = .32, SD = .723). Similarly, they provided more elaborate answers (F(1,48) = 7.152, p = .01) to the human representative (M = 3.14, SD = 2.651) than to the chatbot (M = 1.62, SD = 1.321), and were more likely to express interest in knowing the location of nearby clinics where they can receive the booster shot (F(1,48) = 4.64, p = .036) to the human representative (M = .14, SD = .359) than to the chatbot (M = .00, SD = .00). No differences were found among other behavioral outcomes, such as the number of questions asked, amount of voluntary sharing of personal information and feedback, and types of linguistic characters (e.g., number of shorthand, emoticons, capitalized words, discourse fillers). When anger was elicited, no significant differences were found among the participants’ conversations with the perceived human or chatbot representative.

5 DISCUSSION AND CONCLUSION

Our study is one of the first empirical investigations exploring how emotional context may lead to chatbots’ inferior or equivalent performance as compared to human avatars. Through a 2 ×3 between-subject lab experiment, this study examined whether the perceived human versus chatbot identity of a brand representative impacted consumer response across situations when anger, embarrassment, or no-affect was elicited. The unexpected results based on both subjective self-report and objective behavioral measures highlight the complex emotional dynamics in consumer–chatbot communication. By comparing the different affective conditions, we found that the effectiveness of a chatbot may depend on the emotional contexts. Results revealed that interaction evaluation of the chatbot was significantly lower in the anger condition than in the embarrassment or control condition. This pattern does not hold true for the human representative, as participants’ assessments did not vary across the affective conditions.

It is important to note that the only agency effect confirmed was that when anger was evoked, participants reported significantly lower interaction satisfaction with a chatbot than with a human representative. Contrary to our predictions, no significant differences in the other self-reported variables were observed between the chatbot and human representatives across the affective conditions. Even when participants experienced embarrassment, they responded similarly to chatbots and human representatives, rather than favoring chatbots as hypothesized. Overall, the chatbot was found to perform comparably to the human representative in terms of perceived usefulness and ability to influence participants’ compliance intention of receiving the vaccine in all affective conditions. Although chatbot interaction satisfaction was significantly lower than satisfaction with a human representative in the anger condition, chatbots achieved an equivalent level of interaction satisfaction in the embarrassment and control conditions.

Interestingly, participants’ behavioral reactions contradicted their subjective responses. While participants’ self-reported interaction satisfaction and perceived usefulness toward the human representative did not differ across affective conditions, the number of sentences they typed did. They typed significantly fewer sentences to the human representative in the embarrassment condition. Such finding is consistent with prior research on patient self-disclosure during medical interviews that patients often withhold personal information due to fears of judgment by healthcare providers (e.g., Barglow, 2005). It is likely that when users experience embarrassment while talking to a perceived real person, evaluation apprehension may interfere with their online interactions, leading to fewer typed sentences. However, participants were more likely to disclose concerns of HPV risks, offer more detailed answers, and express interests in knowing the nearby clinic location to the human representative than to the chatbot. Contrary to prior research that suggests people might disclose more intimate, personal, or embarrassing information to machines than to humans (e.g., Lucas et al., 2014), the study participants were not more likely to disclose personal information to chatbots. It is possible that the moderate level of embarrassment elicited by the health survey was not sufficient to engender the agency effect on such self-disclosure. Overall, despite the fact that chatbots and human representatives were perceived similarly in terms of usefulness, the behavioral results suggest that human representatives are more effective than chatbots in soliciting detailed feedback and potentially more honest self-disclosure about concerns and anxieties even when participants feel embarrassed.

5.1 Theoretical implications

Taken together, the study results advance the emerging field of consumer–chatbot interaction that has primarily focused on the cognitive dimensions (e.g., cognitive absorption; Balakrishnan & Dwivedi, 2021) by illuminating the under-researched affective factors that can diminish the effectiveness of chatbots as compared to human representatives. The perceived identity of the brand representative as a chatbot or a human only mattered for the anger-elicited context and only for interaction satisfaction. Additionally, in embarrassing contexts, chatbots were less effective than human avatars in soliciting self-disclosure and elaborate answers via text-based conversations. The study findings thus expand prior research comparing the agency effect in competitive or cooperative gaming tasks and neutral, non-gaming ones (Fryer et al., 2017; Shechtman & Horowitz, 2003) by highlighting the importance of emotional contexts.

In addition, our results showed that the type of evaluation outcomes may be another moderator of the agency effect; agents and avatars may exert different impacts on experience-oriented evaluations, such as interaction satisfaction, but their impacts are not significantly different in terms of utilitarian considerations, such as perceived usefulness. By identifying this moderator, this study enriches theoretical knowledge of how the perceived agency of virtual characters may generate different types of perceptual influences. The findings further reveal the dissimilar utility of subjective and objective measures for capturing the agency effect in different emotional contexts. The results are consistent with Yun et al.’s (2021) study in which the objective neural evidence based on functional magnetic resonance imaging (fMRI) documented significant agency effects while self-reported intention measures did not. It is thus imperative for future comparative studies to include both objective and subjective measures to provide a more nuanced and comprehensive understanding of the different social influences of computer agents and human avatars in marketing communications.

Notably, the majority of self-reported and verbal behavioral data rejected the agency effect: perceived agency did not make a difference on most communication outcomes across the affective conditions. The overall result patterns support the CASA paradigm which posits that people intuitively react to computers as they do with other people, particularly when the computer agents exhibit ability for interactions (e.g., use natural speech; Nass & Moon, 2000; Nowak & Rauh, 2005). Ho et al. (2018) reported that chatbots as conversational partners work equally well as human partners to provide emotional, relational, and psychological benefits by offering conversations that induce emotional disclosure. In Von der Puetten et al.’s (2010) experiment that used a computer agent, instead of a human confederate, to compare the effects of perceived agency, they similarly found no significant differences in a variety of outcomes, including emotional response, perceived social presence, felt rapport, mutual understanding, and self-disclosure. Our results thus provide additional support to the CASA framework with considerations of elicited affect in the context of health marketing communication for vaccines.

It is important to note that scholars argue that for either chatbots or human avatars to exert social influence, projecting social presence is vital (Fox et al., 2015). Behavioral realism can greatly enhance social presence (Von der Puetten et al., 2010) to the extent that high behavioral realism may override the effect of perceived agency (Blascovich, Loomis, et al., 2002; Fox et al., 2015). As this study employed a human confederate, the perceived behavioral realism through natural speech and message contingency was likely to be invariably high, thus contributing to the similar effects between the human representative and chatbot. More importantly, our study primed the affects of embarrassment and anger through a health survey. Due to ethical concerns, participants were not required to speak to the representative about their answers to the embarrassing questions, nor did the representative ask about any of the sensitive sexual behavior questions included in the survey. This could have made participants less concerned about being judged negatively by the perceived human representative, contributing to the absence of the agency effect. Overall, there had been scarce empirical data regarding the usefulness of chatbots to ease conversations about embarrassing topics and encourage disclosure of intimate, sensitive information. In Kang and Gratch’s (2010) study on social anxiety and self-disclosure in computer-mediated interactions, they compared user response to prerecorded video messages from a real human and a virtual human avatar. Through an interview setting in which the study participants were asked to answer embarrassing questions regarding their sexual fantasies, people revealed more information and greater intimate information about themselves to the avatar video than to the real-human video. But the authors compared real versus virtual human videos and did not compare agents and avatars. Lucas et al. (2014) study on medical interviews observed that framing the virtual interviewer as a computer increases participants’ displayed sadness and lowers self-reported level of impression management. However, the authors analyzed expressed sadness as an objective measure, not the amount of self-disclosure. More empirical studies are needed to evaluate the agency effect during embarrassing, potentially humiliating conversations with chatbots and human avatars and the effect on self-disclosure of intimate, sensitive information.

The finding that participants were more likely to voluntarily express concerns about HPV to a human representative than to a chatbot may suggest skepticism or distrust of pre-programmed chatbots to understand their worries or anxieties. Different from disclosing embarrassing or highly sensitive information, emotional disclosure of concerns and worries may be similar to venting one’s anger that it may be an interaction-centric affect regulation strategy relying on being understood. Similarly, the perceived greater empathy from the human representative may have induced a higher level of compliance in terms of offering more elaborate answers and interests in learning about where they can receive the vaccines.

5.2 Strategic implications

Based on our empirical findings, this study offers important strategic guidelines to help marketers capitalize on the advantages of chatbots and recognize situations when it is essential to use human representatives. In particular, AI-powered chatbots have been rapidly adopted in various industries to supplement or even replace human representatives for online customer service. However, as managing angry customers’ complaints and allowing customers to vent is a common and important part of customer service, this might not be an appropriate use of the chatbot. Chatbots’ lower interaction satisfaction rating compared to human representatives in the anger condition illustrates the imperative of employing humans to better satisfy consumers’ anger regulation needs. This is particularly important in industries where interaction satisfaction with a company’s representative is vital, such as insurance and financial services. Additionally, results based on verbal reactions indicate that human representatives were able to solicit more detailed answers regarding participants’ evaluations of the website and the information presented, as well as more disclosures of concerns. Therefore, when collecting honest feedback is the goal, human representatives would appear to be the better choice.

At the same time, results confirmed that a chatbot can be as competent as a human representative in terms of perceived usefulness and ability to influence compliance intention, even when consumers experienced embarrassment or anger. Particularly relevant to the current pandemic crisis that has dramatically shifted the majority of our everyday interactions from face-to-face to virtual, a chatbot can provide cost-efficient, large-scale, 24/7 assistance to meet consumers’ information-seeking needs. However, for services like psychological therapy or counselling, which require the feelings of being understood, genuineness, and empathy (Wampold, 2015), our results suggest that people may put less effort into self-disclosure with a chatbot even when they experience embarrassment. Thus, to build meaningful relationships for users and patients, human counselors are still irreplaceable. Taken together, when handling certain situations, such as encouraging self-disclosure or solving complex technical issues, a human agent might be the better solution. On the other hand, chatbot can handle other certain tasks with ease, such as answering non-conversational and straightforward questions that elicit simple responses. The best solution is to integrate chatbot and human together to create a seamless service for the users.

It is important to emphasize that with a focus on the effect of the representative’s perceived identity across varied emotional contexts, this study did not manipulate and test what chatbot design and communication factors can enhance chatbot effectiveness. Research suggests that anthropomorphic cues, such as the female gender, can improve consumer acceptance (e.g., Borau et al., 2021). Given the prevalent gender stereotype of women being warmer and more empathetic than men (Löffler & Greitemeyer, 2021), a female chatbot may be more effective than a male bot in addressing consumers’ emotional needs, particularly for encouraging intimate, honest self-disclosures. Future studies thus should evaluate different chatbot design features to provide evidence-based guidelines to capitalize on the potential of chatbots for enhancing interpersonal communication effects.

Several limitations of this study should be recognized when interpreting the research findings and addressed in future research, including a small sample size based on a convenience college student sample. We recruited participants from both vaccinated and unvaccinated students for a fictitious HPV booster vaccine; future studies should consider consumers’ prior experience or familiarity with the product, such as controlling the differences between vaccinated and unvaccinated participants. Fictitious brands may be used in future experiments to eliminate the influence of prior brand attitudes. We are on the cusp of a transformational moment in marketing where AI applications like chatbots for customer support and robot-advisers for financial planning are changing how consumers interact with brands and businesses. As we move into the new era, it is important to obtain a better understanding of human–computer communication in various consumption domains beyond the health marketing context examined in this study. Although the literature on eliciting anger is well established, there are fewer studies on the purposeful elicitation of and measurement of embarrassment. Mean levels of embarrassment in the present study were somewhat low, despite asking private and sensitive questions about sexual practices in the embarrassment condition. Future research should develop and refine a protocol or procedure for eliciting embarrassment to evaluate agency effects. Similarly, more empirical research is needed to evaluate different affective influences besides anger and embarrassment.