Can immersive media be more engaging than traditional media?

This post summarizes a study I did at Vertebrae, Inc. last year: Interactivity and Realism as Determinants for Engagement: A Comparison Between Immersive and Traditional Media Effects.

Most successful advertising relies on a storytelling format and persuading via modeling, consumers see a behavior that leads to failure/success and avoid or imitate that behavior, or via pavlovian conditioning, consumers associate a brand with an affective state.

Immersive media are not the best media for storytelling. Immersive media invite users to interact with the content, and thus to change it. That’s part of the reason why AR and VR are so much fun! Interactivity requires content creators to foresee and restrict what the user can do, transforming an experience from a story to something more like a game.

Can immersive media still persuade? Yes. The storytelling format is not necessary. Stories don’t persuade because they are stories. Stories persuade when they engage our attention and when they lead to reflection of the consequences of imitating a modeled behavior. Stories also persuade via pavlovian conditioning, when learning occurs as a consequence of associating a cue from a message with a response—salivating to the ringing of a bell, in Pavlov’s famous experiment. Take for instance branded toy merchandise from fast food franchises. The “Star Wars” logo doesn’t make your burger any better, but the experience of eating that burger becomes better because it reminds you of the emotions you experienced while watching the Star War movies. When the mind is primed to react in a certain way it will probably react in that way.

To persuade, messages only need to be engaging and then lead to learning via either modeling or pavlovian conditioning. The question becomes then, under which circumstances can immersive media be more engaging than traditional media?

The actual study

Vertebrae designed four extremely simple messages, all of them presenting the same article, either a camera, a chair or a tennis shoe, in four different media: 2D pictures, video, 3D manipulable objects, and as an augmented reality (AR) experience. The purpose was to compare engagement and emotional responses across media.

Participants were exposed to two pairs of simulated shopping experiences, each one in a different medium from a different store. “Blue store:” 2D photos or Video; “Green Store:” 3D manipulable objects or AR. In total, there were four different combinations: 2D v 3D; 2D v AR; video v 3D, and video v AR. The order in which the articles were presented and the type of articles were altered to prevent fatigue.

I measured participants’ galvanic skin responses (GSR) while they interacted with each experience. GSR measures skin electric conductivity due to increased sweat through a small sensor. Part of an emotional response are bodily changes (arousal) which cause you to sweat a little. The stronger your emotional response, the stronger your arousal and the higher the change in skin conductivity. GSR cannot infer an emotion, but it can provide a metric that can serve as a proxy for the intensity of an emotion. I called metric an “arousal score.” I expected that highest arousal scores would correlate positively with engagement and media preference.

After each experience, I asked participants how engaging they found it. At the end, I also asked participants which store they liked best, the “Blue Store” which presented articles in traditional media, or the “Green Store” which presented articles in immersive media. As you would have guessed, most people preferred the “Green Store.”

Differences in self-reported levels of engagement told a similar story. On average, participants reported to have been more engaged by immersive media experiences than by traditional media experiences.

GSR told the same story. However, the interesting part was that the arousal caused by AR experiences compared to 2D and video was much more higher than the arousal caused by 3D compared to traditional media, and for 3D the differences were not significant.

That meant that even if people reported to prefer 3D over traditional media, that preference didn’t produce as strong emotional responses as AR did. That is, interacting with an AR experience felt WAY COOLER than watching 2D pictures or a video, while interacting with a 3D experience felt just a little bit better than watching 2D pictures or videos.


We expected 3D and AR to be more engaging than traditional media because of the amount of information that immersive media provided to the user. Experts agree that “AR provides more 3D product information, in different colors and styles, which enhances users’ perception of reality” (Poushneh & Vasquez-Parraga, 2017, p. 233). “AR’s ability to offer the consumer multiple viewing angles and other three-dimensional features clearly offers substantially more potential for the delivery of rich sensory information than traditional means of online advertising” (Hopp & Gangadharbatla, 2016, p. 114). However, we did not expect 3D to be only a little bit more engaging, we expected it to be almost as engaging as AR.

3D articles actually presented the possibility for MORE viewing angles than AR. You couldn’t see the bottom of the objects with 3D. If rich physical information made all the difference, 3D would have felt cooler than AR, which was not the case.

The difference had less to do with information about the product than the fact that AR felt real. There was a real life background and the viewing angle changed as the participants moved.

However realism was not precisely what made AR feel more cool. What made AR feel cool was that participants did not expect AR to look real. The novelty was what made the experience cool. When I compared the arousal score of people with low frequency of exposure to AR to the arousal score of people with high frequency of exposure to AR, I found a big difference.

What does this mean? Does it mean that once AR becomes ubiquitous AR will lose all its appeal?

Nope. It means is that the medium can only do so much. The message must be engaging by itself and use the medium affordances properly.

The messages I used were all equally boring: watch 2D pictures, a video, play with a 3D object or an AR experience. Participants were asked to pretend they were shopping, but it was a boring experience. Their goal was to make the $20 we offered in exchange of their time and get out of the lab as soon as possible. However, when confronted with AR, the participants’ goals changed. What was this sorcery? AR are digital images, but AR does not behave like traditional digital images. AR behaves like the real thing, like a real camera or like a real chair! The viewing angles changed as the participants moved!

The AR experiences motivated exploratory behavior. Participants who expected AR to be as boring as traditional digital images continued playing with the AR experiences until their expectations changed and their brains learned to predict that AR experiences look real.

The dopamine prediction error hypothesis offers an explanation of how this may come to happen. The neurotransmitter dopamine serves several functions, but one very important is to motivate us to seek rewards. Dopamine is not a reward per se. Dopamine gives us the extra push we need to achieve a reward. Dopamine translate to feelings of euphoria, which may feel good but real pleasure comes from endorphins — that’s material for another post. In brief, what dopamine does is to motivate.

The dopamine prediction error hypothesis proposes that the firing of the dopaminergic neurons in the mesolimbic pathway is proportional to a cost-benefit valuation of attaining a goal. Dopaminergic neurons in the mesolimbic pathway will fire more frequently in response to cues that announce unexpected rewards, lower their firing rate to cues that announce predicted rewards, and have an almost null rate of firing in response to cues that announce a lack of reward. The prediction error would be the difference between what one expects to receive based on sensory cues and what was originally expected, based on experience. The higher the prediction error, the higher the release of dopamine and thus the lesser the inclination to disengage attention.

For instance, imagine that you are out and about wanting to buy a cup coffee. You expect coffee to cost about $2. There is a coffee shop a few meters away where a cup of coffee cost $12.95. Probably not, right? You won’t be too motivated to go in there. You know that if you walk one block south, you can get a cup of coffee for $2. Now, before you start walking there, someone tells you that if you walk two blocks north, you can get a coffee for free at another coffee shop, but that coffee shop closes at noon, and it’s 11:55 AM. Chances are, your VTA will start firing dopamine to give you the extra motivation you need to speed up your pace and make it on time to get the free cup of coffee. Why, you were ready to pay $2, you did not expect free coffee!

Since the costs of engaging attention refer to opportunity costs rather than to actual cognitive efforts, which must be equal or lower to the expected rewards, and the perceived value of a reward refers to its perceived utility rather than to its objective value, indicated here by the sigmoidal blue curve, what the dopamine prediction error hypothesis suggests is that when a task seems too difficult or too easy, the cost-benefit valuation of persistence will be negative. The sweet spot of engagement is then when the reward utility of remaining engaged is perceived as higher than the utility of engaging with anything else.

What does this mean, in regards to AR?

If AR was more engaging because of its novelty, will AR lose all its appeal as it becomes ubiquitous? No. The lesson here is that persuasive messages cannot rely on the affordances of a medium alone. Immersive media has advantages that traditional media does not, but just as movies are not automatically more engaging than books, even if they can convey much more information in one second, immersive media will only be more engaging than traditional media when the message itself is more engaging. Just like not every movie will be an engaging one, not every AR experience will be an engaging one.

To be engaging messages must provide users with goals, challenge, and rules. Goals establish the measure against which stimuli are emotionally categorized. Without goals that promise a reward, there cannot be emotional responses. In the case of this study, the reward of interacting with AR was simply to understand how AR behaves. Challenge prevents users to immediately achieve a goal, forcing them to engage their attention and formulate plans of action aimed to achieve that goal. Rules provide a baseline from which users can predict the consequences of their actions (or the actions of others, in the case of a story) and allow for adjustments in behavior. The combination of these three elements provides the necessary motivation to remain engaged.

Increasing the sensation of presence can help, but can only do so much. Because the experiences in this study were so simple (no intrinsic goals, no real challenge) they offered no intrinsic value to the participants, except in those cases where the experiences were novel.

In other words, it is not what the experience can provide, but what the user can get from it.