Mobile Virtual Reality as an Educational Platform: A Pilot Study on the Impact of Immersion and Positive Emotion Induction in the Learning Process

The purpose of this study is to evaluate the influence of emotional induction and level of immersion on knowledge acquisition and motivation. Two conditions were used for immersion modulation: a high immersive condition, which consisted of the viewing of educational content through a head-mounted-display; and a low immersive condition, which was achieved through direct viewing on a tablet. The emotional conditions, created through video simulation, consisted of a positive versus neutral mood induction procedure. The participants were 56 high school students enrolled on a social science course. The results indicate a significant effect of the positive emotion/high immersive condition in knowledge acquisition while positive emotion induction had a positive effect on the interest subscale of the motivation assessment tool used for both immersive conditions.


INTRODUCTION
The popularisation and accessibility of mobile Virtual Reality (mVR) technology in the coming years is likely to have a significant impact on educational contexts and the overall development of students as lifelong learners. Proposed by Motiwalla (2007) as a new technological approach for teaching, there are extensive opportunities for both traditional and distance education in the design of fully immersive experiences with high-quality visualisation and in combining them with the advanced interactive capabilities and connectivity offered by modern smartphones. As Jerald (2015) notes, Virtual Reality (VR) "has turned a corner, transitioning from a specialized laboratory instrument available only to the technically elite, to a mainstream mode of content consumption available to any consumer" (Jerald, 2015).

LITERATURE REVIEW
Although research in the area of virtual reality technology has been ongoing for many years, its actual use and implementation in experiments in educational contexts are extremely limited, partly due to the high cost of this technology prior to the arrival of smartphone-based solutions (Google Cardboard was introduced in June 2014 and the first version of the Samsung Gear VR became available in December 2014). This is one of the reasons why Merchant et al. (2014) focused their meta-analysis on "desktop VR" (3D visualisations on a computer screen). They justified this in the light of the many practical concerns and limitations that restricted the widespread use of true Virtual Reality technology in educational settings. One of the many reasons why this technology was beyond the reach of schools was that it was not financially feasible. In addition, users were found to experience significant physical and psychological discomfort when using previous generations of VR hardware.

2046
True Educational Virtual Environments (EVEs) provide an immersive experience (Slater, 2009), contextualize content and support problem-solving inside the virtual environment (Tzuriel, 2000). We understand, in this context, "immersion" as being aligned with Slater's (1999) view that considers it as an objective measurable aspect of a Virtual Reality system (for example, the field of view could serve to compare to VR systems in terms of immersion). It represents the extent to which the system produces a surrounding environment, disconnecting us from 'real world' and providing a sharp panoramic vision of the virtual environment.
Other elements that can improve the effectiveness of EVEs have been inspired by video games in which events are experienced by the viewer (Bavelier et al., 2012). The visual complexity of videogames and the quantity of stimuli are factors that must also be considered (Bavelier et al., 2012). In addition, interaction with the individual's whole body and multisensorial inputs can also increase learning effectiveness (Fowler, 2014). Dalgarno and Lee (2010) also see the interaction capabilities and the high level of realism (both in the environment and in user actions) as strengths. Immersion in a digital environment can improve learning in three ways: it can provide multiple perspectives, it can contextualise the environment and it can help the transferability of learned material (Dede, 2009).
Most previous applications of EVEs have centred on mathematics and sciences (Mikropoulos & Natsis, 2011). This can be explained because immersive technologies make it easier to understand abstract concepts. As examples, we have chemistry experiments with students from secondary schools (Bell & Fogler, 1995), specific educational content for mathematical concepts with avatars personifying teachers and students (Taxén & Naeve, 2001), and in physics a study about mass gravity in the solar system (Civelek, et al., 2014). The results showed higher comprehensibility, achievement and retention over time.
There are a limited number of works applying this technology to the social sciences. It has been used to expose students to places and situations they couldn't possibly experience in real life, for example, the solar system (Bakas & Mikropoulos, 2003); and taking care of, being responsible for, the life cycle of a plant (Roussos et al., 1999). Ecology is another area which is open to the use of virtual environments. Wrzesien and Alcañiz (2010) compared an immersive environment with a non-immersive using a sample of primary students. The results were not significant, although the users described the immersive environment as being more enjoyable than the nonimmersive one. Rupp et al. (2016), with the aim of evaluating the influence of immersion technology on expectation and degree of information recall of university students, developed a study comparing the results of a NASA film viewed variously on a Smartphone, a Google Cardboard and the Oculus Rift DK2. In this specific case, the use of the higher immersion hardware, when used by a higher expectation subject, resulted in the remembering of fewer details about the video. However, the research did highlight that the video was, in fact, inappropriate for the study as it included too many distractors. Moreno's (2006) framework for the Cognitive-Affective Theory of Learning with Media postulates that the multimedia learning process is mediated by the learner's mood. Hascher (2010) provided a good overview of the state of the art at that point about the interaction of learning and emotion, proposing a general framework for theory and research in the field and showing the complexity of the topic. She makes emphasis that, despite the evidence of the positive effects of positive mood and emotions in the learning process, additional research is needed to advance the understanding of the complex relationship between emotion and learning.
Many works have contributed to the study of the relationship between learning and emotion. For example, Brand et al. (2007) provide findings showing that both positive and negative moods may hinder or promote information processing. In a first experiment, participants in a negative mood solved the transfer tasks poorly. In a second experiment, mood affected performance if it was induced before the learning phase; participants in a negative mood needed more attempts to reach the level required in the experiment. In other work, Park et al. (2015) demonstrated that learners with positive emotional states show better learning outcomes. Liew and Su-Mae (2016) developed an experimental work around learning a basic programming algorithm whose results revealed that negative mood enhanced intrinsic motivation and germane load, while reducing learning transfer. Nadler et al. (2010) induced positive, neutral, and negative moods in subjects learning either a rule-described or a non-ruledescribed category set. Subjects in the positive-mood condition performed better than subjects in the neutral-or

Contribution of this paper to the literature
• This is the first article to evaluate the influence of emotional induction and level of immersion on knowledge acquisition and motivation using mobile virtual reality hardware.
• The results indicate a significant effect of the positive emotion/high immersive condition on knowledge acquisition while positive emotion induction had a positive effect on participants' interest in both immersive conditions.
• Positive emotion is a stronger modulator than immersive condition for both knowledge acquisition and motivation, although the high immersive condition increases this positive effect.
negative-mood conditions in classifying stimuli from rule-described categories. Positive mood also affected the strategy of subjects who classified stimuli from non-rule-described categories.
These previous works show that an important factor to consider in the development of virtual learning experiences is the "emotional feeling" that can be generated through interaction with the virtual environment. Learning tasks should develop positive emotions. It is important to note that information transmitted by the senses is easily retained in the limbic system, which is connected to the frontal cortex, both of which are involved in emotion.
Positive emotion stimulates curiosity, heightens attention and arouses interest in the topic being learned. The absence of emotion has consequences for learning and knowledge retention in academic life (Mora, 2013).
In order to contribute to a deeper understanding of the effects of immersion levels and emotional induction on the learning process when students are experiencing a learning activity inside an Educational Virtual Environment, the present study will analyse two experimental conditions: level of immersion (low/high) and emotional induction (neutral/positive) to evaluate their influence on short-term (knowledge acquisition) and medium-term knowledge retention (in our case, a week after conducting the experiment). The high immersion (HI) condition was obtained by creating sensory isolation from the surroundings via a head-mounted display (HMD), while the low immersion (LI) condition was achieved using a tablet. There were no substantial differences in the educational content presented in both conditions, either in terms of navigation or in the interaction interfaces. Therefore, we manipulated only one of the conditions that has traditionally proved to be necessary for increasing immersion to "remove the participant from the external world through self-contained plots and narratives" (Slater & Wilbur, 1997). Secondly, as previously described, we examined the effect of positive emotional induction and motivation on learning.

Subjects
The experimental sample included 56 students, 23 girls and 33 boys, between the ages of 14-16 years, recruited from two private schools in Valencia (Spain), both of which use the same pedagogical approach. All the participants were in the same year at secondary school and had a history of academic failure. They were all from the same socioeconomic level. They had a maximum of 55 minutes to complete the entire activity (including questionnaires, emotional induction procedures and interaction with the educational content). The experiment was conducted during school times, mainly during the mornings.
Participants' parents were provided with written information about the study and were required to give written consent for their children to take part. Only the parents completed the written consents. However, the teachers and the parents explained the activity to the participants.

Psychological Assessment and Emotional Induction
The following questionnaires were presented to each participant: • Knowledge Questionnaire (KQ): An ad hoc ten-item questionnaire was created by teachers to measure • Self-Assessment Manikin (SAM): This is a well-validated, non-verbal questionnaire assessing the three affective dimensions: valence (positive or negative feeling caused by a situation: 1 = unhappy to 5 = happy), arousal (psychological posture of a person when faced with a condition: 1 = excited to 5 = sleepy) and dominance (measure of personal control: 1 = controlled to 5 = submissive). (Bradley & Lang, 1994).
• Intrinsic Motivation Inventory (IMI): This is a well-validated questionnaire, assessing the intrinsic motivation related to a specific activity. The questionnaire used a Likert scale (1 = disagree; 5 = agree) consisting of three subscales: competence (5 items), interest (5 items) and effort (4 items) (Ryan & Deci, 2000). The post-test and pre-test questionnaires has similar items. These items assess Competence using statements such as "I think I was good in making/playing this game" or "I am satisfied with my performance while making/playing the game". Interest was assessed with statements such as "I think school is quite enjoyable" or "I think school is fun" and Effort with items such as "I put much effort into school" or "It was important to me to do well in making/playing this game".
For emotional induction, we selected two movie clips, short film scenes with happy (3.56 minutes) and neutral (2.19 minutes) content. The happy scene was a snippet from "Singing in the Rain". Specifically, a man is singing and dancing in the street. For the neutral content we used a snippet with a man driving a van along a road. Both snippets have been validated by Baños et al. (2004).

Experimental Design
A 2x2 factorial design was applied. Two factors were considered: emotional induction (with two levels, positive and neutral) and immersion (with two levels, low and high). Participants were randomly assigned to one of the four experimental groups. Each of the four groups had 14 subjects to cover the four possible combinations of the factors.
The presentation order of each exposure condition (high and low immersive), as well as the order of appearance of each emotional stimuli category (emotional induction), in the two different conditions, was counterbalanced for each group. The participants completed a Knowledge Questionnaire (KQ) one week before the experiment. They were also asked to complete the IMI and the Self-Assessment Manikin (SAM) test (working baseline).
The experimental session started with the SAM questionnaire to measure the baseline, follow by the emotional induction. Then a SAM questionnaire was administered again to measure the effect of this induction. Participants were told to freely examine the educational content and undertake activities related to both learning environments (head-mounted display and tablet).
To measure the variation of each exposure condition, at the end of the experiment subjects completed the Intrinsic Motivation Inventory, the Self-Assessment Manikin and the Knowledge Questionnaires (short-term knowledge retention). Finally, after one week, participants again completed the Knowledge Questionnaire to measure their medium-term knowledge retention.

Educational Content
Two educational apps were developed on the Android platform with almost exactly the same learning experience. The only difference was that one app was installed on an Android-powered tablet and the other was installed on a Samsung Gear VR headset powered by a smartphone. The activity was composed of 2D multimedia content in which students were guided through the app by a narrator.
At the beginning of the experiment, the narrator explained the task to the participants and they were shown a world map. Then the participants were asked to observe, focus their attention on, and prepare themselves to go on a trip in which a series of geographically-related topics were presented. The participants were exposed to the content for 8 minutes.
The topics explained in the application were: 1. The birth of agriculture. The audio narration explained where agriculture first appeared while this information was represented graphically to show its geographical location.
4. Population distribution and the evolution of the phenomenon. The principal aim of this module was to highlight the facts (physical, economic and demographical) which influence population distribution. The narrator pointed out the most highly populated areas in the world; and the images of these countries was highlighted with a different colour and with relevant photographs of individuals (Figure 2).
5. Why we cannot live in some places? The principal aim of this topic was to describe the reasons that lead people to live in specific geographical areas.
6. Population movements. This module taught students the formulae of births, deaths and the rate of natural increase. It studied population movements, their causes and consequences. Furthermore, in this section, the population pyramid concept was explained (Figure 3). After each learning module, participants had to answer some questions in two ways: • High immersion condition: students used a red point in the middle of the virtual environment. This could be moved by turning the head towards the desired location to give the correct answer. Or the area of a map could be highlighted by putting a finger on the touchpad on the right side of the headset. (Figure 4) • Low immersion condition: students had to choose the correct answer by touching the tablet screen. (Figure   5)

Hardware
The hardware used in the experiment was a Samsung Gear VR headset equipped with a Samsung Galaxy Note 4 smartphone featuring a 5.7 inch Quad-HD display (2560 x 1440 pixels) with 515 dpi resolution and an Android tablet featuring a 10-inch touch screen.

RESULTS
The analyses were performed using SPSS version 22.0 (Statistical Package for the Social Sciences for Windows, Chicago, IL) for PC. Independent tests were conducted to verify the baseline homogeneity of the sample in terms of age. Since the sample was characterised by statistically significant baseline differences, psychological differences were calculated in the Intrinsic Motivation Inventory (IMI) and SAM data measured after exposure compared to the corresponding baseline. Next, mixed ANOVAs (and ANCOVAs) were conducted to test whether the psychological responses changed according to the exposure condition (High Immersive or Low Immersive), or the type of emotional induction (Positive or Neutral). The level of significance was set at α = .05.

Intrinsic Motivation Inventory (IMI)
Cronbach's alphas for pre-and post-test questionnaires were calculated to evaluate the internal consistency of the competence, effort and interest scales of the intrinsic motivation inventory. As can be seen in Table 1, subscale homogeneity was assessed using the corrected item-total correlation. The internal consistency for the pre-and postcompetence scales was found to be good (.84 and .78, respectively). For the pre-test effort questionnaire, reliability was good (.85), while for the post-test effort questionnaire it was poor (.59). For the interest questionnaire, the reliability of neither pre-nor post-test questionnaires was acceptable (.41 for the pre-test and .43 for the post-test). In both interest questionnaires, the internal consistency was found to be good (.81 for both pre-and post-test questionnaires) when the "I think school is boring" item in the pre-test questionnaire and "I think doing this activity is boring" item in the post-test questionnaire were eliminated. Therefore, these items were not included in the follow up analysis.

Figure 5. Low immersion condition
We carried out Pearson correlations to investigate the measurement stability of the scales. The pre-test questionnaires aimed at measuring general intrinsic motivation at school while the post-test questionnaires aimed at measuring lesson-specific intrinsic motivation; thus the latter measure diverged from the former. None of the competence, effort or interest post-test scales had significant correlations with their respective pre-test scales (r = .20, r = -.83, r = .14, and p > .05, respectively). Due to this divergence, we examined the differences between the pretest and post-test scores on the intrinsic motivation scales for the four conditions using a mixed ANOVA with the evaluation time (pre-and post-test) as a repeated measure, and the emotional induction and learning environment as factors.
For the competence scale, we found only one main effect of evaluation time, F(1,52) = 45.02, p < .001, ηp2 = .46. In the pre-test questionnaire about school, students rated themselves as being less competent in the specific lesson (Mean ± Standard Deviation: 3.13±0.85) than in the post-test questionnaire (3.99±0.58). For the other main and interaction effects the score was p >.05.
Students rated their level of effort in the specific lesson (4.35±0.81) as being higher than in their general effort made at school (3.91±0.51), F(1,52) = 13.23, p = .001, ηp2 = .20. The interaction of evaluation time and level of immersion was marginally significant, F(1,52) = 3.88, p = .07, ηp2 = .07. Pairwise-comparisons yielded significant differences in the rated effort between pre-and post-test scores (3.85±0.85 and 4.37±0.44), but only in the high immersion group, p < .001. For the low immersion group, these score were, respectively, 3.96±0.78 vs. 4.30±0.57, and p> .05. All the other main and interaction effects were non-significant, p > .05.
Students were more interested in the map activity (3.98±0.70) than in school (3.31.±0.82), F(1,52) = 20.08, p < .001, ηp2 = .28 (see Figure 6). We also found that the interaction of evaluation time with emotional induction was marginally significant, F(1,52) = 3.37, p = .072, ηp2 = .06. In the pre-test scores about school there was no difference between the positive and neutral groups (3.39±0.89 vs. 3.79±0.75, respectively), p > .05, but in the post-test score about the activity, interest was higher in the positive group than in the neutral group (4.17±0.59 vs. 3.23±0.75), p = .041. None of the other main and interaction effects were significant (p >.05).

Self-Assessment Manikin (SAM) Test
To measure the students' emotional responses, we analysed the valence scale of the SAM test. Because the baseline test scores (before the emotional induction) were correlated with the pre-test scores (after the emotional induction and before the lesson, r = .44, p = .001) and post-test scores (after the lesson, r = .38, p < .001), this baseline was used as a covariate in a subsequent mixed ANCOVA carried out to investigate the effect of the emotional induction in conjunction with the different levels of immersion. We used the valence SAM scores of each evaluation as repeated measures and the type of emotional induction and level of immersion as factors. The baseline effect was the only main effect which was significant, F(1,51) = 13.73, p< .001, showing that the pre-induction scores predicted the subsequent scores. When this effect was removed, we found that interaction between the level of immersion x emotional induction was marginally significant, F(1,51) = 3.94, p= .053, ηp2 = .07. Students' scores were higher in the positive group (7.68±1.70) than in the neutral (6.86±1.90), but only in the high immersion group (pairwise-comparisons, p= .021; for the low immersion group, the scores were, respectively, 6.04±2.53 vs. 6.96±1.23, and p> .05). Moreover, the evaluation x level of immersion, and the evaluation x emotional induction effects were found to be significant, F(1,51) = 6.98, p=.011, ηp2 = .12, and F(1,51) = 19.01, p< .001, ηp2 = .27, respectively. Valence scores were higher in the high immersion group (7.54±1.73) than in the low immersion group (6.25±22.07), but only after the lesson (post-test scores, pairwise-comparisons, p = .017) and there was no difference between them (7.00±1.92 vs. 6.75±1.99) after the emotional induction (pre-test scores, pairwise-comparisons, p > .05). Valence scores were also higher for the positive induction group (7.29±2.17) than for the neutral induction group (7.14±1.41) after the emotional induction (pairwise-comparisons, p = .013), although this difference disappeared after the lesson (6.43±2.36 vs. 6.46±1.62, p > .05) (Figure 7).

Knowledge Acquisition
In order to determine whether changing the level of immersion and emotional induction yielded differences in terms of knowledge acquisition, we analysed the gains and losses of knowledge between the Knowledge Questionnaire scores from the pre-test, the immediate post-test and the delayed post-test. The pre-test Knowledge Questionnaire scores were the baseline for the immediate post-test ones, and the immediate post-test Knowledge Questionnaire scores were the baseline for the delayed post-test scores. To achieve this, we carried out a mixed ANOVA including level of immersion and emotional induction as factors, and the knowledge gain in each posttest time as repeated measures with two levels. The first level was the immediate post-test Knowledge Questionnaire scores minus the pre-test scores (immediate knowledge gain), and the second was the delayed posttest Knowledge Questionnaire scores minus the immediate post-test ones (delayed knowledge gain).

Effect of Emotional Induction
We found a higher effect of positive emotion compared to neutral in the different assessments: 1) The assessment of participants' interest after the lesson, the positive group being more interested than the neutral one.
2) Students' valence scores were higher for the positive induction group than for the neutral induction group after the emotional induction but not after the lesson. In addition, valence scores were higher in the positive group than in the neutral but only for the high immersion group.
3) Finally, the knowledge gain was higher among students who received positive emotional induction compared to students in the neutral induction group.

DISCUSSION
Regarding the first objective of this study (influence in short-term knowledge retention by comparing high immersion versus low immersion states), we can conclude that the immersive condition influences knowledge retention when delivering educational content. In the short term, participants have better retention when there is positive emotional induction and high immersion. The statistical analysis showed increased medium-term learning in the high immersion condition.
These results are in line with the previous work of Kort et al. (2001), which highlighted the existence of interaction between emotion and learning. Their research focused on student emotions during the learning process. Feelings of amazement, satisfaction, curiosity, hope, and inquiry were identified as good emotions which were associated with a higher level of learning. Our results also agree with Reschly et al. (2012), who studied the impact of positive emotion on student engagement. We can conclude that positive emotion increases engagement, and this can be seen as a multidimensional construct related to academic improvement, as proposed by Lyubomirsky et al. (2005).
Regarding the second main aim of this experimental work (analysing whether manipulating an emotion affects participant motivation), a psychological assessment analysis revealed a significant difference in the interest subscale between the pre-tests and post-tests in the high immersion condition.
This result is also aligned with previous works. For example, Turner, Meyer, and Schweinle (2003) analysed the learning process based on three elements: cognition, emotion and learning. They observed that emotion was an essential component for student motivation. This was also noted by Tüzün et al. (2009), when they compared student motivation in a primary school using a game-based learning approach as against a traditional schoolroom based approach. They observed that students demonstrated statistically significant higher intrinsic motivation and lower extrinsic motivation when learning in the game-based environment. Wrzesien and Alcañiz (2010) observed a similar effect in a study of the learning of natural sciences and ecology in a primary school. One group of students carried out a learning activity in an immersive environment while the other group used a 2D representation. Results showed that the students using the immersive environment were more satisfied with their learning experience. The authors concluded that immersive environments have the ability to improve students' intrinsic motivation.
Finally, considering the third objective of this work (determining whether positive emotion helps students retain educational content), we observed that valence scale values were higher when there was positive emotional induction and high immersion. This statistically significant difference was found after the task but not after the emotional induction, meaning that high immersion can be used as a tool to enhance the influence of emotions. This is an interesting fact, especially when considered in conjunction with the ideas of Dirkx (2001) about the importance of emotions as elements that can either impede or motivate learning.
There are some limitations to the study that should be highlighted. Although there were 56 participants, each experimental condition was covered by only 14 subjects (2x2 design). More significant interactions between the experimental conditions would probably have been detected in a larger sample. It is also worth mentioning that the emotional induction procedure applied should be analysed in greater detail. There are factors related to sociocultural background and the age of the subjects that could have an impact on the effectiveness of the induction procedure. This would require more specific fine-tuning of the process to select select the most relevant film to obtain the desired emotional induction.