Sascha Rushing
Supervisor:Prof. Gudrun Klinker
Advisor:Dipl.-Inf. Univ. David A. Plecher, M.A.
Submission Date:15.11.2018


Immersion has become a significant term for describing and characterizing the experience created by video games. Virtual reality offers additional unique properties towards creating an immersive game, as the head mounted display seals off the player from reality completely and intuitive camera controls are provided. Given the necessary financial commitment for high end VR systems and the low performance of mobile VR, this thesis elaborates on an approach of combining a PC and smartphone to create an affordable yet high quality VR system, called SaMaXVR. The smartphone is placed inside a low cost HMD, as it is done in the GoogleCardboard approach, sending the PC the user’s head orientation data. Updating the virtual cameras accordingly, the PC is responsible for generating the VR content and calculating any other game specific actions. The resulting image is streamed back to the smartphone where it is displayed to the player. In order to prevent cyber sickness, low latency is crucial for virtual reality games. Therefore all steps of the pipeline have to be optimized for maximum efficiency and speed. Additionally, the thesis analyzes and compares current definitions of immersion for video games. Implementing the requirements set by these definitions of immersion, a VR racing game utilizing the SaMaXVR technology is evaluated in regards to immersion in the virtual reality context. Ideas and proposals are given on how immersion can be further improved by using virtual reality in contrast to traditional gaming on PC monitors and TV screens. Benchmarks prove SaMaXVR to be a viable solution for seated VR games.


Virtual reality offers unique characteristics for creating immersive and capturing gaming experiences. As the head-mounted-display seals of the player’s vision and feeling of presence in the real world, he is completely exposed to the virtual environment. This leads to the visual cues only focusing on the virtual content in contrast to conventional gaming on a monitor or TV screen, where the real world is still visible in the peripheral vision. Additionally virtual reality provides highly intuitive camera control, as the user can literally look around instead of using peripheral hardware such as a computer mouse or gamepad. In order to experience high quality desktop VR, additional hardware is needed. Given the financial strain of these devices for gamers, the acquisition of such hardware is not possible for everyone, or creates a higher threshold towards involvement. As virtual reality hardware is rather tough to access without having a system installed at home, exposure to gamers that have not yet experienced virtual reality applications proves to be challenging. A more accessible approach, namely virtual reality using mobile devices such as smartphones, allows for an easier and cheaper point of contact. Given the hardware constraints of mobile devices, the visual quality and computational capabilities are limited in comparison to desktop computers. Another drawback of smartphones is the restricted interaction they provide in VR, such as tilting the head or gaze-control. In terms of immersive gaming virtual reality, as a technology, offers promising characteristics in regards of immersive experiences, yet immersion is very dependent on the design and content of a game. Common definitions of immersion and presence by game researchers show how immersion in video games can be analyzed and classified, helping to understand the parameters of immersion and therefore how to manipulate and increase it. This thesis examines if the combination of a PC and smartphone can create a cost-efficient, yet high quality VR gaming experience with a high degree of immersion. The PC will be used as the main computational source, whereas the smartphone will be utilized as a head-mounted-display similar to the GoogleCardboard approach. Connected via WiFi, the smartphone will be sending head orientation data to the PC, which in turn will render the current frame with the updated camera rotation and send it back to the smartphone in the HMD. Due to the fact that the smartphone only provides rotational data, roomscale applications are not possible. As Oculus’ Vice President of Content Jason Rubin stated at the E3 in 2018, a significant percentage of VR users would rather sit down then use the room-scale option[1], making this approach still very viable. Analyzing and examining immersion as a concept of increasing gaming experiences, a racing game will be implemented applying these results with focus on how to increase immersion even further in the VR context. Suggestions will be given on UI and interaction design, laying a foundation for future research in the field of VR immersion and presence.



The basic idea of the approach presented in this thesis, called SaMaXVR, is the use of a smartphone as a HMD for virtual reality gaming. The VR application itself is executed by a powerful gaming PC, streaming the rendered frames to the mobile device.

[2] [3]

Compared to PC and console VR systems, SaMaXVR does not require additional expensive hardware. The high end HMD is replaced by the smartphone using a GoogleCardboard approach. This lowers the financial strain massively as well as the threshold of involvement for gamers new to virtual reality. Users do not have to set up additional hardware for tracking, at the cost of only supporting 3DOF, compared to the room-scaled 6DOF approach of the HTC Vive and Oculus Rift. With no additional hardware necessary, the availability of VR by using SaMaXVR is much higher compared to PC and console VR. Furthermore, SaMaXVR is a wireless approach out of the box. For the Oculus Rift and HTC Vive a spate module by TPcast allows for a wireless connection between the HMD and the PC. Not only does this add extra cost on the gamers end, but also more hardware, as this technology uses a router of its own. SaMaXVR on the other hand utilizes any WiFi enabled router. It is important to mention that higher throughput positively influences the overall latency of the approach.

The figure above displays the three best configurations tested in a 60 seconds benchmark. By comparing the average Round-Trip-Time it becomes clear, that the codec used in the encoding phase makes a huge difference. Reason one being the time of the encoding step itself. Reason two, the transmission time of the encoded frames, which heavily depends on the size of them. The larger theses frames are, the longer the transmission takes, resulting in an increase of the overall RTT. The encoded frames using the Nvenc H.265 are the smallest with an average of 21kB, compared to 31,5 kB of the Nvenc H.264, yet their encoding times are virtually the same. Solely the size difference of the encoded frames leads to an RTT difference of almost 2 ms in favor of the H.265. The most quality efficient configuration, capturing the screen using DX11 and encoding using the hardware accelerated Nvenc H.265, measured an average RTT, from acquiring the head orientation to rendering the corresponding frame on the smartphone, of 27.36 ms.

Immersive Gaming:

Reviewing and grading a video game is a rather subjective task. One might find seemingly objective attributes such as graphical quality and visual realism or the average hours of gameplay. Yet not even these objective characteristics declare a game to be a success or a failure, as the wildly popular multiplatform game Minecraft has shown. In spite of having rather basic and plain visuals, the game has sold over 144 million copies, as of December 2017.[4] Comparing more subjective attributes such as storyline and gameplay is even more difficult, as they can be viewed very differently by different people. The important thing is that all aspects of the game have to be coherent and create a memorable experience.[5] Words often used to classify the ability of a game to produce such intense experiences are flow and immersion.

There are several approaches to defining immersion. On the one hand, immersion can be view as a series of states with varying intensities of involvement, as introduced by Brown and Cairns. On the other hand, immersion can be defined as different dimensions all responsible to creating a unique game experience, as shown by the SCI-model of Ermi and Mäyrä. Ernest W. Adams presented another approach of identifying different kinds of immersion closely linked to specific game genres. Even though the exact definitions may vary from researcher to researcher and gamer to gamer, the gaming domain as a whole has accepted immersion to be a very defining and important criteria for video games. 


The goal of the demo game "VRacing" is to create a highly immersive virtual reality experience, using the SaMaXVR technology, with a strong feeling of presence. In pursuit of achieving this goal, the game design was examined as to how the requirements set by the various definitions of immersion and presence are met. 


UI elements have been fully integrated into the world, completely ridding the game of overlay interface to not break immersion.

The current lap time is displayed on the monitor integrated in the steering wheel.

The garage is a three dimensional substitute for a classic main menu. Driving the car out of the garage will trigger the start of the race. The TV screens hanging on the walls show the currently best driven time. Setting a new track record will update the time shown on the TV. Doing so, integrates this piece of information into the virtual world, instead of displaying it as an overlay menu, not connected to the three dimensional space at all. 


Immersion has become one of the defining terms for evaluating and characterizing video games. Even though various game researchers have provided different approaches of defining how immersion is achieved, they all emphasize on the importance of immersion for a successful game. This thesis provides an analysis of the current state of immersion definitions for video games and how it can be measured. Virtual reality provides unique characteristics for improving immersion and the feeling of presence from a technology standpoint, yet the content and gameplay of the game is very important. The demo game VRacing applied these definitions in pursuit of creating an immersive gaming experience, focusing on aspects specifically relevant in the VR context. Ideas for improving immersion by UI and gameplay design were given. In order to experience high quality virtual reality, gamers have to commit to acquiring additional VR hardware, as smartphone VR is very much limited given the hardware and interaction constraints of mobile devices. The approach presented in this paper combines a smartphone and PC in order to create an immersive high quality VR experience. Different methods used in the development process have been outlined, evaluated and compared for each step of the SaMaXVR pipeline. The benchmarks have shown that a main bottleneck is the wireless transmission of the encoded frames from the PC to the mobile device. This step is directly impacted by the size of said frames, which heavily depend on the codec used in the encoding stage. SaMaXVR drastically lowers the necessary costs of playing VR games at home and has the ability of increasing exposure and affordability of VR for gamers. The benchmarks have proven this approach to be a viable solution for seated VR games. The most quality efficient configuration, capturing the screen using DX11 and encoding using the hardware accelerated Nvenc H.265, measured an average RTT, from acquiring the head orientation to rendering the corresponding frame on the smartphone, of 27.36 ms.








[ PDF (optional) ] 

[ Slides Kickoff/Final (optional)] 

  • No labels