top of page
Alexis Smith

Using Sound Design, Unity, C# and FMOD to create a VR Soundscape.


Click to skip to a section:



Intro.

For this project I constructed a virtual environment paired with binaural audio, combining technical and creative proficiency through audio integration via Unity, FMOD and C# audio programming. This is balanced with synth and string compositions in a virtual 4.1 format (fig. 1) and user interchangeable ambiences. The project's focal point is a poem spoken and written by a friend (Kaitlin Harrison-Moore) called ‘Lighthouse’ (fig. 2). Featuring binaural mixing this poem adds narrative to the environment aiming to evoke emotions from its users. Consolidated, this project considers audio, visuals and interactivity resulting in a multimedia art piece. Showcasing acousmatic audio, it creates the illusion of a 3D space through two audio channels (BBC, 2020). Initially drawing inspiration from lockdown, this project aimed to create a small break from stay-at-home restrictions and give users a sense of freedom and relaxation.



Figure 1 - The layout of my virtual 4.1 setup in the Unity Game Engine.




Figure 2 - The Poem Lighthouse by Kaitlin Harrison-Moore.




Planning and Research.

Taking into account the themes and intentions of the project was vital to have a clear direction and concept. Initially, it was heavily focused on being an escape from stay-at-home restrictions, highlighting calmness and well-being. However, doing further research into projects already in the field changed the direction of my project.


When looking at some of my personal favourite games 'Dear Esther' and 'Firewatch', I wanted to have a go at incorporating a narrative into the environment. Developed by The Chinese Room (2012), ‘Dear Esther’ is driven by story and immersion rather than game mechanics. Featuring a single narrator and being as sonically dependent as it is visually (Eurogamer, 2012). ‘Firewatch’ developed by Campo Santo (2016) has similar concepts and also uses Unity further confirming the game engine as an effective platform for this project. With a detailed mystery plot for the user to uncover, it takes the user on a journey emotionally (ibid). These both show how effective this format is at immersive storytelling.


Excluding video games, there are two art projects that utilised virtual reality (VR) I found inspiring. ‘Dream’ combines theatre, audio and technology and is described as the ‘future of live performance’ (MIF, 2020). It combines VR and real-time motion-capture technology (Dream, 2020) for immersive productions. ‘To The Moon’ (2018) is a multimedia VR art piece that merges “Movement, physical engagement, language, music and most of all being able to disappear into something” (ibid.) It has striking visuals and a strong narrative. Both projects made me consider the artistry of the experience and not just the basic premise. By trying to incorporate a narrative and an emotional journey I felt the project became more fulfilling creatively.


These projects show engagement between audio and a virtual listening space. I aim to achieve this by augmenting the immersion of the binaural audio with creative audio production techniques and interactive visual feedback. These components will lead to the user being more engaged and more immersed in the experience. Having a binaural VR experience with the added layer of user interaction it results in a unique experience each time. I plan to execute this by amalgamating binaural audio and 3D audio sources in a virtual 4.1 setup that reacts to user rotation. The listener perspective can rotate whilst pinned in the centre, creating a unique listening space between headphones in a binaural VR environment. Allowing opportunities to use Unity itself as an audio mixing technique alongside the processing completed in my DAW (Reaper).


Head-Related Transfer Function (HRTF) is “how an ear receives sound from a sound source” (3DIO, 2019). It is how the frequency profile of a sound changes due to the unique characteristics of the listener (ibid). I can manipulate spatial composition with The Ambisonic Toolkit (2020), including the HRTF (fig. 3). Going beyond sound placement this technique allows for nuanced audio manipulation affecting how the spoken word is experienced.


Figure 3 - The Ambisonic Toolkit for Reaper - HRTF.


To support my themes of wellness and relaxation I completed research to best select my audio. The 2012 research paper ‘Audio and Visual Selections for Computer Systems that Support Wellness’ presents nature sounds such as birdsong and rain to be considered the most relaxing (fig. 4). My own results from a short peer survey support this study with elements of nature being at the forefront of answers. Therefore, presenting immersive binaural nature environments will be important to the project's success. An effective way to record these would be a ‘Sound walk’ which “is any excursion whose main purpose is listening to the environment” (Victoria, 1974). These walks consist of keynotes; soundmarks; sound signals; sound objects; acousmatics (Schafer,1977). I used my Zoom H5 microphone to record on location. This links into soundscape ecology in relation to music. ‘Soundscape’ is used to “describe the relationship between a landscape and the composition of its sound” (Bryan, 2011). Krause (1987) introduced terms to categorize these sounds. Biophony for sounds created by organisms, geophony for non-biological nature sounds like wind and anthrophony for human-created sounds (ibid). I included all three in my project for a diverse arrangement. I then considered the ‘affordances’ of these sounds “to clarify how the same objective environment can furnish different meanings for different perceivers'' (Krueger, 2014). This is what led to the idea of a virtual 4.1 arrangement in VR - partially overcoming the lack of equipment access due to covid. The composition it presents aims to be ‘Furniture Music’ (Nyman, 1973) “which will be part of the noises of the environment” and fill the heavy silences between the spoken word (ibid). Playing a supporting role rather than a focal point. Also taking inspiration from Brian Eno’s Ambient music where the “sound is more important than the notes themselves” (Melchior, 2020).




Figure 4 - Audio and Visual Selections for Computer Systems that Support Wellness - Most Relaxing Sounds.


Project Overview.

My mixing setup for this project is my Sennheiser HD280 headphones. With binaural/ambisonic mixing I am aiming to create the illusion of a 3D space with only two channels of audio making acousmatic audio specifically designed for playback on headphones (Waves, 2017). The Ambisonic 4-channel B-format can be decoded to any speaker array, each channel can be simplified to: left-right, front-back, centre and up-down (ibid). For my project physically there will be only two channels of audio but virtually there will be a 4.1 setup. This helps achieve my goals technically and creatively by producing a multifaceted project with immersive audio processing. For future projects, I would love to expand on the initial ideas I have experimented with here.


However, there are drawbacks to binaural audio such as moving the head will change the perception, people have varying HRTF that alter the sound, headphone EQs will differ thus altering the mix and sometimes visual cues can be missing (Rosas, 2018). Further implications of this binaural set-up is the lack of clarity from sounds placed behind the listener and “with a phantom centre, the cross-talk terms give rise to a less stable centre image as well as a loss of clarity” (Kyriakakis, 1998).


I created the virtual environment in Unity using free assets. The visuals are intended to give users a break from covid restrictions and give a sense of freedom. It allows users to engage with the audio via the virtual listening space. For this project I have used Unity as a mixing technique, the audio hierarchy is in fig. 5. I was able to mix individual and group audio sources using Unity’s mixer (fig. 6) creating a balanced mix.

I am happy with how the visuals came out but, they can be improved on by having more obvious indicators for the 4.1 audio (as they are currently the rocks on the ground) and an environment that adapts to the mood of the poem. This would result in a less static and more engaging experience for users.


Figure 5 - My Unity Audio Hierarchy.




Figure 6 - My Unity Project Mixer.


The focal point is a poem written by a friend called ‘Lighthouse’. Due to covid, I was not able to record in the studio but, I instructed them how to record the best quality on their phone via ‘Dolby On’. However, there was a high noise floor and metallic resonance. To clean up the sound I applied iZotope’s RX de-noise, with subtractive EQ and reformatting to mono (Splice, 2021). To further clarify the speech I boosted at 3,000Hz to improve speech intelligibility, removed sibilance and used compression to add definition (Huff, 2021). For creative binaural processing, I used reverse reverb, where I reversed a phrase and applied Valhalla Shimmer (fig. 7). Recording this into another channel and reversing again it creates a smooth lead-in. When done multiple times and then panned binaurally it created an immersive effect (heard at 01.26). I intended for the mixing to reflect the words spoken. At 00.45 there are references to shifting tides - I mimicked this through Sound Toys’ Crystalliser (fig. 8).


Figure 7 - Valhalla Shimmer Plugin Used for Reverse Reverb Effect.


Figure 8 - Sound Toys’ Crystalliser


I used FMOD to implement the spoken word (fig. 9). Whilst I could have done this via Unity I wanted to push myself and have a go at the basics of FMOD. I was successful at implementing in the intended way with this method. If I were to do this again I would use more powerful binaural effects that sit higher in the mix for the spoken word and adventure more into the capabilities of FMOD.


Figure 9 - My FMOD sessions for this project.


For the synth composition, there are three layers. The first is a smooth drone on FXpansion’s ‘Strobe 2’ (fig. 10), fulfilling the theme of relaxation, it loops the chord sequence shown in fig. 11. I split the sequence of four-note chords between four mono channels in Ableton and a fifth for all frequencies below four-hundred hertz to emulate a sub speaker for the centre. To create a seamless loop, when rendering I made sure to capture the full tail and edit it into the start of the loop (Fig. 12). Figures 13 and 14 show how they were then arranged as 3D audio sources in Unity and how their listening range overlaps. I followed this process for the next two layers- the first a ‘bubbly’ textured synth using ‘Strobe 2’ for binaural interest and a string arrangement using Spitfire Audio’s BBC Symphony Orchestra plugin. Once implemented I experimented with the doppler level; the doppler effect is “the changes in frequency of a sound wave produced by a moving source with respect to an observer” (Bettex, 2010). I wanted it to be prominent enough to notice the placement but not an unnatural amount. However, this is a virtual emulation of a 4.1 system and is not acoustically true to life so can be exaggerated. Looking back i think i could have made this more noticeable. With the previously mentioned inspirations of furniture and ambient music, I feel this project was successful at interpreting the genre into the context of this project. However, the full effect is not felt through a screen recording which is a drawback.


Figure 10 - FXpansion’s Strobe 2 Synth .



Figure 11 - The Chord Sequence Used for this Project.



Figure 12 - How I edited seamless loops.



Figure 13 - How audio sources were places in 4.1 format.



Figure 14 - How the audio source ranges overlap with each other.


To add another layer of immersion I wanted to make the ambience interchangeable. I selected three options based on research and programmed it so it changes via pressing the keys: 1, 2, 3. Whilst I have been self-teaching programming for over a year this was my first time creating my own audio programming script, which is shown and annotated in figure 15. I adapted the program from Audioplug (2015). The program works great and seamlessly changes between ambiences at any point. The audio sources were set to 2D and looped (fig. 16). I sourced the ambiences from Soundly, however, outside of covid it would be better to record my own with my Zoom H5 and NTG4+. I added a canvas overlay in Unity with text instructing the user on adjusting the ambience (fig. 17). Whilst with requisite knowledge of the project it is suitable I think the project would have benefited from an instructions menu to fully explain before starting in a separate menu scene.




Figure 15 - My audio programming script for the ambiences.



Figure 16 - Showing I have set my ambience audio sources to 2D and Looped.



Figure 17 - My canvas with text to instruct the user.




Conclusion.

Overall, I'm happy with how the project came out. I think the core idea and execution is there. Throughout the project, I aimed to think outside the box and experiment with new ideas pushing the boundaries of my own skill set and creativity. I successfully implemented using FMOD and C# programming for the first time.

However, there is a lot that could be developed. The immersion would be improved if the environment adapted to the mood of the poem. When mixing the poem I could have incorporated live performance style automation to enhance the flow and character. It is unfortunate I was not able to upload a functioning WebGL build for the users to experience the project first-hand. I aim to do this with future projects.

Ultimately, this project is a beta version of its true potential and has given me inspiration for future projects to expand on the initial ideas executed here. I’m looking forward to where this initial idea will lead me!




Reference List.

3DIO (2019) What is HRTF?. Available at: https://3diosound.com/blogs/learn-about-sound/what-is-hrtf#:~:text=HRTF%20is%20a%20phenomenon%20that,like%20a%20rapidly%20expanding%20sphere (Accessed: 05/05/2021)


A Trip to the Moon. Available at: https://channel.louisiana.dk/video/laurie-anderson-hsin-chien-huang-a-trip-to-the-moon (Accessed: 05/05/2021)


Audioplug (2015) Unity - Play a sound from a key. Available at: https://www.youtube.com/watch?v=Yb3isH6j-iU&t=38s&ab_channel=Audioplug (Accessed: 05/05/2021)


BBC (2020) Binaural Sound. Available at: https://www.bbc.co.uk/rd/projects/binaural-broadcasting. (Accessed: 05/05/2021)

Bettex. M (2010) Explained: The Doppler Effect. Available at: https://news.mit.edu/2010/explained-doppler-0803 (Accessed: 05/05/2021)


Bryan C. Pijanowski, Luis J. Villanueva-Rivera, Sarah L. Dumyahn, Almo Farina, Bernie L. Krause, Brian M. Napoletano, Stuart H. Gage, Nadia Pieretti, Soundscape Ecology: The Science of Sound in the Landscape, BioScience, Volume 61, Issue 3, March 2011, Pages 203–216, https://doi.org/10.1525/bio.2011.61.3.6


Campo Santo (2016) Firewatch. Available at: http://www.firewatchgame.com/ (Accessed: 05/05/2021)


Dream (2020) Dream. Available at: https://dream.online/ (Accessed: 05/05/2021)


E. R. Hoeg, L. J. Gerry, L. Thomsen, N. C. Nilsson and S. Serafin, "Binaural sound reduces reaction time in a virtual reality search task," 2017 IEEE 3rd VR Workshop on Sonic Interactions for Virtual Environments (SIVE), 2017, pp. 1-4, doi: 10.1109/SIVE.2017.7901610.

Eurogamer (2012) Dear Esther Review. Available at: https://www.eurogamer.net/articles/2012-02-14-dear-esther-review (Accessed: 05/05/2021)


Huff. C (2021) Eight Tips for Improving Clarity in Speech. Available at: https://www.behindthemixer.com/eight-tips-for-improving-clarity-in-speech/#:~:text=Boost%20in%20the%20mid%2Drange,control%2C%20use%20a%20wide%20bandwidth (Accessed: 05/05/2021)


Kearney. G (2020) Binaural audio for virtual and augmented reality. Available at: https://www.york.ac.uk/research/impact/binaural-sound/ (Accessed: 05/05/2021)


Krause B . 1987. Bioacoustics, habitat ambience in ecological balance. Whole Earth Review 57: 14–18.

Krueger Joel. Affordances and the musically extended mind. Frontiers in Psychology, Volume 4, 2014, Page 1003, https://www.frontiersin.org/article/10.3389/fpsyg.2013.01003


Kyriakakis .C (1998) Fundamental and Technological Limitations of Immersive Audio Systems. Available at: https://www.researchgate.net/publication/2985343_Fundamental_and_Technological_Limitations_of_Immersive_Audio_Systems (Accessed: 21/02/21)


MIF (2020) Dream. Available at: https://mif.co.uk/whats-on/dream/ (Accessed: 05/05/2021)


Melchior. C (2020) Ambient Music, Beginnings and Implications. Available at: http://music.hyperreal.org/epsilon/info/melchior.html (Accessed: 05/05/2021)


Nyman, M. (1973). Cage and Satie. The Musical Times, 114(1570), 1227-1229. doi:10.2307/954719


Picking, Rich & Cunningham, Stuart. (2012). Sounds Relaxing-Looks Cool: Audio and Visual Selections for Computer Systems that Support Wellness. International Journal of Ambient Computing and Intelligence. 4. 40-53. 10.4018/jaci.2012010104. Available at: https://www.researchgate.net/publication/262357926_Sounds_Relaxing-Looks_Cool_Audio_and_Visual_Selections_for_Computer_Systems_that_Support_Wellness (Accessed: 05/05/2021)


Rosas .C (2018) Binaural microphones characterisation and soundscapes recording. Available at: https://www.euronoise2018.eu/docs/papers/418_Euronoise2018.pdf (Accessed: 05/05/2021)


Schafer, R.M. (1977) The Tuning of the World. Rochester, Destiny Books.

Splice (2021)How to make your phone recordings sound good. Available at: https://splice.com/blog/make-phone-recordings-sound-good/ (Accessed: 05/05/2021)


The Ambisonic Toolkit (2020) Tools for Soundfield-kernel Composition. Available at: https://www.ambisonictoolkit.net/ (Accessed: 05/05/2021)


The Chinese Room (2012) Dear Ether. Available at: https://www.thechineseroom.co.uk/games/dear-esther (Accessed: 05/05/2021)


To the Moon (2018) Laurie Anderson & Hsin-Chien Huang

Victoria B. C, (1974) Sound Heritage, Volume III Number 4. Available at: https://www.hildegardwesterkamp.ca/writings/writingsby/?post_id=13&title=soundwalking (Accessed: 05/05/2021)


Waves (2017) Ambisonics Explained: A Guide for Sound Engineers. Available at: https://www.waves.com/ambisonics-explained-guide-for-sound-engineers#:~:text=Ambisonics%20is%20a%20method%20for,%2Ddimensional%20360%2Ddegree%20audio.&text=The%20most%20popular%20Ambisonics%20format,a%20complete%20sphere%20of%20sound. (Accessed: 05/05/2021)



Comments


bottom of page