It was clear to me from the very beginning that an audio middleware such as Wwise would be very useful for this project. Although it was not known at the start which technical platform the project would be built on, I anticipated that probably a game engine such as Unity or Unreal would be the choice. That turned out to be the case as we heard that the Nantes group was building their EEG demos on Unity. That was good news as I had already experience on using Unity with Wwise, although I was also mentally prepared to start learning a new environment such as Unreal.
There were two good reasons to use Wwise. Firstly, the 3D audio stems created in Reaper were in 2nd order Ambisonics which Unity does not support. They could have been downmixed to 1st order Ambisonics, but that would have meant less accurate spatialisation. I don't know about Fmod, but Wwise supports Ambisonics up till 3rd order. I had also been using Wwise recently in our experimental VR game FOLD so it was a natural choice to continue using it.
Secondly, an audio middleware makes it relatively easy to create interactive music and sound effects compared to scripting everything from scratch using something like Pure Data, Max or Super Collider.
In a nutshell I used Wwise for
- playing back Ambisonic and headlocked stereo stems exported from Reaper
- constructing and playing back the interactive music
- creating and playing back the introduction sequences
At the start I struggled getting the Ambisonic files to work properly as the software's manufacturer Audiokinetic does not provide comprehensive instructions on their online knowledge base. Only after some googling I found one post in Audiokinetic's user forum where the issue was tackled. Based on that I managed to get things working, and later I made my own instructions.
After editing and mixing the sounds for the videos in Reaper I exported them as
- a 9-channel 2nd order Ambisonic stem with AmbiX encoding
- a 2-channel stereo stem for headlocked playback
Because the videos were played back in Unity using its own video player, but the audio was coming from Wwise, we needed a way to sync them together. With a tip from a MediaLab student Jukka Eerikäinen I made a script in Unity that reads the elapsed time of the video and after 0.5 seconds triggers the Wwise event that starts the playback of both the Ambisonic stem and the headlocked stereo track together with the interactive music. The audio tracks were of course 0.5 seconds shorter from the beginning than the video clips, which meant that there had to be a one-second black header in the video. In our tests this worked perfectly, and I hope it will continue to do so when the project goes on.
During the very last weeks the video playback and audio sync became a bit more complex as the director decided to divide both films into two so that there would be space in between for some instructions and animation to be added later. That meant that Christophe sliced the videos into two and I needed to export two sets of stems from Reaper and make sure that the 0.5 second header was always there and the interactive music trigger points were at the right place.
The students at the University of Nantes under the supervision of professor Vigier had been working on the EEG algorithms for the whole spring. They had also created a preliminary Unity project for running the whole Émotive VR with its multiple segments. Together with the director Marie-Laure and Christophe we made a couple of visits to Nantes during the internship in order to coordinate how the whole project should be constructed in Unity.
During the last weeks my role in Nantes was to install and integrate Wwise into the Unity project and consult one of the students, a Korean intern Hye jeong Lee, in using the Wwise sound events and soundbanks. The process took two days and there were minor problems along the way some of which I had to try to solve remotely from Helsinki after I had returned home.
The biggest problems were some individual sound elements not playing at all or music looping incorrectly. Apparently that was related to the Wwise's memory consumption: the default setting of "lower pool size" tends to be too low, but when increasing that most of the problems go away. Also there were some communicational misunderstandings, but nothing serious.
The original Wwise soundbank became also huge as I decided to keep the 9-channel Ambisonic files uncompressed in order to save computer's CPU. I don't know if that was necessary, but I wanted to be on the safe side. A single big soundbank however turned out to be difficult to manage, especially when I needed to update individual sounds events. Therefore I split the Wwise project into several smaller banks that were easier to be updated using Dropbox or WeTransfer. I paid attention to keeping certain scenes in the same bank when I wanted some sounds or music to continue over a scene change.
Finally everything worked fine, at least according to the emails between me and Hye jeong. Of course I haven't seen or heard the project since I left France, so I must take her word on that.
There is still at least one risk factor: in Wwise I created RTPC parameters for some audio bus levels such as music, sound effects, etc. The idea was that I could use them myself to quickly adjust the audio mix while testing the project in its demo state. However, when I left France in the end of April the Unity project was still far from demoable with many elements missing, and there was no chance to test it properly. The risk with the global volume controls is that someone may accidentally adjust them, or forget to reset them after testing something. Somebody might also want to change some of the volumes deliberately, which is okay to me, but I would of course like to have a say on that kind of decisions (especially if my name will appear in the end credits).
Using version control such as Git would have been convenient and made distributed team work much easier. However it was a new concept to most of the project members, and perhaps seen as extra workload, so it was never deployed.
Previous: