Audio Augmented Reality: Concepts, Technologies and Narratives

Audio augmented reality (AAR) is a fascinating medium and concept where virtual sounds are merged with reality. My book, Audio Augmented Reality: Concepts, Technologies and Narratives, published in May 2025, is an introduction to this field and concept that has existed for long but is yet to be recognised and developed. AAR has huge possibilities in storytelling and artistic expression while it is already been used to help users with practical tasks like navigation and situation awareness. Technically, there are many approaches to realise AAR from hidden loudspeakers to complex head-tracking headphone systems.

Although AAR has been researched at least since the 1990s, and used in some forms even before that, my book appears to be the first publication attempting to cover the subject comprehensively. While exploring it from various angles, the focus is on the narrative use of AAR.

Unfortunately, I can't share the book here for you to read, but you can buy it at
Routledge
Bookshop.org
Kobo
eBooks.com
or any other major online bookstore, or hopefully find it in your local university library (if not, file a request!).

Content

The book has ten chapters:

1 Introduction

This brief introduction chapter starts with a vision of the year 2045 where personal and ubiquitous AAR devices are used in various situations to improve the quality of life. The story works as a backdrop for the introduction chapter as well as the whole book, envisioning some possibilities that this new technology enables. The chapter continues by introducing the key paradigms of audio augmented reality framework: merging virtual sounds with the real environment, augmenting hearing, manipulating personal soundscapes, and connecting our lives to digital ecosystems. The chapter introduces the book's designed bias towards the medium's narrative use and possibilities while presenting the underlying theme of gathering all the various phenomena under the same umbrella concept of audio augmented reality.

2. Nature of AAR

The second chapter presents AAR as a multi-faceted medium, technology, and framework encompassing a wide range of possibilities for enhancing and reshaping our perception of reality. It frames it as a subset of augmented reality (AR) while embracing the concept of mixed reality (MR) as the perfect blend of virtual sounds and reality. The chapter suggests several application domains and introduces some of the key concepts related to the topic. These include virtual sounds and plausibility, location dependency, sonification and sensory augmentation, accessibility, six degrees of freedom, and acousmatic sounds. Binaural and soundfield-based auditory display options for AAR reality are discussed.

3. Reality,presence and interactivity

Reality, sense of presence, and interactivity are fundamental concepts in audio augmented reality. This chapter examines these concepts through both theoretical perspectives and practical examples. Given the book's focus on AAR as a narrative medium, interactive storytelling is explored in detail. Additionally, the importance of embodied movement in creating a sense of immersion and serving as an interaction modality is discussed. The chapter also introduces acoustic ecology, along with the concepts of soundscapes and audio walks, as essential for understanding broader applications of AAR, such as environmental soundscape design. Finally, the role of environmental sounds as an integral part of AAR experiences is highlighted.

4. From echoes to audio walks

After setting the backdrop, I continue tracing the history of auditory augmentation, from the echoes heard on the walls of prehistoric caves to talking dolls, and from animatronic animals to audio walks juxtaposing personal memories and places. Key technical milestones, such as the advent of binaural audio, are highlighted as foundational to modern AAR. Examples of loudspeaker-based augmentation in attractions, public spaces, museums, and cars are discussed, alongside headphone-based experiences, namely linear audio guides and walks. The chapter also examines two locative audio walks utilising radio transmitters and receivers as an early alternative to the satellite-based geolocated applications of the 21st century.

5. Towards mixed realities

This chapter explores a range of AAR applications and experiences from the 1990s to the present day. These serve as milestones and examples of the medium's evolution towards auditory mixed reality, where computer-mediated virtual sounds seamlessly integrate with the real world. The chapter maintains a throughline focused on the relationship with reality while presenting examples of diverse approaches. These include exhibitions with mainly object-based augmentations, site-specific experiences, the reskinning of existing environments, and narratives set in fabricated surroundings. Other examples feature navigational and situational awareness systems, multi- and single-authored geolocated audio platforms and experiences, and AAR games.

6. Spatial hearing and virtual audio

This compact chapter introduces the basic concepts of spatial hearing and virtual audio rendering. The fundamentals of spatial hearing are discussed from the psychoacoustic perspective. The chapter revisits channel-based, object-based, and scene-based paradigms in AAR production and runtime. The spatialisation process of object-based audio is explored, covering positional coordinates, room impulse responses for auralisation, and binaural decoding. Methods for obtaining room impulse responses—simulation, measurement, and estimation—are discussed alongside their strengths and limitations. The chapter concludes with an analysis of externalisation quality, highlighting the importance of spatial cues, acoustic congruence, and head-tracking for achieving realistic and immersive auditory experiences.

7. Technical components

This chapter examines the technical components of a typical wearable, binaural audio augmented reality system. It provides a detailed review of various sensors and tracking methods, including optical tracking, radio-based technologies, inertial measurement units, and emerging technologies. Additional interactional inputs, such as biodata integration via smartwatches, are also explored. The roles of the computer, software, audio interface, and authoring tools are briefly introduced. A detailed examination of headphone types is included, as they serve as the primary auditory display for binaural applications. The chapter discusses the critical impact of motion-to-sound latency on interaction and the plausibility of virtual audio rendering, concluding with an overview of new immersive audio standards.

8. Narrative design considerations

This chapter delves into narrative design considerations in AAR, acknowledging constraints of practical factors such as skills, interests, budgets, and stakeholder priorities. The discussion spans multiple aspects, including the medium's narrative potential, the choice of technical platform, the experience's relationship with its environment, the focus of the augmentations, and the listener's role. It also addresses technical priorities and practical issues, such as the use of secondary interfaces and ensuring accessibility for individuals with reduced vision or hearing. Finally, the chapter examines how the level of realism is not merely an aesthetic choice but also a factor that can significantly impact the comprehensibility of the narrative.

9. Narrative techniques and concepts

AAR is a unique medium with its own, still evolving narrative language. This chapter introduces several narrative techniques and concepts identified as characteristic of the medium. These techniques draw largely upon the use of spatial virtual audio, the interplay between real and virtual elements, and interactivity based on the user's location and movements. The discussion is supported by references to various AAR experiences presented earlier in this book, as well as other projects, research literature, and the author’s professional experience. This chapter aims to inspire content creators and provide a framework for analysing existing narrative experiences.

The narrative techniques and concepts explored in this chapter are:

Locative audio Acousmatic sounds triggered and manipulated based on user's location
Attachment Virtual sound appears as emanating from a real-world object
Object congruence Fully plausible contextual alignment of an object and its sound
Partial congruence Contextual alignment of an object and its sound with elements that are slightly tangential
Incongruence A contextual mismatch between an object and its sound relative to expectations
Addition Introduction of supplementary sounds that manipulate the perception of a sound-producing object
Alternative match Use of audio that provides a different yet still congruent perspective on an object
Extension Space beyond the observable field created and suggested by sound
Revelation Source of an obstructed sound gets revealed
Acousmatic sounds Sounds without physical counterpart
Detachment Sound disengaged from a physical object; becomes acousmatic
Attractor sounds Objects beckoning the user with distinctive sounds
Spatial scale and offset Spatial coordination system and scale manipulated in various ways
Near field Utilising sounds very close to head
Removal and replacement Removing or replacing sounds in user's perception
Synchronisation Creating momentary synchronisation points between virtual and real world for enhanced meanings
Acoustic translocation Real space acoustically transformed to a virtual one

10. The future of AAR

This final chapter revisits the future vision of audio augmented reality introduced at the beginning of the book, reflecting on its features and ideas in light of current and anticipated technological developments. It explores devices and their technical attributes, alongside methods for achieving spatially aligned augmentations, particularly in telepresence scenarios. The chapter acknowledges the role of standards and codecs. It examines emerging trends in areas such as sound separation and manipulation, simultaneous interpretation, navigation, situational awareness, soundscape simulations, and games. Additionally, the chapter addresses the challenges and risks posed by these advancements, including privacy concerns, digital deception, and resource limitations within cultural institutions.

Chapter Notes and Updates

Since writing the book, both the field and my own knowledge have evolved. Here, I will be revisiting some chapters with updates and additions that didn't make it into the original book. This is a (slow) work in progress, so bear with me!

1 Introduction

The book starts with a future vision of everyday, ubiquitous personal AAR. In the story, set in 2045, people use earbuds capable of manipulating their auditory perception by adding and replacing sounds around them.

It's hard to predict the future, but so far AR glasses seem more promising than earbuds as personal AAR devices. This is discussed later in the book, and one obvious reason is that eyewear makes it easier to install forward-looking sensors than ear-mounted devices, which can also be obscured by caps, hair, and so on.

In the more distant future, however, I think that compact and almost unnoticeable ear-mounted devices will be the winning platform. That is, provided that AAR keeps developing in such a ubiquitous direction. Something to stay alive long enough to witness!

2.3 Application domains

The book lists possible application domains for AAR (see, e.g., Yang, Barde & Billinghurst, 2022):
1) situational awareness
2) navigation
3) presentation and display
4) narratives in education, training, entertainment, art, and social applications
5) non-narrative art
6) telepresence
7) healthcare

At least one important domain is missing from this list, though it's mentioned a couple of times in the book:
8) auditory enhancement

If situational awareness is about adding contextual information, 'auditory enhancement' would be more about improving and filtering the sounds already present. This would mean improving what the user hears: auditory equivalents of sunglasses, spectacles, binoculars, or microscopes. Examples include isolating and enhancing speech in noisy environments (something modern hearing aids already do) as well as real-time translation from one language to another. And potentially much more, including applications we can barely imagine yet.

3.2 Sense of presence

In this subchapter, I discuss sense of presence and immersion, but not clearly distinguishing these concepts from each other. Only after returing the manuscript, I came across a paper by Hyunkook Lee (2025) that proposes a unified model distinguishing immersion, presence, and involvement. His framework nicely defines physical presence, social/self presence, and involvement as the core dimensions of an immersive experience, arising from different types of engagement. I recommend reading it!

10 The Future of AAR

For more about the potential dangers augmented reality may pose to society and democracy, see Morten Bay. His 2023 article lays out some genuinely unsettling scenarios, and his new book Mediating Plureality (2025) is currently on my bedside table.

. . . . . . .

More updates and comments coming up...