Spatializing Screen Readers: Extending VoiceOver via Head-Tracked Binaural Synthesis for User Interface Accessibility
Traditional screen-based graphical user interfaces (GUIs) pose significant accessibility challenges for visually impaired users. This
paper demonstrates how existing GUI elements can be translated
into an interactive auditory domain using high-order Ambisonics and inertial sensor-based head tracking, culminating in a realtime binaural rendering over headphones. The proposed system
is designed to spatialize the auditory output from VoiceOver, the
built-in macOS screen reader, aiming to foster clearer mental mapping and enhanced navigability.
A between-groups experiment
was conducted to compare standard VoiceOver with the proposed
spatialized version. Non visually-impaired participants (n = 32),
with no visual access to the test interface, completed a list-based
exploration and then attempted to reconstruct the UI solely from
auditory cues. Experimental results indicate that the head-tracked
group achieved a slightly higher accuracy in reconstructing the interface, while user experience assessments showed no significant
differences in self-reported workload or usability. These findings
suggest that potential benefits may come from the integration of
head-tracked binaural audio into mainstream screen-reader workflows, but future investigations involving blind and low-vision users
are needed.
Although the experimental testbed uses a generic
desktop app, our ultimate goal is to tackle the complex visual layouts of music-production software, where an head-tracked audio
approach could benefit visually impaired producers and musicians
navigating plug-in controls.