A review of AudiAnnotate, a tool for annotating audio files, directed by Tanya Clement, Ben Brumfield, and Sara Brumfield
Craig Breaden, Duke University
The goal of the AudiAnnotate project is to accelerate access to, promote scholarship and teaching with, and extend understanding of significant digital audio collections in the humanities. Access to audio is often restricted by libraries, archives, and museums (LAMs) for copyright, privacy, and preservation reasons, but a lack of descriptive metadata and annotations stands in the way of all levels of access and use. Even with simple metadata, audio materials may not include enough information to pique user engagement. Annotations are what John Unsworth (2000) has called “a scholarly primitive”—an essential humanities method for adding context and meaning to cultural objects of study for use in research, teaching, and publication. Annotations have been the basis for engaging audiences with cultural objects from monks including commentary on medieval manuscripts to current online scholarly pages, editions, and exhibits. Presently, even when audio materials are made accessible, these digital objects often remain inaccessible for annotation and therefore inaccessible for learning, public comment, scholarship, and general use.
Annotating is only one of a list of scholarly primitives including discovering, comparing, referring, sampling, illustrating, and representing (Unsworth). IIIF (International Image Interoperability Framework) is one standardized solution that LAMs have adopted for giving users the ability to perform these primitives with images held in cultural heritage institutions. Comprising 56 global members including major research universities, national libraries, and world-renowned museums, archives, software companies, and other organizations, the IIIF Consortium has worked together since Fall 2011 to create, test, refine, implement and promote the IIIF specifications for interoperable functionality and collaboration across repositories. IIIF uses linked data and W3C web standards to facilitate sharing digital image data, migrating across technology systems, and using third-party software to enhance access to images, allowing for viewing, zooming, comparing, manipulating and working with annotated images on the Web.
If an audio object is not available online, annotations can provide context, like being able to read liner notes for a missing album. Further possibilities for access include the ability for scholars, students, or the public involved in larger projects across institutions to systematically, collaboratively annotate or the ability for LAMs to showcase user annotations or incorporate them into their digital asset management (DAM) systems. This would add much-needed context because, for audio preservation and management, LAMs depend on standardized metadata schemas and systems that adhere to these standards.
In response to the need for a workflow that supports IIIF manifest creation, collaborative editing, flexible modes of presentation, and permissions control, the AudiAnnotate team including Tanya Clement, Principal Investigator of HiPSTAS, and Brumfield Labs has developed a workflow to connect open source tools for annotation (such as Audacity), public code and document repositories (GitHub), and the AudiAnnotate web application for creating and sharing IIIF manifests and annotations. Usually limited by proprietary software and LAM systems with restricted access to audio, users can use the AudiAnnotate workflow as a complete sequence of tools and transformations for accessing, identifying, annotating, and sharing annotations. LAMs will benefit from the AudiAnnotate workflow as it facilitates metadata generation, is built on W3C web standards in IIIF for sharing online scholarship, and generates static web pages that are lightweight and easy to preserve and harvest. Examples already include annotations of Zora Neale Hurston's WPA field recordings and archival material from Anne Sexton’s papers at the Harry Ransom Center. Others have used AudiAnnotate in workshops and classes. The AudiAnnotate workflow represents a new kind of AV ecosystem where exchange is opened between institutional repositories, annotation software, online repositories and publication platforms, and all kinds of users.
The AudiAnnotate project, first supported by an ACLS (American Council of Learned Societies) grant in 2019 and now funded by a generous grant from the Andrew Mellon Foundation to extend AudiAnnotate to video, builds on these IIIF accomplishments to address the gaps in engaging with audio by developing a solution to bring together free annotation tools and the web as a standardized collaboration and presentation platform.
In reading about Tanya Clement’s work on using audio for research and teaching, I was struck by her use of the term “slow research.” It reminded me of the term “slow food,” and of “slow” as shorthand for suggesting deliberate mindfulness, the beauty and art in simply having patience. Clement writes about this approach using AudiAnnotate—a tool she, Ben Brumfield, Sara Brumfield, and a team at the University of Texas have developed for describing audio and sharing that description—when analyzing a recording of poet Anne Sexton reading and discussing her work with students in the mid 1960s. For Clement, the work of research with an audio recording or collection is akin to the slow work of poetry study as Sexton presents it to the class, and AudiAnnotate represents a solution for audio archivists and researchers looking to describe sound in myriad, deeper, and creative ways, with multiple annotations, transcriptions, and interpretations as related layers making up an object’s whole.
In its elegant intention, AudiAnnotate honors recordings by providing an interoperable framework for sharing annotation. Its workflow is reasonably simple: use Audacity’s label tracks to create timestamped description within a recording; export the track as a text file and import it into AudiAnnotate’s GitHub interface, where it is associated with an IIIF-compliant manifest pointing to the audio file; publish the description to a webpage associated with the user’s GitHub account. GitHub doesn’t host the audio, but a player embedded in the page points to the original source and clicking on a time-stamped description takes the reader/listener to that point in the audio. There doesn’t have to be an available audio file for a user to post description using AudiAnnotate—researchers or students creating scholarship on restricted archival audio can still post their descriptions. One of AudiAnnotate’s smartest features, though, is its ability to import multiple layers of tagged description. It’s easy to imagine summary, transcription, and interpretive layers describing a section of audio.
For those of us who labor to describe audio or who use audio in their research, AudiAnnotate could help mitigate many challenges of surfacing sound collections, where rich troves of art, thought, and history often remain invisible because the work is, by its very nature, slow. In my work, I could envision using or recommending AudiAnnotate for class- or group-source projects to establish enhanced audio metadata in recordings where we might only have a title, but I do see some potential scalability issues with the application as it stands. Its reliance on Audacity and its second-based timestamps makes it more challenging for those who would rather use a spreadsheet workflow, with more conventional hh:mm:ss timestamps, to get the text output; entering separate layers through AudiAnnotate’s GitHub form is cumbersome (again, a spreadsheet workflow with a layer field would be helpful); direct URLs to public audio resources don’t always work, and troubleshooting guidance for this kind of situation would be helpful. While there’s some room for improvement, as there would be with any young app, AudiAnnotate’s potential for describing audiovisual resources is clear and has been rewarded; Clement and crew have received further funding and are setting their sights on video next.