A review of Ticha, a digital text explorer for Colonial Zapotec, co-directed by Brook Danielle Lillehaugen and George Aaron Broadwell
Élika Ortega, University of Colorado Boulder
Brook Danielle Lillehaugen and Mike Zarafonetis
The Zapotecs have one of the longest records of alphabetic written documents of any indigenous language of the Americas, the earliest text dating to 1565 (Oudijk 2008: 230). Reading and interpreting these documents can be difficult because of the challenges of early Zapotec orthography, vocabulary, grammar, and printing conventions, yet the documents contain rich linguistic, historical, and anthropological information. Ticha allows users to access and explore many interlinked layers of these texts, including images, transcriptions, translations, linguistic analysis, and commentary. The Ticha project employs an iterative development process that includes in-person workshops with Zapotec community members. Feedback from these interactions inform design decisions for the project. Collaboration with Zapotec communities is an integral part of Ticha, which also makes explicit connections to modern Zapotec languages.
Ticha seeks to make this corpus of Colonial Zapotec texts accessible to scholars in diverse fields (including linguistics, anthropology, and history), Zapotec community members, and the general public.
The team consists of an interdisciplinary group of academics, students, and Zapotec community members:
Dr. Brook Danielle Lillehaugen, linguist; co-director
Dr. George Aaron Broadwell, linguist; co-director
Dr. Michel R. Oudijk, ethnohistorian
Laurie Allen, librarian
Dr. Mike Zarafonetis, librarian
Dr. Xóchitl Flores-Marical, ethnohistorian; Zapotec advisory board member
Moisés García Guzmán; Zapotec educator; Zapotec advisory board member
Dr. Felipe H. Lopez, Zapotec writer and educator; Zapotec advisory board member
May Helena Plumb, doctoral student in linguistics
Undergraduate research assistants, currently including: Felipe Acosta-Muñoz, Eloise Kadlecek, Emily Lin, Tomas Paris, and Conor Stuart Roe.
Ticha is a postcustodial digital archives project, in which a corpus of related manuscripts is created by digitizing manuscripts in multiple archives and collections. At its core is a community-engaged approach in which the project team and the Zapotec advisory board shape the direction of the project in close collaboration with other members of the Zapotec community. The main web application, which provides access to digital surrogates of historical manuscripts in the corpus, is built in the Django framework. The project team employs a “progressive archiving” approach (Nathan 2013), in which manuscripts are made available publicly as soon as possible at whatever state of analysis they are in, including just images; this is a priority due to the sociolinguistic context around the languages. Ticha features a custom built crowdsourcing transcription interface and a reverse indexed Zapotec-to-English dictionary. It brings together data analyzed in FLEx (Fieldworks Language Explorer), a system for lexical and grammatical analysis, with manuscripts encoded under current TEI (Text Encoding Initiative) standards for paleographic and translational representations of texts to create digital critical editions of early Indigneous North American texts. The project employs a collections as data approach, making all transcribed and encoded texts available via a Github repository for reuse.
Ticha is freely available to the public and is committed to remaining so. The project has been supported by funding from the American Council of Learned Societies (2019-2021 Digital Extension Grant, Lillehaugen; 2015 Fellowship, Lillehaugen), the American Philosophical Society (2015 Franklin Research Grant, Lillehaugen), the Center for Peace and Global Citizenship at Haverford College, The Provost Office of Haverford College, the Hurford Center for the Arts and Humanities at Haverford College, the Haverford College Libraries, the National Endowment for the Humanities (2105 Fellowship, Lillehaugen; 2014 Summer Stipend, Lillehaugen), and the Tri-Co Digital Humanities.
Citations in Scholarship
Ticha has been cited in scholarship on linguistics (Austin 2017, Rice & Thieberger 2018) and Mesoamerican history (Wood 2017, Farriss 2018). Ticha is listed as a resource in ADRELA: The Alliance for Digital Research on Early Latin America, The Howard-Tilton Memorial Library’s guide to databases in Spanish Literary, Cultural, & Linguistic Studies, in Always Already Computational’s Collections as Data Facets, and on the web page of the Mexican government's Centro de Cultura Digital. Ticha was reviewed by Jessica Sánchez Flores in Alpert-Abrams and McDonough’s Fall 2018 Critical Digital Archives course. In addition, members of the Ticha team have written about the FLEx database behind Ticha (Broadwell & Lillehaugen 2013), the incorporation of modern Zapotec Talking Dictionaries (Harrison et al. 2019), and the collaborative methods and work with the Zapotec community (Broadwell et al. in review).
Austin, Peter. 2017. Language documentation and legacy text materials. Asian and African Languages and Linguistics 11: 23-44. http://hdl.handle.net/10108/89205
Broadwell, George Aaron, Moisés García Guzmán, Brook Danielle Lillehaugen, Felipe H. Lopez, May Helena Plumb, & Mike Zarafonetis. Ticha: Collaboration with indigenous communities to build digital resources on Zapotec language and history. Digital Humanities Quarterly. [in review]
Broadwell, George Aaron & Brook Danielle Lillehaugen. 2013. Considerations in the creation of an electronic database for Colonial Valley Zapotec. International Journal of the Linguistic Association of the Southwest 32(2): 77-110. DOI: 10.17613/M6WW1J
Farriss, Nancy. 2018. Tongues of Fire: Language and Evangelization in Colonial Mexico. Oxford: Oxford University Press.
Harrison, K. David, Brook Danielle Lillehaugen, Jeremy Fahringer, & Felipe H. Lopez. 2019. Zapotec language activism and Talking Dictionaries. Smart lexicography: Proceedings of the eLex 2019 conference.
Nathan, David. 2013. Progressive archiving: theoretical and practical implications for documentary linguistics. 3rd International Conference on Language Documentation and Conservation (ICLDC). http://hdl.handle.net/10125/26115
Oudijk, Michel R. 2008. El texto más antiguo en zapoteca. Tlalocan 15.227-40.
Rice, Keren & Nicholas Thieberger. 2018. Tools and Technology for Language Documentation and Revitalization. In Kenneth L. Rehg & Lyle Campell (eds.), The Oxford Handbook of Endangered Languages. Oxford: Oxford University Press. DOI: 10.1093/oxfordhb/9780190610029.013.13
Wood, Stephanie. 2017. Digital resources: Digital mesoamerica. In Oxford Research Encyclopedia of Latin American History. DOI: 10.1093/acrefore/9780199366439.013.299
Ticha, a digital text explorer for colonial Zapotec, is a bilingual social justice digital project dedicated to the archiving, recovery, and revitalization of the Zapotec language from Southern Mexico. The project includes an extraordinary number of resources including over four hundred texts in or about colonial Zapotec from 1565 to 1832, high resolution digital images of the corpus documents, an English-Zapotec dictionary, exportable PDF, JSON, CSV, and XML files, full and in-progress transcriptions, translations, maps, timelines, commentary, presentation slides, and linguistic analysis. Additionally, Ticha functions as a workspace by allowing the users not only to navigate and explore its corpus, but also by providing interfaces for learning Zapotec and for crowdsourcing transcriptions. The core team composed of Brooke Danielle Lillehaugen, George Aaron Broadwell, Michael Oudijk, Laurie Allen, and Michael Zarafonetis is aided by an advisory board of Zapotec experts, speakers, and activists including Xóchitl Flores-Marcial, Moisés García Guzmán, and Felipe H. López, as well as graduate and undergraduate research assistants.
With this impressive wealth of resources and the variety of work and expertise required to put it all together, Ticha is exemplary of collaborative digital scholarship, tool development, corpus building, and community engagement. I want to stop on the last one, as the engaging of a community of Zapotec speakers is very clearly the backbone of the project and, through recurrent workshops, has given shape to its other components. At the basis of Ticha, the gathering of the corpus itself responds to centuries of language suppression by the Mexican government. The Mexican educational system has rendered indigenous languages as inferior to Spanish and, even, offensive. However, Ticha does not only provide access to Zapotec documents, but also includes modernized versions of colonial Zapotec writing, and records of orthographic and phonetic variants. These two elements of the project offer Zapotec speakers a much needed historical and contemporary valorization of their language. Reports by the advisory board members provide plenty of evidence of these objectives being fulfilled.
From the point of view of the scholarly community Ticha serves, one can point out the work of filling in the enormous gap in Zapotec documentation that has left linguistic research about this indigenous language in the hands of a few dedicated experts. Moreover, the building of the digital corpus, whose originals are held at a variety of changing and temporary government institutions, into a dedicated space underscores the importance and urgency of such endeavor. The Ticha team has also been extremely mindful of reusing and repurposing already existing digital humanities tools for their own purposes—only creating new ones (like the transcription interface) when the project and the community demanded it. Ultimately, the crucial involvement of undergraduate and graduate students in the project ensures that the meticulous and detailed work of the core team becomes the ethos and praxis of learning.