English, philosophy, and comparative literature are not commonly thought of when discussing large datasets. Nevertheless, the nexus of literature and data analysis is precisely where Anna Preus specializes.
Click to see the entire transcript of the episode
Ways of Knowing
The World According to Sound
Season 2, Episode 1
Digital Humanities
[poetry reading begins]
Chris Hoff: British modernist literature features a few key figures: William Yeats, T.S. Eliot, Ezra Pound.
[reading continues]
CH: The prevailing thought is that these were the most significant and celebrated poets of their era. They have remained esteemed throughout the years. However, there were numerous other writers during that period who were also well-known, yet their works never made it into the established canon.
[reading of Tagore begins]
Who are you, reader, perusing my verses a century later?
I cannot send you a single blossom from this bounty of the spring, not even a ray of gold from those clouds afar.
Open your doors and gaze outside.
CH: This is Rabindranath [ruh BIN druh noth] Tagore [tuh GORE].
[reading of Tagore continues]
From your blooming garden collect sweet memories of the lost flowers from a century before.
In the joy of your heart may you sense the living happiness that sang on one spring morning, its cheerful voice echoing through a hundred years.
CH: And this is Sarojini [ser OH juh knee] Naidu [NAI doo].
[reading of Naidu begins]
When from my cheek I raise my veil
The roses pale with envious shame
and from their wounded hearts rich with sorrow
send forth their fragrances like a lament
Or if by chance one fragrant lock
Is set free to the wind’s embrace
The honeyed hyacinths complain
and languish in a sweet distress
[reading of Naidu fades]
CH: Tagore was awarded the Nobel Prize for Literature in 1913, becoming the first Asian recipient. Naidu was a poet and social activist, recognized throughout India by the title bestowed by Gandhi, “The Nightingale of India.” Yet, their works remain largely hidden from Western audiences.
Anna Preus: I had delved deeply into a Ph.D. program centered on the early 20th century, and no one had ever introduced her name to me.
CH: Anna Preus, a professor of English and data science at the University of Washington.
AP: The prevailing narratives regarding British modernism are largely based on a history framed by middle- and upper-class British and American men. The same set of individuals is frequently mentioned: Ezra Pound, James Joyce, T.S. Eliot. We finally saw a woman included, Virginia Woolf, in the 1970s and 1980s, but she wasn’t consistently acknowledged. Yet, there are numerous other authors whose contributions have been neglected in the traditional understanding of modernism.
CH: This concept is familiar: gifted writers, thinkers, and artists marginalized based on their ethnicities and identities. The story suggests they weren’t the most celebrated or esteemed in their era, hence their absence from the canon.
AP: This has created a situation where they can sometimes be perceived as less impactful than they genuinely were, in my view, not only within literary cultures of Anglophone literature in South Asia or the Caribbean, but also in what can be termed British Modernism, with capital B and M.
CH: However, this narrative has proven to be misleading. Authors like Naidu and Tagore were, in fact, quite prominent in their time. This truth becomes glaringly evident when examining the data.
[reading of data begins]
CH: This data pertains to the publication of colonial poets in early 20th-century Britain. Anna meticulously analyzed all this information for her research.
[reading of data continues]
CH: The volume of publications by non-British authors was substantial, prompting Anna to conclude that colonial writers like Naidu and Tagore were as popular as their contemporaries including Eliot, Pound, and Yeats. This implies that the choice to exclude them from the canon was not due to their obscurity; rather, it was a deliberate decision to prioritize white male authors like Eliot and Pound. To solidify that these writers were celebrated in their time, Anna sought to uncover how many times the most prominent Indian authors were published in Britain. However, obtaining this information was challenging.
Publishing records are notoriously incomplete. If they exist at all, they could be tucked away in some archive or library, potentially unlabeled or not easily accessible to the public.
AP: A few decades prior, some institutions began digitizing historical texts on a large scale. The English Catalogue of Books is one of those texts that Google has digitized. When we began this project with a team, we encountered all these static PDFs and plain text files filled with a jumble of unreadable, raw text. We needed to dissect this so we could extract functional data on each book published each year for both searching and computational analysis.
CH: This was no minor undertaking.
AP: There were years spent merely transforming a PDF into a usable and accurate list of books published at the time. It’s considered digital humanities partly because of this process, of converting a historical text — this publishing catalog — into a spreadsheet that allows people to observe, for instance, which publisher was the most popular in 1913. Moving from Point A to Point B required numerous steps to convert historical text, but also involved broader collaborative efforts.
CH: Once Anna organized all that publication data into a spreadsheet, she could begin exploring deeper inquiries.
AP: For me, I am particularly keen on how the British publishing sector served as a pivotal institution in British imperialism. I was eager to analyze this data to understand what works were being published, especially in connection to British imperial endeavors. That’s why I am focused on the period from 1902 to 1922. This timeframe represents the zenith of British imperialism in terms of territorial expansion.
[data reading begins]
CH: Numerous books were published about South Asia, yet there weren’t as many literary works by South Asian authors. Ultimately, the quantity was higher than Anna had anticipated. She sifted through the data and identified around 2,000 English-language texts authored by individuals like Naidu, Tagore, and others. Some Asian authors showed remarkable productivity.
AP: Absolutely. This has definitely shifted my viewpoint. Initially, while looking at the data and browsing library databases for the number of works published by Tagore and Yeats, I discovered that Tagore had double the library records compared to Yeats based on the information I collected. Even I thought, that’s quite notable. I had heard so much about Yeats.
[instrumental music begins]
CH: She began recognizing something…
“`html
else. Another aspect of the narrative surrounding why these authors were omitted from the canon: Their creations had been inaccurately categorized. Numerous poetic works by these notable South Asian authors were not issued under the classification “poetry.” Rather, publishers branded them as songs. Naidu’s poetry –– the excerpts we listened to at the outset of this segment –– are derived from a volume titled “The Sceptred Flute: Songs of India.”
AP: Each of these writers’ initial publications were so intensely, generically tied to song. I was perplexed, asking, “What’s happening?” These authors are distinct, their poetry is notably different. It’s poetry. It continually gets referred to as song. This phenomenon was evident for other writers throughout the British colonies. I sensed that these works were being designated as songs to connect them with oral traditions, with types of oral literature and potentially to depict them not as necessarily literary texts, elevated poetry, or verse.
CH: This wasn’t confined solely to writers from Britain’s far-flung colonies, but also to those originating from the British Isles and Ireland.
AP: I discovered numerous poetry anthologies categorized as songs. Thus, when England reflected on its own poetic past, it identified the roots of poetry in these collectively generated ballads. They discussed extensively about this poetic evolution in the territories they colonized within the British Isles — including Wales, Ireland, and Scotland. Many collections of Welsh ballads or Scottish ballads illustrate this early communal poetic culture. Therefore, to me, it appears that part of the reason authors like Naidu are not recognized as modernists is that from their inception, they were linked to the past, and this earlier style of poetry, rather than the experimental, avant-garde, or contemporary poetry associated with some writers like T.S. Eliot, who were marketed differently.
CH: For Anna, consigning these colonial writers to a more primitive mode of expression was intentional. To substantiate this, she correlated the works of these South Asian writers with broader data sets. Although she could have performed all this analysis without data evaluation, processing such a vast amount of information manually could occupy the entire career of a single scholar. Digital instruments can expedite much of this laborious task.
AP: Definitely. If you aimed to do this for 1912, you’d be sifting through a 400-page publishing catalog containing 70 entries per page, diligently noting every time a publisher was cited, and subsequently tallying them. You’re absolutely spot on. This approach enables us to dissect the text; essentially, we have a column in the spreadsheet for publishers and a count, which saves us from doing it manually.
CH: Instead, Anna and her team can achieve these findings in mere days, underscoring a fundamental aspect of her research: collaboration. All the necessary tools are quite accessible. Provided you have some coding proficiency, everything else –– the texts, processing software, and really, all you need is a laptop –– is essentially free. However, for Anna, she’s been supported by staff and both graduate and undergraduate collaborators over the years — each possessing varying technical capabilities. This, she notes, is not typical for humanities scholarship, and it certainly isn’t prevalent in many institutions.
Ultimately, though, this form of humanistic inquiry is not solely about data. It’s about intertwining the study of primary sources, like a poem, with data analysis.
AP: The narrative evolves from oscillating between the texts and the data. I don’t believe the story emerges from the data alone. I don’t think it materializes from individual works alone. For me, it’s a continual process of engaging with the data, examining all these publications, then identifying which of these I can access, actually reading them, analyzing their marketing strategies, the tropes invoked, and then returning to the data to see if those patterns resonate more broadly. Thus, it’s always this iterative process.
CH: Anna Preus operates within the contemporary sphere of the digital humanities. This field focuses on applying the analysis of extensive data sets within disciplines that are not typically associated with such methodologies, like English, philosophy, or comparative literature. Utilizing digital tools to investigate humanistic content opens avenues in the humanities that otherwise would be unattainable.
CH: Here are five texts that’ll assist you in exploring more about publishing culture and the digital humanities as a form of understanding.
“A World of Fiction: Digital Collections and the Future of Literary History,” by Katherine Bode
CH: This book leverages the world’s most extensive collection of mass-digitized newspapers to discern how Anglophone fiction in the 19th century circulated globally.
“Debates in the Digital Humanities,” edited by Matthew Gould and Lauren Klein
CH: A snapshot of the state of the digital humanities, so to speak — at least as of 2023. This compilation of essays emphasizes the significant questions, challenges, and practical insights of the discipline.
“New Digital Worlds: Postcolonial Digital Humanities in Theory, Praxis, and
Pedagogy,” by Roopika Risam
CH: Risam’s volume investigates the influence of colonial violence on the establishment of digital archives and the potentiality of postcolonial digital repositories in countering this violence.
“Postcolonial Writers in the Global Literary Marketplace,” by Sarah Brouillette.
CH: A work exploring the connection between postcolonial authors and the international market where their creations are disseminated.
“In Another Country: Colonialism, Culture, and the English Novel in India,” by Priya Joshi
CH: Joshi delves into how Indian creators of English novels adapted the formerly imperial form and personalized it for their own usage.
CREDITS
SH: Ways of Knowing is a creation of The World According to Sound. This season explores various interpretative and analytical approaches in the humanities. It was produced in partnership with the University of Washington and its College of Arts & Sciences. Appreciation to Casey Miner and Ben Trefny for their voice contributions. Music provided by Ketsa, Serge Quadrado, Graffiti Mechanism, Oootini, and our associates, Matmos.
Preus, an assistant professor of English and data science at the University of Washington, digitized the procedure of cataloging the quantity of non-British poets featured in early 20th-century Great Britain. The figure was substantial, yet these poets remain excluded from the literary canon — a gap that caused Preus to suspect their omission was intentional. In this episode, she elaborates on her findings and the framework necessary for analogous digital humanities initiatives.
This marks the inaugural episode of Season 2 of “Ways of Knowing,” a podcast showcasing how the humanities can mirror daily life. Through a collaboration between The World According to Sound and the University of Washington, each episode highlights a faculty member from the UW College of Arts & Sciences, discussing the projects that motivate them and providing resources for further exploration of the subject.
“`