Professor of Digital Humanities
School of Advanced Study and Senate House Library
University of London
Humanities and the born digital: moving from a difficult past to a promising future?
Digitised (or born analogue) materials have become an established part of the humanities research landscape. We routinely consult archives of digitised newspapers, early modern printed books and medieval manuscripts, taking full-text search for granted. The mass digitisation of the last two decades has in many ways transformed research processes, but the digital objects themselves remain relatively familiar, and retain a connection with an even more familiar analogue ‘original’. But humanities researchers are now beginning to work with born digital materials – the archived web, social media, email archives – which present very different challenges. Does the idea of an ‘original’ have any meaning for born-digital materials? How can we work with vast born-digital archives when our preferred method of access – search – begins to break down? How can we deal with hybrid archives, without privileging either the analogue or the digital? How are traditional understandings of archives having to evolve to accommodate, for example, web crawls and email dumps? Thinking about these questions now, and working across disciplines and sectors to develop new approaches and methodologies for dealing with born digital data, can help to shape the kinds of research that future historians, linguists, literary scholars and others will be able to undertake. It is essential that we do so, if we are to explore the full potential of these new kinds of primary source.
|Mike Kestemont||Folgert Karsdorp|
|Department of Literature||Ethnology|
|University of Antwerp||Meertens Institute|
Synthesizing Humanities: Explaining Complex Models through Simple Data Synthesis
Models — and especially talking about models — have become ubiquitous in the Digital Humanities. In text analysis too, various models have been proposed to capture and describe textual characteristics, such as genre markers or stylistic ideosyncracies of authors. While models are traditionally divided into two categories — models _for_ and models _of_ phenomena — we turn to an exciting new type of models which has recently surfaced: generative models. When fed with enough data, such generative models are capable of autonomously generating new data samples and synthesize new, artifical data, so to speak.
We present a series of recent experiments in which computational models of text generation are employed to explore the processes of (creative) writing. We argue that such models complement the considerable body of scholarship in the Humanities, which focuses on hermeneutic approaches to text. We aim to show that this complementary perspective of generation, which requires an algorithmic perspective on the process of creative writing, could serve to elucidate, extend, and verify theories put forth in, for instance, linguistics and literary theory about how texts are produced.
In one of our experiments, we describe a co-creative text generation system, AsiBot, which was applied in the context of the annual book campaign _Nederland Leest_ to produce a literary piece of science-fiction written as part of a human-machine collaboration. We explore the ramifications of applying a model of Natural Language Generation within such a co-creative process, and examine where and to what extent the co-creative setting challenges both writer and machine.