BackgroundThe survival of culture
Scholars studying the past of human cultures struggle with the fact that many artefacts
have been lost over time. This introduces “survivorship bias”, and the risk that we will
underestimate the cultural diversity of past societies. In this project, we show that
unseen species models from ecology can be used to estimate the loss rates of cultural artefacts.
For medieval literature (chivalric and heroic narratives in particular) we obtain survival
estimates which are compatible with prior research and which emphasize the severity of the
loss in this domain. Comparison of results between languages highlights interesting differences
in literary survival patterns across European medieval vernaculars.
Unseen species models
Unseen species models from ecology are statistical methods for estimating the number of species that have gone undetected in bioregistration campaigns. We argue that these models can also be used to estimate the survival rate of cultural artefacts from earlier societies, for instance, in the domain of historic literature.
Loss of medieval literature
Much medieval literature has been lost: less than one out of ten manuscripts survive in the genre of chivalric and heroic narratives. Apart from the material loss of books, the immaterial loss of narratives is also considerable, although our analyses reveal considerable differences across medieval vernaculars.
Evenness of collections
The evenness of cultural collections is an overlooked factor when it comes to their stability in the face of loss. Island literatures, such as Icelandic and Irish, display an elevated evenness, that is, the average number of copies that were made of works is more evenly distributed.
VideoAuthor explanation video
In this author explanation video, multiple collaborators involved in this project provide a high-level explanation of our research project and results. Their explanations are complemented with footage of historical manuscripts from across medieval Europe.
Works and documents
The survival of medieval literature
In the middle ages, works of narrative fiction circulated in hand-written books, called manuscripts. Each manuscript was individually produced and each thus presents a unique material artefact surviving as a parchment or paper volume (or the fragmentary remnants of a volume). Multiple parallel witnesses of the same medieval work could therefore circulate, especially for more popular narratives. Scholars agree that much medieval literature has been lost, both through accidental losses (e.g. library fires) and deliberate destruction (e.g. the recycling of books as binding material for other books). However, the precise extent of these losses remains a matter of considerable debate and speculation. We draw a firm distinction between the (material, tangible) document and the (immaterial, non-tangible) work surviving in that document. Loss should be considered, we suggest, both at the level of the document and that of the work: in our model, a work is considered “lost” when none of the copies that once preserved it survive any longer.
The "Snow Whites" of Leuven, Belgium: some of the books which survived the library fires in WWI are now kept in glass boxes. © KU Leuven. Digitaal Labo.
Loss rates: documents and works
Abundance data in ecology records how often different species have been spotted during a bioregistration campaign. Chao1 is an unseen species model that estimates how many species were not observed, on the basis of the data for species which were observed only rarely. Once we have an idea of the actual, real number of species, we can estimate how many of these species were in fact detected. Chao1 has been integrated into the Hill number framework, an elegant model used to present multiple metrics for expressing species richness (ecodiversity) on a single spectrum, for various values of q. We apply this framework to cultural data to model the under-detection of medieval literature: we treat works as species and documents as sightings of those species. We show the empirical and estimated Hill number profiles (left) and a species accumulation curve (right), showing how many more works we are likely to find by discovering more documents. Of the original ca. 1,170 works that once existed, 799 would survive today; the 3,648 documents that still exist would be a sample from an original population of ca. 40,614 specimens.
Did island literatures fare better?
Our loss figures are compatible with prior studies in book history using other methodologies, but they hide considerable variation across the six medieval languages considered here. We present survival ratios both for (material) documents and (immaterial) works. While the confidence intervals are large, we can observe clear trends. Our analyses confirm the severity of loss, but suggest that German, Icelandic and Irish are characterized by higher survival ratios than French, Dutch or English. The results for the island literatures are remarkable: in spite of their small size, the survival ratios for these literatures were on par or better than for more widely-influential literatures, such as French, on the mainland. That these isolated island cultures behave differently is particularly exciting, because in ecology too, islands are of special interest. In ecology, endemic species richness, for instance, is higher on islands: if islands are indeed better able to preserve their biological heritage, could the same be true for their cultural heritage?
Evenness of distributions
Past research has mostly focused on post-medieval factors that drove the loss of historic literature, such as library fires or collectors disposing of “duplicate” copies. We identify, however, an additional factor that has typically been overlooked: the original evenness of these literatures. Evenness is a concept that we borrow from ecology. In a more even literary tradition, copies are more evenly distributed over works, so that the difference in the number of copies between the most popular and the least popular works is smaller. A more even distribution can guard a literature against losses: if we randomly lose a manuscript in a more evenly-distributed literature, the chance that we lose a unique copy of a work is smaller than it is in an equal-sized literary tradition that is less even. To the right, we show “evenness curves” for the traditions studied; these are integrated in the Hill number framework and display evenness across various values of q. The island literatures (Irish and Icelandic) differ sharply from the other four languages.
The spread of literature
Just like plant seeds, historic books have been subjected to a global dispersal after the middle ages. Often, fragments of manuscripts traveled unnoticed in the spines of later books, re-emerging later in distant corners of the world. In other cases, lavishly illustrated codices were traded by well-known book salespeople for record prices at public auctions. There are many aspects of the survival of literature that deserve further quantitative research. In the Sankey diagram to the right, we plot where the various documents of the vernaculars from our study are currently being kept. The English documents (again) stand out: their dispersion has remained surprisingly local to the British Isles, whereas the other vernaculars experienced a much wider spread across the European continent. Just asin ecology, the ability to migrate might have been a crucial factor in the survival of literatures.
To support the findings in our paper and ensure their replicability, we have made our
full datasets and Python code open access. This includes documented Jupyter notebooks and
the release of a new, open-source software package for running unseen species models, called
“copia” (Latin for “abundance”,
which is a classical concept in ecology), available from the
Python Package Index.
The logo with a horned goat playfully refers to the mythological
cornucopia or "horn of plenty", the legendary horn of the goat Amaltheia, who fed the infant
Zeus with her milk. The software has been published on Github, where we will welcome community
contributions in the future; a snapshot of this repository (including the data) has been
sustainably archived on Zenodo. Additionally, we provide an independent reimplementation
of our entire analysis in the statistical software R, which has an established tradition
in biostatistics, in particular for unseen species models.