Opening times
Digilab

“Data Pearl 2024” awarded to National Library of Estonia’s Digilab

28. August 2024

The “Data Pearl” competition was initiated by the Statistical Office of Estonia in 2021 in celebration of its 100th anniversary, to promote the use and interpretation of data and to improve data literacy in the society. The winners of the “Data Pearl” competition are determined by combining the results of a public vote with the scores given by a jury.

The awards of 2024 were announced in June this year. The title of the best data story in the “Data Pearl 2024” competition was awarded to the National Library of Estonia’s Digilab for a series of data stories, which were published in four blog posts in 2023. These posts, available on the website of the Digilab, offer an excellent opportunity to explore our cultural heritage through data, covering the following topics:

  • How big are Estonian books?
  • Driving through big data;
  • RaRa datasets on maps
  • How many books have been published in Estonian?

The winners of the competition, the Digilab team consisting of Krister Kruusmaa, Peeter Tinits, and Laura Nemvalts, share insights into how the stories, which became the favourite of both the public and the jury, were created:

The task of the Digilab is to bring the cultural heritage available at the National Library of Estonia to the public through data. We aim to make storytelling about culture through data accessible to everyone, while also trying to lead the way ourselves. The greatest challenge lies in organising this data, as it has primarily been collected for preservation purposes rather than for analysis or publication. We spend a lot of time “translating” the data from the world of librarianship to the world of researchers and scholars, which is a rather lengthy and complex process. The rewards of this work are the outcomes, when we can finally pose meaningful questions for the data and share the answers with the public.

Through data, we have been able to highlight the so-called big picture of cultural history. For example, we can see when various modes of transportation entered our media or how the dominant genres of books published in Estonian have changed over time. Since data is one way to help us understand the patterns of history and culture, the range of possible topics is extremely broad. In this field, close collaboration with the humanities and social sciences, which study these topics, is also crucial. Data-driven stories are always interesting because they build on what we already know, either through everyday experience or through expert narratives. We believe that data can bring new colour and freshness to our cultural history.

On the Digilab’s website, you can find several tools, such as the word n-grams in newspapers, an interactive map of the publication locations of Estonian books, a network of translated literature, and an overview of the digitisation status of Estonian newspapers.

Tõlkevõrgustik perioodil aastatel 1991–2023.
In addition to revealing the preferences of Estonians in translated literature, the application, which presents the connections between translators and writers as a network, helps to get an idea of ​​the undercurrents of history and the links between Estonian and world culture.

What fascinates us most about working with cultural data are the stories and narratives that emerge, which, on the one hand, can support the existing knowledge, but on the other hand, help uncover hidden patterns and connections. Analysing big data also allows us to see the larger picture of cultural history, which is otherwise impossible to grasp. Stanford University Professor Richard White has said that the analysis of cultural data is a research tool in itself—it reveals connections that have previously remained hidden and raises questions that would not be asked in any other way. Similarly, in our analysis we have discovered how much linguistic experts’ guidelines can influence people’s language use, the extent to which major changes in literary history are driven by individuals or specific schools of thought, and which topics and events have brought environmental issues to the forefront of media attention.

The series of data stories actually continues, and in addition to the 2023 posts submitted for the competition, there are also two entirely new stories on our blog, both based on network analysis.

The first one provides an overview of the publishing network during the Estonian National Awakening and explores who were the most central figures in shaping Estonian national literature. The second story, created in collaboration with translation history researchers from Tallinn University, introduces an interactive tool designed for studying Estonian translated literature. This tool maps out all the connections between nearly 10,000 foreign authors and 4,000 translators who have translated their works into Estonian over the past two centuries.

In the upcoming data stories, we will explore slightly different directions, such as interactive maps and artificial intelligence. But what we look forward to most is contributions from external participants. The idea behind the Digilab is to open up data and its value to everyone, so if anyone has done something using the National Library’s data, they are welcome to reach out to us, and we will help them create a blog post for publication on our site. In the future, we envision the Digilab as a platform for co-creation and citizen science, where people can share their work with data and explore what others have created.