Evaluation and Data Enrichment

Task 3.1 in the SSHOC-NL project

This task aims to enable researchers from the Humanities and Social Sciences (SSH)  to make use of data enrichment tools in a methodologically sound manner. In order to do this we aim to provide a hub of information, previous case studies, and training sessions and materials to help researchers to find and use the right tools for their specific use case as well as train them in testing whether their outcome is sufficiently reliable for the intended purpose.

About SSHOC-NL

SSHOC-NL (Social Science and Humanities Open Cloud for the Netherlands) is a large-scale research initiative aiming to create a unified ecosystem of tools, data, and services tailored to the needs of both the humanities and social sciences. The project builds on the strengths of national research infrastructures like ODISSEI for social sciences and CLARIAH for the humanities, fostering interoperability and collaboration across disciplines. Its primary goal is to address researchers’ needs by developing a secure, FAIR (Findable, Accessible, Interoperable, Reusable) environment that enables the linking and analysis of diverse data types—ranging from historical records to social media. This infrastructure facilitates cutting-edge, interdisciplinary research that tackles pressing societal challenges such as polarization, inequality, and environmental change.

By integrating technological advancements with researchers’ workflows, SSHOC-NL ensures that the social sciences and humanities communities in the Netherlands can collaboratively explore new research avenues. The project emphasizes user-friendly tools, ethical and secure data handling, and a robust training framework to empower researchers of all levels. Through this initiative, the Netherlands strengthens its position at the forefront of European open science, while providing a platform for addressing complex social and cultural questions with broad societal impact​.

The team

The work team for Task 3.1 in the SSHOC-NL project is composed of researchers from both the Vrije Universiteit Amsterdam (VU) and the Instituut voor de Nederlands Taal (INT).

Eleanor Smith (VU): Eleanor has a background in digital humanities, historical and computational linguistics. She currently works as a PhD candidate at the VU Amsterdam in the Computational Linguistics and Text Mining Lab.

Bram Vanroy (INT): Bram is a researcher in computational linguistics at the Dutch Language Institute and KU Leuven, with a current focus on large language models and evaluation.

Antske Fokkens (VU): Antske is professor in Computational Linguistics at the VU. Her research focuses on methodology for language technology in interdisciplinary context. Her group is embedded in the Computational Linguistics and Text Mining Lab at the VU.

Vincent Prins (INT): Vincent is software engineer at the Dutch Language Institute and lead developer of the GaLaHaD platform, a platform for linguistic annotation and evaluation of historical Dutch.

Jesse de Does (INT): (task co-lead) Jesse is a senior computational linguist at the Dutch Language Institute. He was responsible for managing the INT contributions to the CLARIAH-core and CLARIAH-plus projects.

Katrien Depuydt (INT): Katrien is a senior researcher and linguist at the Dutch Language Institute. She is responsible for the data-infrastructure of the institute, i.e. the development and deployment of corpora and lexica.

Contact us for questions or collaborations

If you are interested in knowing more about our work, or you find that there is overlap between our intentions and your (prior or current) work, then you can reach us at via the contact form.