2025 LLM workshop: materials

On November 27th we invited researchers to our workshop on LLMs. The focus point was “reflection”: when using LLMs, we should reflect on multiple aspects of the tool. We asked questions such as: how do LLMs actually work and how does that impact its output (biases, trustworthiness); what are the ethical, copyright and privacy concerns involved; is a massive LLM really necessary or can I make do with smaller, specialized models?

We welcomed four different presentations in the morning session. You can download their presentations here:

Ellie Smith opened the workshop, introducing the broader context of the workshop: the SSHOC-NL project in general and our Task 3.1 specifically.

Antske Fokkens took us through the ins-and-outs of LLMs. Describing how they work, how they should (not) be used, and what common pitfalls are to avoid.

Stella Verkijk presented how LLMs play a part in her research, leaving room for critical reflections and comparisons between specialized models and general-purpose models.

Bram Vanroy highlighted the importance, and fragility, of evaluating LLMs by describing common LLM benchmarks.

For those interested, the workshop continued into the afternoon. Bram guided attendees through two Python notebooks that explore how data scarcity can potentially be remedied by using an LLM to generate synthetic data, that is, new data that looks like our expected training data and that can help the model generalize better. Particular attention is paid on model evaluation and comparison, ensuring that our results are methodologically sound — we only use the term “significant difference” if we can indeed statistically prove such a significance. In an artificial low-resource scenario for named entity recognition, we found that using generated synthetic data does significantly improve model performance.

The following notebooks can be run in a free Google Colab session. They walk you through data cleaning and critical reflection of existing datasets; training a token-classification model for NER; using an LLM to generate synthetic data; and evaluating and comparing models in a statistically supported manner.

Categorieën:

Reacties

Geef een reactie

Je e-mailadres wordt niet gepubliceerd. Vereiste velden zijn gemarkeerd met *