New article published – testing the ability of language models to capture olfactory information.
In our most recent publication, Representations of smells: The next frontier for language models?, published in Cognition, we tested language models’ ability to capture olfactory-perceptual and olfactory semantic information.
We trained three generations of language models, using around 200 training configurations and four different text corpora. We then evaluated these models against three different data sets, capturing either olfactory-perceptual or olfactory-semantic information.
Surprisingly, we found classic language models, such as Word2Vec, to best capture olfactory-perceptual content, and state-of-the art models, such as GPT, to excel at olfactory-semantic content. However, the results of Word2Vec depended heavily on the training data, showing much better performance when trained on olfactory-related contexts.
The article can be read here: https://www.sciencedirect.com/science/article/
Authors: Murathan Kurfalı, Pawel Herman, Stephen Pierzchajlo, Jonas Olofsson, Thomas Hörberg
