19. June 2023 at 00:00

Teaching AI bots to speak better Slovak

There is relatively little data available in Slovak for AI training.

Matúš Beňo

Editorial

(source: unsplash)
Font size: A - | A +
Comments disabled

Digital question-and-answer systems in English work just fine: you type in the former and out comes the latter, in more-or-less perfectly grammatical form. In fact, the linguistic abilities of emergent technologies like generative AI are among their most remarkable aspects for users of major languages. However, the same cannot be truly said about systems based on less widely spoken languages, such as Slovak.

SkryťTurn off ads
SkryťTurn off ads
Article continues after video advertisement
SkryťTurn off ads
Article continues after video advertisement

Slovak is among a group known as low-resource languages, meaning there is a relative absence of data available to train conversational systems when compared to, say, English.

This is where the efforts of Slovak researchers from the Technical University of Košice (TUKE) come in. They have created the first manually annotated Q&A dataset in Slovak, consisting of more than 91,000 factual questions and answers from various fields, and published it free of charge.

The Slovak Spectator talked to Daniel Hládek from the Department of Electronics and Multimedia Communications at TUKE's Faculty of Electrical Engineering and Informatics about the dataset, what a machine reading process looks like, and how a neural network, or more colloquially AI, can answer questions it has not seen before.

SkryťTurn off ads
Computer scientist explains how robots see and translators translate
Related article
Computer scientist explains how robots see and translators translate

To stay up to date with what scientists in Slovakia or Slovak scientists around the world are doing, subscribe to the Slovak Science newsletter, which will be sent to readers free of charge four times a year.


Two reasons

Ever since the dawn of conversational systems such as virtual assistants, they have been used predominantly on English-language websites.

The rest of this article is premium content at Spectator.sk
Subscribe now for full access

I already have subscription -  Sign in

Subscription provides you with:

  • Immediate access to all locked articles (premium content) on Spectator.sk

  • Special weekly news summary + an audio recording with a weekly news summary to listen to at your convenience (received on a weekly basis directly to your e-mail)

  • PDF version of the latest issue of our newspaper, The Slovak Spectator, emailed directly to you

  • Access to all premium content on Sme.sk and Korzar.sk

Comments disabled
SkryťClose ad