Digital question-and-answer systems in English work just fine: you type in the former and out comes the latter, in more-or-less perfectly grammatical form. In fact, the linguistic abilities of emergent technologies like generative AI are among their most remarkable aspects for users of major languages. However, the same cannot be truly said about systems based on less widely spoken languages, such as Slovak.
Slovak is among a group known as low-resource languages, meaning there is a relative absence of data available to train conversational systems when compared to, say, English.
This is where the efforts of Slovak researchers from the Technical University of Košice (TUKE) come in. They have created the first manually annotated Q&A dataset in Slovak, consisting of more than 91,000 factual questions and answers from various fields, and published it free of charge.
The Slovak Spectator talked to Daniel Hládek from the Department of Electronics and Multimedia Communications at TUKE's Faculty of Electrical Engineering and Informatics about the dataset, what a machine reading process looks like, and how a neural network, or more colloquially AI, can answer questions it has not seen before.

To stay up to date with what scientists in Slovakia or Slovak scientists around the world are doing, subscribe to the Slovak Science newsletter, which will be sent to readers free of charge four times a year.
Two reasons
Ever since the dawn of conversational systems such as virtual assistants, they have been used predominantly on English-language websites.