tištěná kniha e-kniha

Unsupervised Machine Translation: How Machines Learn to Understand Across Languages

brožovaná, 176 str., 1. vydání
vydáno: duben 2025
ISBN: 978-80-246-6078-3
doporučená cena: 400 Kč

Anotace

For decades, machine translation between natural languages fundamentally relied on human-translated documents known as parallel texts, which provide direct correspondences between source and target sentences. The notion that translation systems could be trained on non-parallel texts, independently written in different languages, was long considered unrealistic. Fast forward to the era of large language models (LLMs), and we now know that given their sufficient computational resources, LLMs exploit incidental parallelism in their vast training data, i.e., they identify parallel messages across languages and learn to translate without explicit supervision. LLMs have since demonstrated the ability to perform translation tasks with impressive quality, rivaling systems specifically trained for translation.

This monograph explores the fascinating journey that led to this point, focusing on the development of unsupervised machine translation. Long before the rise of LLMs, researchers were exploring the idea that translation could be achieved without parallel data. Their efforts centered on motivating models to discover cross-lingual correspondences through various techniques, such as the mapping of word embedding spaces, back-translation, or parallel sentence mining. Although much of the research described in this monograph predates the mainstream adoption of LLMs, the insights gained remain highly relevant. They offer a foundation for understanding how and why LLMs are able to translate.

Sdílet