Large-Context Models for Large-Scale Machine Translation

Microsoft Research334 тыс

Следующее

17.08.16 – 1911:08:18

Coming to Grips with Complexity in Computer-Aided Verification

Популярные

82 дня – 12 49413:34

Research Forum 4 | Keynote: Phi-3-Vision: A highly capable and "small" language vision model

258 дней – 7626:15

Generative AI and Plural Governance: Mitigating Challenges and Surfacing Opportunities

Опубликовано 17 августа 2016, 21:06

Statistical machine translation systems generate their output by stitching together fragments of example translations. Two trends are fueling rapid progress in this field: more example data, and new modeling techniques that better exploit the information in the data. In particular, today's massive data sets allow our statistical models to capture larger linguistic contexts than ever before. In this talk, I will give a tour of the three stages of a modern system: training a model, searching for translations, and selecting one. For each stage, I will highlight innovations that have enabled us to leverage the rich patterns contained in large data sets. The first stage of translation discovers how two languages correspond to each other. Models of correspondence have historically bottomed out in word-to-word statistics. The approach I will describe centers instead on statistics over multi-word phrases, which can capture idiomatic and non-literal translation patterns. These patterns are acquired automatically using nonparametric statistical machinery that scales up naturally with the data, introducing additional context whenever there is sufficient evidence to support it. The second stage searches for translations that are scored highly by a model. As our models grow in size and complexity with the data, so does the scale of this search problem. I will present a coarse-to-fine approach to managing this complexity, which uses simpler approximate models to guide and constrain the full-scale search. This kind of multi-pass inference is proving to be a powerful general tool for deploying language processing systems at scale. The final stage selects a single output translation from a set of high-scoring candidates. The consensus framework I will introduce selects a translation with high agreement among the multitude of strong candidates. Theoretically, this approach unifies two distinct translation problems: selecting final outputs and combining multiple systems together. Empirically, this work has set new performance records for two of the world's most successful large-scale, highly distributed translation systems.

Свежие видео