Redesiging Neural Architectures for Sequence to Sequence Learning

1 502

14.3

Microsoft Research336 тыс

Следующее

26.03.19 – 5111:04:09

Using AI in the Industry, Sprucing up the Academia

Популярные

301 день – 99 70316:22

Research Forum 2 | Keynote: The Revolution in Scientific Discovery

02.11.23 – 5051:26:11

Task Focused IR in the Era of Generative AI Workshop: Intro + Keynote

Опубликовано 26 марта 2019, 17:37

The Encoder-Decoder model with soft-attention is now the defacto standard for sequence to sequence learning, having enjoyed early success in tasks like translation, error correction, and speech recognition. In this talk, I will present a critique of various aspect of this popular model, including its soft attention mechanism, local loss function, and sequential decoding. I will present a new Posterior Attention Network for a more transparent joint attention that provides easy gains on several translation and morphological inflection tasks. Next, I will expose a little known problem of mis-calibration in state of the art neural machine translation (NMT) systems. For structured outputs like in NMT, calibration is important not just for reliable confidence with predictions, but also for proper functioning of beam-search inference. I will discuss reasons for mis-calibration and some fixes. Finally, I will summarize recent research efforts towards parallel decoding of long sequences.

See more at microsoft.com/en-us/research/v...

Свежие видео