Microsoft Research356 тыс
Следующее
Опубликовано 3 июня 2025, 16:32
Next-token prediction trains a language model on all tokens in a sequence. VP Weizhu Chen discusses his team’s 2024 NeurIPS paper on how distinguishing between useful and “noisy” tokens in pretraining can improve token efficiency and model performance.
Show notes: microsoft.com/en-us/research/p...
Listen to the Abstracts series: microsoft.com/en-us/research/p...
Show notes: microsoft.com/en-us/research/p...
Listen to the Abstracts series: microsoft.com/en-us/research/p...
Свежие видео
Случайные видео





















