последние час день неделя месяц всё

Make some noise: Teaching the language of audio to an LLM using sound tokens

841

12.7

Microsoft Research356 тыс

Следующее

279 дней – 2 5061:05:30

How I became a StoryTeller (and how YOU can too)

Популярные

58 дней – 44752:29

CROSS — Leveraging AI ASICs for Homomorphic Encryption

2 дня – 8 8680:13

Scaling Clean Energy With Every Wave

Опубликовано 28 июля 2025, 19:56

August 22, 2024
Speakers: Shivam Mehta
Host: Hannes Gamper

We investigate the use of low bitrate causal quantized audio representations to fine-tune large language models (LLMs) using LoRA for comprehending and generating audio. Differing from earlier approaches that depend on continuous audio representations for audio comprehension, our attempt involves learning a discretized language of audio through a causal variational quantization leading to an ultra-low bitrate of 0.293 kbps. These proposed audio tokens are then utilized to fine-tune the Llama 7b model for multimodal tasks involving audio understanding and generation. By treating audio as a language with a similar left-to-right inductive bias, we can leverage these tokens to train a multimodal model and conduct qualitative multimodal analysis.

Свежие видео

6 дней – 5412:09

Chart your path to compliant and resilient AI with Google Workspace

9 дней – 214 7853:34:25

Linux’s Biggest Win Yet - WAN Show April 24, 2026

10 дней – 3100:24

DOOGEE B10 | Seamless Speed, Smooth Control

10 дней – 11 5290:31

How @SypherPK's gaming career started on Windows

10 дней – 15 8980:55

Google's AI Music App is INSANE!

182 дня – 36510:33

Nokia Core Talk: In Discussion: Nokia Core & Autonomous Networks with Red Hat OpenShift

Случайные видео

31.01.25 – 8 7880:32

How can developers get started with AI?

17.01.25 – 92 5218:49

Nintendo Switch 2 - 15 THINGS You Missed!

28.08.24 – 57 1190:30

Go Like You Know with Google Maps

12.05.24 – 855 9500:49

Nintendo Switch 2 - NEW Leaks!

15.10.08 – 19 4804:53

Google Apps, Astadia, Cloud Computing

32 дня – 6 3849:32

Sony ULT Tower 9 – I’m Impressed

3 дня – 3 82617:00

Transform Kde Plasma 6.6 Look Like Macos On Cachyos

4 дня – 2 3150:29

Gemini in Chrome can nail those hands-free, summarized tutorials & guides. 💅

6 дней – 1 1488:30

How Chrome deprecates and removes features

10 дней – 9 3862:58:21

Next ‘26 Developer keynote deep-dive

11 дней – 4301:31

Scaling clinical insights at Labcorp with AWS HealthLake | Amazon Web Services

11 дней – 18 9911:36

We’re introducing Workspace Intelligence.

2 дня – 4 2030:30

How to control information context in Gemini in Android Studio’s

3 дня – 9 84211:15

Building with Gemini Embedding 2: Our first natively multimodal embedding model

авто техно музыка детское

Последние техно видео О рейтинге Добавить канал English