From SqueezeNet to SqueezeBERT: Developing Efficient Deep Neural Networks

4 368

9.6

Microsoft Research335 тыс

Следующее

28.08.20 – 1 89616:54

Programming with Proofs for High-assurance Software

Популярные

328 дней – 2 6856:21

Improving Reasoning in Language Models with LASER: Layer-Selective Rank Reduction

11.11.23 – 1 30653:29

Research intern talk: Real-time single-channel speech separation in noisy & reverberant environments

Опубликовано 14 августа 2020, 20:57

Deep neural networks have been trained to interpret images and text at increasingly high levels of accuracy. In many cases, these accuracy improvements are the result of developing increasingly large and computationally-intensive neural network models. These models tend to incur high latency during inference, especially when deployed on smartphones and edge-devices. In this talk, we present two lines of work that focus on mitigating the high cost of neural network inference on edge-devices. First, we review the last four years of progress in the computer vision (CV) community towards developing efficient neural networks for edge-devices, ranging from early work such as SqueezeNet, to recent work leveraging neural architecture search. Second, we present SqueezeBERT, a mobile-optimized neural network design for natural language processing (NLP) that draws on ideas from efficient CV network design. SqueezeBERT achieves a 4.3x speedup over BERT-base on a Pixel 3 smartphone. Finally, we believe that SqueezeBERT is just the beginning of several years of fruitful research in the NLP community to develop efficient neural architectures.

See more at microsoft.com/en-us/research/v...

Свежие видео