Microsoft Research features a talk by Wei Wen on Efficient and Scalable Deep Learning (slides)

In deep learning, researchers keep gaining higher performance by using larger models. However, there are two obstacles blocking the community to build larger models: (1) training larger models is more time-consuming, which slows down model design exploration, and (2) inference of larger models is also slow, which disables their deployment to computation constrained applications. In this talk, I will introduce some of our efforts to remove those obstacles. On the training side, we propose TernGrad to reduce communication bottleneck to scale up distributed deep learning; on the inference side, we propose structurally sparse neural networks to remove redundant neural components for faster inference. At the end, I will very briefly introduce (1) my recent efforts to accelerate AutoML, and (2) future work to utilize my research to overcome scaling issues in Natural Language Processing.

See more on this talk at Microsoft Research:
https://www.microsoft.com/en-us/research/video/efficient-and-scalable-deep-learning/

Yannic Kilcher investigates BERT and the white paper associated with it https://arxiv.org/abs/1810.04805

Abstract:We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT representations can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications. BERT is conceptually simple and empirically powerful. It obtains new state-of-the-art results on eleven natural language processing tasks, including pushing the GLUE benchmark to 80.4% (7.6% absolute improvement), MultiNLI accuracy to 86.7 (5.6% absolute improvement) and the SQuAD v1.1 question answering Test F1 to 93.2 (1.5% absolute improvement), outperforming human performance by 2.0%.

Regina Barzilay is a professor at MIT and a world-class researcher in natural language processing and applications of deep learning to chemistry and oncology, or the use of deep learning for early diagnosis, prevention and treatment of cancer.

She has also been recognized for her teaching of several successful AI-related courses at MIT, including the popular Introduction to Machine Learning course. This conversation is part of the Artificial Intelligence podcast run by Lex Fridman.

In this video, Lex Fridman interviews Oriol Vinyals, a senior research scientist at Google DeepMind.

From the video description:

Before that he was at Google Brain and Berkeley. His research has been cited over 39,000 times. He is one of the most brilliant and impactful minds in the field of deep learning. He is behind some of the biggest papers and ideas in AI, including sequence to sequence learning, audio generation, image captioning, neural machine translation, and reinforcement learning. He is a co-lead (with David Silver) of the AlphaStar project, creating an agent that defeated a top professional at the game of StarCraft.