Yannic Kilcher explores a recent innovation at Facebook

Code migration between languages is an expensive and laborious task. To translate from one language to the other, one needs to be an expert at both. Current automatic tools often produce illegible and complicated code. This paper applies unsupervised neural machine translation to source code of Python, C++, and Java and is able to translate between them, without ever being trained in a supervised fashion.

Paper: https://arxiv.org/abs/2006.03511

Content index:

  • 0:00 – Intro & Overview
  • 1:15 – The Transcompiling Problem
  • 5:55 – Neural Machine Translation
  • 8:45 – Unsupervised NMT
  • 12:55 – Shared Embeddings via Token Overlap
  • 20:45 – MLM Objective
  • 25:30 – Denoising Objective
  • 30:10 – Back-Translation Objective
  • 33:00 – Evaluation Dataset
  • 37:25 – Results
  • 41:45 – Tokenization
  • 42:40 – Shared Embeddings
  • 43:30 – Human-Aware Translation
  • 47:25 – Failure Cases
  • 48:05 – Conclusion

How far can you go with ONLY language modeling?

Can a large enough language model perform NLP task out of the box?

OpenAI take on these and other questions by training a transformer that is an order of magnitude larger than anything that has ever been built before and the results are astounding.

Yannic Kilcher explores.

Paper

Time index:

  • 0:00 – Intro & Overview
  • 1:20 – Language Models
  • 2:45 – Language Modeling Datasets
  • 3:20 – Model Size
  • 5:35 – Transformer Models
  • 7:25 – Fine Tuning
  • 10:15 – In-Context Learning
  • 17:15 – Start of Experimental Results
  • 19:10 – Question Answering
  • 23:10 – What I think is happening
  • 28:50 – Translation
  • 31:30 – Winograd Schemes
  • 33:00 – Commonsense Reasoning
  • 37:00 – Reading Comprehension
  • 37:30 – SuperGLUE
  • 40:40 – NLI
  • 41:40 – Arithmetic Expressions
  • 48:30 – Word Unscrambling
  • 50:30 – SAT Analogies
  • 52:10 – News Article Generation
  • 58:10 – Made-up Words
  • 1:01:10 – Training Set Contamination
  • 1:03:10 – Task Exampleshttps://arxiv.org/abs/2005.14165
    https://github.com/openai/gpt-3

MIT Introduction to Deep Learning 6.S191: Lecture 6 with Ava Soleimany.

Subscribe to stay up to date with new deep learning lectures at MIT, or follow us @MITDeepLearning on Twitter and Instagram to stay fully-connected!!

Lecture Outline

  • 0:00 – Introduction
  • 0:58 – Course logistics
  • 3:59 – Upcoming guest lectures
  • 5:35 – Deep learning and expressivity of NNs
  • 10:02 – Generalization of deep models
  • 14:14 – Adversarial attacks
  • 17:00 – Limitations summary
  • 18:18 – Structure in deep learning
  • 22:53 – Uncertainty & bayesian deep learning
  • 28:09 – Deep evidential regression
  • 33:08 – AutoML
  • 36:43 – Conclusion

This tutorial on TensorFlow.org implements a simplified Quantum Convolutional Neural Network (QCNN), a proposed quantum analogue to a classical convolutional neural network that is also translationally invariant.

Wow.

This tutorial implements a simplified Quantum Convolutional Neural Network (QCNN), a proposed quantum analogue to a classical convolutional neural network that is also translationally invariant . This example demonstrates how to detect certain properties of a quantum data source, such as a quantum sensor or a complex simulation from […]

Lex Fridman shared this lecture by Vivienne Sze in January 2020 as part of the MIT Deep Learning Lecture Series.

Website: https://deeplearning.mit.edu
Slides: http://bit.ly/2Rm7Gi1
Playlist: http://bit.ly/deep-learning-playlist

LECTURE LINKS:
Twitter: https://twitter.com/eems_mit
YouTube: https://www.youtube.com/channel/UC8cviSAQrtD8IpzXdE6dyug
MIT professional course: http://bit.ly/36ncGam
NeurIPS 2019 tutorial: http://bit.ly/2RhVleO
Tutorial and survey paper: https://arxiv.org/abs/1703.09039
Book coming out in Spring 2020!

OUTLINE:
0:00 – Introduction
0:43 – Talk overview
1:18 – Compute for deep learning
5:48 – Power consumption for deep learning, robotics, and AI
9:23 – Deep learning in the context of resource use
12:29 – Deep learning basics
20:28 – Hardware acceleration for deep learning
57:54 – Looking beyond the DNN accelerator for acceleration
1:03:45 – Beyond deep neural networks

Here’s my talk from the Azure Data Fest Philly 2020 last week!

Neural networks are an essential element of many advanced artificial intelligence (AI) solutions. However, few people understand the core mathematical or structural underpinnings of this concept. In this session, learn the basic structure of neural networks and how to build out a simple neural network from scratch with Python.Neural networks are an essential element of many advanced artificial intelligence (AI) solutions. However, few people understand the core mathematical or structural underpinnings of this concept. In this session, learn the basic structure of neural networks and how to build out a simple neural network from scratch with Python.

What is the universal inference engine for neural networks?

Microsoft Research just posted this video exploring ONNX.

Tensorflow? PyTorch? Keras? There are many popular frameworks out there for working with Deep Learning and ML models, each with their pros and cons for practical usability for product development and/or research. Once you decide what to use and train a model, now you need to figure out how to deploy it onto your platform and architecture of choice. Cloud? Windows? Linux? IOT? Performance sensitive? How about GPU acceleration? With a landscape of 1,000,001 different combinations for deploying a trained model from some chosen framework into a performant production environment for prediction, we can benefit from some standardization.