Half Ideas – Startups and Entrepreneurship takes a closer look at GPT-3 and what it means for AI.

GPT 3 can write poetry, translate text, chat convincingly, and answer abstract questions. It’s being used to code, design and much more. I’ll give you a demo of some of the latest in this technology and some of how it works.

GPT 3 has been developed for a number of years. One of the early papers published was on Generative Pre-Training. The idea behind generative pre-training (GPT) is that while most AI’s are trained on labeled data, there’s a ton of data that isn’t labeled. If you can evaluate the words and use them to train and tune the AI it can start to create predictions of future text on the unlabeled data. You repeat the process until predictions start to converge.   

How far can you go with ONLY language modeling?

Can a large enough language model perform NLP task out of the box?

OpenAI take on these and other questions by training a transformer that is an order of magnitude larger than anything that has ever been built before and the results are astounding.

Yannic Kilcher explores.

Paper

Time index:

  • 0:00 – Intro & Overview
  • 1:20 – Language Models
  • 2:45 – Language Modeling Datasets
  • 3:20 – Model Size
  • 5:35 – Transformer Models
  • 7:25 – Fine Tuning
  • 10:15 – In-Context Learning
  • 17:15 – Start of Experimental Results
  • 19:10 – Question Answering
  • 23:10 – What I think is happening
  • 28:50 – Translation
  • 31:30 – Winograd Schemes
  • 33:00 – Commonsense Reasoning
  • 37:00 – Reading Comprehension
  • 37:30 – SuperGLUE
  • 40:40 – NLI
  • 41:40 – Arithmetic Expressions
  • 48:30 – Word Unscrambling
  • 50:30 – SAT Analogies
  • 52:10 – News Article Generation
  • 58:10 – Made-up Words
  • 1:01:10 – Training Set Contamination
  • 1:03:10 – Task Exampleshttps://arxiv.org/abs/2005.14165
    https://github.com/openai/gpt-3