OpenAI’s GPT-3 is quite the feat of AI engineering and now we have Two Minute Papers’ take on it.
How far can you go with ONLY language modeling?
Can a large enough language model perform NLP task out of the box?
OpenAI take on these and other questions by training a transformer that is an order of magnitude larger than anything that has ever been built before and the results are astounding.
Yannic Kilcher explores.
- 0:00 – Intro & Overview
- 1:20 – Language Models
- 2:45 – Language Modeling Datasets
- 3:20 – Model Size
- 5:35 – Transformer Models
- 7:25 – Fine Tuning
- 10:15 – In-Context Learning
- 17:15 – Start of Experimental Results
- 19:10 – Question Answering
- 23:10 – What I think is happening
- 28:50 – Translation
- 31:30 – Winograd Schemes
- 33:00 – Commonsense Reasoning
- 37:00 – Reading Comprehension
- 37:30 – SuperGLUE
- 40:40 – NLI
- 41:40 – Arithmetic Expressions
- 48:30 – Word Unscrambling
- 50:30 – SAT Analogies
- 52:10 – News Article Generation
- 58:10 – Made-up Words
- 1:01:10 – Training Set Contamination
- 1:03:10 – Task Exampleshttps://arxiv.org/abs/2005.14165
Computers just got a lot better at mimicking human language. Researchers created computer programs that can write long passages of coherent, original text.
Language models like GPT-2, Grover, and CTRL create text passages that seem written by someone fluent in the language, but not in the truth. That AI field, Natural Language Processing (NLP), didn’t exactly set out to create a fake news machine. Rather, it’s the byproduct of a line of research into massive pretrained language models: Machine learning programs that store vast statistical maps of how we use our language. So far, the technology’s creative uses seem to outnumber its malicious ones. But it’s not difficult to imagine how these text-fakes could cause harm, especially as these models become widely shared and deployable by anyone with basic know-how.
Christoph Henkelmann (DIVISIO) explains what sets Google’s natural language processing model BERT apart from other language models, how can a custom version version be implemented and what is the so-called ImageNetMoment?
Two Minute Papers explores OpenAI’s GPT2
Check out this GPT-2 implementation too (thanks Robert Miles for the link!) – write something, then tab, enter, tab, enter and so on: https://transformer.huggingface.co/doc/gpt2-large
OpenAI’s post: https://openai.com/blog/gpt-2-6-month-follow-up/Tweet source: https://twitter.com/gdm3000/status/1151469462614368256
Since hearing about Open AI’s decision not to release GPT-2 due to it “being too dangerous,” I have been puzzled by their decision to release their research that went into creating it. Furthermore, the idea of an organization called “Open AI” hiding their best work to date seemed off. To me, it smelled like a publicity stunt. Rob Miles suggests why it might not just be a a PR grab.
Computerphile’s Rob Miles takes a closer look at GPT-2, the AI deemed “too dangerous” by its creators to release. If you’ve not heard about it, it’s an AI that, given a bit of text to prime it, it continues writing a believable and coherent way.
Rob Miles on Language Models and Transformers, plausible text generation, how does it work, and what’s next.