Mechanics and Evolutionary Process of LLMs
📌 Recommended Reference Materials for Those Who Want to Understand LLMs
Introduction
In recent years, Large Language Models (LLMs) like ChatGPT have been gaining a lot of attention. We believe there are many people who want to start learning about LLMs from scratch. We have compiled a list of reference materials (books, papers, etc.) that we recommend for such individuals. We hope this will be of some help.
The image below was generated by DALL-E, representing the history of LLMs.
Structure
We think it's easier to understand the evolution of LLMs if you study it in the following stages:
- Basics of Deep Learning: Essential for progressing to the next stages.
- Before Transformer: About NLP (Natural Language Processing) and language models before the development of the Transformer.
- Transformer: The mechanics of the Transformer and its representative language models.
- Chatbot: The evolution of language models and the realization of Chatbots.
- LLM: The evolution and challenges of LLMs.
Reference Materials
Basics of Deep learning
Before Transformer
- Efficient Estimation of Word Representations in Vector Space (Word2vec)
- Sequence to Sequence Learning with Neural Networks
Transformer
- End-To-End Memory Networks
- Attention Is All You Need
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Chatbot
- Scaling Laws for Neural Language Models
- Language Models are Few-Shot Learners
- Improving Language Understanding by Generative Pre-Training
- Language Models are Unsupervised Multitask Learners
- BlenderBot 3: a deployed conversational agent that continually∗ learns to responsibly engage
LLM
- A Survey of Large Language Models
- LaMDA: Language Models for Dialog Applications
- PaLM: Scaling Language Modeling with Pathways
- Improving alignment of dialogue agents via targeted human judgements
- LLaMA: Open and Efficient Foundation Language Models
- PaLM-E: An Embodied Multimodal Language Model
- On the Opportunities and Risks of Foundation Models
- GPT-4
- Llama 2: Open Foundation and Fine-Tuned Chat Models