Skip to content

Latest commit

 

History

History
64 lines (39 loc) · 3.18 KB

README.md

File metadata and controls

64 lines (39 loc) · 3.18 KB

Learning the Latent structure in LLMs

Large Language Model

We will be learning what are large language models and how they can be used as knowledge bases. At the end we will be building and training a BERT and a miniGPT entirely from scratch though our miniGPT won't be as powerful as the GPT models out there, we will learn how we can improve our model and some of the techniques that can be used to align them towards instructions at the very end of the project, if time permits :). The aim of the project is to make you so well versed in LLMs that you can build and train one from scratch on the go.

Prerequisites

Basics of probability theory, statistical machine learning and python

Week wise distribution of content

  1. Gentle Introduction to NLP with word2vec, word embeddings, Distributional sementics
  2. Introduction to pytorch and neural networks, convolutional layers and pooling, building cnns and training them on dummy datasets
  3. Text classification, building generative and discriminative models
  4. Language Modeling, N gram LMs, Neural LMs, evaluating LMs
  5. Building encoder-decoder models, autoencoder and inferencing
  6. Introducing attention(Transformer: Attention is All You Need) in encoder-decoders, building a transformer from scratch, Seq2Seq
  7. Transfer learning, replacing pre trained word embeddings in GPT and BERT
  8. Building and training BERT and miniGPT in pytorch from scratch

Checkpoints

  • building and training cnns in pytorch
  • building and training a bayesian classifier
  • building and training autoencoders for image generation
  • building and training a transformer from scratch
  • building and training BERT and miniGPT in pytorch from scratch

Resources

To get started with python : https://docs.python.org/3.11/tutorial/index.html

Python and Numpy Tutorial : https://cs231n.github.io/python-numpy-tutorial/