LLM

A Not-So-Large-Yet Language Model.

This is a decoder-only generative transformer written from scratch

To run :

Make a gpu enabled env/venv.
Clone this repo
pip install requirements.txt
Check the run.py file for general training and sampling example(s).

You can find the link to the poems dataset here

DataLoader class implements functions to load your own txt file, merge all txt files in a source directory into a single file and split an input data file into train and val files. As for the tokenizer, currently the model only supports character level ascii mapping and OpenAI's tiktoken BPE tokenizer, which works on a sub-word level.

Additionally, If you're a stranger and found your way here, I assume you're here because you're interested in language models or all things NLP. Instead of a readme, I think you should explore the code yourself, which could help in getting a better understanding of how the model works :). Cheers!

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
README.md		README.md
config.json		config.json
dataloader.py		dataloader.py
decoder.py		decoder.py
feedforward.py		feedforward.py
gpt.py		gpt.py
model.py		model.py
multiheadattention.py		multiheadattention.py
poems.txt		poems.txt
requirements.txt		requirements.txt
run.py		run.py
selfattention.py		selfattention.py
tokenizer.py		tokenizer.py
train_data.txt		train_data.txt
val_data.txt		val_data.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM

About

Releases

Packages

Languages

AdarshAcharya5/LLM

Folders and files

Latest commit

History

Repository files navigation

LLM

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages