CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.
-
Updated
Mar 17, 2024 - Python
CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.
This is the official code for the paper CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning (NeurIPS22).
🔥 🔥 🔥Open Source & AI driven Data Onboarding Platform:Free flatfile.com alternative
[NeurIPS'22 Spotlight] A Contrastive Framework for Neural Text Generation
高性能小模型测评 Shared Tasks in NLPCC 2020. Task 1 - Light Pre-Training Chinese Language Model for NLP Task
Calculate perplexity on a text with pre-trained language models. Support MLM (eg. DeBERTa), recurrent LM (eg. GPT3), and encoder-decoder LM (eg. Flan-T5).
Embeddings: State-of-the-art Text Representations for Natural Language Processing tasks, an initial version of library focus on the Polish Language
Code for "Semi-supervised Formality Style Transfer using Language Model Discriminator and Mutual Information Maximization"
A project that harnesses the Stanford NLP library to gauge sentiment from provided text via an intuitive graphical interface.
The PreTENS shared task hosted at SemEval 2022 aims at focusing on semantic competence with specific attention on the evaluation of language models with respect to the recognition of appropriate taxonomic relations between two nominal arguments (i.e. cases where one is a supercategory of the other, or in extensional terms, one denotes a superset…
A 78.5% word sense disambiguator based on Transformers and RoBERTa (PyTorch)
translatorlab: a machine translation tool that uses artificial intelligence models to provide accurate and fast translations between different languages
TurkishBERTweet: Fast and Reliable Large Language Model for Social Media Analysis
The project generates a sentence given a pre-defined starting phrase from the user such as "Ilbierah kont" and the script attempts to build a sentence off of that phrase. Structurally, the generator works in an n-gram fashion but the main structures used to generate the sentences were the unigram, bigram and trigram. The perplexity for each n-gr…
The PowerShell Random Text Generator is a script that generates random text based on a given model.
Simple next word prediction model from scratch, implemented using only numpy.
Personality test which classifies in four personality types. For the classification is used the natural language processing classification algorithm - Multinomial Naive-Bayes.
Add a description, image, and links to the languagemodel topic page so that developers can more easily learn about it.
To associate your repository with the languagemodel topic, visit your repo's landing page and select "manage topics."