Skip to content

Latest commit

 

History

History
51 lines (45 loc) · 6.46 KB

README.md

File metadata and controls

51 lines (45 loc) · 6.46 KB

Human-Feedback-For-AI-awesome

We would like to maintain an up-to-date list of progress (papers, blogs, codes, and etc.) made in Human Feedback For AI (LLM,Text-image and other task), and provide a guide for some of the papers that have received wide interest. Please feel free to open an issue to add papers.

  • Deep reinforcement learning from human preferences, nips'17. [paper]
  • Recursively Summarizing Books with Human Feedback, arxiv'22. [paper]
  • InstructGPT: Training Language Models to Follow Instructions With Human Feedback, nips'22. [paper] [video]
  • Fine-tuning language models to find agreement among humans with diverse preferences, nips'22. [paper]
  • Constitutional AI: Harmlessness from AI Feedback, arxiv'22. [paper]
  • Training a helpful and harmless assistant with reinforcement learning from human feedback, arxiv'22. [paper]
  • Direct Preference Optimization:Your Language Model is Secretly a Reward Model, arxiv'23. [paper] [code] [blogs]
  • RRHF: Rank responses to align language models with human feedback without tears, arxiv'23. [paper] [code] [blogs]
  • RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment, arxiv'23. [paper] [code] [blogs]
  • Fine-Grained Human Feedback Gives Better Rewards for Language Model Training, arxiv'23. [paper] [code] [blogs]
  • Fine-Tuning Language Models with Advantage-Induced Policy Alignment, arxiv'23. [paper]
  • Scaling Laws for Reward Model Overoptimization, ICLR'23. [paper]
  • Reward Collapse in Aligning Large Language Models, arxiv'23. [paper] [blogs]
  • Chain of Hindsight Aligns Language Models with Feedback, arxiv'23. [paper]
  • Principled Reinforcement Learning with Human Feedback from Pairwise or K, arxiv'23. [paper]
  • Reinforcement Learning from Diverse Human Preferences, arxiv'23. [paper]
  • Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback, arxiv'23. [paper]
  • Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization, iclr'23. [paper] [code]
  • How to Query Human Feedback Efficiently in RL? arxiv'23. [paper]
  • Pretraining Language Models with Human Preferences, icml'23. [paper]
  • Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback, arxiv'23. [paper]
  • Aligning text-to-image models using human feedback, arxiv'23. [paper] [blogs]
  • Better aligning text-to-image models with human preference, arxiv'23. [paper] [code]
  • DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models, arxiv'23. [paper]
  • ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation, arxiv'23. [paper] [code]
  • AGIQA-3K: An Open Database for AI-Generated Image Quality Assessment, arxiv'23. [paper] [code]
  • AIGCIQA2023: A Large-scale Image Quality Assessment Database for AI Generated Images: from the Perspectives of Quality, Authenticity and Correspondence, arxiv'23. [paper] [code]
  • Pick-a-Pic: An Open Dataset of User Preferences for Text-to-Image Generation, arxiv'23. [paper]
  • Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis, arxiv'23. [paper] [code]
  • Aligning human preferences with baseline objectives in reinforcement learning, icra'23. [paper]
  • Feedback-efficient interactive reinforcement learning via relabeling experience and unsupervised pre-training, icml'21. [paper]
  • Augmented Proximal Policy Optimization for Safe Reinforcement Learning, aaai'23. [paper]