Replies: 1 comment 1 reply
-
I am 100% with you on this -- we are trying to exploit the gradient abstraction by swapping in human feedback instead of LLM feedback. This is in our roadmap! |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
The biggest limitation that I currently see for applying this framework to my production generative AI systems is simply that LLMs are not very good judges. It is much easier to get a model to give a good output than it is to get it to reliably recognize a good output.
I think that the current implementation is fantastic for reducing the cost of use cases that are clearly within reach of AI capabilities, such as using a more expensive model to train a cheaper model to do a better job.
I am personally much more interested in use cases where we are pushing the best models to their peak performance. To do this I think there needs to be a way to incorporate human evaluation into textgrad. Even just a few optimization cycles with expert human input as the evaluation criteria could quickly unlock novel capabilities that may have been previously undiscovered.
In terms of implementation , I think this could be as simple as incorporating a human input field, which the "teacher" would incorporate as context into the optimization step. It should be fairly straightforward to prompt the teacher to defer to the human evaluation, instead of making up its own.
Obviously, this approach doesn't scale very well, But again, I think that a few optimization steps with thoughtful human expert feedback could really unlock huge gains for many real world applications.
Beta Was this translation helpful? Give feedback.
All reactions