Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated feedback loop #75

Open
1 task
luarss opened this issue Oct 21, 2024 · 28 comments
Open
1 task

Automated feedback loop #75

luarss opened this issue Oct 21, 2024 · 28 comments
Assignees

Comments

@luarss
Copy link
Collaborator

luarss commented Oct 21, 2024

Think about an automatic mechanism in which we can use the feedback we get from the users back in our evaluation dataset.

Use it as part of our dataset ground truth (good or bad )

Tasks

For this, we need to institute more metrics (tracked in #18 )

Explicit feedback

  • Thumbs up/ thumbs down. But this is rare.

Implicit feedback

  • Inferred from user behavior, such as response time, follow-up questions, or even actions like copying and pasting responses. Implicit feedback is far more abundant and can provide valuable insights into how well the LLM is performing
  1. https://towardsdatascience.com/how-to-make-the-most-out-of-llm-production-data-simulated-user-feedback-843c444febc7
  2. https://www.nebuly.com/blog/llm-feedback-loop
@Kannav02
Copy link
Collaborator

Hey @luarss!

how would you want the feedback to be stored, do you want a feeder pipeline, that feeds back into the evaluation dataset and then based on that each subsequent prompt answer is influenced by these metrics

do you also want the thumbs up and thumbs down feature to be implemented, I believe the functionality would be similar to how LLMs like chatGPT take the feedback, like once about 100 prompts maybe?

If those are the requirements can I work on this issue?

thank you!

@luarss
Copy link
Collaborator Author

luarss commented Oct 28, 2024

Hi @Kannav02, thanks for your interest.

We have some implementation of the thumbs up/down interface in this PR: #61, done by Aviral (@error9098x). That should be merged soon.

What we need is a convenient way to store and access this feedback programmatically. Currently we use Google Sheets but that isn't the most programming friendly database. This feedback could be thumbs up/down, or in the form of actual text (suggestions). Happy to hear your suggestions on this.

We would need to store minimally: Question, Response, Sources, Date/Time, Thumbs up/down, Feedback, Conversation history

@Kannav02
Copy link
Collaborator

Since the data is pretty structured, I believe something like this could be stored within A SQL based database, maybe PostgreSQL,

I would go with MongoDB as well since we would consider scalability considering the amount of prompt feedback we will have, but the idea again is it's schemaless, I mean we could still work around this by enforcing it in the code itself,

Initially, when I saw this problem I thought maybe Redis could be a game-changer given the fast-query capabilities, but again the fields here are more and I believe it would rather just complicate our stuff

In the end I believe it boils down to PostgreSQL and MongoDB, if it is just for the prompt services and feeding it back to AI, I would go with MongoDB but if you still have its use in other cases we might have to consider the consistency in queries across the codebase, so then maybe PostgreSQL

@luarss
Copy link
Collaborator Author

luarss commented Oct 29, 2024

I think MongoDB might be preferable since our feedback might evolve over time and schemas could change. A nice checkpoint could be DB hosting and basic web UI for filtering, querying and sorting.

I know postgres is relatively easy to set up, but have no experience with mongo. How is the setup and maintenance like?

@Kannav02
Copy link
Collaborator

Postgres is good when we have laid down schemas however we can have migrations in that as well meaning even if you change the fields or columns, an ORM like 'prisma' would help us with migrations , but again seeing the requirements of this feature now I believe MongoDB would be the right choice

If you say so I can start working on the hosting part as for the UI, what would you expect from that, do you want an interactive dashboard kind of UI to interact with MongoDB?

Thank you!

@luarss
Copy link
Collaborator Author

luarss commented Oct 29, 2024

Some milestones:

  1. Interactive web UI
  2. Integration with ORAssistant. Pending PR merge, or you can try to do a small prototype with our existing streamlit code in frontend/
  3. Export dataset using python

That would be a good amount for a PoC, then we can decide to go further with mongodb/postgres or not.

Maybe this will work? https://github.com/mongo-express/mongo-express

@Kannav02
Copy link
Collaborator

So right now you wouldn't want this issue to focus on DB integration before the milestones right?

And as for the Web UI, what would you want in that, is this related to the DB operations?

I can start working on these milestones if you want me to

Thank you!

@luarss
Copy link
Collaborator Author

luarss commented Oct 31, 2024

DB integration can be done in hand with the web UI, otherwise the web UI will not show any meaningful information. I would suggest actually spending more time in the DB integration, and use something pre-built frontend like mongo-express that we can run as a docker container.

Yes, you can start working on it :)

@Kannav02
Copy link
Collaborator

sure i will start working on this, I will keep on posting regular updates on this issue

@Kannav02
Copy link
Collaborator

Kannav02 commented Nov 4, 2024

Hey @luarss,

Hope you're doing well,

I just had a question regarding the GOOGLE_APPLICATION_CREDENTIALS key in the backend folder, what is this supposed to do, is it the one that provides access to service accounts, is there a guide for the same that I could follow ?

Thank you!

@error9098x
Copy link
Collaborator

Hi @Kannav02, currently we are using GOOGLE_APPLICATION_CREDENTIALS to access a service account that grants us access to use Gemini without the rate limit that is imposed when using AI Studio platform. I hope this clarifies things.

@Kannav02
Copy link
Collaborator

Kannav02 commented Nov 4, 2024

That's perfect, I appreciate your help

Thank you!

@Kannav02
Copy link
Collaborator

Kannav02 commented Nov 8, 2024

Just a quick update on this, I am still integrating the DB, I have a follow up for the same, would you want a single Big PR comprising all the features or Individual PR's for different features referencing this issue?

@luarss
Copy link
Collaborator Author

luarss commented Nov 8, 2024

Individual, feature-based PRs will be preferred and easier to review

@Kannav02
Copy link
Collaborator

Hey, so I was making changes and testing out my mongoDB changes using the mock server , but when I switched to the main backend I am getting this error, is this the expected behavior right now with backend or should I look into something else?

Screenshot 2024-11-11 at 1 36 22 AM

@luarss
Copy link
Collaborator Author

luarss commented Nov 11, 2024

The main backend won't work because you need the google credentials. Streamlit is entirely front-end based so you should be okay to proceed with the mock backend(to silence the errors)

@Kannav02
Copy link
Collaborator

I did get the google credentials from my end, I tried to then run it using docker compose given all my google credentials, it failed at first due to some IAM issues but I fixed it, the backend was up with docker-compose, is it supposed to work with a certain google project/credentials?

Thank you for your support

@luarss
Copy link
Collaborator Author

luarss commented Nov 11, 2024

This is an example json credential you need. On top of this you also need Vertex AI enabled for this credential, I can supply it to you once we are past the prototype stage.

Can i get your email please? You can drop me a PM

image

@Kannav02
Copy link
Collaborator

Thank you for the help

I did download the credentials file from my google account for the service account, I have given the access to Vertex AI for backend and then sheets API for the frontend, I had a couple of free credits so I decided to use this, but the backend didn't respond to the requests from frontend.

I'll PM you as well

Once again thank you for the support!

@luarss
Copy link
Collaborator Author

luarss commented Nov 11, 2024

Did you perhaps put the credentials in the correct directory?

For docker compose, you would have to ensure the file is called backend/src/secret.json

@Kannav02
Copy link
Collaborator

I believe the credentials were in the right directory as the backend did compile and start to run, I have attached an image, its just the frontend was sending requests to an API endpoint that apparently doesn't exist on the backend as I get 404 not found error in the terminal

Screenshot 2024-11-11 at 11 52 26 AM

@luarss
Copy link
Collaborator Author

luarss commented Nov 11, 2024

I see, the streamlit is outdated and calling the old backend: https://github.com/The-OpenROAD-Project/ORAssistant/blob/master/backend/src/api/routers/chains.py

You can update it as part of the PR.

@Kannav02
Copy link
Collaborator

sure, I'll open a PR for this by today,

Its really wonderful to see minor inconsistencies causing code breaks lol,

Thank you for your help on this, I appreciate it

@Kannav02
Copy link
Collaborator

I found the issue for the request being sent, apparently in the codebase where the routers were being added in the main.py file in the API folder, the router for chains wasn't being added , so I have included it now,

on another note should we also include the functionality of bind mounts for the docker file, I have found this issue that if I make any changes, I have to restart the docker-compose again, and the docker container takes about 10-15 minutes to setup for my machine, would this be a useful addition, considering the container would respond to any changes made locally on the machine?

@luarss
Copy link
Collaborator Author

luarss commented Nov 12, 2024

@Kannav02 We excluded it because in production we only have 1 chain running, the agent-retriever.

For bind mounts, I am fine with it as long as it is a separate target (maybe make-docker-dev) and a separate docker-compose file if needed

@Kannav02
Copy link
Collaborator

Sure I will make a separate docker-compose file for the same,

Just a quick update as well, I have finally tested the entire flow of getting the feedback to Google Sheets, I am uploading an image for your reference

Screenshot 2024-11-12 at 1 57 00 PM

Also to follow up on the previous point, would the chains/listAll endpoint be used later on?, as there are endpoints in it that are still not implemented like the chains/ensemble, should I still include it in my changes

@luarss
Copy link
Collaborator Author

luarss commented Nov 13, 2024

It could be used later on, but for now you may modify frontend to do your tests.

@Kannav02
Copy link
Collaborator

hey , so sorry for the late update, I was just busy with school , but I have now opened a PR, which gives an idea of how the schema would, work, do let me know if you want me to make any changes to that

when the PR is approved, I will start working on integrating and migrating from Google Sheets, and optimize the design and the solution as well

Just a side note:
would you still want the google sheet functionality to be there to support previous feedbacks, or once the mongoDb is updated do you want me to remove the entire functionality of google sheets?

The PR linking to this comment is #100

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants