Add ColQwen2 example #897

advay-modal · 2024-09-30T15:03:46Z

Adds a chat with RAG example using the following things:

ColQwen2 to index the docs
Qwen2-VL as a VLM
Gradio chatUI and PDF upload UI

Some things I would do if I wanted people to actually use this in prod. Curious which of these people think is worth doing

Optimize the cold start (takes 2 mins now on average)
Optimize the inference time for the chat (currently ~10s). Would try vllm, but there's an issue with VLLM and the transformers version that ColQwen2 needs, so I'd have to build vllm from source, which would increase build time?
Try to make the app use less memory (I currently need an 80gb A100, largely because though the underlying model is the same, I couldn't find a clean way to use the same underlying object for the model, and so I end up having 2 model objects)

Type of Change

New example
Example updates (Bug fixes, new features, etc.)
Other (changes to the codebase, but not to examples)

Checklist

Outside contributors

You're great! Thanks for your contribution.

charlesfrye · 2024-09-30T15:09:36Z

🚀 The docs preview is ready! Check it out here: https://modal-labs-examples--frontend-preview-7a1075b.modal.run

charlesfrye · 2024-09-30T16:23:11Z

Optimize the cold start (takes 2 mins now on average)

Shorter than 1 min would be nice. Where are we spending time here? Is it loading the models, indexing the PDFs, or something else?

Optimize the inference time for the chat (currently ~10s). Would try vllm, but there's an issue with VLLM and the transformers version that ColQwen2 needs, so I'd have to build vllm from source, which would increase build time?

I would suggest not optimizing inference time unless it's a durable improvement. I suspect vllm will resolve this issue soon, so I'd skip working on it for now.

Try to make the app use less memory (I currently need an 80gb A100, largely because though the underlying model is the same, I couldn't find a clean way to use the same underlying object for the model, and so I end up having 2 model objects)

This might be worth looking into. Getting onto A100-40s (and later L40Ses) would be really nice.

advay-modal · 2024-09-30T19:50:44Z

Cold start time is coming from loading the models

charlesfrye · 2024-10-01T19:37:08Z

🚀 The docs preview is ready! Check it out here: https://modal-labs-examples--frontend-preview-c97a29f.modal.run

charlesfrye · 2024-10-01T19:38:49Z

🚀 The docs preview is ready! Check it out here: https://modal-labs-examples--frontend-preview-e52659d.modal.run

advay-modal · 2024-10-01T21:22:41Z

Have reduced the cold start

charlesfrye · 2024-10-01T21:34:59Z

🚀 The docs preview is ready! Check it out here: https://modal-labs-examples--frontend-preview-f06a9a5.modal.run

charlesfrye · 2024-10-01T21:49:05Z

🚀 The docs preview is ready! Check it out here: https://modal-labs-examples--frontend-preview-1aa3914.modal.run

charlesfrye · 2024-10-01T21:50:32Z

🚀 The docs preview is ready! Check it out here: https://modal-labs-examples--frontend-preview-c2767f3.modal.run

charlesfrye · 2024-10-05T03:05:53Z

🚀 The docs preview is ready! Check it out here: https://modal-labs-examples--frontend-preview-64c2fd0.modal.run

erikbern · 2024-10-21T18:24:47Z

What's the status?

charlesfrye · 2024-10-21T18:36:27Z

What's the status?

@erik-dunteman to pick this one up

charlesfrye · 2024-10-25T00:25:49Z

🚀 The docs preview is ready! Check it out here: https://modal-labs-examples--frontend-preview-72f967a.modal.run

charlesfrye · 2024-10-25T04:10:53Z

🚀 The docs preview is ready! Check it out here: https://modal-labs-examples--frontend-preview-5e0cb53.modal.run

erik-dunteman · 2024-10-25T04:17:04Z

This one should be ready to go, pending the following:

CI passes
if we want to add automated testing (in the context of gradio, would need to call the gradio api directly from local entrypoint)

Changes since I took the PR over:

state no longer stored in class's "self", instead using modal dict, allowing concurrent users and horizontal scaling.
- way to ID users, using gradio state

charlesfrye · 2024-10-25T04:21:04Z

🚀 The docs preview is ready! Check it out here: https://modal-labs-examples--frontend-preview-b80fa95.modal.run

charlesfrye · 2024-10-25T04:31:09Z

Nice work! Will review quickly tomorrow.

charlesfrye · 2024-10-28T21:22:06Z

lookin good!

erik-dunteman · 2024-10-28T23:31:19Z

@charlesfrye I'd like to disable the keep_warm on this, cool if I make that one-line change?

(edit: below commit does this. Ask forgiveness, not permission)

charlesfrye · 2024-10-28T23:43:14Z

🚀 The docs preview is ready! Check it out here: https://modal-labs-examples--frontend-preview-1b7a289.modal.run

charlesfrye · 2024-10-29T05:20:47Z

Noticed that we OOMed at ~5 pages of PDF, so added batching with batch size 4. Somewhat surprising to me that the memory allocation is so high! But it's expected for this inference code.

Going past 5 pages also revealed that storing images in a Dict falls apart at a few tens of pages. I moved the image storage onto a Modal Volume.

With those enhancements, the model can now answer questions about my dissertation:

Also made some text edits and added a local_entrypoint for interfacing via the command line, as pictured above.

charlesfrye · 2024-10-29T05:31:33Z

🚀 The docs preview is ready! Check it out here: https://modal-labs-examples--frontend-preview-064ef7d.modal.run

advay-modal added 2 commits September 30, 2024 10:48

Add colpali example

88be417

Remove print statements

16d941f

advay-modal requested a review from charlesfrye September 30, 2024 15:03

advay-modal changed the title ~~Add colpali example~~ Add ColQwen2 example Sep 30, 2024

advay-modal added 4 commits October 1, 2024 15:29

reduce cold start + rename

62812ef

format

7493382

Merge branch 'main' into advay/colpali

40b672c

Format

66b296e

Cleanup

aa0bfe6

advay-modal added 2 commits October 1, 2024 17:44

Fix docs issues

66b0158

Fix docs

9f9119d

copy editing

b3ff163

erik-dunteman added 4 commits October 22, 2024 12:33

progress

f739551

stashing progress

bc3f800

session_id working

fc2f2bf

session id working, merged my branch into advays

10dea17

clean up comments

e31fd79

ruff format

fca9653

consistency

a82d9ff

Remove keep_warm

224f168

larger scale, text edits

cee7d56

charlesfrye merged commit 8cb7059 into main Oct 29, 2024
7 checks passed

charlesfrye deleted the advay/colpali branch October 29, 2024 05:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ColQwen2 example #897

Add ColQwen2 example #897

advay-modal commented Sep 30, 2024 •

edited

Loading

charlesfrye commented Sep 30, 2024

charlesfrye commented Sep 30, 2024

advay-modal commented Sep 30, 2024

charlesfrye commented Oct 1, 2024

charlesfrye commented Oct 1, 2024

advay-modal commented Oct 1, 2024

charlesfrye commented Oct 1, 2024

charlesfrye commented Oct 1, 2024

charlesfrye commented Oct 1, 2024

charlesfrye commented Oct 5, 2024

erikbern commented Oct 21, 2024

charlesfrye commented Oct 21, 2024

charlesfrye commented Oct 25, 2024

charlesfrye commented Oct 25, 2024

erik-dunteman commented Oct 25, 2024

charlesfrye commented Oct 25, 2024

charlesfrye commented Oct 25, 2024

charlesfrye commented Oct 28, 2024

erik-dunteman commented Oct 28, 2024 •

edited

Loading

charlesfrye commented Oct 28, 2024

charlesfrye commented Oct 29, 2024 •

edited

Loading

charlesfrye commented Oct 29, 2024

Add ColQwen2 example #897

Add ColQwen2 example #897

Conversation

advay-modal commented Sep 30, 2024 • edited Loading

Type of Change

Checklist

Outside contributors

charlesfrye commented Sep 30, 2024

charlesfrye commented Sep 30, 2024

advay-modal commented Sep 30, 2024

charlesfrye commented Oct 1, 2024

charlesfrye commented Oct 1, 2024

advay-modal commented Oct 1, 2024

charlesfrye commented Oct 1, 2024

charlesfrye commented Oct 1, 2024

charlesfrye commented Oct 1, 2024

charlesfrye commented Oct 5, 2024

erikbern commented Oct 21, 2024

charlesfrye commented Oct 21, 2024

charlesfrye commented Oct 25, 2024

charlesfrye commented Oct 25, 2024

erik-dunteman commented Oct 25, 2024

charlesfrye commented Oct 25, 2024

charlesfrye commented Oct 25, 2024

charlesfrye commented Oct 28, 2024

erik-dunteman commented Oct 28, 2024 • edited Loading

charlesfrye commented Oct 28, 2024

charlesfrye commented Oct 29, 2024 • edited Loading

charlesfrye commented Oct 29, 2024

advay-modal commented Sep 30, 2024 •

edited

Loading

erik-dunteman commented Oct 28, 2024 •

edited

Loading

charlesfrye commented Oct 29, 2024 •

edited

Loading