Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High memory consumption #3427

Open
arouene opened this issue Dec 11, 2024 · 6 comments
Open

High memory consumption #3427

arouene opened this issue Dec 11, 2024 · 6 comments

Comments

@arouene
Copy link

arouene commented Dec 11, 2024

Hello,

We have a Danswer host installed with docker. The host have 64GB of RAM, but vespa keeps getting killed for OOM. And then while it is restarted, the backend cannot reach again for vespa.

Memory:

# free -h
               total        used        free      shared  buff/cache   available
Mem:            62Gi        42Gi       2.6Gi       148Mi        18Gi        20Gi
Swap:             0B          0B          0B

Vespa getting killed for OOM:

[Tue Dec 10 17:20:04 2024] oom_reaper: reaped process 163083 (vespa-proton-bi), now anon-rss:0kB, file-rss:208kB, shmem-rss:0kB

Backend cannot reach vespa anymore:

ERROR:    12/11/2024 09:29:58 AM       handle_regular_answer.py  269: [Channel ID: D07EXAZES4U] Unable to process message - did not successfully answer in 5 attempts
Traceback (most recent call last):
  File "/app/danswer/document_index/vespa/chunk_retrieval.py", line 303, in query_vespa
    response.raise_for_status()
  File "/usr/local/lib/python3.11/site-packages/httpx/_models.py", line 761, in raise_for_status
    raise HTTPStatusError(message, request=request, response=self)
httpx.HTTPStatusError: Server error '503 Service Unavailable' for url 'http://index:8081/search/'

If I restart the backend container it's working again.

Is it normal for Danswer to require more than 64GB of memory?

@rkuo-danswer
Copy link
Contributor

Hi Aroune, depends entirely on the number and size of the documents you are indexing. Could you provide us with some more context?

@arouene
Copy link
Author

arouene commented Jan 7, 2025

Hello, thanks for your interest in this boring oom problem !

Here is a screenshot of our connectors
image

Web connectors are for internals websites, kind of blog / forum / frontpages websites.

Is that helpful?

@rkuo-danswer
Copy link
Contributor

That is definitely a lot of docs, it may be necessary to add RAM to support that. You can get a better feel for this by inspecting the memory usage of Vespa via the metrics endpoint.

https://docs.vespa.ai/en/operations/metrics.html
https://stackoverflow.com/questions/68014005/which-all-metric-to-trace-to-determine-if-the-resource-needs-to-be-added

@arouene
Copy link
Author

arouene commented Jan 10, 2025

Ok, so I as understand it seems pretty legit.
Thanks for the docs, I will take a peek !

@JonnyPower
Copy link

@rkuo-danswer we are experiencing a similar issue, is it expected that the index container's memory usage grows proportionately to number of indexed documents?

that seems like a poor design decision if that's the case - acts more like a memory leak. isn't the Vespa database meant to prevent a need for loading every document in ram?

@rkuo-danswer
Copy link
Contributor

@rkuo-danswer we are experiencing a similar issue, is it expected that the index container's memory usage grows proportionately to number of indexed documents?

that seems like a poor design decision if that's the case - acts more like a memory leak. isn't the Vespa database meant to prevent a need for loading every document in ram?

Not really ... in fact being in memory is a key component of being able to perform similarity searches across documents quickly. There are probably some significant optimizations we can apply here, but generally speaking this is expected behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants