Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[catnip] Catnip runs out of memory at higher loads #862

Open
nimishwadekar opened this issue Aug 6, 2023 · 1 comment
Open

[catnip] Catnip runs out of memory at higher loads #862

nimishwadekar opened this issue Aug 6, 2023 · 1 comment
Labels
bug Something Isn't Working confirmed Issue Affects Multiple People

Comments

@nimishwadekar
Copy link

Description

Catnip panics with failed to clone mbuf every time at higher loads with multiple connections. The reason is that the DPDK mempool runs out of free memory to clone mbufs from. This has been verified using rte_mempool_avail_count(). It was tested using a custom example static page HTTP server and the Caladan client generator at 150K requests per second over 100 connections (this number is specific to my testing, but there is always a threshold value above which it runs out of memory). This issue was not occurring on commit 803c363a765ae14f3eff2a9de021eabf23f8d10c dated 30th September 2022.

The pattern stays similar every time:

The server works fine (the free memory in the mempool stays around the same) for a short while.
The free memory then starts decreasing. The decreases are always in big jumps and then they start rising slowly for a while (presumably due to free()s) before a big decrease again (another big allocation), with a net negative differential.
This continues until the mempool is out of memory and the server panics.

Note: This issue does not happen for a lesser number of connections. A possible reason could be the O(N) time complexity of wait_any(), which leads to a higher time interval between calls to poll() as the number of connections, and consequently the number of QTokens passed to wait_any(), increases, leading to a backlog of arrivals at the NIC that are polled all at once, which requires a large allocation of DPDK memory. This still does not explain why it did not occur in the aforementioned commit, so something elsewhere was probably modified that causes this memory leak.

@nimishwadekar nimishwadekar added the bug Something Isn't Working label Aug 6, 2023
@ppenna ppenna changed the title Catnip runs out of memory at higher loads [catnip] Catnip runs out of memory at higher loads Aug 14, 2023
@ppenna ppenna added the confirmed Issue Affects Multiple People label Feb 28, 2024
@anandbonde
Copy link
Contributor

Fixing issue #1330 should alleviate the memory problem. We still need to verify if there are any leaks around adding/removing to and from the queue.

@anandbonde anandbonde reopened this Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something Isn't Working confirmed Issue Affects Multiple People
Projects
None yet
Development

No branches or pull requests

3 participants