Deliberate reject_messages_on_memory_ratio and evict_cache_on_memory_ratio #9743

CalvinNeo · 2024-12-25T07:51:22Z

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

2. What did you expect to see? (Required)

3. What did you see instead (Required)

4. What is your TiFlash version? (Required)

CalvinNeo · 2024-12-25T08:45:22Z

Related Issues:
#6332
#6399

After the fix of https://github.com/pingcap/tidb-engine-ext/pull/244/files. The memory management of TiFlash can be sorted into 2 kinds:

Restart case
Under this case, many raft logs could be read from disk and take lots of memory, even if there is no input. So evict_cache_on_memory_ratio will take effect.
Normal case
Under this case, a smaller reject_messages_on_memory_ratio will take effect.

There has been some observation that the reject_messages_on_memory_ratio could be too conservative.

However, after tikv/tikv#17488, TiKV would evict cache before reject_messages_on_memory_ratio is reached.

// MEMORY_HIGH_WATER_MARGIN_SPLIT_POINT is used to decide whether to use a fixed
// margin or a percentage-based margin, ensuring a reasonable margin value on
// both large and small memory machines.
const MEMORY_HIGH_WATER_MARGIN_SPLIT_POINT: u64 = 10 * 1024 * 1024 * 1024; // 10GB
const MEMORY_HIGH_WATER_FIXED_MARGIN: u64 = 1024 * 1024 * 1024; // 1GB
const MEMORY_HIGH_WATER_PERCENTAGE_MARGIN: f64 = 0.9; // Empirical value.

So on TiKV, there are two mechanisms:

reject msgappend (needs_reject_raft_append)
When global memory usage reaches water limit, and raft_msg_usage + cached_entries + applying_entries reaches reject_messages_on_memory_ratio
evict entry cache
When global memory usage near water limit(90%), and cache usage reaches evict_cache_on_memory_ratio.

And the logic for reject is

pub fn needs_evict_entry_cache(evict_cache_on_memory_ratio: f64) -> bool {
    fail_point!("needs_evict_entry_cache", |_| true);

    if evict_cache_on_memory_ratio < f64::EPSILON {
        return false;
    }

    let mut usage = 0;
    let is_near = memory_usage_reaches_near_high_water(&mut usage);
    if !is_near {
        return false;
    }

    let ec_usage = get_memory_usage_entry_cache();
    ec_usage as f64 > usage as f64 * evict_cache_on_memory_ratio
}

As a result, if the total memory usage reaches 0.9 of the high water, and the entry cache usage reaches evict_cache_on_memory_ratio, the evict could happen.

According to our current set, supposing all raft memory is introduced by cached_entries

If the total memory reaches memory_high_water, and entry cache reaches memory_high_water * 0.05, reject msgAppend
If the total memory reaches memory_high_water * 0.9, and entry cache reaches memory_high_water * 0.1, evict cache

We want reject msgAppend be used before evict cache in normal cases.

There could be case when total memory reaches memory_high_water * 0.9, and entry cache reaches memory_high_water * 0.1, so evict cache happens and reject not happen.

high water is total available memory * MEMORY_USAGE_LIMIT_RATE(75%).

CalvinNeo added the type/bug The issue is confirmed as a bug. label Dec 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deliberate reject_messages_on_memory_ratio and evict_cache_on_memory_ratio #9743

Deliberate reject_messages_on_memory_ratio and evict_cache_on_memory_ratio #9743

CalvinNeo commented Dec 25, 2024

CalvinNeo commented Dec 25, 2024 •

edited

Loading

Deliberate reject_messages_on_memory_ratio and evict_cache_on_memory_ratio #9743

Deliberate reject_messages_on_memory_ratio and evict_cache_on_memory_ratio #9743

Comments

CalvinNeo commented Dec 25, 2024

Bug Report

1. Minimal reproduce step (Required)

2. What did you expect to see? (Required)

3. What did you see instead (Required)

4. What is your TiFlash version? (Required)

CalvinNeo commented Dec 25, 2024 • edited Loading

CalvinNeo commented Dec 25, 2024 •

edited

Loading