Reindexing performance degrades non-linearly #2896

bhalsey · 2024-10-09T16:37:49Z

Reindexing performance degrades non-linearly

Description

A test instance with 460M entities took over 4 days to complete on a 3 node cluster running FusionAuth 1.52.1. The performance slowed significantly after 300M entities were indexed. More work is required to clearly identify the bottleneck. And more work is required to find mitigations, such as doubling the cluster size.

robotdan · 2024-10-11T17:03:37Z

There is not a lot of information here. Is the thesis that this is an problem with FusionAuth, or that we just need to scale Elasticsearch and the relational database adequately?

Ideally we would only open public GH issues for things that need work in FusionAuth.

mooreds · 2024-10-11T18:14:31Z

The servers were all adequately sized: XLs, 1.5TB of disk.

My take is that it is weird that the re-index got exponentially slower. This thread, however, indicates that it is possible it was due to disk i/o: https://discuss.elastic.co/t/reindexing-throughput-degrades-over-time/265279

I couldn't figure out a way to see disk queue depth for the ES nodes, but maybe that would help determine if this was an infra problem.

More supposition here: https://inversoft.slack.com/archives/C051S8N8E/p1728071443879379

So I guess I think this does need some investigation to determine if there are any changes needed to the core product (which, after all, is what controls the re-indexing process).

bhalsey · 2024-10-11T19:22:55Z

Agreed that more work is needed. The outcome could simply be guidance on sizing a cluster. Or it could entail changes to how FusionAuth manages reindexing, such as modifying the index refresh interval.

bhalsey added performance elasticsearch labels Oct 9, 2024

mooreds added the scale label Oct 9, 2024

robotdan added the needs more info label Oct 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reindexing performance degrades non-linearly #2896

Reindexing performance degrades non-linearly #2896

bhalsey commented Oct 9, 2024

robotdan commented Oct 11, 2024

mooreds commented Oct 11, 2024

bhalsey commented Oct 11, 2024

Reindexing performance degrades non-linearly #2896

Reindexing performance degrades non-linearly #2896

Comments

bhalsey commented Oct 9, 2024