Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reindexing performance degrades non-linearly #2896

Open
bhalsey opened this issue Oct 9, 2024 · 3 comments
Open

Reindexing performance degrades non-linearly #2896

bhalsey opened this issue Oct 9, 2024 · 3 comments

Comments

@bhalsey
Copy link

bhalsey commented Oct 9, 2024

Reindexing performance degrades non-linearly

Description

A test instance with 460M entities took over 4 days to complete on a 3 node cluster running FusionAuth 1.52.1. The performance slowed significantly after 300M entities were indexed. More work is required to clearly identify the bottleneck. And more work is required to find mitigations, such as doubling the cluster size.

Screenshot 2024-10-09 at 10 37 05 AM
@robotdan
Copy link
Member

There is not a lot of information here. Is the thesis that this is an problem with FusionAuth, or that we just need to scale Elasticsearch and the relational database adequately?

Ideally we would only open public GH issues for things that need work in FusionAuth.

@mooreds
Copy link
Collaborator

mooreds commented Oct 11, 2024

The servers were all adequately sized: XLs, 1.5TB of disk.

My take is that it is weird that the re-index got exponentially slower. This thread, however, indicates that it is possible it was due to disk i/o: https://discuss.elastic.co/t/reindexing-throughput-degrades-over-time/265279

I couldn't figure out a way to see disk queue depth for the ES nodes, but maybe that would help determine if this was an infra problem.

More supposition here: https://inversoft.slack.com/archives/C051S8N8E/p1728071443879379

So I guess I think this does need some investigation to determine if there are any changes needed to the core product (which, after all, is what controls the re-indexing process).

@bhalsey
Copy link
Author

bhalsey commented Oct 11, 2024

Agreed that more work is needed. The outcome could simply be guidance on sizing a cluster. Or it could entail changes to how FusionAuth manages reindexing, such as modifying the index refresh interval.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants