Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support queuing score by id #114

Merged
merged 32 commits into from
Jul 22, 2022
Merged

Conversation

notbakaneko
Copy link
Contributor

@notbakaneko notbakaneko commented Jul 15, 2022

Adds support for queuing solo_score for indexing by id.
The indexer will do the necessary lookups and index or delete the document as necessary; scores with missing user or beatmap will be deleted from the index.

The structure of SoloScore is specific to the indexer itself and probably not convenient for other clients to use, especially if the format changes.

To queue for indexing, push to the queue:

{ "ScoreId": 1 }

ScoreId in the queue item will have priority over Score (score gets ignored)

Added some utility commands:
To queue a score for indexing or deletion by id:

scores index ${id}

Pushing the contents of filename to the queue as a single item - this is mainly so I can test with arbitrary json blobs instead of the result of JsonConvert.SerializeObject:

push-file ${filename}

Also adds the convert flag to the schema but haven't added added the convert lookup yet.
Also does the convert lookup (assuming score's ruleset_id being different from beatmap's playmode)

closes #112

@notbakaneko notbakaneko self-assigned this Jul 15, 2022
@peppy
Copy link
Member

peppy commented Jul 15, 2022

@notbakaneko to confirm, is this ready to go?

@nanaya
Copy link
Contributor

nanaya commented Jul 15, 2022

The index action works as expected. Tested using my test branch.

@notbakaneko
Copy link
Contributor Author

Noticed some strange errors earlier; it should work properly now - I'll move the parts that organize the items to index/delete out later

README.md Outdated
Comment on lines 186 to 189
### Deleting a score by `id`
```json
{ "Action": "delete", "ScoreId": 1 }
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the case for deleting? Wondering if we even need an action if the deletion case is always "doesn't exist in database or has preserve=0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So a document can be deleted directly when testing for effects.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it will never be used in actual usages?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's a convenience when testing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, considering it's going to be fetching from a mysql table at the end of the day (where you can set preserve:false) it's probably fine to remove here to keep things simple then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being able to just delete it makes it easier than needing to flip preserve between true and false on he db first; with the command I can just do score delete 1 to remove and score index 1 to put it back; it also means not needing an entire infrastructure and complete dataset to always be present and be in sync when testing 👀

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't get it. The indexer does lookups from the database. How do you test without data/infrastructure?

If this is really required, can we keep it on a dev branch, as I don't foresee it being useful in production?

Comment on lines 25 to 29
var redis = new Redis();
var database = redis.Connection.GetDatabase();
var queueName = $"osu-queue:score-index-{AppSettings.Schema}";

database.ListLeftPush(queueName, value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for doing it like this rather than using the existing queue method? I'd much prefer the latter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes it easier to push arbitrary structures into the queue for testing instead of it being serialized into a known type, or defining a new type and queue processor instance for every payload I want to check.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But you can just new ScoreItem { id = ... } no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushing directly to the key allows pushing things like { "ScoreId": 1, "not_a_real_field": "aa" } and also intentionally omit properties that didn't exist in other versions, instead of it being serialized into to type of the current version, which isn't as useful when testing for payloads from external sources.

processor.PushToQueue(new ScoreItem()) is always going to push "{\"Action\":null,\"ScoreId\":null,\"Score\":null,\"TotalRetries\":0,\"Tags\":null}" to the queue; a different client isn't necessarily going to be sending all the fields.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's fine to leave this, since this command probably won't be used in production. Still a bit weird though. I can imagine this breaking if we change the queue name or system and forget to update this. shrug

@peppy
Copy link
Member

peppy commented Jul 20, 2022

@notbakaneko I've applied some changes and tested locally that things still seem to work. Of concern is 1e8f036 where I change the deletion operation to use longs directly, instead of strings. Let me know if there was a reason you didn't have it running this way (it seems to work as expected).

peppy
peppy previously approved these changes Jul 20, 2022
@@ -86,23 +85,6 @@ private void addToBuffer(SoloScore score, ProcessableItemsBuffer buffer)
buffer.Deletions.Add(score.id);
}

private BulkResponse dispatch(BulkDescriptor bulkDescriptor)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't supposed to be unused; apparently I managed to test it and somehow not push the change using it 👀

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you want to add the usage back for this then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to update the calls to it but there was an issue I came across when checking ES8 support, except I can't seem to replicate it now 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new logic means that if ES goes away, the queue processor will never exit. I don't think this should be the case? Queue processor itself already has fail and retry logic, so unless this is intended to handle actual intermittent errors during normal operation, I'd either leave the handling to that mechanism, or make it retry for a max of 1-5 seconds to ensure things don't get stuck.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Retrying queue items with minimal delay can easily deplete all the retries while a server restarts or network recovers; a better way may be to be able to just re-queue an item without it being marked as failed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a setting on QueueProcessor that ensures failed items are never dropped, for cases that matters.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will merge as-is for now and we can revisit when this becomes an issue. Tracking at ppy/osu-queue-processor#16.

@notbakaneko
Copy link
Contributor Author

change the deletion operation to use longs directly, instead of strings.

The original call was DeleteMany(ids) without specifying <T>which needed the type of be an object or string but there's currently a bug requiring T to be explicitly specified when using ids only.

@peppy
Copy link
Member

peppy commented Jul 20, 2022

Yeah I saw that, but seems longs work fine as long as the generic is specified.

@@ -86,23 +85,6 @@ private void addToBuffer(SoloScore score, ProcessableItemsBuffer buffer)
buffer.Deletions.Add(score.id);
}

private BulkResponse dispatch(BulkDescriptor bulkDescriptor)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new logic means that if ES goes away, the queue processor will never exit. I don't think this should be the case? Queue processor itself already has fail and retry logic, so unless this is intended to handle actual intermittent errors during normal operation, I'd either leave the handling to that mechanism, or make it retry for a max of 1-5 seconds to ensure things don't get stuck.

@peppy peppy merged commit 8a3b5f7 into ppy:master Jul 22, 2022
@notbakaneko notbakaneko deleted the feature/queue-item-action branch August 30, 2022 04:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Document/improve method to index scores
3 participants