Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Bulk operation throwing 'action_request_validation_exception" when doing bulk delete. #873

Closed
ausmanlumeris opened this issue Dec 19, 2024 · 7 comments · Fixed by #878
Labels
bug Something isn't working

Comments

@ausmanlumeris
Copy link

What is the bug?

When using helpers.bulk to delete objects in bulk, the helpers.bulk function throws the exception shown below. I noticed there are no examples of doing a delete in the documentation.

raise HTTP_EXCEPTIONS.get(status_code, TransportError)(
opensearchpy.exceptions.RequestError: RequestError(400, 'action_request_validation_exception', 'Validation Failed: 1: index is missing;')

How can one reproduce the bug?

Use this code.

docs = [{"delete": {"_index": "<index name", "_id": "<id>"}}]           
helpers.bulk(client, docs)

What is the expected behavior?

The bulk function should be able to delete objects in bulk. The same request works fine from postman when going directly against the opensearch '_bulk' API.

What is your host/environment?

OS: MacOS 14.7.1 (23H222)
Python 3.12.3
opensearch-py: 2.8.0

@ausmanlumeris ausmanlumeris added bug Something isn't working untriaged Need triage labels Dec 19, 2024
@dblock dblock removed the untriaged Need triage label Dec 19, 2024
@dblock
Copy link
Member

dblock commented Dec 19, 2024

Looks like a bug. Want to try and write a (failing) test for this @ausmanlumeris?

@Harshil-Jani
Copy link
Contributor

Facing the same issue at my day job. I can volunteer writing a failing test case.

@Harshil-Jani
Copy link
Contributor

@dblock @ausmanlumeris Well it's actually not a bug. We are supposed to use bulk from client and not from helper.
Basically all you need to do is

docs = [{"delete": {"_index": "<index name", "_id": "<id>"}}]           
client.bulk(docs)

This shall pass what we are trying to do. Coming to the bulk in helper it is only for the purpose of bulk indexing. It just takes your data object as it is and pushes it to index. so it actually isn't detecting your operation. (Maybe the name of that bulk method inside helper is confusing us ? Or we can have some comments on the separation of bulk from client and helper) But for what we intend to do the above solution works the best.

@dblock I have also updated the PR with passing test case with client implementation so that we have a sanity on it's working. Will be happy to coordinate further in case if there is something else we need to do.

@dblock
Copy link
Member

dblock commented Dec 29, 2024

This shall pass what we are trying to do. Coming to the bulk in helper it is only for the purpose of bulk indexing. It just takes your data object as it is and pushes it to index. so it actually isn't detecting your operation. (Maybe the name of that bulk method inside helper is confusing us ? Or we can have some comments on the separation of bulk from client and helper) But for what we intend to do the above solution works the best.

Does it make sense to add support for deletions in the bulk helper or do something in code that prevents future errors?

Either way do take a look at the documentation, maybe a few words added can help the next person?

@Harshil-Jani
Copy link
Contributor

Either way do take a look at the documentation, maybe a few words added can help the next person?

Yes, Actually documentation made me realize that we were using it in wrong way. There is no mention of helpers in documentation https://opensearch.org/docs/latest/clients/python-low-level/ and also it makes more sense to use client instead.

Does it make sense to add support for deletions in the bulk helper

Ah, I am not sure. I would like to learn why helpers exist ? What's the purpose of helpers in codebase ? Becase we are allowing same methods in client. Maybe it could be an abstraction somewhere? Cannot learn about them from documentations anywhere. This will then give a better idea if we should add delete to it or not.

@dblock
Copy link
Member

dblock commented Dec 30, 2024

@Harshil-Jani It's all legacy from when OpenSearch forked 3 years ago. I am 100% with you that helpers should really not exist and be wrappers on top of existing APIs. If you want to undertake that kind of change, please do!

That said, progress > perfection, so if you think you can contribute something/anything that would have prevented you from wasting a lot of time debugging a non-existent problem that would be very much appreciated.

@Harshil-Jani
Copy link
Contributor

Sure @dblock . I will try figuring out what can be done and will keep updated here as i explore the codebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants