Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

return array data type in infer_dtype_bydata #2144

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

rgupta2508
Copy link

Now milvus supporting ARRAY data types so we can return DataType.ARRAY in case of if it is not VECTOR.

This will be helpful to create ARRAY data type field seamlessly in milvus

@sre-ci-robot
Copy link

Welcome @rgupta2508! It looks like this is your first PR to milvus-io/pymilvus 🎉

@mergify mergify bot added the needs-dco label Jun 20, 2024
@rgupta2508 rgupta2508 force-pushed the array_infer_dtype_bydata branch from 51e18a7 to 1dc200d Compare June 20, 2024 10:54
@mergify mergify bot added dco-passed and removed needs-dco labels Jun 20, 2024
@rgupta2508
Copy link
Author

@wangting0128 @longjiquan Please review , if this can be merged to return Milvus.ARRAY data type for infer_dtype_bydata function.

Or any thing we can do.

@XuanYang-cn XuanYang-cn self-assigned this Jun 24, 2024
@mergify mergify bot added the ci-passed label Jun 24, 2024
@rgupta2508
Copy link
Author

/assign @czs007

@sre-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: rgupta2508
To complete the pull request process, please ask for approval from czs007 after the PR has been reviewed.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@rgupta2508
Copy link
Author

This PR will resolve this issue #2165
idea is to take kwargs which accept datatype along with all the parameter, based on kwargs we can decide return type.
As per current implementation it is difficult to find which one is vector array or float array

@longjiquan
Copy link
Contributor

Please help to review this. @xiaocai2333
/assign @xiaocai2333

@xiaocai2333
Copy link
Contributor

It is recommended to change the default behavior to infer the vector, as pymilvus itself relies on the interface.

pymilvus/orm/types.py Outdated Show resolved Hide resolved
@rgupta2508 rgupta2508 force-pushed the array_infer_dtype_bydata branch from 7cd45fb to 727d14e Compare July 5, 2024 03:20
@mergify mergify bot added dco-passed and removed needs-dco labels Jul 5, 2024
@longjiquan
Copy link
Contributor

@rgupta2508 thanks for the contributions. I agree that we should infer data as array when the data is list-like and the infered type is bool or varchar, but it's also not enough to infer the schema since element type is required for array.

@mergify mergify bot added needs-dco and removed dco-passed labels Jul 5, 2024
@rgupta2508 rgupta2508 force-pushed the array_infer_dtype_bydata branch from 12a14d8 to cb9b12d Compare July 5, 2024 03:36
@mergify mergify bot added dco-passed and removed needs-dco labels Jul 5, 2024
@longjiquan
Copy link
Contributor

@rgupta2508 Take array type as considerations, I prefer that this function should infer a field schema instead of a data type (since data type don't contain other informations).

@longjiquan
Copy link
Contributor

If so, I think we should do much more refactors.

@rgupta2508
Copy link
Author

rgupta2508 commented Jul 5, 2024

@rgupta2508 thanks for the contributions. I agree that we should infer data as array when the data is list-like and the infered type is bool or varchar, but it's also not enough to infer the schema since element type is required for array.

Yes correct , we cant return field schema with this, this we can use for data type only. directly we cant use it for schema. even max_capacity for array and max_length for varchar array is also mandatory to create schema, not sure if we can use this method for this, this will just return DataType only.

used should handle this if they will get datatyepe Array.

@rgupta2508
Copy link
Author

@rgupta2508 Take array type as considerations, I prefer that this function should infer a field schema instead of a data type (since data type don't contain other informations).

Yes correct, but even max_capacity for array and max_length for varchar array is also mandatory to create schema, not sure if we can add some default values on this.

@mergify mergify bot added needs-dco and removed dco-passed labels Jul 5, 2024
Rohit Gupta added 11 commits July 5, 2024 18:00
Signed-off-by: Rohit Gupta <[email protected]>
Signed-off-by: r0g0bum <[email protected]>
Signed-off-by: Rohit Gupta <[email protected]>
Signed-off-by: r0g0bum <[email protected]>
Signed-off-by: Rohit Gupta <[email protected]>
Signed-off-by: r0g0bum <[email protected]>
Signed-off-by: r0g0bum <[email protected]>
Signed-off-by: r0g0bum <[email protected]>
Signed-off-by: r0g0bum <[email protected]>
Signed-off-by: r0g0bum <[email protected]>
Signed-off-by: r0g0bum <[email protected]>
@rgupta2508 rgupta2508 force-pushed the array_infer_dtype_bydata branch from 7dd49ea to cadea3f Compare July 5, 2024 12:31
@mergify mergify bot added dco-passed and removed needs-dco labels Jul 5, 2024
@rgupta2508
Copy link
Author

Hi @longjiquan Any suggestion on this PR . if this can be merged with this idea.
Similar issue previously #1785

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants