-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OPIK-76 Add stream experiment items endpoint #294
OPIK-76 Add stream experiment items endpoint #294
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern only concerns are:
- What if we have too many experiments?
- Shouldn't the response format be responsibility of the API layer ?
f45e966
to
28b5ad0
Compare
28b5ad0
to
7506224
Compare
Those are nice questions. Let me answer then: Regarding 1: there are multiple alternatives to handle this:
With all this in mind, I honestly think that the best option is to leave the endpoint like it is. Experiment names are randomly generated and in practice the number of matches within a workspace is going to be very low. We might never encounter an issue with this and if we do, we can always tackle 2, 3 or both. Regarding 2, I've refactored the services and resources a bit and encapsulated the streaming logic in a With this, adding new streaming endpoints is much more straight forward. Let me know what you think. |
Perfect. My concern was more with having a very large list of |
Yeah, I didn't explain myself well in my previous message, but that's exactly what I meant. We would also need to limit the number of retrieved experiments (and their IDs), in addition to the current limit for items. But it's unlikely to be an issue for a very long period of time, so I propose to not to do it for now. |
Agree! An option would be to do the join or a subquery in the IN clause. But can discuss about it later, when if needed |
* OPIK-76 Add stream experiment items endpoint * Rev2: created Streamer class
Details
This is meant to be used by SDK and to return at least these fields per experiment item:
id
,trace_id
anddataset_item_id
and that's exactly what it does, plus any other field inexperiments
table.It has the same semantics as the other stream operations in the service: limit (default 500) and last retrieved id cursor.
The search by experiment name follows the same pattern as in the find experiment endpoints: contains regex and case insensitive.
Therefore, it can match experiment items per multiple experiments. For that reason, they're sorted by experiment id first.
Issues
OPIK-76
Testing
Documentation
N/A