-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add coordinatorId
to /v1/info
#23910
base: master
Are you sure you want to change the base?
Conversation
Each node in a Trino cluster already has a node id. Why is that not sufficient? |
|
We need to make routing decision base on First request from client:
First response from coordinator:
The following request from client:
We want to route the request to the same coordinator. What we have here is the queryId Trino Gateway acts as a transparent proxy for one or more Trino clusters (ref). Multiple Trino clusters share the same domain |
The query id is an opaque identifier. It's not meant to be parsed and interpreted by clients. Its format is subject to change at any time. |
One possible solution would be to rewrite the nextUri in the gateway to include the node id of the coordinator (e.g, in a query parameter), so that subsequent requests can be interpreted and routed to the right place. |
Yes, that's a possible solution. There are some alternatives being discussed:
All these modify the protocol between client and gateway. We don't have a specification Another possible solution is to introduce some kind of sessionId into the protocol officially. |
Can @wendigo maybe look here as well and then maybe we should discuss as group of @oneonestar @martint @electrum @mosabua and others interested. |
The protocol is specified here, and is guaranteed to be stable: https://trino.io/docs/current/develop/client-protocol.html
That's a much larger topic. There's no state kept in the server at this time. |
I don't see a problem with having the Trino gateway depend on the format of query ID, if that is the simplest solution. The query ID format has existed since the beginning of Trino and it seems unlikely we would need to change it in the future. One consideration is that the coordinator ID is random, 5 digit base-32 number, so it only has ~33 million ( |
This pull request has gone a while without any activity. Tagging for triage help: @mosabua |
Closing this pull request, as it has been stale for six weeks. Feel free to re-open at any time. |
Now that @electrum chimed in I think we can go ahead and rebase and get this towards merge. We should just add a test here and maybe also in Trino Gateway so we catch any changes that might break things |
Can we use node id that we already have instead of generating new id? |
Description
Add
coordinatorId
to/v1/info
.Motivation
Trino Gateway works as a proxy between client and coordinator. After the initial query submission, subsequent requests for the same query must be routed to the same coordinator. Currently, gateway store the queryId after the initial response from coordinator for later routing purposes.
If the queryId doesn't exist in the store (which happens when there are multiple gateway instances being deployed without sticky session), gateway will query every coordinator to find the correct destination.
By adding
coordinatorId
to/v1/info
, the above routing logic can be simplified to:coordinatorId
in gateway.coordinatorId
fromqueryId
and route it accordingly.Other concerns
coordinatorId
is just a random string being generated during server startup. The only usage so far is for generatingqueryId
. I don't think there are security concerns with exposing it in the public API.Additional context and related issues
From coordinator:
From worker:
Release notes
(x) Release notes are required, with the following suggested text: