fix getting updated user namespaces #2770

alichaddad · 2022-09-22T07:13:04Z

Closes

What changed?
When attempting to get the namespaces of a user we lock getting the flow of getting the namespaces from the cache and updating it when necessary. We also only update a cluster namespaces to a user namespaces when the cluster is actually found in the cluster namespaces cache.

Why was this change made?
This change was made to fix an issue where requests for resources like Kustomizations or helm releases would sometimes return empty result.
The issue happens when another request checks on the user namespaces in the cache while a previous request is still building the cache, resulting in a response with partial or empty results. This becomes more apparent when a cluster has an issue(unreachable or has no namespaces for whatever reasons) making empty responses a common occurences. This is due to how the implementation was written:

The user namespaces are calculated from the namespaces in a cluster and then filtered based on user permission and stored in a cache with TTL of 30s
When a request is made for user namespaces it returns it from the cache, if the cache is empty which would happen periodically, it attempts to update the cache and return the data
During that if another request comes it will check on whether the cache is empty or not which means that even if a single key was added it would get the current user namespaces, resulting in partial results. Following requests would have correct responses once the cache is probably built again

The proposed fix is to lock each user request by the user ID. So if a request comes from the same user while another is in process it will only check on the status of the cache after in-flight request is finished.
If a cluster has issues retrieving namespaces or the cluster itself is not reachable it would not be set in the cluster namespaces cache but it will still be set in the user namespaces cache with an empty namespaces list. The PR changes this behavior and only add a cluster namespaces to the user cache when it already exists in the cluster namespaces cache.

How was this change implemented?
By adding a lock when retrieving the user namespaces. Since there is a lock per user ID we had to store the locks in a map. Currently this is not being cleaned up, which might be something that we need to handle down the line.

How did you validate the change?
By adding a non valid cluster to my instance which caused the results of certain APIs to return empty responses consistently before my changes and redoing the same scenario after my changes.

Release notes

Documentation Changes

ozamosi · 2022-09-22T10:45:08Z

Thank you for the excellent explanation of the problem!

This puts the UsersNamespaces lock outside of it in the factory itself, while the ClustersNamespaces are still inside the ClustersNamespaces struct. Is there a reason to not make them be the same?
This - just like the UsersNamespaces ttl cache itself I've now spotted, so perhaps a separate, bigger issue - uses the user.ID as a unique key. However there's various auth forms that leaves user.ID empty, so in that case the cache/lock will not be able to distinguish between different users.

alichaddad · 2022-09-22T12:20:29Z

Thanks for taking a look!

This puts the UsersNamespaces lock outside of it in the factory itself, while the ClustersNamespaces are still inside the ClustersNamespaces struct. Is there a reason to not make them be the same?

Good point we might have to rethink the locking methodology in this part, it seems that there might be too much. In the case of clustersNamespaces it might be a bit different. It gets refreshed periodically consistently without an external trigger. So the need to add an extra lock to it is not the same since the case of the cache getting completely cleared out is rather rare. For userNamespaces this happens constantly and the rebuild is triggered from a user request, since this entails a lot of external actions that might overlap with another request the lock is needed. Does that make sense?

This - just like the UsersNamespaces ttl cache itself I've now spotted, so perhaps a separate, bigger issue - uses the user.ID as a unique key. However there's various auth forms that leaves user.ID empty, so in that case the cache/lock will not be able to distinguish between different users.

This would indeed be a problem. Is there another unique identifier that we can use for users?

foot · 2022-09-22T15:24:44Z

Is there another unique identifier that we can use for users?

There might be .toString() kind of thing on the Principal. But it might strip out the token.. which would make it not great for this. It can be potentially sensitive so don't want it in logs etc, maybe we add .Hash() to the principal instead?

lasomethingsomething · 2023-10-30T11:58:07Z

@alichaddad @foot Is this still in progress?

alichaddad added the bug Something isn't working label Sep 22, 2022

alichaddad force-pushed the user_namespaces_fix branch from 82d96ca to 1666763 Compare September 22, 2022 07:13

alichaddad requested review from a team, bigkevmcd and foot September 22, 2022 07:14

alichaddad force-pushed the user_namespaces_fix branch from 1666763 to e7cd75e Compare September 22, 2022 08:16

alichaddad force-pushed the user_namespaces_fix branch 2 times, most recently from ac6b04d to a8da483 Compare October 4, 2022 11:40

alichaddad added 2 commits October 20, 2022 12:49

fix getting updated user namespaces

0036029

add a user hash function to serve as ID

3e2a666

alichaddad force-pushed the user_namespaces_fix branch from a8da483 to 3e2a666 Compare October 20, 2022 10:49

foot mentioned this pull request Apr 4, 2023

The UI for which repo to apply template is non-deterministic weaveworks/weave-gitops-enterprise#2583

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix getting updated user namespaces #2770

fix getting updated user namespaces #2770

alichaddad commented Sep 22, 2022

ozamosi commented Sep 22, 2022

alichaddad commented Sep 22, 2022

foot commented Sep 22, 2022

lasomethingsomething commented Oct 30, 2023

fix getting updated user namespaces #2770

Are you sure you want to change the base?

fix getting updated user namespaces #2770

Conversation

alichaddad commented Sep 22, 2022

ozamosi commented Sep 22, 2022

alichaddad commented Sep 22, 2022

foot commented Sep 22, 2022

lasomethingsomething commented Oct 30, 2023