-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gist Searching in GitHub Provider is now Rate Limited (and doesn't appear to be affected by OAuth authentication) #823
Comments
This is a critical issue for all social identity discovery system. |
Sustainable solution is to switch to using the Github gist API, and using This means the graph has to be extended with metadata like the |
Technically this could be defined by the |
I think GitHub is basically preventing scraping the gists. |
We should also take into account this information as part of the metadata of any given provider, so we can track how often we hit the API: https://docs.github.com/en/rest/rate-limit/rate-limit?apiVersion=2022-11-28 |
To get around this atm, you can also do |
Describe the bug
In our discovery system, we use
getIdentityData
method from github provider which will look up gists viahttps://gist.github.com/search
.It seems recently this now has some secondary rate limit applied (https://docs.github.com/en/rest/using-the-rest-api/troubleshooting-the-rest-api?apiVersion=2022-11-28), which is not solvable even with authenticated requests. Atm it is done unauthenticated, because it's basically a public page that we index over.
Gists are not currently searchable via the official GitHub API, so it seems that gist search has basically become impossible to index programmatically now. This is pretty bad. Especially because it's a secondary rate limit.
I tried doing things like:
But no use, it's just 429 too many requests.
The only other option right now is to change using the API for gists, and because there's no search feature, you have to basically index over all gists via the API, but we could use
since
to do this efficiently without having to repeat. https://docs.github.com/en/rest/gists/gists?apiVersion=2022-11-28#list-gists-for-a-user. Effectively only going over the new gists representing new claims. The timestamp acts like a cursor.To Reproduce
WARN:polykey.PolykeyAgent.task v0pocinl3mpo0195g4m2kd1t8k0:Failed - Reason: ErrorProviderCall: Provider responded with 429 Too Many Requests
show up in the agent logs.Expected behavior
It needs to work just like normal and discover without problems.
Screenshots
Notify maintainers
@tegefaulkes
The text was updated successfully, but these errors were encountered: