Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Currently Broken due to new non-signed URL Scheme #11

Open
nelsonjchen opened this issue Jan 1, 2025 · 5 comments
Open

Currently Broken due to new non-signed URL Scheme #11

nelsonjchen opened this issue Jan 1, 2025 · 5 comments

Comments

@nelsonjchen
Copy link
Owner

nelsonjchen commented Jan 1, 2025

https://takeout-download.usercontent.google.com/download/takeout-20241222T093656Z-002.zip?j=3647d71e-7af8-4aa7-9dc1-1f682197329a&i=1&user=798667665537&authuser=0

What's this stuff?

It looks like downloading now isn't just a simple signed URL.

OTOH, maybe this can reduce a hop needed on Cloudflare Workers. Unfortunately, this is somewhat a major rewrite.

The good news is this: If you smuggle the cookie out and all, you can totally download from a wholly different IP.

@nelsonjchen
Copy link
Owner Author

I think resume works, so that's good. This means ranges probably work.

@nelsonjchen
Copy link
Owner Author

I know what a signed URL is and how long it is valid for. I do not know for these "Google" cookies. It does seem to expire though and need refreshes.

I do notice that the URL seems to still work on the browser more than a few minutes afterwards. Of course, that'll use your Google cookies to "authenticate". It does not seem they have a download limit anymore too.

The bad news is that if I want GTR to work, I'll need to handle google cookies and that's a bit super sensitive. At least, if someone took a snapshot of the cookie, it's only valid for a limited time. It's still a lot more access than I want. Even if I suggest making a small takeout to test, it's still a lot of power handed to the sample Cloudflare Workers I setup I don't want.

The good news is that if my extension could have access to the Google cookies, I could theoretically implement a proper queue where all the user has to do is middle-click or cmd click all the way down the list and transfers can continuously get proper credentials on demand to do the mass transfer.

Anyway, I'm putting the cart before the horse, I need to see if some new scheme could still work with PutBlockFromUrl https://learn.microsoft.com/en-us/rest/api/storageservices/put-block-from-url?tabs=microsoft-entra-id + Cloudflare Workers.

@fcfort
Copy link

fcfort commented Jan 5, 2025

I don’t see any support for passing custom headers or cookies via Azure's Put Block From Url API.

I do believe it's possible for an extension to request access to get google.com cookies, but it might be hard to get approval for it on the Chrome extension store.

Another, more manual option would be to manually extract your Google auth cookies, and deploy a CF proxy worker with those cookies attached as a Secret. Then to store the data, make API requests to the Azure Put Block From Url API using URLs that point to the CF proxy worker. The CF proxy worker would translate those URLs to Google Takeout URLs and pass along the cookies at that time, returning the Google Takeout data back to Azure to persist it in storage.

It's unfortunate that the Google auth cookies are so powerful but at least with that method the cookies only transit between the CF worker and Google's servers.

@nelsonjchen
Copy link
Owner Author

nelsonjchen commented Jan 5, 2025

Well, the extension part was never in and is never going to be in the store. It's kinda too gasoline/fire anyway. We can do anything.

That "more manual" option is a bit "singleton and storey. I'm thinking of making the extension somehow encode the cookies into Authentication header and call Azure Storage that way and then when Azure Storage calls back into Cloudflare Workers, the Cloudflare Workers would unwrap the Authentication header back into cookies. We can control the Authentication header that Azure Storage sends out and while I know to test, ChatGPT seems to say that sometimes it's possible to stuff 8K of information into that.

Yeah, the cookie stuff is super powerful so I'm also thinking of maybe changing the language of the README to say: "Try this out with the public proxy on a fresh or new Google Account to get familiar" and more recommending people deploy their own proxies.

@nelsonjchen
Copy link
Owner Author

I wonder if it's possible to nail down a specific cookie to use/passthrough to minimize cookies to be handled. This page from Meta's privacy cookie thing has a surprising amount of description about Google's cookies.

https://engineering.fb.com/privacy/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants