-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Could not get the train json by gsutil #1
Comments
Exactly what url are you trying to retrieve? Are you authenticated on gcloud? |
You are trying to access a non-open dataset. Where was this linked from? |
The link is from notram/guides/configure_flax.md Line 247 in 54aeb6b
I want to pretrain the corpus on roberta large. If I cannot get the json, where should I get the original corpus? |
Sorry the link is |
Sorry. There is an internal link in this guide. You should replace this with whatever dataset you have available. One alternative is of course the NCC (that was released after this tutorial was written). There are several ways of training on this dataset. Assuming you are using Flax (since you are following the tutorial), a simple was is to specify dataset_name |
Early next year, we will also place the NCC in an open gcloud bucket. |
The error message is :
does not have storage.objects.list access to the Google Cloud Storage bucket.
The text was updated successfully, but these errors were encountered: