-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove existing Asset Manager asset files from NFS #296
Comments
Note: We should stop the nightly Duplicity backup of the Asset Manager assets before doing this. |
I've rebased alphagov/govuk-puppet#6768 against |
Here's a first attempt at a plan: Preparatory steps
Integration
Staging
Production
Note: I think it's worth running the Rake task in each environment and not just in production in order to avoid the nightly sync having to delete all the Asset Manager asset files, because I think it might make that job take a long time. However, an alternative approach would be to only run the Rake task in production and allow the environment syncing to delete the asset files in staging & integration. |
I ran the Rake task on integration at 10:55 today (see output below). It took about 16 mins to queue ~600K jobs.
|
It took ~1h30m to process all the jobs and there were no errors/retries: The CPU usage increased during this time, but not above the critical level: The asset master/slave disk comparison alert went critical for some of the time, but all of the files were deleted within 20 mins and the slaves had caught up within 50 mins: |
I ran the following commands on
The figures we recorded on 05 Jan were:
Thus we see there has been a drop of 43,262,172 - 2,466,360 = 40,795,812K (i.e. 38.9GB) to 2,466,360K (i.e. 2.4GB). The following suggests that most of the disk space is being used in the
This makes sense from the point of view that the Running the following command demonstrates that there are no files (only directories) under the
And the following command shows that there are ~600K empty directories under the
I think it must be the directories which are taking up the disk space. |
I've created a new issue to capture the idea of clearing up the empty directories, because I don't think this is urgent. |
The Rake task was kicked off in production at 15:22 today and 600845 jobs were queued:
|
@rubenarakelyan ran the Rake task on production at 15:22 today (see output below). It took about 9 mins to queue ~600K jobs.
|
It took ~0h25m to process all the jobs and there were no errors/retries: The CPU usage increased during this time, but not above the critical level: The asset master/slave disk comparison alert went critical for some of the time, but all of the files were deleted within 6 mins. |
After discussion with @chrisroos, I took a different approach to that outlined in this earlier comment: Integration
Staging
Production
Same day
Next day
I have completed all of the above steps except the two "Next day" ones which I plan to do tomorrow. |
alphagov/govuk-puppet#7016 and alphagov/govuk-puppet#7019 have both been merged and so I'm happy to close this issue. |
Note that we need to implement a solution for new assets before doing this.
Once we have enabled proxying to S3 via Nginx for staging & integration, we should be able to delete all Asset Manager asset files from NFS on all environments and rely entirely on the assets being in the relevant S3 bucket.
/mnt/uploads/whitehall
as opposed to/mnt/uploads/asset-manager
./mnt/uploads/asset-manager
directory, because the Asset Manager app will still need to store files there until they have been virus scanned and uploaded to S3.The text was updated successfully, but these errors were encountered: