-
-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Files are not deleted from S3 (primary) #20333
Comments
Your issue is likely filelocking. Disable it in the nextcloud config, disable redis filelocking. Restart PHP-FPM and try reproducing this again. In my case by disabling filelocking all of my issues related to deletion were resolved. I just let the S3 backend handle the filelocking now. |
Filelocking is already disabled (see my config in the 1st post). |
My bad, another foot in mouth moment. If you wait a while are they removed from the backend? Sometimes with S3 deletion is delayed on the backend. |
Thanks but I don't think so as if i upload a 200M file and delete it, i can see it in real time in the S3 backend. And 2h had passed now and files are there (cron is running every 5mn). |
Right but with S3 in particular if a file is removed but is locked on the S3 backend it can take a while for it to process the deletions. 2 hours is a fairly long time though. If you have your S3 provider run the garbage collection process do the files stay or are they deleted? |
With amazon they also include an option to retain locked objects for x days. https://docs.aws.amazon.com/AmazonS3/latest/dev/object-lock-overview.html Can you verify that's not the case and garbage collection doesn't resolve the issue? |
I'm not using Amazon but Scaleway, they have Lifecycle Rules but they are disable by default. |
I use radosgw-admin gc process. Each host has their own rules for garbage collection. Do they have a end-user option for garbage collection or an API call to process it? If not you would need to contact them directly and ask how often it runs and if they can run it now. |
Edit: Scaleway runs it once a day on their cold S3 storage. I don't know about the other storage options - best contact them about it. |
I owe you an apology. I don't use nextcloud for images but for file storage. You are correct that files are not being deleted properly when it comes to image previews. |
Same here with wasabi as storage backend. I am having many problems with s3 currently. Maybe there is something more general broken. |
I can always confirm this with v19.0.5.
and These are clearly not images previews as i have big files in the bucket:
Summary: an empty instance, and a bucket with |
Yes, can confirmed this issue with Nextcloud 20.0.1 also.
Now my minio bucket is having 10 x 10mb chunked file which should've been deleted. |
I've tested it with AWS S3, to eliminate any compatibility issues with S3 'compatible' providers. Problem remains.
Check the bucket stats of the files empty NC instance:
This is a serious problem for many reasons, specially GDPR when users request their files to be deleted and they aren't, beyond the S3 billing for objects we aren't using anymore. cc @nextcloud/server-triage can someone take a look at this? I believe this affects every ObjectStorage instance but since files are named |
+1 for GDPR concerns. It has been almost a year since this issue was brought up and S3 is heavily used in enterprise environments - any ideas when this will be prioritized? Unfortunately, we cannot rely on using S3 for storage if we cannot show that files are completely removed. |
I have a suggestion on this issue. I know it is hard to sync between files (including filesystem or object storage) and database, not mention about handling cache involed. I believe it is impossible to make database correct after a hardware-based failure, just like a simple power failure. So I suggest there should be a way to check current file list and file information at database. There is a command I also appreciate if developers take a look at object server direct download function, an issue had opened at #14675. This function can save our server non-essential bandwitdh, and server loading. I use object storage (minio) as primary storage because files can be backup easily. I don't need to shutdown Nextcloud server for a long time, and I can seperate database and file server easily. I believe this workout will be widely used in enterprise level, I also wish this suggestion can help Nextcloud server at deployment and migration. |
I am having the same issue running Nextcloud 21.0.2 with Digital Ocean Spaces (S3) as primary storage. In my case it seems that the issue only occurs when server-side encryption is activated. Although, I haven't tested too much without encryption so I can't be too conclusive. Also, I agree with @caretuse
|
I have been running Nextcloud using S3 storage for over two years. I noticed my bucket was bloating early on. Digital Ocean shows my S3 was using 800GB even though my only user had 218 GB of files including versioning. I've been watching this issue for a long time now hoping for a solution, but finally got around to looking into it myself. I compared the I finally found that the bloat was from old incomplete uploads. You can list these using the Nextcloud could remove old multipart data if it kept track of them. But S3 has the ability to do so on its own. Using s3cmd you upload an XML rule to the S3 bucket:
This rule will run once a day at midnight UTC according to what I found. After waiting a day my nearly 800 incomplete uploads spanning over two years were gone and my S3 storage now sits at 220GB as it should. This doesn't appear to be the solution to all of the issues in this thread, but hopefully it helps some. In my case the files marked as trash or versioning in the database were being removed correctly according to the rules I have in Nextclouds config file. I have transactional file locking disabled and encryption is not enabled. |
This comment was marked as resolved.
This comment was marked as resolved.
I use Nextcloud 22.0.1 and have the exact same problem with Scaleway S3. At this point I have about 35 GB used by my users, but storage is filled with 74 GB. Edit: |
Looks like the original issue is fixed then. @acsfer can you still reproduce this on NC21.0.4 or NC22.1.1? |
@szaimen can't help anymore here, we moved away from S3... |
I snooped around the Nextcloud database and it seems that the issue is, that objects uploaded to S3 are not committed to the db until the transfer to S3 is completed. If a transfer is interrupted, then Nextcloud looses track of the object, since no record of ongoing transfers is kept. A potential fix could be to log ongoing transfers in the database and occasionally do a clean-up if something goes wrong. Until this is fixed Nextcloud will continue to bloat the bucket, so I've hacked together a python script that cleans up the S3 storage. It doesn't solve any of the open issues using S3 as primary storage - it simply cleans up orphaned objects in the bucket, thereby bringing down the amount of storage used by Nextcloud. DISCLAIMER: Since the issue seems to be caused by the db not being updated until a transfer to S3 is complete, the script might delete objects that have successfully been transferred to S3, but have not yet been recorded in the database, if it's run while a sync is in progress. Therefore you should not run this while a sync is in progress. I repeat Do not run this while a sync is in progress! I've run/tested this against my own setup (Minio + Postgres) and haven't encountered any issues so far. If you use any other combination of S3 compatible storage and database, you'll need to modify the code to your needs: |
Why reinvent the wheel? Look back at my post on July 4 and S3 lifecycle rules. Since then I have had zero issues with S3 storage bloating from NC 20 through 23. |
@Scandiravian made a good script to solve database not consistent issue, although I believe this should be implemented in @jeffglancy and otherguy also made a good script to solve another issue, which is cleanup pending multipart uploads in S3. But lazy as me, I would choose |
This comment has been minimized.
This comment has been minimized.
I tested some scenario
I can't confirm files not shown in Nextcloud scenario, it should manipulate in database level. It goes beyond my interest. Does anyone have environment to test? |
Hi @szaimen , we are currently running Nextcloud 25 and still experience this problem. On one instance, our S3 Bucket shows 209GB of data, while counting the different users quota in NC itself comes to about 55GB. Occurs on different Instances, which were built with a custom Docker Image. |
I created a script (S3->local) once upon a time I had trouble with S3.. partially because I found a bug and feared it was S3 related (but wasn't, fixed that one: #34422 ;) ) later on (partially with creating that migration script), I dared to try to migrate back to S3.. "reversing" that script was quite a challenge.. but I got it working.. With creating that script I built in various "sanity checks".. and I now run my "local->S3" script every now and then to clean up my S3.. and baring a little hiccup every now and then the script rarely needs to clean stuf up.. A few weeks ago I decided to publish it on Github, take a look at: PS: I have various users on my Nextcloud, totaling some 100+Gb of data |
I wrote a Python script to delete orphaned S3 objects (among other work-arounds for NC lack of proper S3 support): https://github.com/aurelienpierre/clean-nextloud-s3 |
Is there already a real solution from nextcloud? @aurelienpierre you're script might help but its not ready for other s3 vendos like OVH Cloud aurelienpierre/clean-nextloud-s3#2 Also scanning 300k objects takes so much time and downloading them costs €€ ;) |
I can still reproduce this in Nextcloud 29. Pre upload:
Upload ~5GB file, abort at ~50%, find:
|
@tsohst sorry, I don't know. I stopped using nextcloud because of this error years ago. |
Disclaimer: Work in progress. Based on my review of this thread it doesn't appear everyone has the same underlying cause (though the symptoms are somewhat similar). Since this issue has fairly broad title, it's likely there is also some overlap with other open Issues (I'll try to review these as time permits and sort some of these out). Here are the apparent underlying causes I've been been able to identify from this Issue:
Locking (and legal holds) got mentioned, but hasn't seemed to be a factor with anyone here. I'll also toss a couple others into the list:
Some of these (but not all) could be addressed through some documentation tweaks. Keep in mind this is work in progress analysis. Here are a couple notes on a couple of the biggies above. VersioningDifferent providers and object store platforms have different defaults. For example, Backblaze has versioning on by default. AWS has it off by default, but when it's turned on versioning of individual objects apparently are hidden by default in their Web UI in some places so it can be easily to miss if they've been turned on through org policy. Solution: Either turn off versioning or add lifecycle management rules on your S3 platform. Also, the Aborted multipart uploadsMaybe we can do better here, but it's going to take some work to figure that out. On the other hand, lifecycle rules can be made to handle this situation well (and cleanly) from the looks of it. |
I experience this same issue but in different scenarios. New, clean install. During performance testing I uploaded files of increasing file size and had some failures for unknown reasons. I deleted them out completely but when I went to check the bucket on Linode it still had hundreds of GBs of data in the bucket. 0 files in the single users directory. I continued testing with even larger files (even up to 1TB) and those seemed to work file and deleted file. It would be helpful if there was a cleanup command like was requested in #48143 I'm running a Docker Compose install using MariaDB. |
@0x4E4448 What types of data? If MPUs, use a lifecycle policy: |
How would I know? I see individual files of exactly the sizes I was uploading (1G, 1.25G, 1.5G, etc). The names are all urn:old:####. I will look at lifecycle policies also. |
Lifecycle policies did seem to resolve most of it, but not all of it. I've still got some files (which do not exist in NC) that I know were uploaded by me (by size) but were not removed from object storage. The ability to reconcile the database with the files in object storage really does make sense, especially considering it exists for local storage. |
Various operation are available from the command-line administrative utility, In addition to various other proposed improvements, at minimum basic cleanup should be supported, though the command-line utility scanning a bucket, and then reporting and removing objects identified as orphans. Generally, S3 backend storage seems to be valuable and popular, due its many advantages in provisioning, operation, and economy, compared to storing user files on the local filesystem, yet, so far, reliability and flexibility of such support has not seemed as among the major priorities for platform development. |
How to use GitHub
Steps to reproduce
Expected behaviour
Trashbin should be empty correctly
Actual behaviour
After some time, an error appears (Error while empty trash). Reloading the page shows no more files either in the Files or Trashbin.
But files still on the Object Storage, here OBJECTS and SIZE:
Before these test operations (upload, delete...)
Following commands had been executed (after):
One user has reported that interface show he is "using" 1,9Gb of storage, but he has NO FILES or FOLDERS at all, either in FILES or TRASHBIN in a production instance.
Server configuration
Operating system: Ubuntu 18.04
Web server: Nginx 17
Database: MariaDB 10.4
PHP version: 7.3
Nextcloud version: (see Nextcloud admin page) 18.0.3
Updated from an older Nextcloud/ownCloud or fresh install: Fresh install
Where did you install Nextcloud from: Official sources
Signing status:
Signing status
List of activated apps:
App list
Nextcloud configuration:
Config report
Logs are completely empty (we have just fired up a test instance, and test this use case).
Similar to #17744
The text was updated successfully, but these errors were encountered: