Files are not deleted from S3 (primary) #20333

solracsf · 2020-04-06T16:21:31Z

How to use GitHub

Please use the 👍 reaction to show that you are affected by the same issue.
Please don't comment if you have no relevant information to add. It's just extra noise for everyone subscribed to this issue.
Subscribe to receive notifications on status change and new comments.

Steps to reproduce

Set S3 as primary storage
Upload, say, 2000 files into a folder (JPGs here)
Delete that folder, and try to empty trashbin

Expected behaviour

Trashbin should be empty correctly

Actual behaviour

After some time, an error appears (Error while empty trash). Reloading the page shows no more files either in the Files or Trashbin.

But files still on the Object Storage, here OBJECTS and SIZE:

Before these test operations (upload, delete...)

Following commands had been executed (after):

sudo -u testing php occ files:scan test

Starting scan for user 1 out of 1 (test)
+---------+-------+--------------+
| Folders | Files | Elapsed time |
+---------+-------+--------------+
| 0       | 0     | 00:00:00     |
+---------+-------+--------------+

sudo -u testing php occ files:cleanup

0 orphaned file cache entries deleted

sudo -u www-data php occ trashbin:cleanup --all-users

Remove deleted files for all users
Remove deleted files for users on backend Database
   test

One user has reported that interface show he is "using" 1,9Gb of storage, but he has NO FILES or FOLDERS at all, either in FILES or TRASHBIN in a production instance.

Server configuration

Operating system: Ubuntu 18.04

Web server: Nginx 17

Database: MariaDB 10.4

PHP version: 7.3

Nextcloud version: (see Nextcloud admin page) 18.0.3

Updated from an older Nextcloud/ownCloud or fresh install: Fresh install

Where did you install Nextcloud from: Official sources

Signing status:

Signing status

No errors have been found.

List of activated apps:

App list

Enabled:
  - accessibility: 1.4.0
  - admin_audit: 1.8.0
  - announcementcenter: 3.7.0
  - apporder: 0.9.0
  - cloud_federation_api: 1.1.0
  - dav: 1.14.0
  - external: 3.5.0
  - federatedfilesharing: 1.8.0
  - files: 1.13.1
  - files_accesscontrol: 1.8.1
  - files_automatedtagging: 1.8.2
  - files_pdfviewer: 1.7.0
  - files_rightclick: 0.15.2
  - files_sharing: 1.10.1
  - files_trashbin: 1.8.0
  - files_versions: 1.11.0
  - files_videoplayer: 1.7.0
  - groupfolders: 6.0.3
  - impersonate: 1.5.0
  - logreader: 2.3.0
  - lookup_server_connector: 1.6.0
  - notifications: 2.6.0
  - oauth2: 1.6.0
  - password_policy: 1.8.0
  - privacy: 1.2.0
  - provisioning_api: 1.8.0
  - settings: 1.0.0
  - sharebymail: 1.8.0
  - theming: 1.9.0
  - theming_customcss: 1.5.0
  - twofactor_backupcodes: 1.7.0
  - viewer: 1.2.0
  - workflow_script: 1.3.1
  - workflowengine: 2.0.0

Nextcloud configuration:

Config report

{
    "system": {
        "objectstore": {
            "class": "\\OC\\Files\\ObjectStore\\S3",
            "arguments": {
                "bucket": "testing.example.com",
                "autocreate": true,
                "key": "***REMOVED SENSITIVE VALUE***",
                "secret": "***REMOVED SENSITIVE VALUE***",
                "hostname": "10.1.0.2",
                "port": 8080,
                "use_ssl": false,
                "region": "fr-par",
                "use_path_style": true
            }
        },
        "log_type": "file",
        "logfile": "\/var\/log\/nextcloud\/testing.example.com-nextcloud.log",
        "passwordsalt": "***REMOVED SENSITIVE VALUE***",
        "secret": "***REMOVED SENSITIVE VALUE***",
        "trusted_domains": [
            "testing.example.com"
        ],
        "datadirectory": "***REMOVED SENSITIVE VALUE***",
        "dbtype": "mysql",
        "version": "18.0.3.0",
        "overwrite.cli.url": "https:\/\/testing.example.com",
        "dbname": "***REMOVED SENSITIVE VALUE***",
        "dbhost": "***REMOVED SENSITIVE VALUE***",
        "dbport": "3306",
        "dbtableprefix": "oc_",
        "mysql.utf8mb4": true,
        "dbuser": "***REMOVED SENSITIVE VALUE***",
        "dbpassword": "***REMOVED SENSITIVE VALUE***",
        "dbdriveroptions": {
            "1009": "\/etc\/ssl\/mysql\/ca-cert.pem",
            "1008": "\/etc\/ssl\/mysql\/client-cert.pem",
            "1007": "\/etc\/ssl\/mysql\/client-key.pem",
            "1014": false
        },
        "installed": true,
        "skeletondirectory": "",
        "default_language": "fr",
        "default_locale": "fr_FR",
        "activity_expire_days": 30,
        "auth.bruteforce.protection.enabled": false,
        "blacklisted_files": [
            ".htaccess",
            "Thumbs.db",
            "thumbs.db"
        ],
        "htaccess.RewriteBase": "\/",
        "integrity.check.disabled": false,
        "knowledgebaseenabled": false,
        "logtimezone": "Europe\/Paris",
        "maintenance": false,
        "memcache.local": "\\OC\\Memcache\\APCu",
        "memcache.distributed": "\\OC\\Memcache\\Redis",
        "updatechecker": false,
        "appstoreenabled": false,
        "upgrade.disable-web": true,
        "filelocking.enabled": false,
        "overwriteprotocol": "https",
        "preview_max_scale_factor": 1,
        "redis": {
            "host": "***REMOVED SENSITIVE VALUE***",
            "port": 6379,
            "timeout": 2.5,
            "dbindex": 2,
            "password": "***REMOVED SENSITIVE VALUE***"
        },
        "quota_include_external_storage": false,
        "theme": "",
        "trashbin_retention_obligation": "auto, 7",
        "updater.release.channel": "stable",
        "mail_smtpmode": "smtp",
        "mail_smtpsecure": "tls",
        "mail_sendmailmode": "smtp",
        "mail_from_address": "***REMOVED SENSITIVE VALUE***",
        "mail_domain": "***REMOVED SENSITIVE VALUE***",
        "mail_smtpauth": 1,
        "mail_smtphost": "***REMOVED SENSITIVE VALUE***",
        "mail_smtpport": "587",
        "mail_smtpname": "***REMOVED SENSITIVE VALUE***",
        "mail_smtppassword": "***REMOVED SENSITIVE VALUE***",
        "instanceid": "***REMOVED SENSITIVE VALUE***",
        "overwritehost": "testing.example.com",
        "preview_max_x": "1280",
        "preview_max_y": "800",
        "jpeg_quality": "70",
        "loglevel": 2,
        "enabledPreviewProviders": [
            "OC\\Preview\\PNG",
            "OC\\Preview\\JPEG",
            "OC\\Preview\\GIF",
            "OC\\Preview\\BMP",
            "OC\\Preview\\XBitmap"
        ],
        "apps_paths": [
            {
                "path": "\/var\/www\/apps",
                "url": "\/apps",
                "writable": false
            },
            {
                "path": "\/var\/www\/custom",
                "url": "\/custom_apps",
                "writable": true
            }
        ]
    }
}

Logs are completely empty (we have just fired up a test instance, and test this use case).

Similar to #17744

The text was updated successfully, but these errors were encountered:

SimplyCorbett · 2020-04-06T17:57:15Z

Your issue is likely filelocking. Disable it in the nextcloud config, disable redis filelocking. Restart PHP-FPM and try reproducing this again.

In my case by disabling filelocking all of my issues related to deletion were resolved. I just let the S3 backend handle the filelocking now.

solracsf · 2020-04-06T17:59:31Z

Filelocking is already disabled (see my config in the 1st post).

SimplyCorbett · 2020-04-06T18:01:00Z

Filelocking is already disabled (see my config in the 1st post).

My bad, another foot in mouth moment. If you wait a while are they removed from the backend? Sometimes with S3 deletion is delayed on the backend.

solracsf · 2020-04-06T18:08:05Z

Thanks but I don't think so as if i upload a 200M file and delete it, i can see it in real time in the S3 backend. And 2h had passed now and files are there (cron is running every 5mn).

SimplyCorbett · 2020-04-06T18:09:18Z

Thanks but I don't think so as if i upload a 200M file and delete it, i can see it in real time in the S3 backend. And 2h had passed now and files are there (cron is running every 5mn).

Right but with S3 in particular if a file is removed but is locked on the S3 backend it can take a while for it to process the deletions. 2 hours is a fairly long time though.

If you have your S3 provider run the garbage collection process do the files stay or are they deleted?

SimplyCorbett · 2020-04-06T18:24:51Z

With amazon they also include an option to retain locked objects for x days.

https://docs.aws.amazon.com/AmazonS3/latest/dev/object-lock-overview.html
https://aws.amazon.com/blogs/storage/protecting-data-with-amazon-s3-object-lock/

Can you verify that's not the case and garbage collection doesn't resolve the issue?

solracsf · 2020-04-06T18:27:09Z

I'm not using Amazon but Scaleway, they have Lifecycle Rules but they are disable by default.
What do you call GC in S3?

SimplyCorbett · 2020-04-06T18:29:32Z

I'm not using Amazon but Scaleway, they have Lifecycle Rules but they are disable by default.
What do you call GC in S3?

I use radosgw-admin gc process. Each host has their own rules for garbage collection. Do they have a end-user option for garbage collection or an API call to process it?

If not you would need to contact them directly and ask how often it runs and if they can run it now.

SimplyCorbett · 2020-04-06T18:30:41Z

Edit: Scaleway runs it once a day on their cold S3 storage. I don't know about the other storage options - best contact them about it.

SimplyCorbett · 2020-04-07T07:27:57Z

I owe you an apology. I don't use nextcloud for images but for file storage. You are correct that files are not being deleted properly when it comes to image previews.

JUVOJustin · 2020-04-14T09:59:42Z

Same here with wasabi as storage backend. I am having many problems with s3 currently. Maybe there is something more general broken.

solracsf · 2020-11-24T21:04:56Z

I can always confirm this with v19.0.5.
My test instance is completely empty, no files at all, trashbin cleaned, but mc outputs this:

./mc du minio/bucket
2.7GiB

and ./mc ls minio/bucket lists hundreds of files from my different tests.
Some of the files were created more than one month ago in the bucket.

These are clearly not images previews as i have big files in the bucket:

./mc ls minio/bucket
...
[2020-11-24 20:16:54 CET] 115KiB urn:oid:50503
[2020-10-07 10:10:30 CEST] 1.1KiB urn:oid:15082
[2020-11-24 20:38:10 CET]  99KiB urn:oid:55762
[2020-10-07 10:09:26 CEST] 5.5MiB urn:oid:14773
[2020-11-24 20:38:09 CET] 192KiB urn:oid:55750
[2020-10-07 09:59:00 CEST]  26KiB urn:oid:11050
[2020-10-07 10:10:27 CEST]   110B urn:oid:15034
[2020-10-06 11:21:30 CEST] 360KiB urn:oid:883
[2020-11-24 20:21:33 CET] 271KiB urn:oid:54307
[2020-10-07 09:59:52 CEST]  25MiB urn:oid:11158
...

Summary: an empty instance, and a bucket with 2.87GB used and 3685 objects in it.
😮

changsheng1239 · 2020-12-01T03:18:21Z

Yes, can confirmed this issue with Nextcloud 20.0.1 also.
My steps:

upload a 500mb files.
cancel the upload at 100mb.

Now my minio bucket is having 10 x 10mb chunked file which should've been deleted.

solracsf · 2020-12-31T11:06:21Z

I've tested it with AWS S3, to eliminate any compatibility issues with S3 'compatible' providers.
Object lock is disabled for this test case.

Problem remains.

Create an instance with one single user (admin) with AWS bucket as primary
Login and upload 5 folders from my external HDD containing thousands of sub-folders and files of all sort, for a total of 13,8Gib
Select and delete them all: an error like Can't delete folder - nothing in logs about this - has showed in WebUI for 2 of them but after refresh the Files tab, it shows an empty root)
Empty the Trashbin by php occ trashbin:clean --all-users)

Check the bucket stats of the files empty NC instance:

./mc du --versions aws/bucket
2.9GiB

This is a serious problem for many reasons, specially GDPR when users request their files to be deleted and they aren't, beyond the S3 billing for objects we aren't using anymore.

cc @nextcloud/server-triage can someone take a look at this? I believe this affects every ObjectStorage instance but since files are named urn:oid:xxx nobody really knows what files are on their buckets.

disco-panda · 2021-03-31T19:38:56Z

+1 for GDPR concerns.

It has been almost a year since this issue was brought up and S3 is heavily used in enterprise environments - any ideas when this will be prioritized? Unfortunately, we cannot rely on using S3 for storage if we cannot show that files are completely removed.

caretuse · 2021-04-03T07:57:58Z

I have a suggestion on this issue.

I know it is hard to sync between files (including filesystem or object storage) and database, not mention about handling cache involed. I believe it is impossible to make database correct after a hardware-based failure, just like a simple power failure. So I suggest there should be a way to check current file list and file information at database.

There is a command occ files:scan to sync between files and database in filesystem based storage condition, but this is not applicable in settings using object as primary storage. I believe every server using object stroage always using a standalone bucket (or directory), so it is safe to clean uncontrolled or unregistered files.

I also appreciate if developers take a look at object server direct download function, an issue had opened at #14675. This function can save our server non-essential bandwitdh, and server loading.

I use object storage (minio) as primary storage because files can be backup easily. I don't need to shutdown Nextcloud server for a long time, and I can seperate database and file server easily. I believe this workout will be widely used in enterprise level, I also wish this suggestion can help Nextcloud server at deployment and migration.

siglun88 · 2021-05-24T16:32:11Z

I am having the same issue running Nextcloud 21.0.2 with Digital Ocean Spaces (S3) as primary storage. In my case it seems that the issue only occurs when server-side encryption is activated. Although, I haven't tested too much without encryption so I can't be too conclusive.

Also, I agree with @caretuse
It would be much appreciated if both of these features could be implemented in some future release

There is a command occ files:scan to sync between files and database in filesystem based storage condition, but this is not applicable in settings using object as primary storage. I believe every server using object stroage always using a standalone bucket (or directory), so it is safe to clean uncontrolled or unregistered files.

I also appreciate if developers take a look at object server direct download function, an issue had opened at #14675. This function can save our server non-essential bandwitdh, and server loading.

jeffglancy · 2021-07-04T15:08:04Z

Yes, can confirmed this issue with Nextcloud 20.0.1 also.
My steps:

upload a 500mb files.

cancel the upload at 100mb.

Now my minio bucket is having 10 x 10mb chunked file which should've been deleted.

I have been running Nextcloud using S3 storage for over two years. I noticed my bucket was bloating early on. Digital Ocean shows my S3 was using 800GB even though my only user had 218 GB of files including versioning. I've been watching this issue for a long time now hoping for a solution, but finally got around to looking into it myself.

I compared the s3cmd la file list output to the oc_filecache database table. I expected to find extra trash files in the S3 bucket. I was perplexed to find that the file list matched perfectly. This was easy to check as the database fileid is the urn:oid:___ number. The file size from the data also matched. This led me to research more about S3 storage.

I finally found that the bloat was from old incomplete uploads. You can list these using the s3cmd multipart s3://BUCKET/ command. S3 allows large uploads to be uploaded as smaller multipart files which are then concatenated when the upload is complete. This helps reduce data transfer in case of interrupted uploads as it can resume near where it left off. It appears neither Nextcloud nor S3 storage is set to delete old incomplete multipart uploads by default. You can remove each file set individually using s3cmd abortmp s3://BUCKET/FILENAME UPLOAD_ID.

Nextcloud could remove old multipart data if it kept track of them. But S3 has the ability to do so on its own. Using s3cmd you upload an XML rule to the S3 bucket:
s3cmd setlifecycle lifecycle.xml s3://BUCKET/
Where lifecycle.xml is:

<LifecycleConfiguration>
        <Rule>
                <ID>Remove uncompleted uploads</ID>
                <Prefix/>
                <Status>Enabled</Status>
                <AbortIncompleteMultipartUpload>
                        <DaysAfterInitiation>3</DaysAfterInitiation>
                </AbortIncompleteMultipartUpload>
        </Rule>
</LifecycleConfiguration>

This rule will run once a day at midnight UTC according to what I found. After waiting a day my nearly 800 incomplete uploads spanning over two years were gone and my S3 storage now sits at 220GB as it should.

This doesn't appear to be the solution to all of the issues in this thread, but hopefully it helps some. In my case the files marked as trash or versioning in the database were being removed correctly according to the rules I have in Nextclouds config file. I have transactional file locking disabled and encryption is not enabled.

Sivarion · 2021-08-29T09:32:27Z

I suppose this is still happening on NC21.0.4?

I use Nextcloud 22.0.1 and have the exact same problem with Scaleway S3. At this point I have about 35 GB used by my users, but storage is filled with 74 GB.

Edit:
Manually running command: ./occ trashbin:clean --all-users has fixed it for me, but I guess problem will return in time.

szaimen · 2021-09-15T12:44:14Z

Manually running command: ./occ trashbin:clean --all-users has fixed it for me

Looks like the original issue is fixed then.

@acsfer can you still reproduce this on NC21.0.4 or NC22.1.1?

solracsf · 2021-09-15T12:45:22Z

@szaimen can't help anymore here, we moved away from S3...

Scandiravian · 2022-02-22T10:23:48Z

I snooped around the Nextcloud database and it seems that the issue is, that objects uploaded to S3 are not committed to the db until the transfer to S3 is completed. If a transfer is interrupted, then Nextcloud looses track of the object, since no record of ongoing transfers is kept.

A potential fix could be to log ongoing transfers in the database and occasionally do a clean-up if something goes wrong.

Until this is fixed Nextcloud will continue to bloat the bucket, so I've hacked together a python script that cleans up the S3 storage. It doesn't solve any of the open issues using S3 as primary storage - it simply cleans up orphaned objects in the bucket, thereby bringing down the amount of storage used by Nextcloud.

DISCLAIMER:
I'm a stranger on the internet providing a script, that requires access to your personal (and probably sensitive data) -> Do not trust strangers on the internet. Please review the code before running it. I'm not responsible if this script destroys your data, corrupts your db, makes your house catch fire, or curse you to step on Lego bricks every time you have bare feet.

Since the issue seems to be caused by the db not being updated until a transfer to S3 is complete, the script might delete objects that have successfully been transferred to S3, but have not yet been recorded in the database, if it's run while a sync is in progress. Therefore you should not run this while a sync is in progress. I repeat Do not run this while a sync is in progress!

I've run/tested this against my own setup (Minio + Postgres) and haven't encountered any issues so far. If you use any other combination of S3 compatible storage and database, you'll need to modify the code to your needs:

gist

jeffglancy · 2022-02-22T14:43:11Z

Why reinvent the wheel? Look back at my post on July 4 and S3 lifecycle rules. Since then I have had zero issues with S3 storage bloating from NC 20 through 23.
(#20333 (comment))

caretuse · 2022-02-23T01:38:11Z

@Scandiravian made a good script to solve database not consistent issue, although I believe this should be implemented in occ trashbin:cleanup --all-users, just like NeoTheThird mentioned in #29841.

@jeffglancy and otherguy also made a good script to solve another issue, which is cleanup pending multipart uploads in S3. But lazy as me, I would choose rclone cleanup s3:bucket in rclone document, rather simple and mistake-proof solution.

caretuse · 2023-01-10T01:15:18Z

I tested some scenario

Delete S3 object with Nextcloud server shut down: Files remain even occ files:scan --all, until delete manual from Nextcloud
Turn off S3 server (minio) with Nextcloud file deleting: Files remain in trash bin (will appear after reloading webpage)

I can't confirm files not shown in Nextcloud scenario, it should manipulate in database level. It goes beyond my interest.

Does anyone have environment to test?

Corinari · 2023-02-09T12:24:41Z

Hi @szaimen ,

we are currently running Nextcloud 25 and still experience this problem. On one instance, our S3 Bucket shows 209GB of data, while counting the different users quota in NC itself comes to about 55GB.
select sum(size/1024/1024/1024) as size_GB, count(*) as anzahl from oc_filecache where mimetype != 2; shows around 203GB of data which is tracked by NC
Trashbin is (almost) empty (~2GB)

Occurs on different Instances, which were built with a custom Docker Image.

mrAceT · 2023-02-20T12:34:37Z

@Corinari

I created a script (S3->local) once upon a time I had trouble with S3.. partially because I found a bug and feared it was S3 related (but wasn't, fixed that one: #34422 ;) ) later on (partially with creating that migration script), I dared to try to migrate back to S3.. "reversing" that script was quite a challenge.. but I got it working.. With creating that script I built in various "sanity checks".. and I now run my "local->S3" script every now and then to clean up my S3.. and baring a little hiccup every now and then the script rarely needs to clean stuf up..

A few weeks ago I decided to publish it on Github, take a look at:
https://github.com/mrAceT/nextcloud-S3-local-S3-migration

PS: I have various users on my Nextcloud, totaling some 100+Gb of data

aurelienpierre · 2024-04-12T14:27:52Z

I wrote a Python script to delete orphaned S3 objects (among other work-arounds for NC lack of proper S3 support): https://github.com/aurelienpierre/clean-nextloud-s3

tsohst · 2024-08-14T12:14:03Z

Is there already a real solution from nextcloud?
Facing the same problem currently.

@aurelienpierre you're script might help but its not ready for other s3 vendos like OVH Cloud aurelienpierre/clean-nextloud-s3#2

Also scanning 300k objects takes so much time and downloading them costs €€ ;)

thlehmann-ionos · 2024-09-11T07:09:37Z

I can still reproduce this in Nextcloud 29.

Pre upload:

$ du -sm ./minio/data/
6897    ./minio/data/

Upload ~5GB file, abort at ~50%, find:

$ du -sm ./minio/data/
9541    ./minio/data/

agowa · 2024-09-11T18:58:57Z

@tsohst sorry, I don't know. I stopped using nextcloud because of this error years ago.

joshtrichards · 2024-09-12T13:20:48Z

Disclaimer: Work in progress.

Based on my review of this thread it doesn't appear everyone has the same underlying cause (though the symptoms are somewhat similar). Since this issue has fairly broad title, it's likely there is also some overlap with other open Issues (I'll try to review these as time permits and sort some of these out).

Here are the apparent underlying causes I've been been able to identify from this Issue:

Object versioning being enabled (disable or use lifecycle rules)
Unfinished multipart uploads (use lifecycle rules, but possibly some refinements we can make on our side too handle cleanup automatically) - see Large data files left in S3 object storage after bad uploads #29841
Interrupted transactions
Probably some bugs over the years we've had
Some non-bugs, like trashbins that just needed to be cleared or maybe config parameters adjusted to local needs (?)
Previews (TBD)

Locking (and legal holds) got mentioned, but hasn't seemed to be a factor with anyone here.

I'll also toss a couple others into the list:

Possibly some interactions with other features/apps like groupfolders or encryption (?)
Differences in behavior that have changed over time in our code
Differences in behavior that have changed over time in the SDK we use (from AWS)
something else?

Some of these (but not all) could be addressed through some documentation tweaks.

Keep in mind this is work in progress analysis. Here are a couple notes on a couple of the biggies above.

Versioning

Different providers and object store platforms have different defaults. For example, Backblaze has versioning on by default. AWS has it off by default, but when it's turned on versioning of individual objects apparently are hidden by default in their Web UI in some places so it can be easily to miss if they've been turned on through org policy.

Solution: Either turn off versioning or add lifecycle management rules on your S3 platform. Also, the files_versions_s3 app may be of interest: https://github.com/nextcloud/files_versions_s3/

Aborted multipart uploads

Maybe we can do better here, but it's going to take some work to figure that out. On the other hand, lifecycle rules can be made to handle this situation well (and cleanly) from the looks of it.

0x4E4448 · 2024-11-03T05:29:21Z

I experience this same issue but in different scenarios. New, clean install. During performance testing I uploaded files of increasing file size and had some failures for unknown reasons. I deleted them out completely but when I went to check the bucket on Linode it still had hundreds of GBs of data in the bucket. 0 files in the single users directory. I continued testing with even larger files (even up to 1TB) and those seemed to work file and deleted file. It would be helpful if there was a cleanup command like was requested in #48143

I'm running a Docker Compose install using MariaDB.

joshtrichards · 2024-11-03T15:37:18Z

I deleted them out completely but when I went to check the bucket on Linode it still had hundreds of GBs of data in the bucket. 0 files in the single users directory.

@0x4E4448 What types of data? If MPUs, use a lifecycle policy:

0x4E4448 · 2024-11-03T17:51:00Z

@0x4E4448 What types of data? If MPUs, use a lifecycle policy:

How would I know? I see individual files of exactly the sizes I was uploading (1G, 1.25G, 1.5G, etc). The names are all urn:old:####.

I will look at lifecycle policies also.

0x4E4448 · 2024-11-08T00:43:57Z

@0x4E4448 What types of data? If MPUs, use a lifecycle policy:

Lifecycle policies did seem to resolve most of it, but not all of it. I've still got some files (which do not exist in NC) that I know were uploaded by me (by size) but were not removed from object storage. The ability to reconcile the database with the files in object storage really does make sense, especially considering it exists for local storage.

brainchild0 · 2025-01-07T23:21:55Z

Various operation are available from the command-line administrative utility, occ, to keep the database properly coherent with the backend file store, but the operation seems only to apply to files stored locally, rather than also to alternative storage backends.

In addition to various other proposed improvements, at minimum basic cleanup should be supported, though the command-line utility scanning a bucket, and then reporting and removing objects identified as orphans.

Generally, S3 backend storage seems to be valuable and popular, due its many advantages in provisioning, operation, and economy, compared to storing user files on the local filesystem, yet, so far, reliability and flexibility of such support has not seemed as among the major priorities for platform development.

solracsf added bug 0. Needs triage Pending check for reproducibility or if it fits our roadmap labels Apr 6, 2020

solracsf changed the title ~~If empty the trash times out, files are not deleted from S3 (primary)~~ If empty the trash fails, files are not deleted from S3 (primary) Apr 6, 2020

SimplyCorbett mentioned this issue Apr 7, 2020

S3 - Previews not deleted #20344

Closed

solracsf added the feature: object storage label Apr 17, 2020

solracsf changed the title ~~If empty the trash fails, files are not deleted from S3 (primary)~~ Files are not deleted from S3 (primary) Apr 24, 2020

solracsf mentioned this issue Apr 28, 2021

better cleanup of user files on user deletion #26792

Merged

This comment was marked as resolved.

Sign in to view

szaimen added the needs info label Aug 8, 2021

Scandiravian mentioned this issue Feb 22, 2022

Large data files left in S3 object storage after bad uploads #29841

Open

solracsf closed this as completed Oct 3, 2022

szaimen reopened this Oct 3, 2022

This comment has been minimized.

Sign in to view

szaimen added needs info 0. Needs triage Pending check for reproducibility or if it fits our roadmap and removed 1. to develop Accepted and waiting to be taken care of labels Jan 9, 2023

szaimen added 25-feedback and removed needs info labels Feb 9, 2023

joshtrichards added the feature: trashbin label Oct 17, 2023

joshtrichards added the needs review Needs review to determine if still applicable label Sep 12, 2024

joshtrichards mentioned this issue Sep 18, 2024

Add command to list/cleanup objects that exist in a bucket but not in DB #48143

Open

solracsf removed the 25-feedback label Nov 17, 2024

szaimen added the 25-feedback label Dec 2, 2024

Files are not deleted from S3 (primary) #20333

Files are not deleted from S3 (primary) #20333

Comments

solracsf commented Apr 6, 2020 • edited Loading

How to use GitHub

Steps to reproduce

Expected behaviour

Actual behaviour

Server configuration

SimplyCorbett commented Apr 6, 2020

solracsf commented Apr 6, 2020

SimplyCorbett commented Apr 6, 2020 • edited Loading

solracsf commented Apr 6, 2020

SimplyCorbett commented Apr 6, 2020

SimplyCorbett commented Apr 6, 2020

solracsf commented Apr 6, 2020

SimplyCorbett commented Apr 6, 2020

SimplyCorbett commented Apr 6, 2020 • edited Loading

SimplyCorbett commented Apr 7, 2020 • edited Loading

JUVOJustin commented Apr 14, 2020

solracsf commented Nov 24, 2020 • edited Loading

changsheng1239 commented Dec 1, 2020

solracsf commented Dec 31, 2020 • edited Loading

disco-panda commented Mar 31, 2021

caretuse commented Apr 3, 2021

siglun88 commented May 24, 2021

jeffglancy commented Jul 4, 2021

This comment was marked as resolved.

Sivarion commented Aug 29, 2021 • edited Loading

szaimen commented Sep 15, 2021 • edited Loading

solracsf commented Sep 15, 2021

Scandiravian commented Feb 22, 2022 • edited Loading

jeffglancy commented Feb 22, 2022 • edited Loading

caretuse commented Feb 23, 2022

This comment has been minimized.

caretuse commented Jan 10, 2023

Corinari commented Feb 9, 2023

mrAceT commented Feb 20, 2023 • edited Loading

aurelienpierre commented Apr 12, 2024

tsohst commented Aug 14, 2024 • edited Loading

thlehmann-ionos commented Sep 11, 2024

agowa commented Sep 11, 2024

joshtrichards commented Sep 12, 2024 • edited Loading

Versioning

Aborted multipart uploads

0x4E4448 commented Nov 3, 2024 • edited Loading

joshtrichards commented Nov 3, 2024

0x4E4448 commented Nov 3, 2024

0x4E4448 commented Nov 8, 2024

brainchild0 commented Jan 7, 2025

solracsf commented Apr 6, 2020 •

edited

Loading

SimplyCorbett commented Apr 6, 2020 •

edited

Loading

SimplyCorbett commented Apr 6, 2020 •

edited

Loading

SimplyCorbett commented Apr 7, 2020 •

edited

Loading

solracsf commented Nov 24, 2020 •

edited

Loading

solracsf commented Dec 31, 2020 •

edited

Loading

Sivarion commented Aug 29, 2021 •

edited

Loading

szaimen commented Sep 15, 2021 •

edited

Loading

Scandiravian commented Feb 22, 2022 •

edited

Loading

jeffglancy commented Feb 22, 2022 •

edited

Loading

mrAceT commented Feb 20, 2023 •

edited

Loading

tsohst commented Aug 14, 2024 •

edited

Loading

joshtrichards commented Sep 12, 2024 •

edited

Loading

0x4E4448 commented Nov 3, 2024 •

edited

Loading