Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement recursive folder support for S3 bucket syncs #284

Open
alanking opened this issue Sep 30, 2024 · 0 comments
Open

Implement recursive folder support for S3 bucket syncs #284

alanking opened this issue Sep 30, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@alanking
Copy link
Collaborator

Currently, all S3 bucket syncs treat the entire bucket like a flat directory. While this is the nature of S3 buckets, treating "/" characters as individual "sub-folders" in the bucket could massively improve performance. The Minio.list_objects call in the S3 bucket task specifies recursive=True:

itr = client.list_objects(bucket_name, prefix=prefix, recursive=True)

This should probably be False, but that would require a lot of other changes.

Additionally, this would greatly improve the potential implementation of #282. As it stands, a query to hold all of the data objects under the target collection is required. This would mean that the entire S3 bucket is being held in memory (possibly - depends on the implementation of Minio.list_objects) and the entire target collection's contents as well, which could potentially be very large.

@alanking alanking added the enhancement New feature or request label Sep 30, 2024
@alanking alanking modified the milestone: 0.6.0 Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

No branches or pull requests

1 participant