Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spark 3.5: Support RewriteManifestsProcedure with a target size parameter #11959

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

majian1998
Copy link

Current Limitations: The rewrite manifest currently has limited parameters. To set or change the manifest size, you must go through the table config, which restricts flexibility. We wish for rewriting and setting the target size to be a unified operation that only impacts that particular rewrite. During tuning, we test various sizes for the manifest. In contrast, when rewriting data files, we can specify files directly without altering table properties, and other files remain unaffected by the target size.

Consideration for Options: I initially considered using options to specify parameters, but since few are needed for rewriting the manifest, adding just one parameter for the target size seems sufficient. This approach is open to further discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant