Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Register Spark date_format function #11953

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

PHILO-HE
Copy link
Contributor

@PHILO-HE PHILO-HE commented Dec 24, 2024

This PR registers the existing FormatDateTimeFunction for Spark SQL to use.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 24, 2024
Copy link

netlify bot commented Dec 24, 2024

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 21d95d1
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/677f43d5864eef0008644073

@PHILO-HE PHILO-HE changed the title misc: support date_format Spark function by simply reusing a Presto function misc: Support date_format Spark function by simply reusing a Presto function Dec 24, 2024
@PHILO-HE
Copy link
Contributor Author

We only use one signature of FormatDateTimeFunction. The other one is only applicable to Presto. So I prefer keeping the whole implementation in Presto folder.
@rui-mo, please take a look. Thanks!

Copy link
Collaborator

@rui-mo rui-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

velox/docs/functions/spark/datetime.rst Outdated Show resolved Hide resolved
velox/docs/functions/spark/datetime.rst Outdated Show resolved Hide resolved
Converts `timestamp` to a string in the format specified by `dateFormat`.

SELECT date_format('2020-01-29', 'yyyy'); -- '2020'
SELECT date_format('2024-05-30 08:00:00', 'yyyy-MM-dd'); -- '2024-05-30'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the patterns of 'dateFormat' are the same between Presto and Spark. If no, we might need separate implementation.

Copy link
Contributor Author

@PHILO-HE PHILO-HE Dec 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, JodaDateTimeFormatter is used as expected. And it has been validated by Spark UTs.

// supported range.
// date_format and from_unixtime throws VeloxRuntimeError when the
// timestamp is out of the supported range.
"date_format",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The VeloxRuntimeError with UNSUPPORTED_INPUT_UNCATCHABLE code is allowed in the fuzzer test. I wonder if we can throw it in the 'from_unixtime' and 'date_format' to make the fuzzer test work. Seeing details in 8a6ab15.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rui-mo, just removed this function from fuzzer black list. Please review this pr again.

@rui-mo
Copy link
Collaborator

rui-mo commented Jan 9, 2025

./velox/expression/fuzzer/spark_expression_fuzzer_test --seed ${RANDOM} --duration_sec 120 --minloglevel=0 --stderrthreshold=2 --only date_format
E0110 00:29:32.479840 4036313 ExpressionFuzzerVerifier.cpp:455] Total iterations: 225914
E0110 00:29:32.480005 4036313 ExpressionFuzzerVerifier.cpp:456] Total failed: 135471
[==========] Running 0 tests from 0 test suites.
[==========] 0 tests from 0 test suites ran. (0 ms total)
[ PASSED ] 0 tests.

@rui-mo rui-mo changed the title misc: Support date_format Spark function by simply reusing a Presto function feat: Register Spark date_format function Jan 9, 2025
@rui-mo rui-mo added the ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall label Jan 9, 2025
@facebook-github-bot
Copy link
Contributor

@Yuhta has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants