Skip to content

Commit

Permalink
Presto's from_unixtime with TSwTZ can be 1 millisecond off in Velox c…
Browse files Browse the repository at this point in the history
…ompared to Presto Java (#11426)

Summary:
Pull Request resolved: #11426

from_unixtime with TSwTZ was written to reuse the logic used to implement the function for
Timestamps. That is to say, it takes subtracts the seconds from the unixtime, multiplies the
remainder by 1000, and rounds to get the integer number of milliseconds. The seconds and
milliseconds are later recombined to form the unix timestamp portion of the TSwTZ.

In Presto Java, for TSwTZ they simply multiply the unix timestamp by 1000 and round to get the unix
timestamp portion of the TSwTZ.  Due to the quirks of floating points, this means Presto Java and
Velox can get slightly different answers.

I modified Velox to reproduce the Presto Java logic for TSwTZ and confirmed this gives consistent
results.

Reviewed By: kgpai

Differential Revision: D65434542

fbshipit-source-id: bfe434fe58103557dab72ff7a9309511460a1ae4
  • Loading branch information
Kevin Wilfong authored and facebook-github-bot committed Nov 5, 2024
1 parent f4a1223 commit 2ade7f7
Show file tree
Hide file tree
Showing 3 changed files with 34 additions and 4 deletions.
8 changes: 4 additions & 4 deletions velox/functions/prestosql/DateTimeFunctions.h
Original file line number Diff line number Diff line change
Expand Up @@ -70,10 +70,10 @@ struct FromUnixtimeFunction {
FOLLY_ALWAYS_INLINE void call(
out_type<TimestampWithTimezone>& result,
const arg_type<double>& unixtime,
const arg_type<Varchar>& timezone) {
int16_t timezoneId =
tzID_.value_or(tz::getTimeZoneID((std::string_view)timezone));
result = pack(fromUnixtime(unixtime).toMillis(), timezoneId);
const arg_type<Varchar>& timeZone) {
int16_t timeZoneId =
tzID_.value_or(tz::getTimeZoneID((std::string_view)timeZone));
result = fromUnixtime(unixtime, timeZoneId);
}

// (double, bigint, bigint) -> timestamp with time zone
Expand Down
25 changes: 25 additions & 0 deletions velox/functions/prestosql/DateTimeImpl.h
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,31 @@ FOLLY_ALWAYS_INLINE Timestamp fromUnixtime(double unixtime) {
return Timestamp(seconds, milliseconds * kNanosecondsInMillisecond);
}

FOLLY_ALWAYS_INLINE boost::int64_t fromUnixtime(
double unixtime,
int16_t timeZoneId) {
if (FOLLY_UNLIKELY(std::isnan(unixtime))) {
return pack(0, timeZoneId);
}

static const int64_t kMin = std::numeric_limits<int64_t>::min();

if (FOLLY_UNLIKELY(unixtime >= kMinDoubleAboveInt64Max)) {
return pack(std::numeric_limits<int64_t>::max(), timeZoneId);
}

if (FOLLY_UNLIKELY(unixtime <= kMin)) {
return pack(std::numeric_limits<int64_t>::min(), timeZoneId);
}

if (FOLLY_UNLIKELY(std::isinf(unixtime))) {
return unixtime < 0 ? pack(std::numeric_limits<int64_t>::min(), timeZoneId)
: pack(std::numeric_limits<int64_t>::max(), timeZoneId);
}

return pack(std::llround(unixtime * kMillisecondsInSecond), timeZoneId);
}

namespace {
enum class DateTimeUnit {
kMillisecond,
Expand Down
5 changes: 5 additions & 0 deletions velox/functions/prestosql/tests/DateTimeFunctionsTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -300,6 +300,11 @@ TEST_F(DateTimeFunctionsTest, fromUnixtimeWithTimeZone) {
// Nan.
static const double kNan = std::numeric_limits<double>::quiet_NaN();
EXPECT_EQ(fromUnixtime(kNan, "-04:36"), TimestampWithTimezone(0, "-04:36"));

// Check rounding behavior.
EXPECT_EQ(
fromUnixtime(1.7300479933495E9, "America/Costa_Rica"),
TimestampWithTimezone(1730047993350, "America/Costa_Rica"));
}

TEST_F(DateTimeFunctionsTest, fromUnixtimeTzOffset) {
Expand Down

0 comments on commit 2ade7f7

Please sign in to comment.