-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: avoid async_trait macro for IcebergWriter and provide extra dyn trait for object safety #760
base: main
Are you sure you want to change the base?
Conversation
cc @ZENOTME |
Thanks, @wenym1! I think PR is great so that we can't avoid some limits to designing the writer API because of object safety. cc @liurenjie1024 @Xuanwo @Fokko |
This PR needs to fix the conflict after #741. It change the interface of writer builder |
@ZENOTME Comments are addressed. PTAL |
Thanks @wenym1 for this pr, could you elaborate the benefit of this change? As you said, this may introduce breaking api change, why we need to do this? One point you mentioned is the box allocation, do we have measurement of how much this cost is compared with actual IO? |
Personally, I think more important benefits of this PR is to provide extra dyn traits for object safety. After separating these two trait, we can design the inner trait without worrying about the object safety. It can avoid some problem like #703 (comment). Also, after this PR, our writer builder be object safe now, it originally isn't. In practice, we found it's useful for this because in some case, user want to store writer builder in some place and wrap it Box make things easier. |
I find that the implementation has some problems now, it will cause recursive calls endlessly and stack overflow finally. Reproduce:
|
Thanks for pointing it out. The stack overflow was caused by accidentally repeatedly interleaving call on the |
Hi, thank you @wenym1 for your work on this, and thanks to @ZENOTME and @liurenjie1024 for their reviews. I'm a bit concerned about the complexity this PR introduces.
Given that users always utilize the dyn-compatible API from outside, I believe the box allocation cannot be avoided.
I thought |
Previously, we use
async_trait
for traitIcebergWriter
andIcebergWriterBuilder
. For traits implemented withasync_trait
, all call to the async methods will generate aBoxedFuture
, which may incur unnecessary cost in box allocation.In this PR, we will avoid using
async_trait
for the two traits, so that theBoxedFuture
can be optionally avoided. To retain the object-safety, we provide with the object-safe counterpart to the two traits, namedDynIcebergWriter
andDynIcebergWriterBuilder
. We doimpl IcebergWriter for Box<dyn DynIcebergWriter>
andimpl IcebergWriterBuilder for Box<dyn DynIcebergWriterBuilder>
, so that the type erased dyn trait object can still be used asIcebergWriter
andIcebergWriterBuilder
. Nevertheless, for the two dyn traits, the futures generated from calling their async methods are still boxed futures.Note that, after this PR, there can be backward compatibility issue in the public API of the library. When
impl
the two traits, we may have to remove the previousasync_trait
that wraps theimpl
block like what we did in this PR.