-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
POC for multi-output #767
Draft
audunska
wants to merge
20
commits into
master
Choose a base branch
from
audunska/multi-output
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
POC for multi-output #767
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This works with the new implementation in terms of Writable
Codecov Report
@@ Coverage Diff @@
## master #767 +/- ##
==========================================
- Coverage 77.07% 77.04% -0.04%
==========================================
Files 49 50 +1
Lines 3407 3441 +34
Branches 152 154 +2
==========================================
+ Hits 2626 2651 +25
- Misses 781 790 +9
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is proof of concept for how we might support multiple outputs from a single transformation.
The idea is that you write sql like
and then in the
type
relation option, you specify the types of those names likea:asset,e:event
. This would be passed in a more structured json way to the jetfire-backend, but we have to flatten it to a string in spark.Internally, we parse out the names and types, and then create a relation for each of them. Specific options to each source can be passed by prefixing the options with
${name}:
, i.e., an option nameda:assetSubTreeIds
would be passed to the asset relation.See the new test in
AssetTests
for a proof that this works in a simple case.Unfortunately, we have to give all fields in the correct order and explicitly cast all null fields for this to typecheck in spark. Hopefully this can be alleviated in some way.EDIT: This now just works!This is just a POC. Ideas / criticisms are welcome!