You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Reading features from the TSV file may fail.
These errors should be wrapped in Validated such that they can be propagated to the job output as an error in location record instead of failing the whole job.
Examples:
WKB can be invalid, bad user input
WKB can be valid but encodes invalid geometry
Pre-Processing of features (like splitting along the grid tiles) triggers some kind of JTS TopologyException
???
Describe the solution you'd like ErrorSummaryRDD introduced the use of Validated for the polygonal summary operation with mechanism to report the errors. We should have ValidatedFeatureRDD to cover feature input.
This should be a new class so we can test it in ForestChangeDiagnostic and dashboard jobs without having to change the rest of the code-base. Later refactors can clean that up.
Because part of the logic here is covered by DataFrame API the logic from SummaryRDD will not be enough.
I'm not sure what the best way to handle the exceptions in DataFrame nor actually what they will look like. (they SHOULD result in null fields without much explanation, but that may not be the case).
Either way I would expect to see some use of Validated here:
Describe alternatives you've considered
I've considered doing thing because geometries should be valid, but that proved to be not so.
I'm not sure how an invalid geometry is going to interact with the partitioning scheme in the ErrorSummaryRDD. It may be that they will have to be filtered out and then joined onto the results as its neither possible to place them on a map or run a polygonal summary on them. Interested to see how that turns out.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
Reading features from the TSV file may fail.
These errors should be wrapped in
Validated
such that they can be propagated to the job output as an error in location record instead of failing the whole job.Examples:
Describe the solution you'd like
ErrorSummaryRDD introduced the use of Validated for the polygonal summary operation with mechanism to report the errors. We should have
ValidatedFeatureRDD
to cover feature input.This should be a new class so we can test it in ForestChangeDiagnostic and dashboard jobs without having to change the rest of the code-base. Later refactors can clean that up.
Because part of the logic here is covered by DataFrame API the logic from SummaryRDD will not be enough.
I'm not sure what the best way to handle the exceptions in DataFrame nor actually what they will look like. (they SHOULD result in
null
fields without much explanation, but that may not be the case).Either way I would expect to see some use of
Validated
here:gfw_forest_loss_geotrellis/src/main/scala/org/globalforestwatch/summarystats/forest_change_diagnostic/ForestChangeDiagnosticCommand.scala
Lines 42 to 46 in 5b4a436
such that they could be passed here:
gfw_forest_loss_geotrellis/src/main/scala/org/globalforestwatch/summarystats/forest_change_diagnostic/ForestChangeDiagnosticAnalysis.scala
Lines 61 to 65 in 5b4a436
gfw_forest_loss_geotrellis/src/main/scala/org/globalforestwatch/summarystats/ErrorSummaryRDD.scala
Lines 80 to 84 in 5b4a436
Describe alternatives you've considered
I've considered doing thing because geometries should be valid, but that proved to be not so.
I'm not sure how an invalid geometry is going to interact with the partitioning scheme in the
ErrorSummaryRDD
. It may be that they will have to be filtered out and then joined onto the results as its neither possible to place them on a map or run a polygonal summary on them. Interested to see how that turns out.The text was updated successfully, but these errors were encountered: