Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GTC-2598 Fix gadm null filter #200

Merged
merged 3 commits into from
Nov 15, 2023
Merged

GTC-2598 Fix gadm null filter #200

merged 3 commits into from
Nov 15, 2023

Conversation

manukala6
Copy link
Member

@manukala6 manukala6 commented Nov 7, 2023

Pull request checklist

Please check if your PR fulfills the following requirements:

  • Make sure you are requesting to pull a topic/feature/bugfix branch (right side). Don't request your master!
  • Make sure you are making a pull request against the develop branch (left side). Also you should start your branch off our develop.
  • Check the commit's or even all commits' message styles matches our requested structure.
  • Check your code additions will fail neither code linting checks nor unit test.

Pull request type

Please check the type of change your PR introduces:

  • Bugfix
  • Feature
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes, no api changes)
  • Build related changes
  • Documentation content changes
  • Other (please describe):

What is the current behavior?

AFi Analysis currently filters out groups in which gadm_id contains any null. This causes areas without adm2 IDs, such as Singapore, to be excluded from results

Issue Number: GTC-2598

What is the new behavior?

  • AFi Analysis filters out groups only when the ISO code is null

Does this introduce a breaking change?

  • Yes
  • No

Other information

@@ -48,9 +48,9 @@ object AFiAnalysis extends SummaryAnalysis {
val summaryDF = AFiAnalysis.aggregateResults(
AFiDF
.getFeatureDataFrame(summaryRDD, spark)
.filter(!$"gadm_id".contains("null"))
.filter(!$"gadm_id".substr(1, 3).contains("null"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment above this statement to explain the 'substr(1, 3)'? It would be good to explain the general idea (that you are only looking for null in the ISO, not adm1 or adm2), and also specifically why the substr() call works.

In particular, I don't understand how a substring of length 3 is ever going to contain the string "null". Is that actually correct, and if so, maybe you can explain why in the comment. I.e. what is the format of the "gadm_id" column (something like "IDN.1.5", right?) and does "null" actually show up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we actually might not want to check if anything is null? "null" is valid for adm1 and adm2 depending on the country, the only thing we'd want to filter out is when iso is an empty string.

Or maybe not filter anything at all? Because we saw there are occasionally results outside of GADM boundaries somehow, do we want to discard those or keep those?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jterry64 We currently use the row with location_id = -1 and gadm_id = '' to hold summed values for the dissolved list. The filter disregards stray pixels (like over oceans) that would add another row with those specific values. Or we assign them location_id = -3 or some other value. Let me know what works

@@ -48,9 +48,9 @@ object AFiAnalysis extends SummaryAnalysis {
val summaryDF = AFiAnalysis.aggregateResults(
AFiDF
.getFeatureDataFrame(summaryRDD, spark)
.filter(!$"gadm_id".contains("null"))
.filter(!$"gadm_id".substr(1, 3).contains("null"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we actually might not want to check if anything is null? "null" is valid for adm1 and adm2 depending on the country, the only thing we'd want to filter out is when iso is an empty string.

Or maybe not filter anything at all? Because we saw there are occasionally results outside of GADM boundaries somehow, do we want to discard those or keep those?

@codecov-commenter
Copy link

Codecov Report

Attention: 7 lines in your changes are missing coverage. Please review.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Files Coverage Δ
...obalforestwatch/summarystats/afi/AFiAnalysis.scala 0.00% <0.00%> (ø)
...lobalforestwatch/summarystats/afi/AFiSummary.scala 0.00% <0.00%> (ø)

... and 1 file with indirect coverage changes

📢 Thoughts on this report? Let us know!

@manukala6
Copy link
Member Author

@jterry64 I moved the filer from AFiAnalysis to AFiSummary, it now skips pixels with null ISO. However pixels with null adm1 or adm2 would have a "null" string in the gadmId (for example "SGP.1.null"). Could that introduce front-end issues?

@manukala6 manukala6 requested a review from jterry64 November 13, 2023 16:56
@danscales
Copy link
Collaborator

@jterry64 I moved the filer from AFiAnalysis to AFiSummary, it now skips pixels with null ISO. However pixels with null adm1 or adm2 would have a "null" string in the gadmId (for example "SGP.1.null"). Could that introduce front-end issues?

It seems like it might be easier to understand and process for the front-end if you make null values be the empty string "" in the gadmId. So, then it would look like "SGP.1." But I'm sure the front-end folks could deal with either, as long as you warn them about the possibility (missing adm1 or adm2) and how it will show up.

What does a missing adm1 or adm2 mean? Is it actually land in the country which somehow is not in any adm1 (or any adm2 with an adm1)?

if (gadmAdm0 == "") {
return
}
val gadmAdm1: Integer = raster.tile.gadmAdm1.getData(col, row)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to add a comment here that either gadmAdm1 and gadmAdm1 could possibly be null (which came from a 9999 pixel value in the raster)

@manukala6 manukala6 merged commit 0abc076 into master Nov 15, 2023
2 of 3 checks passed
@manukala6 manukala6 deleted the bugfix/afi_gadm_filter branch November 15, 2023 06:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants