Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug fix: Enhanced SQL statement validation with word boundary matching #2324

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

nitin-singla
Copy link
Member

Enhanced SQL statement validation to handle disallowed keywords appearing as substrings. Introduced regular expression-based word boundary matching to accurately detect whole-word occurrences of disallowed operations, preventing potential misinterpretations and unintended false positives.

Issue #, if available:
#2323

Description of changes:

  • Modified the customConnectorVerifications method in the DDBQueryPassthrough class.
  • Replaced the previous implementation that used a simple contains check to detect disallowed keywords.
  • Introduced the use of regular expressions and the java.util.regex.Matcher class to perform word boundary matching.
  • Added a regular expression pattern (\\w+) to match one or more word characters.
  • Implemented a loop that iterates through all word matches found in the SQL statement using the Matcher.
  • For each word match, checked if the word is present in the set of disallowed keywords.
  • If a disallowed keyword is found as a whole word, an AthenaConnectorException is thrown with the appropriate error message and error code.
  • This enhancement ensures that SQL statements like "SELECT * from xyupdatez" are correctly identified as valid SELECT statements, even though they contain the substring "UPDATE" within an identifier.
  • The new implementation continues to accurately reject statements that contain disallowed keywords representing operations like "UPDATE", "INSERT", or "DELETE".
  • Improved the overall correctness and accuracy of the SQL statement validation process by handling edge cases more robustly.

In summary: The proposed solution introduces regular expression-based word boundary matching to accurately identify whole-word occurrences of disallowed operations, thereby improving the correctness and accuracy of the validation process. This change ensures that valid SELECT statements are not incorrectly rejected, preventing potential issues or unintended behavior.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

nitin-singla and others added 2 commits October 9, 2024 23:50
Enhanced SQL statement validation to handle disallowed keywords appearing as substrings. Introduced regular expression-based word boundary matching to accurately detect whole-word occurrences of disallowed operations, preventing potential misinterpretations and unintended false positives.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants