Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reporting jsonb subscripting issue in assessment and analyze #2129

Merged
merged 19 commits into from
Jan 7, 2025

Conversation

priyanshi-yb
Copy link
Contributor

@priyanshi-yb priyanshi-yb commented Dec 28, 2024

Describe the changes in this pull request

Reporting the JSONB subscription in the DMLs, DDLs and PLPGSQL objects.
The approach to detecting these is not defined, so I am doing my best effort to detect as many as possible in the source schema. There are still edge cases that can't be detected easily.

https://yugabyte.atlassian.net/browse/DB-14545

Describe if there are any user-facing changes

addition of a feature to the report-

Unsupported Feature

Screenshot 2024-12-30 at 1 56 56 PM

Unsupported Query Constructs

Screenshot 2024-12-30 at 1 57 03 PM

Unsupported PL/pgSQL objects

Screenshot 2024-12-30 at 2 07 23 PM

How was this pull request tested?

Unit tests
TODO: add end-to-end tests

Does your PR have changes that can cause upgrade issues?

Component Breaking changes?
MetaDB No
Name registry json No
Data File Descriptor Json No
Export Snapshot Status Json No
Import Data State No
Export Status Json No
Data .sql files of tables No
Export and import data queue No
Schema Dump No
AssessmentDB No
Sizing DB No
Migration Assessment Report Json No
Callhome Json No
YugabyteD Tables No
TargetDB Metadata Tables No

@priyanshi-yb priyanshi-yb marked this pull request as ready for review December 30, 2024 08:46
@priyanshi-yb priyanshi-yb changed the title Reporting jsonb subscripting Reporting jsonb subscripting issue in assessment and analyze Dec 30, 2024

func (p *ParserIssueDetector) getJsonbColumns() []string {
var jsonColumns []string
for _, mp := range p.columnsWithUnsupportedIndexDatatypes {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not re-use columnsWithUnsupportedIndexDatatypes to get jsonb columns here. If in the future, jsonb index gets supported, we'll have to remove it from that list, but we'll need it for detecting jsonb-related issues.

Can we use a different variable here?

Also, this means that we need to have a single instance of parserIssueDetector for both analyzeSchema as well as assess-migration AND that the unsupportedqueryconstructs are fetched AFTER the schema issues have been detected, correct? Can you verify that this is indeed happening?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this means that we need to have a single instance of parserIssueDetector for both analyzeSchema as well as assess-migration AND that the unsupportedqueryconstructs are fetched AFTER the schema issues have been detected, correct? Can you verify that this is indeed happening?

yes correct, we would need that because we rely on exported schema details for other things, e.g. datatypes (UDT, array etc..) so it is still the case that parserIssueDetector needs to be the same for analyze and assessment and run any other detection in assessment after exported schema is Parsed (ParseRequiredDDL()) .

Copy link
Contributor Author

@priyanshi-yb priyanshi-yb Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if required all this can be abstracted out if we run the ParseRequiredDDL separately on exported schema for assessment.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's okay for now, it's a fair assumption to make, I suppose. just add a comment in code at the point where we're calling parserIssueDetector to get unsupported query constructs.

func (mv *FunctionProcessor) Process(parseTree *pg_query.ParseResult) (DDLObject, error) {
funcNode, ok := getCreateFuncStmtNode(parseTree)
if !ok {
return nil, fmt.Errorf("not a CREATE VIEW statement")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a CREATE FUNCTION

SchemaName string
FuncName string
ReturnType string
IsReturnPercType bool
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this field? doesn't seem to be used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, not using it currently, will remove it.

return aIndirection, ok
}

func GetGenericNode(msg protoreflect.Message) (*pg_query.Node, bool) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not being used? remove?

return nil
}

arg := aIndirectionNode.GetArg()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment here explaining the indirection node and the arg, indices fields at a high level. Explain that we're essentially checking that the arg is a jsonb type

type_name:{names:{string:{sval:"jsonb"}} typemod:-1 location:306} location:304}}
*/
typeCast := node.GetTypeCast()
typeName, _ := GetTypeNameAndSchema(typeCast.GetTypeName().GetNames())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetTypeNameAndSchema need not be exported anymore because you moved the entire function to queryparser? Same probably for a few other functions that you decided to export. Pls check.

@@ -264,3 +268,69 @@ func IsDDL(parseTree *pg_query.ParseResult) (bool, error) {
//Not Full-proof as we don't have all DDL types but atleast we will skip all the types we know currently
return !ok, nil
}

func IsJsonbType(node *pg_query.Node, jsonbColumns []string, jsonbFunctions []string) bool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about DoesNodeEvaluateToJsonB or something similar? add some doc to this function. Explain that we are checking if the particular node, when evaluated, results in a jsonb value.. could be a column, a typecase, a function call, an expression, etc,etc.

@priyanshi-yb priyanshi-yb force-pushed the priyanshi/report-json-subscripting branch from f5de10d to a0954f1 Compare January 6, 2025 15:43
Copy link
Collaborator

@makalaaneesh makalaaneesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

`SELECT ('{"key": "value1"}'::jsonb || '{"key": "value2"}'::jsonb)['key'] AS object_in_array;`,
}

for _, sql := range withoutIssueSqls {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for writing negative tests 👍

@priyanshi-yb priyanshi-yb merged commit 23db698 into main Jan 7, 2025
67 checks passed
@priyanshi-yb priyanshi-yb deleted the priyanshi/report-json-subscripting branch January 7, 2025 11:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants