Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ingestion Utility: All class , func metadata and Func to func call ingestion into Neo4j #206

Closed
JayGhiya opened this issue Nov 5, 2024 · 6 comments · Fixed by #212
Closed
Assignees
Labels
enhancement New feature or request

Comments

@JayGhiya
Copy link
Member

JayGhiya commented Nov 5, 2024

No description provided.

@JayGhiya
Copy link
Member Author

JayGhiya commented Nov 14, 2024

Package Support

  • Support Poetry
  • Support PIP

Qualified Names

  • Finish qualified names of classes (full path with handling classes and normal function code for Python)
  • Finish qualified names of packages

Data Model Enhancements

  • Add support for URL, GitHub name, and README content in the codebase
  • Support Repository level node with 1:many codebases for mono repo and align the schema. This will also extend support to enable repository level metadata .

Python Version Management

  • Support updating Python version if not present in Poetry/PIP packaging system through external config
    • Required for stdlist to get correct internal packages

Import Handling

  • Implement system imports reading through stdlist
  • Implement external imports reading by mapping with data from package manager
  • Read internal imports and make them absolute
  • Read imports properly through AST and fill chapi as chapi imports ds is broken
    • Will raise upstream issue with algorithm

Onboarding Documentation

  • Use Ruff ecosystem in Python to remove unused imports
  • Use Ruff ecosystem in Python to support relative imports rules

Data Models

  • Simplify Pydantic data models between chapi and dspy
    • dspy now extends and much data needed later is made available as part of core chapi by extending the model (Work in progress)

Neomodel Schema Implementation

  • Implement Neomodel schema to reflect imports, content, description, annotation, function calls, and corresponding relations
  • Implement batch support for ingestion.
  • Implement Transactional Support.

Dspy Pipelines and Summary Generation

  • Remove parallel processing in Classes as function call hiearchy has to be respected.

@JayGhiya
Copy link
Member Author

JayGhiya commented Nov 14, 2024

For Python - Had to revamp code again due to chapi separating a single file which has mix of functional and class code into separate json items creating duplication of nodes. This needs to be sorted. As there are 5/6 python specific processing codebases we have revamped the entire processing to be strategy pattern based on programming language so we reduce burden on creating every special processing that any programming language requires and we do not have to do it again.

@JayGhiya
Copy link
Member Author

  • Test - external dependency tomr
  • implement merging of functional and class chapi nodes into one single code by extending functions from node that has class_type - none
  • For internal imports Do the following:
    • Make the import path absolute for internal imports based on workspace, current file path and current path of import. Also implement ruff rules to improve bad imports from codebases otherwise there will be many cases to handle.
    • Make the usage names of importname structure set instead of list to make it efficent for search
    • Go through all functions and their local variables of chapi struct to figure out match with internal imports and keep adding to usage name.

@JayGhiya
Copy link
Member Author

Function params , function call params fixed thanks Phodal! here is the issue ->archguard/archguard#154

@JayGhiya
Copy link
Member Author

JayGhiya commented Nov 15, 2024

Cant depend on cross referencing package deps with package manager as package and module name (name used in imports) can be different and there is no deterministic way as of now to fix it. So right now aft figuring system imports we should figure out internal imports based on import , current file path to which import belongs, root package name and workspace path. Then remaining ones are external

@JayGhiya
Copy link
Member Author

For dev environement added mypy and ruff. Will be committed by eod

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Done
1 participant