Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline from config file #220

Merged
merged 67 commits into from
Dec 12, 2024
Merged
Show file tree
Hide file tree
Changes from 26 commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
e9712a9
Merge branch 'main' of https://github.com/neo4j/neo4j-graphrag-python
stellasia Oct 15, 2024
b52c45e
Merge branch 'main' of https://github.com/neo4j/neo4j-graphrag-python
stellasia Oct 16, 2024
84c1780
Merge branch 'main' of https://github.com/neo4j/neo4j-graphrag-python
stellasia Oct 17, 2024
47d4782
Merge branch 'main' of https://github.com/neo4j/neo4j-graphrag-python
stellasia Oct 21, 2024
bc7a2f9
Merge branch 'main' of https://github.com/neo4j/neo4j-graphrag-python
stellasia Oct 22, 2024
a945284
Merge branch 'main' of https://github.com/neo4j/neo4j-graphrag-python
stellasia Oct 22, 2024
4e13c23
Merge branch 'main' of https://github.com/neo4j/neo4j-graphrag-python
stellasia Oct 23, 2024
5367bed
Merge branch 'main' of https://github.com/neo4j/neo4j-graphrag-python
stellasia Oct 24, 2024
21d1223
Merge branch 'main' of https://github.com/neo4j/neo4j-graphrag-python
stellasia Oct 25, 2024
3329cd7
Merge branch 'main' of https://github.com/neo4j/neo4j-graphrag-python
stellasia Oct 25, 2024
d8f6364
Merge branch 'main' of https://github.com/neo4j/neo4j-graphrag-python
stellasia Oct 28, 2024
4cec2f3
Merge branch 'main' of https://github.com/neo4j/neo4j-graphrag-python
stellasia Nov 4, 2024
4445b49
Merge branch 'main' of https://github.com/neo4j/neo4j-graphrag-python
stellasia Nov 5, 2024
939b18c
Merge branch 'main' of https://github.com/neo4j/neo4j-graphrag-python
stellasia Nov 18, 2024
6437fe7
Merge branch 'main' of https://github.com/neo4j/neo4j-graphrag-python…
stellasia Nov 25, 2024
e92d7bb
SimpleKGPipeline config parser
stellasia Nov 27, 2024
221c734
Factorize common parmeters
stellasia Nov 27, 2024
9813743
Fix
stellasia Nov 27, 2024
fb5939a
Fix again (ruff)
stellasia Nov 27, 2024
45dd67e
Add builder, docstrings
stellasia Nov 28, 2024
53aba3a
Adds example
stellasia Nov 28, 2024
ba56f44
Ruff
stellasia Nov 28, 2024
cc85fb0
Add headers
stellasia Nov 28, 2024
62ae4e1
Another header
stellasia Nov 28, 2024
4b80926
Remove old file - more detailed example
stellasia Nov 28, 2024
4d8eddb
Fix JSON
stellasia Nov 28, 2024
536c9ff
WIP
stellasia Nov 30, 2024
668fbc2
Adds more param resolvers
stellasia Dec 1, 2024
1d884a1
A bit of mypy
stellasia Dec 1, 2024
e2eb5d0
Add root types to allow instantiation from python object directly
stellasia Dec 3, 2024
390be0c
Document + mypy
stellasia Dec 3, 2024
adfc908
Add embedder config
stellasia Dec 3, 2024
ecda174
Implement SimpleKGBuilder with this setup
stellasia Dec 3, 2024
ab46edb
Add YAML config example
stellasia Dec 4, 2024
9dc8bec
Example config files for custom pipeline
stellasia Dec 4, 2024
d551fb4
Update SimpleKGPipeline
stellasia Dec 4, 2024
b070f6b
Fix UT
stellasia Dec 4, 2024
d192667
Fix tests
stellasia Dec 4, 2024
137cd76
Simplify and mypy
stellasia Dec 4, 2024
260478a
ruff
stellasia Dec 4, 2024
35133dd
Add missing dep
stellasia Dec 4, 2024
b5e6471
Missing import for '|' annotations
stellasia Dec 4, 2024
30b878d
Merge branch 'main' of https://github.com/neo4j/neo4j-graphrag-python…
stellasia Dec 4, 2024
a7d37e4
Restructure files + increase tests coverage
stellasia Dec 5, 2024
99af0de
mypy
stellasia Dec 5, 2024
3632557
Refactor examples
stellasia Dec 5, 2024
3f8fbf4
Test runner, clean simple kg builder test (remove duplicates)
stellasia Dec 5, 2024
3f6a9e1
E2E tests
stellasia Dec 5, 2024
30543dc
Use fsspec in config reader
stellasia Dec 5, 2024
ccb46f6
Changelog
stellasia Dec 5, 2024
aedd9a3
Fix test
stellasia Dec 5, 2024
3122eda
Use cast to remove a type ignore comment
stellasia Dec 6, 2024
ce21353
Also use cast here to remove type ignore
stellasia Dec 6, 2024
4e04d04
Close instantiated drivers
stellasia Dec 6, 2024
1e1ce2a
ruff
stellasia Dec 6, 2024
8834351
Adding loggers
stellasia Dec 6, 2024
82b4598
Make close function async
stellasia Dec 6, 2024
ede69f1
Fix tests
stellasia Dec 6, 2024
6613451
fix UT
stellasia Dec 6, 2024
aeea211
Write doc about SimpleKGPipeline and config files
stellasia Dec 6, 2024
83f5133
Merge branch 'main' of https://github.com/neo4j/neo4j-graphrag-python…
stellasia Dec 6, 2024
76434ec
Update api.rst
stellasia Dec 6, 2024
2bebbfe
Merge branch 'main' of https://github.com/neo4j/neo4j-graphrag-python…
stellasia Dec 11, 2024
54a55ae
Recreate lock file after merge
stellasia Dec 11, 2024
8572b42
Merge branch 'main' into feature/config-files
stellasia Dec 12, 2024
94483d8
Merge branch 'feature/config-files' of https://github.com/stellasia/n…
stellasia Dec 12, 2024
663df47
Add more comments to explain Config/Type models
stellasia Dec 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
{
"version_": "1",
"neo4j_config": {
"uri": {
"resolver_": "ENV",
"var_": "NEO4J_URI"
},
"user": {
"resolver_": "ENV",
"var_": "NEO4J_USER"
},
"password": {
"resolver_": "ENV",
"var_": "NEO4J_PASSWORD"
},
"database": {
"resolver_": "ENV",
"var_": "NEO4J_DATABASE"
}
},
"llm_config": {
"name_": "openai",
"class_": "OpenAILLM",
"params_": {
"api_key": {
"resolver_": "ENV",
"var_": "OPENAI_API_KEY"
},
"model_name": "gpt-4o"
}
},
"embedder_config": {
"name_": "openai",
"class_": "OpenAIEmbeddings",
"params_": {
"api_key": {
"resolver_": "ENV",
"var_": "OPENAI_API_KEY"
}
}
},
"from_pdf": false,
"entities": ["Person", {"label": "Organization"}],
"relations": ["WORKS_FOR", {"label": "DIRECTED_BY"}],
"potential_schema": [
["Person", "WORKS_FOR", "Organization"],
["Organization", "DIRECTED_BY", "Person"]
],
"text_splitter": {
"class_": "fixed_size_splitter.FixedSizeSplitter",
"params_": {
"chunk_size": 100,
"chunk_overlap": 10
}
},
"perform_entity_resolution": false
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
"""In this example, the pipeline is defined in a JSON file 'pipeline_config.json'.
According to the configuration file, some parameters will be read from the env vars
(Neo4j credentials and the OpenAI API key).
"""

import asyncio

## If env vars in a .env file, uncomment:
## (requires pip install python-dotenv)
# from dotenv import load_dotenv
# load_dotenv()
# env vars manually set for testing:
import os

from neo4j_graphrag.experimental.pipeline.config.parser import SimpleKGPipelineBuilder
from neo4j_graphrag.experimental.pipeline.pipeline import PipelineResult

os.environ["NEO4J_URI"] = "bolt://localhost:7687"
os.environ["NEO4J_USER"] = "neo4j"
os.environ["NEO4J_PASSWORD"] = "password"
# os.environ["OPENAI_API_KEY"] = "sk-..."


# Text to process
TEXT = """The son of Duke Leto Atreides and the Lady Jessica, Paul is the heir of House Atreides,
an aristocratic family that rules the planet Caladan, the rainy planet, since 10191."""


async def main() -> PipelineResult:
file_path = "examples/customize/build_graph/pipeline/simple_kg_pipeline_config.json"
pipeline = SimpleKGPipelineBuilder.from_config_file(file_path)
return await pipeline.run_async(text=TEXT)


if __name__ == "__main__":
print(asyncio.run(main()))
18 changes: 17 additions & 1 deletion src/neo4j_graphrag/experimental/components/schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
# limitations under the License.
from __future__ import annotations

from typing import Any, Dict, List, Literal, Optional, Tuple
from typing import Any, Dict, List, Literal, Optional, Self, Tuple, Union

from pydantic import BaseModel, ValidationError, model_validator, validate_call

Expand Down Expand Up @@ -55,6 +55,14 @@ class SchemaEntity(BaseModel):
description: str = ""
properties: List[SchemaProperty] = []

@classmethod
def from_text_or_dict(
cls, input: str | dict[str, Union[str, dict[str, str]]]
) -> Self:
if isinstance(input, str):
return cls(label=input)
return cls.model_validate(input)


class SchemaRelation(BaseModel):
"""
Expand All @@ -65,6 +73,14 @@ class SchemaRelation(BaseModel):
description: str = ""
properties: List[SchemaProperty] = []

@classmethod
def from_text_or_dict(
cls, input: str | dict[str, Union[str, dict[str, str]]]
) -> Self:
if isinstance(input, str):
return cls(label=input)
return cls.model_validate(input)


class SchemaConfig(DataModel):
"""
Expand Down
14 changes: 14 additions & 0 deletions src/neo4j_graphrag/experimental/pipeline/config/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Copyright (c) "Neo4j"
# Neo4j Sweden AB [https://neo4j.com]
# #
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# #
# https://www.apache.org/licenses/LICENSE-2.0
# #
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
58 changes: 58 additions & 0 deletions src/neo4j_graphrag/experimental/pipeline/config/param_resolvers.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# Copyright (c) "Neo4j"
# Neo4j Sweden AB [https://neo4j.com]
# #
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# #
# https://www.apache.org/licenses/LICENSE-2.0
# #
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import os
from typing import Any, Optional

from .types import ParamFromEnvConfig, ParamResolverEnum, ParamToResolveConfig


class ParamResolver:
"""A base class for all parameter resolvers."""

name: ParamResolverEnum

def resolve(self, param: ParamToResolveConfig) -> Any:
raise NotImplementedError


class EnvParamResolver(ParamResolver):
"""Resolve a parameter by reading its value
in the environment variables.

Example:

.. code-block:: python

import os
os.environ["MY_ENV_VAR"] = "LOCAL"

resolver = EnvParamResolver()
resolver.resolve("MY_ENV_VAR")
# Output: "LOCAL"
"""

name = ParamResolverEnum.ENV

def resolve(self, param: ParamFromEnvConfig) -> Optional[str]:
return os.environ.get(param.var_)


PARAM_RESOLVERS = {
resolver.name: resolver
for resolver in [
EnvParamResolver,
]
}
Loading
Loading