-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question]: on the PropertyGraphIndex, is there an easy way to exclude metadata when building an index from documents? #17535
Comments
Currently, there is no specific configuration or parameter in the To achieve the exclusion of certain metadata, you might need to customize or extend the To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other |
Under what circumstances is it desirable to pollute a graph with a massive community of metadata nodes? If max_tiplets_per_chunk is 5, this means every 6th node at least is something like "source, is, file path". Won't this degrade retriever performance? |
Including a large number of metadata nodes in a graph can be desirable when the metadata provides valuable context or additional information that enhances the understanding or utility of the graph. For example, metadata can be useful for:
However, as you pointed out, a high density of metadata nodes can indeed degrade retriever performance by increasing the complexity and size of the graph. This can lead to slower query times and higher computational costs. To mitigate these issues, it's important to balance the inclusion of metadata with the performance requirements of the application. Techniques such as filtering out non-essential metadata, optimizing the graph structure, or using efficient retrieval mechanisms can help manage the trade-offs between metadata richness and performance. To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other |
@chartsNgraphs The source is coming from the The llm reads node/document metadata by default. If this is not helpful, exclude it Example for both suggestions
|
Question Validation
Question
Whenever I use PropertyGraphIndex to build a graph using an LLM, it always includes the metadata. This seems pretty wasteful, since the metadata is not useful to the knowledge graph and it wastes a lot of LLM tokens.
I'd like my graph NOT to include triplets like "source, is, 12348873" and "file path, is, home/folder". How can this be easily achieved?
The text was updated successfully, but these errors were encountered: